You are on page 1of 102

Introduction to Wave Optics

Nick Vamivakas
Institute of Optics
University of Rochester

April 22, 2019


Chapter 1

Introduction

Interference and diffraction (OPT 261) is the first physical optics class you en-
counter in the Institute of Optics undergraduate program. Its physical underpin-
ning is based on the notion that light is a wave. And, these waves are a medium
for energy transmission that can be used to communicate information between
distinct spatial locations. We will unpack the previous sentence as we progress
this semester. Two comments are in order. First, we will use the phrase light
rather generically to refer to the phenomena we discuss. In a strict sense light
refers to waves with temporal frequencies 1 (to be defined soon) that lie in the
visible part of the electromagnetic spectrum. Visible just means we can see light
waves with our built in photodetectors - our eyes! - and we will learn that the
spectrum of a wave refers to the variety of temporal frequencies that contribute
to the wave disturbance. But, what we will be discussing applies to visible and
invisible waves as well as many types of non-light waves.

A familiar example of such a light wave is a laser. A laser is a nearly monochro-


matic light wave and it gives the visual impression of being a single color. Light
waves with broad temporal frequency content (polychromatic) also exist. Think t a monochromatic wave has
of a light bulb or the sun. These sources emit waves that correspond to a col- a single temporal frequency;
mono = one and chroma =
lection of temporal frequencies and are understood to have a broad frequency
color
spectrum and thus appear white. There will be more discussion of this as we
progress. Second, describing light as a wave is only one of many ways to model
light. Each model is a different mathematical representation of light that cap-
tures certain features of light while disregarding/ignoring other features. As you
progress through the Optics curriculum you encounter these different models
and your task as a student is to learn when a particularly model is appropriate. In
fact, the curriculum is structured so that you devote a full semester to each model
of light.

Lastly, before describing the different models of light, let me elaborate on how
1 We use the term temporal frequency to make explicit this is the frequency related to the

temporal variation of a light wave; this foreshadows the idea of spatial frequency

1
2 Chapter 1 Introduction

the topics we encounter this semester are connected. Specifically, our journey
this semester will focus on light waves, but will be generally applicable to all wave
phenomena. Since wave optics does not provide a full description of how light
behaves [see below 1.1] it is necessary to provide some postulates that we take as
true (and can be shown to be true when you have a bit more machinery at your
disposal, i.e. OPT 262). With these postulates we will start by understanding what
a wave “is" and then consider two simple waves: a plane wave and a spherical
wave. These elementary waves are the primitives that the rest of the course is built
on. You will learn how to visualize these waves and understand the relationship
between their strength, amplitude and phase. With these single waves, we will
discuss situations where you add - the grown up word is superpose - 2 waves. It is
here that the idea of interference is engaged. With an understanding of generic
interference effects, we will consider instruments to manipulate interference, so
called interferometers. This will provide us with an opportunity to consider what
coherence means in the context of light waves. Although still an active are of
research, loosely speaking coherence is related to the ability of a light wave to
interfere with a suitably temporally delayed or spatially shifted replica of itself.
Continuing on our journey we will encounter scenarios where instead of 2 we
add up (superpose) N waves. This provides the physical underpinnings for thin
film filters, optical cavities and diffraction gratings. Finally, we let N, the number
of waves we consider, become infinite in a controlled way and we arrive at wave
diffraction. From this perspective diffraction is nothing but the interference of
many waves. The reward for all this heavy lifting is we learn that diffraction sets
the fundamental limits to optical system performance. I will stress many times
throughout this class wave optics is nothing but examining the consequences of
adding up, a possible infinite number, of waves. So at the outset I alert you to the
importance of fully understanding the primitive objects - the plane and spherical
wave - and their manipulation since this provides the basis the rest of the course
is built on. A second cautionary note. Many students struggle in moving from N
to infinity since this introduces multi-dimensional calculus into the discussion.
You will find it useful to review this material now and when appropriate.

Now we can begin with models of light.

1.1 Models of Light


The exact nature of light, and what light “is", I would guess has been discussed
as long as people have been discussing. It certainly has occupied the minds of
many scientists and engineers. At a basic level, there is a fundamental tension as
to whether light is a wave or light is a particle . As these discussions have evolved
t what are the physical char- so has our understanding of what light is. And, perhaps ironically, with these
acteristics of a particle? of continued debates, we are not much closer to answering the question of what
a wave? how can you test
light “is"! You will learn the answer to the previous ultimately depends on how
these?
you generate, propagate and detect it - depending on the exact scenario, some-
1.1 Models of Light 3

times light seems like a particle and sometimes like a wave. With this short detour
regarding the nature of light, we now set out to describe briefly the different ways
we think about light. I encourage you as you learn about each model to consider
how the model deals with generation, propagation and detection.

There are four main models used to understand light. To organize these I like
to think of an archery target (see Fig. 1.1). Such a target is a collection of four

QUANTUM OPTICS

ELECTROMAGNETIC OPTICS

WAVE OPTICS

RAY
OPTICS

Figure 1.1 The nested structure of light models. The least descriptive model is ray
optics and the most comprehensive model is quantum optics.

nested circles. Each model corresponds to a different circle and the larger, outer
circles not only capture some new feature(s) about light, but also can explain
all the effects described by the inner ones. In the bullseye of the target is ray
or geometrical optics. You encountered this material in OPT241. Some would
describe this as the “simplest" model of light. Simple just means the mathematical
model is based primarily on geometry and real analysis. Trust me, simple does
not mean easy by any means! Most of the results are derived via application of
Snell’s law for refraction n 1 sin θ1 = n 2 sin θ2 and the angle of reflection equals
the angle of incidence for light reflection. The previous rules can be derived
by an application of Fermat’s principle also called the principle of least time.
Quantitatively, the path traversed between two points P 1 and P 2 is determined
by minimizing the optical path length (which is analogous to find the path that
takes the least time) between the two points.
Z P2
min(OPL)path = n(s)d s (1.1)
P1

where OPL means optical path length. Principles of least time occupy an inter-
esting place in the physical sciences (not just optics), but we will not have much
4 Chapter 1 Introduction

more to say about them in this course. Presumably, OPT241 provided a thorough
grounding in the consequences of Eq. (1.1). Notice by finding the path that
minimizes the optical path length we are making a explicit statement about light
propagation. For example, if n is constant throughout space, the minimum path
will be a line connecting P 1 and P 2 . This line then represents a ray of light. The
resultant OPL in this case is n(P 2 − P 1 ).

Moving out from the center in Fig. 1.1, the next model of light is wave op-
tics. This is the purview of OPT261 and you are here to learn about the new
phenomena we encounter when light is upgraded from being a ray to being a
wave. Important in this discussion of waves is the notion of phase. Phase, and
in particular phase differences, manifest themselves in both interferometry and
diffraction. The mathematical machinery is also expanded to include complex
analysis, and so called phasors, as well as partial differential equations (algebraic
relationships the waves and their suitably arranged derivatives with respect to
space and time must satisfy) and multivariable calculus. The wider range of math
tools required for wave optics is what makes it more difficult than geometrical
optics. Since wave optics encompasses geometrical optics, it is able in principle to
describe and reproduce all the predictions of geometrical optics. That said, many
problems are more easily understood and solved approximately, but satisfactorily,
from a geometrical optics perspective. In fact, ray/geometrical optics is the limit
of wave optics when the wavelength of a wave (λ) is much smaller than the size of
the physical size of objects encountered by the wave.

Continuing outward, the next model encountered is electromagnetic optics.


Electromagnetic optics considers problems where the vectorial nature of light is
important in addition to its wavelike behavior. In OPT 262 you are introduced to
electromagnetic theory and Maxwell’s equations. The most important new fea-
ture to appear is light polarization. It is also possible to understand completely
how light propagates across boundaries that separate differential materials.

Last but not least is quantum optics. Quantum optics is the most complete
theory of light. It can explain from first principles how light interacts with mat-
ter. It is remarkable in its predictions, but also challenging in how it changes
are physical conception of light. Recall that we mentioned for many centuries a
debate has ranged regarding what light “is". In quantum optics we find the the
following answer: light is a quantum object that can behave both as a particle
and as a wave! And, whether it seems to be a particle or a wave depends on how
you look at the light. This tension can be made evident in simple interference
experiments such as a Young’s two-pinhole interferometer. OPT223 also serves
as your first real encounter with photons. Photons can loosely be conceived as
particles of light. In OPT223 you will learn when particle descriptions of light
are useful and when wave descriptions are useful. As a rule of thumb, photons
make analyzing generation and detection problems easy whereas waves are more
1.2 Systems approach to Optics 5

suitable in describing propagation. This is by no means true in all situations, but


is typically useful for a first attempt in understanding a problem.

So in summary, we can catalog the different models of light and how they
describe light:

• Geometrical optics → light is a ray

• Wave optics → light is a wave (has a phase)

• Electromagnetic optics → light is a vector wave (polarization)

• Quantum optics → light is a quantum object (particle and wave)

This concludes are brief tour of how we model and understand light. Your job
moving forward, as you master these different models, is to learn when to apply
a particular model(s) to a given problem. Deciding which model to use is by no
means easy since it involves intuition as to what the limitations may arise in the
system modeling and making the right decisions only come from experience that
you acquire from practice.

1.2 Systems approach to Optics


Even at this early stage I think it is useful to make some general observations
about features common to each model of light. Consulting Fig. 1.2, each model
of light provides the scientist and engineer with tools to mathematically model
phenomena associated with the generation, propagation and detection of light.
Important is that optical energy can be generated (in the sources), delivered (and
possibly modified) by the optical system to the detector, and finally the energy
that arrives at the detector is measured. Often the modification of this flow of

OPTICAL SYSTEM

SOURCE OPTICS DETECTOR


laser lenses charged couple
light bulb mirrors device (CCD)
sun free space eye

ENERGY FLOW
Figure 1.2 Block diagram of the elements that go into a system designed to transfer
optical energy from a source to detector. The system includes a light source, an optical
system and the detector.
6 Chapter 1 Introduction

optical energy is what carries information that we are interested in. In this course
(this may or may not be true in your other optics course so pay attention!) I will
refer to the totality of the light source, the machinery that delivers the light from
the source to the detector 2 , and the detector as the optical system. The part of the
system that delivers the light from the source to the detector we will call the optics.
Each of the models that we will discuss makes a statement about the process of
light generation, how light propagates and interacts with optical components and
how light is detected. Later in 261 we will find explicit mathematical functions for
the waves that represent the light source properties, functions representing the
optical system and a formula to quantify the amount of energy in an optical wave
(this is important for light detection). We will emphasize situations where inter-
ference effects are critical to understanding the spatial and temporal distribution
of optical energy.

Some examples of optical systems that are relevant in optics are microscopes,
telescopes, Michelson interferometers - the list goes on and on. In each of these
systems it is possible to identify each of the component sub-systems in Fig. 1.3.
As a simple example, that allows us to compare what we know from OPT241 with
what we will learn in OPT261, is a single-lens imaging system. The general system
consists of a self-luminous person that serves as the light source. This energy
propagates through the optical system that is composed of a free space segment, a
thin lens and then a free space segments. If system is arranged such that l11 + l12 = 1f
then an image is formed on the plane located at l 2 . This image could be seen on
sheet of paper placed at this plane or recorded by an electronic imaging detector.
In Fig. 1.3(a), the OPT241 approach is the descriptive tool and the light energy
that emanates from each object point travels as straight lines through the optical
system. We can trace the rays from input to output to determine where the rays
arrive on the detector plane. Three convenient rays are illustrated. In OPT241, is
this is an ideal thin lens imaging system then each point in the object is mapped
to a point in the image. There is not limit to how small a feature of the object you
can resolve in the image. For example, if the size of the person shrunk and the
hand (end of arm) moved closer and closer to the foot (end of leg) in the object,
with an ideal 241 thin lens imaging system you could always resolve (ie clearly
distinguish) the hand and foot in the image of the object.
In contrast, in Fig. 1.3(b), each point in the self-luminous object generates a
wave. The curved lines illustrate the wave’s wavefront. We will define wavefronts
in the next chapter. What is important in this illustration is that energy from each
point in the object is spread out into a blur spot (a collection of points) in the
image. This blur is fundamental and present even with an ideal thin lens imaging
system. It is known as the diffraction limit and is a manifestation of the wave
nature of light. The diffraction limit constrains the smallest possible features in
an object we can resolve in the image to the wavelength of the light generated by
the self-luminuous object. What this means is as the hand and foot move closer
2 this machinery could include lenses, mirrors, free space propagation, etc
1.2 Systems approach to Optics 7

a) source optics detector b) source optics detector

rays OPT241 waves OPT261

f f f f
rays
waves
POINT BLUR
l1 l2 l1 l2
object image object image

Figure 1.3 Ray and Wave Models to Image Formation. a) OPT241 based on rays. All
rays focus to a point in an ideal thin lens system. b) OPT261 based on waves. In an
ideal thin lens imaging system all waves focus to a blur spot that is a manifestation of
interference and diffraction. The blur sets a fundamental limit to the resolving power
of an imaging system (the smallest features of an object discernible in an image).

and closer in the object, there is some minimum separation between the two
that allows the hand and foot to be clearly resolved in the image. If the hand and
foot become closer together in the object than this minimum distance then it is
not possible to distinguish the hand and foot in the image - they blur together
or can not be resolved. The previous is what we mean when we say diffraction
limited imaging resolution. In this course we will unravel how to determine the
resolution limits of imaging and general optical system architectures that use
light.
This is an important point and is worth emphasizing:

Wave optics sets the fundamental performance limits of an opti-


cal system. Said another way, all optical systems have a resolving
power that is fundamentally governed by the wave nature of light.
Chapter 2

Fundamentals of wave optics

In this chapter we begin our discussion of light waves. After some qualitative
discussions of wave phenomena a set of postulates that govern the behavior of
light waves is given. The necessity of these postulates is a result of not having
the full machinery of Maxwell’s equations at our disposal (this waits for OPT
262). With the postulates in hand the two simplest and most useful waves for
understanding wave phenomena are presented - plane waves and spherical waves.
Various approaches to representing these waves are discussed. As phase is a
conceptually new feature of waves we will spend time exploring it. Finally, the
propagation of waves through simple media will be presented. Additionally, since
the most economical way of representing waves leans on phasors a short review
of the necessary complex algebra rules are given in the last section of the chapter.

2.1 What is a wave?


Waves are all around us. Often the best way to determine, what something “is",
is by first considering some examples and then distilling from the examples
common features. These features can provide the foundation for a definition.
To get some feeling for waves and develop some intuition we can make a list of
common waves. A short and not exhaustive list of waves is

• water wave

• sound wave

• seismic wave

• arm wave in a sports stadium

• electromagnetic waves

• gravitational waves

• quantum waves

9
10 Chapter 2 Fundamentals of wave optics

Can you think of more wave examples? A natural next question is what exactly is
“waving" in each of these. Returning to our previous list:

• water wave → water is the thing waving

• sound wave → coordinate pressure variations in air

• seismic wave → an elastic wave propagating through the earth itself

• arm wave in a sports stadium → crowds arms

• electromagnetic waves → the electromagnetic field

• gravitational waves → space-time

• quantum waves → the quantum state

Figuring out what waves in each wave is not as trivial of a question as it seems
and in fact initiated a revolution in physics at the end of the 19th century. Until
the late 1800’s, most waves were mechanical in origin and propagated through a
recognizable medium; think water, sound and seismic waves. It was natural to
assume the same held true for the newly discovered electromagnetic waves and
that these waves must propagate through a medium, the so-called ether. This
will take us to far off track, but in short trying to detect the ether not only led to
the invention of the Michelson interferometer (we will see this device in the next
chapter), but also ushered in the special theory of relativity.

From considering these previous examples it is clear that there are similarities
between each example (at least the first 4-5!). First something is moving through
space and continues to persist as time elapses. There is a shape or waveform that
is preserved as the wave moves. Maybe not as obvious is that each of these exam-
ples also present a way to transfer energy (the possibility to do work) between
separated locations in space. Leaning on the previous we make the following
qualitative definition for a wave:

A traveling wave is a self-sustaining disturbance that can transfer


energy between distant points.

As we progress through the material, we can always check back to make sure our
qualitative definition of a wave captures the physical wave properties we uncover.

The next task is to begin developing a mathematical formalism to quantify


wave motion. Put simply, we want a way to attach a number that characterizes
the wave strength for all positions is space and instances of time. At this point the
most general mathematical description we can give to a (scalar) wave is to assign
a function that catalogs the waves strength as a function of position and time:

u(r, t ) (2.1)
2.1 What is a wave? 11

where r is a vector that denotes the point in space where the strength u is mea-
sured and t is the time it is measured. It turns out this function, u(r, t ), is too
general to describe wave motion. By appealing to some physical intuition re-
garding wave motion and consistency of measurements of the wave strength in
different reference frames (coordinate systems moving at constant linear relative
velocity to one another), it is possible to constrain the set of functions u(r, t ) to a
subset of functions that represents waves; so called wavefunctions. To do this we
will consider the wave function illustrated in Fig. 2.1. Note we will consider only
1-dimension, but this reasoning would apply in 3D too.

The wave function in Fig. 2.1 is described mathematically as u(z 0 ) = exp[−(z 0 −


2
z o ) ]. The previous function describes the waves strength at all positions and all

u(z’) v u(z’) v
optical
disturbance a a
frame
z’a z’ z’a z’

u(z,t=0) vto u(z,t=to)


lab frame a

za zb z

Figure 2.1 General wave function illustrated in two frames. The wave’s frame is the top
row of panels. In this frame the disturbance is stationary and fixed for all times. In the
bottom row, the wave is illustrated in the lab frame. The lab frame is at rest and the
wave moves with a velocity v through this frame.

times in the waves frame of reference. This is illustrated in the top row of Fig. 2.1.
The waves frame of reference is plotted for two different times t = 0 and t = t o . In
the wave’s frame (top row) the wave is fixed with respect to the z 0 axis. In the lab
frame, bottom row, the wave moves along the coordinate axis z (parallel, but not
equal to z 0 ). The strength of the wave at location z a0 in its own frame is a. If this is
in fact a traveling wave in the lab frame (you are sitting at your desk watching the
wave move by), the wave appears to be moving with velocity v. See the second
row for snapshots of the wave strength at two different times in the lab frame.

The physical constraint on the general function u(r, t ), to make it a wave


function, emerges when we demand the distribution of strength be identical
regardless of what frame we inspect the wave from. At time t = 0, the wave
strength at location z a0 in the wave frame equals the strength of the wave at z a
in the lab frame and z a0 = z a . It is easy to see the same is true for all locations
12 Chapter 2 Fundamentals of wave optics

labeled by z and z 0 . At time t = t o , the wave has moved in the lab frame and at this
later time, the wave strength at z a no longer equals the strength at z a0 . From the
snapshot it is discovered the wave strength at z b now equals the wave strength at
z a0 . How are z a0 , z b and t related? From geometry we find z b = v t o + z a0 in the lab
frame. There is nothing special about these two points and at a given time the
coordinates of the wave frame are related to the coordinates of the lab frame via
the relationship z = z 0 + v t . The previous implies z 0 = z − v t so that

u(z 0 ) → u(z − v t ). (2.2)

From the previous physical considerations we have constrained the set of func-
tions u(z, t ) to the subset describing wave motion u(z − v t ). In fact z − v t is not
the only allowable combination of space and time for a function to be a wave
function. There are four possible ways to express this constraint, each with a
slightly different meaning. They are

• z − v t → in time t the wave moves a distance vt in the positive z direction

• z + v t → in time t the wave moves a distance vt in the negative z direction

• t − z/v → moving a distance z in the positive z direction takes time z/v

• t + z/v → moving a distance z in the negative z direction takes time z/v

These particular combinations of space and time will hold true for all waves. So,
what did we just learn? First, we saw an example of how physical reasoning, the
strength of the wave should be independent of the given frame of reference -
the wave exists independently of the coordinate systems we use to describe it,
constrains the relationship between space and time in a wavefunction. Second,
given a general function of a single independent variable, we now know how to
turn this into a wave; simply replace the single independent variable with z − v t
(z +v t ) to arrive at a mathematical representation of a wave moving in the positive
(negative) z direction in the lab frame. When we have outlined the postulates
of wave optics we will also discover the permissible wave functions need to be
differentiable in a certain way. With this we are ready to outline the postulates of
wave optics.

2.2 The Postulates of Wave Optics


As discussed in Chapter 1 wave optics is an approximation to electromagnetic
optics. As such there are certain results that we need to take as postulates. These
postulates emerge from Maxwell’s equations (the mathematical rules governing
electromagnetic optics) and you will discover them in your course on electromag-
netic theory. In this section we state these postulates as they guide and constrain
our discussion moving forward.
2.2 The Postulates of Wave Optics 13

2.2.1 The optical disturbance function a.k.a the wave optics wave func-
tion
The optical disturbance function is the basic object that will occupy our attention
for the rest of the course/text and is the wave function of wave optics. This
function is the tool we use to quantify the strength of our optical disturbance in
space and time. Since a single function (not a set of a functions; a vector function)
quantifies the wave behavior, the optical disturbance in wave optics is a scalar
wave. We will use the following notation for the optical disturbance function:

u (r, t ) (2.3)

At a given moment of time, u(r, t = t o ) the optical disturbance function assigns


a strength to the wave at each location in space r. Likewise, at a fixed location
in space r = ro , u(r = ro , t ) assigns a strength to each instant of time. In the
next subsection we will introduce a partial differential equation that constrains
the space-time behavior of the optical disturbance function – this is the wave
equation. Moving forward, we will use the term optical disturbance function
interchangeably with wave, wave function and light wave.

2.2.2 The wave equation


All waves satisfy an equation that connects their distribution through space and
evolution with time. In our qualitative definition presented in Section 2.1, the self-
sustaining and moving through space nature of a wave are captured quantitatively
by the wave equation. This 2nd order partial differential equation is derivable from
Maxwell’s equations and stated here without proof. The scalar wave equation for
our optical disturbance function is:

1 ∂2
µ ¶
2
∇ − 2 2 u (r, t ) = 0. (2.4)
c ∂t
In the previous c = cno is the wave phase velocity and is equal to the speed of light
c o (3×108 ms−1 ) divided by the medium’s refractive index n. The differential
³ oper-´
∂ ∂ ∂
ator represented by the nabla symbol in cartesian coordinates is ∇ = ∂x , ∂y , ∂z
and is also expressible in other coordinate systems (for example spherical coordi-
nates). For the wave optics problems we consider we will deal with linear media
that are piecewise homogeneous (essentially the material can be broken into
slabs that are large with respect to the wavelength and have constant refractive
index).

The most important property of the wave equation for wave optics is that it
is linear and allows for superposition. What this means is that if you have two
optical disturbance functions, u 1 (r, t ) and u 2 (r, t ), that are solutions to Eq. (2.4)
then their sum, u S (r, t ), is also a solution to Eq. (2.4), u S (r, t ) = u 2 (r, t ) + u 1 (r, t ).
It is the superposition principle that gives rise to the range of interference and
diffraction effects encountered in wave optics.
14 Chapter 2 Fundamentals of wave optics

2.2.3 A measure of energy flux


The reason for studying wave optics is to understand how the wave properties
influence the spatial-temporal distribution of light energy. At a basic level all
optical systems modulate the flow of optical energy in a useful way that permits
information exchange between remote locations. Therefore, it is paramount in
wave optics to be able to quantify the wave’s energy flux. Again, we need Mawell’s
equations to justify this relationship, so we take as a postulate the energy flux in
the wave is
2 T 2 ¡ 0¢ 0
Z
2
I (r, t ) = 2〈u (r, t ) 〉T = lim u r, t d t . (2.5)
T →∞ T 0
£ J ¤
where I is the waves irradiance and carries units of s.m 2 (J is joules, s is seconds
2
and m is meters) and 〈u (r, t ) 〉T is the time average of the optical disturbance
function squared. This may seem overly complicated and we will find the calcula-
tion of irradiance changes from a calculus thing to an algebra thing depending
on how we represent the optical disturbance. Finally, if we want the power in the
wave we integrate the irradiance over area normal to the propagation direction
R
i.e. P (t ) = I (r, t ) d r.

An obvious question at this point is why must we time average the optical
disturbance function to find its energy flux? It turns out for optical waves (from
the UV to the IR) the frequencies are in the 10’s to 100’s of THz which result in
periods shorter than a picosecond. Detectors for measuring the optical wave’s
flux are too slow to respond to these rapid oscillations and hence average the
instantaneous field irradiance. We will see this complicates determine the phase
of a wave.

2.3 Simple Waves


In wave optics, there are two types of waves that we will use to represent the
optical disturbance function. At the outset, these are not the only scalar optical
waves that exist, but these are the simplest to deal with and are suitable to help
us understand interference and diffraction. Both of these waves are harmonic
in space and harmonic in time. What this means is the wave varies in space
with a well defined frequency (its spatial frequency and varies in time with a well
defined frequency (its temporal frequency). We will try to be explicit about which
type of frequency we are thinking about by qualifying the word with either spatial
or temporal. This is not often done and after some experience with this material
it will be obvious which type of frequency is being considered.

2.3.1 Simple harmonic motion


Before dealing directly with the optical disturbance functions of waves – i.e. the
wave optics wave function, it is useful to recall the solution of the simple harmonic
2.3 Simple Waves 15

oscillator from intro physics. The equation of motion for simple harmonic mo-
tion (this is analogous to our wave equation, it describes the oscillating systems
dynamics) is

d2
y (t ) = −ω2 y (t ) (2.6)
d y2
where y is the displacement of the simple harmonic oscillator and ω is the angular
temporal frequency of the oscillator. It is easy to check that the following function
is a solution of Eq. (2.6)
y (t ) = a cos (ωt + δ) . (2.7)
A point that is worth emphasizing is that the displacement at time t has the same
units as the amplitude of the oscillation a, both [m]. The argument of the cos in
Eq. (2.7) is called the phase of the simple harmonic oscillator. Specifically,

θ (t ) = ωt + δ. (2.8)

where δ is called the initial phase angle. For each instant of time, the phase of
the simple harmonic oscillator quantifies where the system is located in a given
oscillation cycle. Figure 2.2 illustrates simple harmonic motion for δ = 0 in Fig.

y(t)  
ωtn=m2π  
a  

t  

-­‐a  

y(t)  
δ  
a  

t  

-­‐a  

Figure 2.2 Simple harmonic motion. Simple harmonic [see Eq. (2.7)] is illustrated
with zero initial phase angle, top panel, and a negative initial phase angle, bottom
panel. The blue curve in bottom panel leads the black curve (the black curve lags the
oscillator described by the blue curve).

2.2(a) and for δ < 0 in Fig. 2.2(b). Referring to Fig. 2.2(a), we can ask what is the
change in temporal phase between two displacement maxima. This requires

m2π
ωt m = m2π → t m = = mT (2.9)
ω
16 Chapter 2 Fundamentals of wave optics

where m is an integer (the subscript m on t denotes the times of maximum


displacement) and T is the temporal oscillation period. The temporal period
equals
2π 2π 1
T= = = (2.10)
ω 2π f f
where we have introduced the linear frequency f [s−1 ] (and ω = 2π f ). The pe-
riod tells us how long it takes the harmonic oscillator to go through one cycle
and the linear frequency tells us how many cycles the oscillator makes in 1 second.

A final point is to consider the relative phase of two oscillations and identify
when one oscillation is ahead (leads) or is behind (lags) the second oscillation.
Phase differences are fundamental to wave optics. This point can be understood
by examining Fig. 2.2(b) and assuming the two oscillations refer to two different
harmonic oscillators. By consulting Eq. (2.8), we see a negative initial phase
angle requires we wait an additional time to = |δ|ω to measure the same oscillator
displacement value. Said a different way, the oscillator represented by the blue
curve (maximum displacement at t = 0) leads the oscillator described by the black
curve (maximum displacement shifted to a time to = |δ| ω ).

2.3.2 Phasors and simple harmonic motion


At this point it is useful to introduce the phasor representation of simple har-
monic motion. Simply put, a phasor is a recipe to associate a complex function
with a sinusoidal function. The utility of this representation becomes apparent
when we consider superposition of waves (adding sinusoidal functions is re-
placed/simplified by vector addition). Recalling the Euler identity (this relates the
complex exponential function with sinusoidal functions) it is possible to express
the displacement as (the time dependence of the phase θ is supressed)
h i
y (t ) = a cos (θ) = ℜ ae i θ = ℜ [a cos (θ) + i a sin (θ)] (2.11)

which leads to the following definition of a phasor

Y (t ) = ae i (ωt +δ) (2.12)

where i is the imaginary unit and Y is the complex function associated with
y. We will use a capital letter to denote the complex function associated with
sinusoidal function that we represent with a lower case letter (in this case y and
Y ). Geometrically the phasor is a vector in the complex plane with its tail affixed
to the origin and its length equal to a. The orientation of this vector with respect
to the real axis at a given time is determined by the phase of the phasor.
Figure 2.3 plots the phasor in Eq. (2.12). It is important to specify that positive
angles correspond to counter-clockwise phasor rotation. For a simple harmonic
oscillator, with the phase defined in Eq. (2.8), as time evolves the phasor continu-
ously rotates in the counter-clockwise direction around a circle of radius a with a
2.3 Simple Waves 17

imaginary  axis  

iθ 2 iθ1
Y2 = ae a a Y1 = ae
θ2
asin θ 2 θ1 asin θ
1

a cosθ 2 a cosθ1 real  axis  

Figure 2.3 Phasor Representation. Two phasors Y1 and Y2 illustrated in the complex
plane.

temporal (angular) frequency of ω. The temporal phase accrued after one cycle of
the oscillator is 2π. From this perspective, the phasor Y2 is a time evolved version
of the phasor Y1 . Using the language of lag and lead, phasor Y2 leads phasor Y1 .
In comparing Fig. 2.2 and Fig. 2.3 we see the variation in harmonic oscillator
is not a result of its amplitude changing (a is fixed), but is due to the variation
of its phase. The phasor representation make this clear. Again, at any instant of
time, the displacement of the oscillator is found from the projection of the phasor
along the complex plane’s real axis (see Eq. 2.11, maybe make a figure for this).
So, if the initial phase angle (δ) of the harmonic oscillator is 0 at time t = 0 the
displacement is equal to a and the simple harmonic oscillator phasor is parallel
π
to the real axis. At a later time t = 2πω the displacement is zero (at this instant the
harmonic oscillator’s location is equal to its rest position) and this is consistent
with the phasor being parallel to the imaginary axis in the complex plane.

2.3.3 From simple harmonic motion to waves


With a clear understanding of harmonic motion in time, we can apply our al-
gorithm from Eq. (2.2) to turn this simple harmonic motion in time to simple
harmonic motion in time and space. Specifically, for a wave traveling along the
+z-direction we make the substitution
z
t→ −t (2.13)
c
where we have replaced the wave velocity v with the speed of light c since this
is the phase velocity of a light wave. This becomes evident from consulting the
wave equation. We will also begin to change notation for our optical waves and
substitute the optical disturbance function for the displacement y (t ) → u (z, t )
and the amplitude of oscillation with what we will define below as the complex
envelope a → u o . With these substitutions we have arrived at our first example of
a time and space harmonic wave
³ ³z ´ ´
y (t ) → u (z, t ) = |u o | cos ω − t + δ . (2.14)
c
18 Chapter 2 Fundamentals of wave optics

The reason for the modulus of u o is the pair of numbers |u o | and δ will specify
the complex amplitude of the wave; denoted u o . The phase of this wave is
³z ´
θ (z, t ) = ω − t + δ (2.15)
c
and has both a temporal phase −ωt and spatial phase ω cz in addition to the ini-
tial phase angle δ. We will see in the next section that the previous wave is an
example of a monochromatic plane wave. The reason for this nomenclature will
be described in detail then, but we observe in passing that the equation ω cz = b
where b is a constant (some number of radians) describes a plane parallel to the
x − y plane located at z = cb
ω . Figure 2.4 plots Eq. (2.14) for the case z = 0 (Fig. 2.4
top panel) and for the case t = 0 (Fig. 2.4 bottom panel); δ = 0 for each plot. Con-

u(z=0,t)  
ωtn=m2π  
uo  

t  

-­‐uo  

u(z,t=0)  
ωzn/c=m2π  
uo  

z  

-­‐uo  

Figure 2.4 Time and space harmonic wave. Top panel: u (z = 0, t ). Bottom panel:
u (z, t = 0).

sidering the discussion regarding simple harmonic motion, the angular frequency
of the wave still relates to its linear frequency and period as ω = 2π f → f = T1 .
The new bit in Eq. (2.14) is the spatial phase.

We focus on the spatial phase of Eq. (2.14) to understand the factor that
multiplies the position variable z. From the bottom panel of Fig. 2.4 it is apparent
that ωzc m = m2π determines the distances between maximum (crests) of the
optical disturbance. Manipulating the previous we find
ωz m 2πc c
= m2π → z m = m = m = mλ (2.16)
c 2π f f
where as expected the spatial phases advances by multiples of 2π when a spa-
tial distance denoted by λ [m] is traversed. Since the constant λ quantifies the
2.3 Simple Waves 19

waves spatial periodicity it is defined as the wave’s wavelength. In Eq. (2.16) a


critical relationship in wave optics has been discovered. It relates a wave’s velocity,
wavelength and frequency
f λ = c. (2.17)
The previous equation relates temporal properties of the wave, captured by f ,
to spatial properties of the wave, cataloged by λ, and the two are related via the
wave’s speed c.
Next, in analogy to the temporal phase, the prefactor of z in the spatial phase t If the wave propagates in
is associated with the wave’s spatial frequency (it carries units of [m−1 ]). We a medium other than vac-
uum, the waves speed and
define the linear spatial frequency as
wavelength are reduced by
f the medium’s refractive in-
ν= (2.18) dex. The waves frequency is
c
unchanged. One can under-
and the angular spatial frequency, the wavenumber of the wave, as stand this as a requirement
of energy conservation
2π f ω 2π since a wave’s frequency f
k= = = . (2.19) is connected to the energy
c c λ
of the photon’s that con-
Much like the relationship between temporal period and temporal frequency, stitute the wave. You will
the wavelength tells us how far through space we need to walk to find successive need to wait until quantum
peaks in the optical disturbance strength and the spatial frequency tells how many optics to understand this
peaks we cross in traversing a unit length along the wave. With the wavenumber fully!
defined it is common to write the phase of the time and space harmonic wave as

θ (z, t ) = kz − ωt + δ (2.20)

where k is the wave’s wavenumber in angular units.

Question: How can we change this to a wave traveling in the −z-direction?

In the next section when we generalize our definition of plane waves we will
also consider the phasor representation of these waves.

2.3.4 Monochromatic Plane Waves


In 2.3.3 the time and space harmonic wave was introduced. This type of wave is
called a monochromatic plane wave. Since the oscillation in time consists of a
single frequency we call this wave monochromatic. There are also polychromatic
waves. Since the spatial phase described a planar surface this wave is called a
monochromatic plane wave. The planar geometry of the wave’s phase results in a
succession of planar surfaces, separated by a distance equal to the wavelength,
that yield the same optical disturbance strength. Surfaces separated by the wave-
length change the spatial phase of the wave by 2π. The inverse of the wavelength
defines the waves spatial frequency.
20 Chapter 2 Fundamentals of wave optics

Recall that the spatial phase of a time and space harmonic wave (when com-
bined with the initial phase angle) can be expressed as φ (z) = ω cz + δ = kz + δ
where k is the wave’s wavenumber. As was hinted in the last section kz = c, where
c is a constant (δ is absorbed into c), geometrically describes a plane parallel to
the x − y plane located at z = kc . This wave is a monochromatic plane wave travel-
ing in the +z-direction. We can make one more observation regarding the spatial
phase. We can ask what is the negative gradient of this scalar function. Remember
the gradient returns the local normal vector to the surface. The negative gradient
of the plane wave’s spatial phase is

∂ ∂ ∂
µ ¶
∇θ (z) = , , φ (z) = (0, 0, k) (2.21)
∂x ∂y ∂z

which is a vector along the z-direction with a magnitude equal to the wave num-
ber. This vector is called the wave vector and it is normal to the planar wavefronts
(in this case parallel to the z-direction). Two facts have been discovered about
monochromatic plane waves:

• The spatial phase of a plane wave describes a plane, hence the name plane
wave

• The normal direction to the plane is the wave’s wave vector

See Fig. 2.5a for a one way to visualize monochromatic plane waves traveling in
the +z direction assuming the initial phase angle δ = 0 The wave is visualized as
a set of parallel planes orthogonal to the z-axis. In the visualization the planes
are separated by multiples of the wavelength (mλ where m can be a positive or
negative integer) and are labeled with the constant value of spatial phase taken
across the entire plane. The wave vector is locally normal to each these surfaces.
Figure (2.5)b displays a common way to illustrate monochromatic plane waves as
a set of parallel lines. The snapshot is taken at two different times. The top panel
is at time t = 0 and the bottom panel is at a time ωt = π. Notice in the bottom
panel the wave has advanced at distance equal to half a wavelength along the
z-axis. Also decorating the planes are the constant phase assumed across the
x − y plane for the given z-locations. From the phase it is clear these planes are
separated by a spatial distance equal to the wavelength. We are now ready to
define the wavefront of a wave.

The wavefront of a wave is a locus (connected collection) of spatial points


that share the same spatial phase at a given instant of time

So, a plane wave is a plane wave since it’s wavefronts are planar! This is captured
in the visualization of Fig. 2.5(a) and (b).
By construction, the time and space wave that was arrived at in Sec. 2.3.3 was
propagating along the z-direction. The optical disturbances we are discussing
have no preference about the direction they propagate along and we can eas-
ily generalize our equation representing a plane wave propagating along the
2.3 Simple Waves 21

λ
a) b) c) k
x x=0
x z x
z λ
y k z
y
t=0
-2π 0 2π 4π
k
k
ωt=π
-2π 0 2π 4π

Figure 2.5 Monochromatic Plane Wave Wavefront Visualization.a) Planes of constant


phase for a monochromatic plane wave propagating along the +z direction. The
wavevector k is normal to this surfaces of constant phase. The dashed arrows indi-
cates these planes cover the entire x − y plane for a fixed z value b) A common way to
visualize a monochromatic plane wave is sets of lines, separated by the wavelength,
that have constant phase. In this illustration the constant phase is also indicated for
the given z-location of the plane. The top panel is a snapshot of the wavefronts at time
t = 0 and the bottom snapshot is at time ωt = π. Notice at this time the wavefronts
have advanced half a wavelength along the z-axis. c) A monochromatic plane wave
propagating along an arbitrary direction through space.

z-direction to a plane wave propagating along any direction in space. Here we


can lean on our intuition from geometry and what we discovered for time and
space harmonic waves to determine the form of a general monochromatic plane
wave’s spatial phase. The general equation of a plane in three dimensional space
is ax + b y + cz − p = 0 where x,y, and z are the position coordinates and a, b, c
and p are constants that determine the plane’s geometric properties. Since the
phase of the wave is a scalar quantity From the preceding discussion it should be
clear the wave vector will still describe the direction of propagation and for wave’s
propagating in a arbitrary direction through space the wave vector will have three
components. A natural generalization of the phase in Eq. (2.20) to account for
three dimensions is
θ (r, t ) = k · r − ωt + δ (2.22)
where the wave vector, wavenumber and wavelength are related via
p 2π
|k| = k·k = k = . (2.23)
λ
The discussion surrounding Eq. (2.21) is applicable to the generalized spatial
phase and it should be clear the wave vector will point along the wave’s propaga-
tion direction in three dimensional space.

Before writing down the formula for a general monochromatic plane wave
we will make sense of the wave vector and its components. To do this it is useful
to introduce a vector space commonly referred to as k-space in wave optics.
This new mode of thinking will prove particularly powerful when we consider
22 Chapter 2 Fundamentals of wave optics

diffraction and propagation of waves in optical systems. The main idea is to


think of a second vector space with each point in this vector space identifying
a potential wavevector. This should not be confused with the vector space we
use to represent positions in space. Figure 2.6 illustrates both the more common
position vector space (locations in this space are denoted by position vectors r)
and the wavevector k-space. Any point in each of these spaces needs three pieces
of information to locate it since the spaces are three dimensional.
The illustrations in Fig. 2.6 have made prominent three othrogonal axes. In
position space, these are commonly labeled x, y and z and each axes also has
an associated unit vector x̂, ŷ, and ẑ such that each point in position space is
identified with a vector rp = x p x̂ + y p ŷ + z p ẑ. The vector algebra is the same
for k-space and each point in k-space is identified with a vector kp = k xp k̂x +
k y p k̂y + k zp k̂z . In the previous expression we have been explicit about the point
by adding the subscript p, but in the future the label p is supressed. Here is a
crucial point, we assume the cartesian axes of these two vector spaces are parallel
so that we can identify the unit vectors of each vector space. This identification
lets us write k = k x n̂x + k y n̂y + k z n̂z . Physically this makes sense as we built up
our wave vector by asking for a quantity that coordinated itself with points in
position space in such a way as to generate planes. With this identification k · r
makes sense (we know how to take this vector dot product) and we see the spatial
phase is expressible as
ϕ (r) = k x x + k y y + k z z + δ (2.24)
where the initial phase angle is absorbed into the definition of the spatial phase.

a)   z   b)   kz  
γ  
rp=(xp,  yp,  zp)   kp  
zp   kzp    
β  
y   y  
kx   α  
x   xp   kyp  
yp   kxp  

Figure 2.6 Position space and k-space vector spaces.a) An illustration of the position
space vector space. Points are located by three pieces of information. In this illustra-
tion a Cartesian coordinate system
¡ is adopted
¢ and each point rp is identified by its
three Cartesian components x p , y p , z p . Notice there are points in space and a vector
associated with each point that has its tail at the origin and head at the point. b) An
illustration of k-space. We will see each point in k-space corresponds to a permissible
monochromatic plane wave. The wavevector can be expressed with respect to a Carte-
sian coordinate system, via¢ spherical coordinates (not illustrated) or using direction
cosines with angles α, β, γ .
¡
2.3 Simple Waves 23

q
From Eq. (2.23) we see |k| = k x2 + k y2 + k z2 = k.

There are a few more comments we can make regarding k-space. There
are different ways to parameterize the vectors that inhabit k-space. We have
seen one representation in Eq. (2.24) with the wave vector parameterized by
¡ ¢
its Cartesian components k = k x , k y , k z . It is also possible to use other or-
thonormal coordinate systems such as spherical coordinates. Again referring
to Fig. (2.6)b, in spherical coordinates the wavevector is expressible as k =
|k| sin θ cos φ, sin θ sin φ, cos θ . In this representation the angles determine the
¡ ¢

propagation direction (this is not explicitly illustrated in Fig. (2.6)b). This means
if I give you θ and φ you will know the waves propagation direction. A closely
related representation of the wavevector also relies on angles, but in this third
representation the direction cosines of the wavevector are used. The direction
cosines are found by measuring the angle between the three Cartesian axes and
the wavevector itself. Consulting Fig. (2.6)b it can be seen that the angle between
n̂x − k is α, between n̂y − k is β and between n̂z − k is γ. These angles satisfy the
property cos2 α + cos2 β + cos2 γ = 1. With these angles defined the wavevector
can also be written as k = |k| cos α, cos β, cos γ .
¡ ¢

After this exhaustive discussion of the spatial phase and wavevector we can
write down the real wavefunction for a monochromatic plane wave as

u (r, t ) = |u o | cos (k · r − ωt + δ) (2.25)

where all the parameters have been previously identified. To specify a monochro-
matic plane wave you need 4 pieces of information

• the amplitude of the optical disturbance

• its temporal frequency, wavelength or wavenumber (these are all related)

• its direction of propagation (these could also contain 3 pieces of informa-


tion!)

• its initial phase angle

With these quantities we can define the set of all monochromatic plane waves.
An illustration of the wavefronts for a general monochromatic plane wave is
presented in (2.5)c. Notice the to make a plot of the scalar optical disturbance
function in Eq. (2.25) requires 5-dimensions! The optical strength u is one dimen-
sion, the position vector occupies three dimensions and time provides the last
dimension.

It is instructive to check that the plane wave in Eq. (2.25) satisfies the wave
equation given in Eq. (2.4). This is done by substituting Eq. (2.25) into Eq. (2.4).
24 Chapter 2 Fundamentals of wave optics

We find

1 ∂2
µ ¶
∇2 − 2 2 u o cos (k · r − ωt + δ) = 0 (2.26)
c ∂t
ω2
→ k x2 + k y2 + k z2 = 2 → ω = kc. (2.27)
c
The relationship ω = kc is an example of a dispersion relation. We see the wave
equation provides a constraint that the wavenumber and angular frequency sat-
isfy. The two can not be arbitrarily fixed but are related by the speed of light.
Thinking of this constraint in k-space we see that the wavenumber is equal to
the length of the wavevector and it defines a sphere of radius k = ωc . One such
sphere is illustrated in Fig. (2.6)b. Each point on the sphere in k-space defines a
monochromatic plane wave with the same angular frequency, but propagating in
a different direction.

Just like the simple harmonic oscillator, the monochromatic plane wave can
be expressed as a phasor and the prescription for doing this is identical to the
simple harmonic oscillator; recall Eq. (2.11). The first phasor we define is also
called the complex wavefunction. The complex wavefunction is defined as

U (r, t ) = |u o |e i (k·r−ωt +δ) = |u o |e i θ(r,t ) (2.28)

where the complex wavefunction and real wavefunction are related via

u (r, t ) = ℜU (r, t ). (2.29)

The complex wavefunction is a dynamic phasor since its phase changes as time
evolves. It is also useful to define a static phasor known as the complex amplitude.
The complex amplitude is defined as

U (r) = u o e i (k·r+δ) = u o e i ϕ(r) (2.30)

where there is a slight abuse of notation since U is used for both the complex
wavefunction and complex amplitude. It will be clear from the context which
phasor is being used. Generally we will be primarily focused on the complex
amplitude.

Visualizing plane waves can be accomplished in a variety of different ways.


The “simple" propagating wave we described in 2.3.3 is an example of a monochro-
matic plane wave propagating along the +z-direction. This wave is visualized in
Fig. 2.7. The left panel of Fig. 2.7 presents snapshots of the optical disturbance at
two different times (t = 0, top and t = T4 , bottom). Notice the wavefront parallel to
the x − y plane at z = 0 and t = 0 has propagated along the +z-direction a distance
T
l = 4c
T at time t = 4 . The two panels in the center, are linecuts along x = 0 of
this wave. Finally, the right panel presents the irradiance for the same line cut
as displayed in the center panels. Since the irradiance is independent of time is
2.3 Simple Waves 25

shown. This is to be expected as our detector average’s over many oscillations


in time of the optical disturbance. Finally, the top panel on the right illustrates
the phasor associated with the complex wavefunction of a monochromatic plane.
There is also a time-independent phasor associated with the complex amplitude
of the wave.

A final point is in regards to u o . When initially introduced it was assumed


real, but it can be a complex quantity too and is called the complex envelope. The
complex envelope is
u o = |u o |e i δ (2.31)
where δ refers to the phase of the complex amplitude. The complex envelope
as given is constant throughout space and time. It is possible for this ampli-
tude to become time and/or space dependent. Examples include when we deal
with optical waves that are beams or temporal pulses. We will typically right the
complex envelope as a real quantity u o and we will be explicit if it also has a phase.

2.3.5 Monochromatic Spherical Waves


In this section we describe the spherical wave . The discussion mirrors the plane
wave discussion. We first define the real wavefunction and complex amplitude t A spherical wave is the type
then we compare this to the monochromatic plane wave. The real wavefunction of light wave generated by
a point source. Atoms are
the canonical examples of
imaginary   point sources of light. If
u(x,z;t=0)  
U = uo e ( )
x   z=0   axis   iθ r,t
1.5 u(x=0,z;t=0)   you go on to study the ba-
1
uo  
sics of light emission from
x=0  
0.5
uo
0
θ atoms you will discover in
z  
−0.5
the far-field atoms radiate
−1

real  axis   spherical wavefronts with


z  
−1.5
−5 −4 −3 −2 −1 0 1 2 3 4 5

an amplitude that depends


x   u(x,z;t=T/4)   I(x=0,z;t)  
1.5
u(z;t=0)   on the orientation of the
uo   u2o  
1
atom’s emission dipole
x=0  
0.5

−0.5 z  
−1
z  

z=0  
−1.5

z  
−5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 2.7 Plane wave visualization. Left: Top panel and bottom panel plot the optical
strength in the x − z plane of a monochromatic plane wave propagating in the +z-
direction. The bottom panel has advanced in time by t = T4 resulting in a phase shift
go π2 . Middle: Top and bottom are linecuts of the optical strength along x = 0. Right:
The bottom panel is the irradiance along the same linecut (x = 0). The irradiance is
constant throughout space and time for a monochromatic plane wave. Top panel:
Phasor associated with the complex wavefunction.
26 Chapter 2 Fundamentals of wave optics

for a monochromatic spherical wave, assumed to be located at the origin, is


|u o |
u (r, t ) = cos (kr − ωt + δ) (2.32)
r
where as usual r = |r| and k = |k|. The complex amplitude of the spherical wave is
|u o | i (kr +δ)
U (r) = e = u o e i ϕ(r) . (2.33)
r
These formulas can be changed if the spherical source is not located at the origin
by replacing r with r − r s where r s denotes the location of the spherical source.
There are two major changes when comparing the monochromatic spherical
wave to the monochromatic plane wave. First, the wavefront of a spherical wave
is not planar, but is spherical. This can be seen by considering the spatial phase of
Eq. (2.33) assuming the initial phase angle is 0. The surfaces of constant phase are
determined by the equation kr = b where b is a constant number. Remembering
r = x + y + z 2 it is clear r = kb describes a sphere with radius kb . The gradient of
p
2 2

the spherical wave’s spatial phase returns the wavevector k = n̂r that is everywhere
normal to the spherical surface. Notice the wavenumber is parallel to the unit
radial vector in spherical coordinates. The second major change when compared
to the monochromatic plane wave is that the complex envelope now depends on
position. Notice
|u o | i ∠uo
|u o |e i ∠uo → e (2.34)
r
and the complex amplitude of the wave decays as r1 .

Figure 2.8 visualizes the wavefronts, the optical disturbance strength and
irradiance of a monochromatic spherical wave.

u(r;t=0)   u(r;t=0)   I(r;t)  


x   1.5

uo  
r  
1
u2o  
0.5

z   −0.5
r   r  
−1

−1.5
−5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 2.8 Visualizing monochromatic spherical waves. a) Monochromatic spherical


wave’s wavefronts in the x − z plane. The circles are the wavefronts, i.e. loci of constant
phase. b) The optical disturbance strength along the direction r identified in a). Note
this is the same for all directions in the plane in the x − z-plane. The wave’s complex
envelope is uro and its initial phase angle of 0. c) The irradiance associated with the
optical disturbance in b) along the direction r .

2.3.6 Irradiance revisited


Calculating the irradiance simplifies when we work with the complex wavefunc-
tion. From the definition of irradiance found in Eq. (2.5) and the definition of
2.3 Simple Waves 27

a monochromatic wave (planar or spherical will work, it depends on ϕ (r) - the


spatial phase) we find

2 T 2u o2 T
Z Z
u 2 r, t 0 d t 0 = lim 1 + cos 2ωt 0 + 2ϕ (r) d t 0 = u o2
¡ ¢ ¡ ¢
lim (2.35)
T →∞ T 0 T →∞ 2T 0

where T is the response time of the detector. The detector response time T deter-
mines the fastest signal a detector can faithfully reproduce. The fastest optical
detectors are on the order of 100 GHz (1 GHZ = 109 Hz) and the frequency of
optical waves in a few hundred THz (1 THz = 1012 Hz). This means that there are
more than 103 cycles of the optical wave during the detector response time T . As a
result of this, optical wave detectors integrate the instantaneous intensity over the
detector response time and only report an average value. The limiting procedure
in Eq. (2.35) ensures mathematically the integral converges to a well-defined
value.

Calculating the magnitude squared of the complex wavefunction given in Eq.


(2.28) we find
U (r, t )U (r, t )∗ = |U (r, t ) |2 = u o2 (2.36)
where U ∗ is the complex conjugate of the complex wavefunction. What we have
discovered is that the we can dispense with time averaging the real wavefunction
and instead take the magnitude squared of the complex wavefunction to find the
wave’s irradiance. Specifically, we can take advantage of the fact that

2〈u (r, t )2 〉T = |U (r, t ) |2 = u o2 (2.37)

to calculate the irradiance. In using this equation we will need to be cautious


if U results from the superposition of two monochromatic waves of different
frequencies. This will be addressed in Chapter 3.
For the complex amplitudes we have defined, the monochromatic plane and
spherical waves, the wave’s phase information is lost in evaluating the irradiance
and only variation due to the magnitude of the complex envelope is possible (e.g.
the r −1 dependence of the spherical wave). Focusing on a monochromatic plane
wave (assuming the complex envelope is real) we find

U (r, t )U (r, t )∗ = u o e i θ(r,t ) u o e −i θ(r,t ) = u o2 (2.38)

for the irradiance. Notice in Eq. (2.38) the monochromatic plane wave’s phase
does not influence the irradiance. So, although we found the phase is responsible
for the variation in optical strength for a plane wave, it does not impact the
irradiance. This turns out to be a generic feature in measuring optical waves and
fields - accessing the phase information. This is an ongoing challenge in optical
measurements, recovering the phase of the optical wave. Many techniques for
phase retrieval leverage interference effects.
28 Chapter 2 Fundamentals of wave optics

2.4 Waves and Materials


Waves can propagate through a variety of different materials. Some examples are
free space (refractive index of 1) and glass (refractive index of 1.52). Materials
influence wave propagation in many ways. From the perspective of our course
on wave optics there are two important features that will concern us. First, a
wave that encounters a material discontinuity will have its amplitude spilt into a
reflected and transmitted portion. Beamsplitters exploit this fact. In addition to
redistributing the wave’s amplitude, the wave’s phase is also modified. Phase can
be changed both in reflection and via propagation. Propagation phase modifica-
tion can be harnessed to shape wavefronts for light focusing (a lens does this!) or
to intentionally delay wavefronts. In what follows we examine how propagation
through a material influence a wave’s phase. We will consider reflection and
transmission in a later chapter.

n=1 " U2 lag


C
phasor nkL (n −1)kL
diagram U1 lead
kL
positive angles = counter-
B clockwise rotation

spatial ! "
phase
A along x
n "
z=0 z=L A B C

Figure 2.9 Plane wave propagation through a planar slab. Left: The panel represents
the wavefronts of a monochromatic plane wave propagating through a distance L of
free space and through a medium of refractive index n . The spacing of the wavefronts
is determined by the wavelength (λ). In the medium, the distance between wavefronts
is contracted to nλ . Right: Phasor diagram for the two complex amplitudes associated
with each path at z = L. Phasor 2 lags phasor 1 by an amount equal to (n − 1) kL. Also,
althought wavefronts stay connected through space, walking along the x-direction
there is a pronounced jump in the spatial phase.

Figure 2.9 illustrates the wavefront of a monochromatic plane wave, propa-


gating along the positive z-direction, that encounters a piece of material with
refractive index n and thickness L. The surrounding medium has a refractive
index of 1. For illustration, the middle part of the wavefronts are connected with
dashed lines. In this discussion we want to develop intuition for how a material
delays a wave. Recall that in a material both a wave’s phase velocity and its wave-
length decrease in such a way that its frequency remains constant. This fact will
enable us to discover how the wave’s phase is modified. Notice, the wavefronts
in free space run ahead of the wavefronts in the medium. Therefore, if I walk
2.4 Waves and Materials 29

a distance L outside the medium I encounter less crests than if I walk through
the medium. With this perspective, there is more spatial phase acquired by the
wave propagating through a medium of length L as compared to the same wave
outside the medium Phasors can be used to quantify this intuition.
First, in propagating a distance L a monochromatic plane wave’s complex
amplitude acquires a phase

U1 (z = L) = u o e i kL (2.39)

and by the same reasoning the spatial phase acquired in propagating through a
medium of the same thickness with refractive index n is

U2 (z = L) = u o e i nkL . (2.40)

Notice in both Eq. (2.39) and (2.40) the waves propagate a physical distance L
( a geometric length) although in the second case, the acquired phase is nkL.
Recall, the physical distance scaled by the refractive index is called the optical
path length. These are distinct “lengths". We can also ask, what is the relative
phase of the waves in Eq. (2.39) and (2.40) at the location z = L. This is

U2 (z = L) u o e i nkL
= = e i (n−1)kL . (2.41)
U1 (z = L) u o e i kL

where the phase difference is (n − 1)kL which is the same as k multiplied by


the difference in optical paths of the two waves. This makes sense as we have
said the wavenumber multiplies distances (optical path lengths) to return a phase.

The discussion of this paragraph is important. Notice the second wave has
acquired a relative phase of (n−1)kL as compared to the wave that has propagated
through free space. It is useful to consider the previous in the context of phasors.
The right panel of Fig. 2.9 presents a phasor diagram that contains complex
amplitudes associated with each of the wave. If we assume the phasors are
aligned with the positive real axis when t = z = 0, then the orientation of each
phasor catalogs the phase acquired by each wave and the difference is the relative
phase difference between the 2 waves at z = L (we are ignoring the fact that the
phasors rotate counter-clockwise as time evolves and are only bookkeeping for
the phasor rotation due to spatial phase accumulation). Since positive angles
correspond to clock wise rotation, and nkL > kL, it is clear phasor 2 acquires
more phase phasor 1. From our earlier discussions this means the phasor 2 lags
phasor 1 or phasor 1 leads phasor 2. This is also makes sense with our intuition.
If we labeled the wavefront at z = 0 at time t = 0 we would have to wait longer
for the wave that propagates through the material to arrive at z = L as compared
to the wave that propagates the same distance through free space. Why do we
wait longer? The phase velocity in the medium v = nc is slower than the phase
velocity in free-space v = c and it takes longer for the wave to travel through
the medium. Also, notice in the lower right panel of Fig. 2.9 if you walk along a
30 Chapter 2 Fundamentals of wave optics

direction orthogonal to the propagation direction (in the illustration x), there is a
pronounced jump in the spatial phase in going from the high index medium into
free space.
So we have found in propagating through a medium of refractive index n, an
optical wave’s phase is modified by the medium and we are led to the make the
following observations:

The physical path length, PL, traveled by a wave equals the physical distance
traveled; what you would measure with a ruler.

The optical path length, OPL, traveled by a wave equals the physical dis-
tance traveled multiplied by the refractive index.

The acquired phase is determined by multiplying the OPL by the wavenum-


ber.

A general observation about traveling waves can now be made. Equation


(2.21) reveals that the gradient of the wave’s spatial phase locally defines the wave
propagation direction. For a plane wave, each location on a given wavefront, at a
particular instant of time, has the same gradient so the wave vector (direction and
magnitude) is identical. So, as time progresses the wave continues to propagate
with planar phase fronts. If instead, at a given instant of time, across a given plane
the spatial phase was a nonuniform function of position (i.e. not constant for all x
and y values at a given z location) then the subsequent propagation would reflect
this nonuniform phase. This is one of the essential point of wave optics – the
spatial phase of the wave tells you how the wave is propagating. Said another way,
the wave’s spatial phase encodes where the wave is going. We will highlight this
as we consider wave propagation through media that distort the wavefunction.

2.5 Connecting monochromatic plane waves and spheri-


cal waves
To this point we have discussed the two simplest waves of wave optics, the
monochromatic plane wave and the monochromatic spherical wave. Don’t be
fooled by the simplicity, a vast majority of wave optics effects, in particular in-
terference and diffraction, can be appreciated by adding these two waves up in
the right way. And, although this chapter has presented these waves as distinct.
There are two “natural" ways to connect planar and spherical monochromatic
waves.

The first connection is provided by simply investigating a small cross-section


of the complex amplitude generated by a spherical wave at a distance far removed
from the source location. This is illustrated in Fig. 2.10a. The illustration corre-
sponds to the physical situation of having a distant point source of light delivering
optical energy to an optical system. This could be an atom radiating light that
2.5 Connecting monochromatic plane waves and spherical waves 31

is collected by a lens or perhaps a telescope gazing out at star in space . For the
ensuing discussion it is assumed the optical system has a circular entrance pupil t In a later chapter we will
with radius ∆a (the entrance pupil may also be simply called the aperture in what discover the finite size of
the star, it is not really a
follows). Our goal is to use some mathematics tools to quantify how planar is
point, influences the regu-
the plane wave delivered to the entrance pupil of the optical system by the point larity of the wavefront the
source. Said another way, we will be able to quantify when it is reasonable to telescope receives
approximate the incoming optical energy as being delivered by a plane wave.
What this all means is we need to figure out how to turn the complex amplitude
of the spherical wave into the complex amplitude of a plane wave.
To get started, we of course assume the point source generates a spherical
wave - the wavefronts are spherical as discussed in Sec. 2.3. In moving from left

a) b)

Lz

c)

f
f
z'

Figure 2.10 Relating spherical waves and plane waves. a) A spherical point source
delivers a planar wavefront to the entrance pupil of an optical system provided the
distance z = L z is far away from the point source. This is the origin of the phrase a
planes wave is generated by a spherical wave at infinity. In between the wavefront is
parabolodial. b) Focusing a plane wave (left to right) or collimating a spherical wave
(right to left). c) Geometry for finding the transmission coefficient of a thin lens..

to right the distance from the point source, z = L z , increases. When L z becomes
large and we consider points x and y close to the z-axis the spatial phase of a
spherical wave can be expressed as

x2 + y 2
kr = k(L 2z + x 2 + y 2 )1/2 = kL z (1 + )1/2 . (2.42)
L 2z

Why the previous rearrangement of the spatial phase? We are building a line of
reasoning that involves the physical intuition provided by the illustration. From
the illustration, it appears that points in the transverse plane close to the optical
axis will provide the domain over which the wavefront can be approximated as
planar. Notice in Eq. (2.42) the second term in the square root is a ratio of
t Recall the wavevectors in
wave optics are co-linear
with the rays of ray optics.
The nearly on-axis points
we are considering corre-
spond to nearly on-axis
wavevectors or rays. These
32 Chapter 2 Fundamentals of wave optics

something small to something big - maybe an approximation can be used to


understand this spatial phase function approximately?
The last bit necessary to make progress on the given problem is a math tool.
The tool is the binomial expansion

(n)(n − 1)x 2 (n)(n − 1)(n − 2)x 3


(1 + x)n u 1 + nx + + + .... (2.43)
2! 3!
valid for |x| < 1. In Eq. (2.42) we now see that the since the second term is small it
is indeed reasonable to employ the binomial expansion. The result of utilizing Eq.
(2.43) is
x 2 + y 2 1/2 x2 + y 2
kr = kL z (1 + ) u kL z + k . (2.44)
L 2z 2L z
where only the first two terms in the binomial expansion, with n = 1/2 and
x = (x 2 + y 2 )/L z , have been retained. Notice the approximated spatial phase
consists of two terms. The first term

kL z (2.45)

is referred to as the linear phase. Why? The spatial phase is linearly dependent on
L z . The second term is
x2 + y 2
k (2.46)
2L z
and is called the quadratic phase. Quadratic phases are extremely important and
will be encountered many times in wave optics. Notice, at the level of approxi-
mation in Eq. (2.44), in the region between the source origin and the entrance
pupil plane, the wavefronts are parabolas and this wave is called a paraboloidal
wave. The second term on the right hand side in Eq. (2.44) describes a parabolic
or quadratic phase variation and when added to the constant linear phase the
resultant wavefront is still a parabola.
With Eq. (2.44) it is possible to determine both under what conditions the
optical system’s entrance pupil receives a plane wave and what is the deviation
from planarity. For the spatial phase in Eq. (2.44) to be linear the second term
must be small. Quantitatively

(x 2 + y 2 )max ∆2
k << π → NF = a << 1 (2.47)
2L z λL z

the quadratic phase must be much smaller than π for it to be negligible in the
entrance pupil. In Eq. (2.47), the maximum transverse distance is the entrance
t π is chosen by convention pupil (orthogonal to the optical axis) has been denoted by ∆a (for a circular en-
although since this is the trance pupil this is the pupil radius). After rearranging the algebraic constraint in
phase most different from 0
Eq. (2.47) the Fresnel number NF is defined. In its new statement small quadratic
phase
phase is equivalent to have a Fresnel number much less than 1. Therefore, given
the entrance pupil extent, wavelength and source pupil distance we can deter-
mine when the quadratic phase variation of the wavefront is negligible so that the
2.5 Connecting monochromatic plane waves and spherical waves 33

received wavefront by the optical system (see Fig. 2.10a) is approximately planar
(only the first term on the right hand side in Eq. (2.44) contributes). What is the
maximum phase variation across the entrance pupil given the NF ? Just substitute
NF into Eq. (2.46). The result is πNF !
A second connection between spherical and planar wave fronts is provided
by a thin lens. This is illustrated in Fig. 2.10b. In this illustration, in going from
left to right a planar wavefront is focused by the lens. Focusing converts the
planar wavefront to a spherical wavefront. Similarly, in going from right to left,
a spherical wave emitted by a point source is converted into a collimated plane
wave by a thin lens. For this reason it is often said a lens is able to make the
plane at infinity accessible in an optical system. To use a lens in wave optics it
is important to understand the phase shift introduced by a thin lens. The phase
shift can be determined by considering Fig. 2.10c. Assuming the lens is centered
on the z-axis, has a refractive index n and a focal length f we can determine the
phase shift for each ray, propagating parallel to the z-axis, that encounters the
spherical surface at location x, y. The path length is

x2 + y 2
³ ´ q
0
d o − d (x, y) = d o − f − z = d o − f 2 − x 2 − y 2 u d o − (2.48)
2f
where we have assumed f >> x, y and made the paraxial approximation. The
phase is found to be
x2 + y 2
u kdo − k (2.49)
2f
which consists of a constant phase we will ignore and a second quadratic phase.
Assuming the lens has a unity power transmission ( no loss or reflection for the
wave propagating through it) we can write the transmission function for the lens
as 2 2 x +y
−i k
t l (x, y) = e 2f . (2.50)
which is by assumption a pure phase transformation on the wave that passes
through the lens. The positive sign in the lens phase indicates the more off-axis
the wavefront location the more advanced it is with respect to the lens center.
This is physically sensible since the wave takes longer to pass through the, thicker,
center of the lens when compared to the thinner portions. The nonuniform phase
delay across the lens is what allows it to convert planar wavefronts to spherical
wavefronts and vice versa (see Fig. 2.11).
In summary, as Illustrated in Fig. 2.11a, a thin spherical lens results in a plane
wave being focused to a focal spot. In passing throughout the lens (assuming the
lens center is coincident with the coordinate origin) the incoming wave’s complex
amplitude gets modified according to
x 2 +y 2
−i k
Uout (x, y) = t l (x, y)Uin (x, y) = Uin (x, y)e 2f (2.51)
x 2 +y 2
−i k
which for a plane monochromatic wave equals Uo e 2f . Although the converg-
ing wavefronts are spherical, we can also investigate the phase of the converging
34 Chapter 2 Fundamentals of wave optics

a) b)
3

−1

−2

−3
−3 −2 −1 0 1 2 3
Figure 2.11 Focusing a plane wave. a) A plane monochromatic wave is focused by a
thin spherical lens. The planar wavefronts are converted to spherical wavefronts. b)
Spatial phase across the plane indicated by the dashed line in a).

wave across the plane indicated by the vertical dashed line in Fig. 2.11a we find
the result plotted in Fig. 2.11b. These are two different quantities related to a
wave’s phase. We see the converging spherical wavefronts result in a spatial phase
distribution across a plane perpendicular to the optical axis that are concentric
circles. The particular slice plotted has the location x = y = 0 coincident with an
anti-node of the converging wave for the instant of time selected.

2.6 Huygens’ principle and wave propagation


We close this chapter on waves with a brief description of wave propagation pio-
neered by Christiaan Huygens in the 1600’s. His description of wave propagation
for a plane wave is illustrated in Fig. 2.12a. The idea is physically motivated by the
following. GIven a wavefront, this is the primary wave, at a given time (labeled
t1 in the illustration) we can determine the wavefront at a later time (labeled t2 )
by assuming the primary wavefront is comprise of fictitious secondary spherical
wave radiators. The wavefront at t2 is constructed by finding the tangent surface
to each of the constituent spherical waves, also called wavelets. For the example
in Fig. 2.12a plane wavefronts continue on as plane wavefronts.

We will see Huygen’s notion of spherical secondary sources is particularly


powerful in studying wavefront splitting interferometers and diffraction. Consider
Fig. ??b. If an opaque screen, with a small pinhole, is placed in front of a wave
(in this case a plane wave), the pinhole selects only one spherical wavelet from
the wavefront. This is the only bit of the wave that can continue to propagate
to the right of the screen. It is easy to imagine now adding multiple pinholes in
the screen or even finite size openings. Understanding in each of the previous
scenarios only a subset of secondary spherical waves add to the resultant wave
provides a physical explanation for interference (the former) and diffraction (the
latter case). rging wave across the plane indicated by the vertical dashed line
2.6 Huygens’ principle and wave propagation 35

t1 t2

Figure 2.12 Plane wave propagation via Huygens’ principle.

in Fig. 2.11a we find the result plotted in Fig. 2.11b. These are two different
quantities related to a wave’s phase. We see the converging spherical wavefronts
result in a spatial phase distribution across a plane perpendicular to the optical
axis that are concentric circles. The particular slice plotted has the location
x = y = 0 coincident with an anti-node of the converging wave for the instant of
time selected.
Chapter 3

The interference of two waves

The stage has now been set to explore what happens when we start to add up
waves. As we said in the Ch. 2, since the wave equation is linear it obeys the
principle of superposition. This is just a fancy way of saying if we know multiple
solutions to the wave equation then we know their sum is too! It is this fact that
makes interference possible in wave optics. A second point to recall is that the
optical disturbance function’s, i.e. the wavefunction’s, strength was found to vary
as a result of its phase changing both in position and time. Interference occurs
when we superpose two waves with a relative phase difference. In this chapter
we will concern ourselves with the interference of two waves. Later in the book we
will relax this constraint. By focusing on the interference of two waves we can gain
intuition and understanding without the complication of added mathematical
clutter such as summations and integrals. In this chapter, after some general
discussion of two-wave interference, we will consider the interference of the
specific elementary waves we introduced in Ch. 2. This will provide a foundation
to investigate interferometers in the next chapter and many wave interference
effects as well as diffraction. Remember, wave optics is fundamentally about
understanding how to add up elementary waves!

3.1 Interference: general considerations


Interference occurs when we add up waves. Interference is the manifestation of
adding up two waves that have a relative phase difference. Interference modu-
lates the space-time distribution of the total optical strength and irradiance when
compared to that of the constituents waves that are interfering. As we saw in
Eq. (2.37) the irradiance can be determined from the complex wavefunction so
in the interference problems we consider we will work at the level of complex
wavefunctions. Before embarking on a discussion of interference it is important
to recognize that irradiance, unlike the optical disturbance function, does not
satisfy a wave equation. As such, irradiance does not generally share the features
of a traveling wave that we expect.
In this first section we will derive a general form of the interference equation t Important; irradiance is not
a wave!!

37
38 Chapter 3 The interference of two waves

and then apply it to a variety of scenarios. The result we derive is fundamental to


wave optics and should be thoroughly understood. If you appreciate, understand
and can derive this situation the next two chapters should be a breeze. To capture
both plane and spherical waves, we will write the complex wavefunction as

U (r, t ) = |u o (r) |e i θ(r,t ) (3.1)

where we have added a spatial dependence to the complex envelope modulus


|u o |. To observe interference, we need to add up two waves. We define the total
complex wavefunction as

UT (r, t ) = U1 (r, t ) +U2 (r, t ) = |u 1 (r) |e i θ1 (r,t ) + |u 2 (r) |e i θ2 (r,t ) (3.2)

where the subscript refers to wave 1 and wave 2 and the complex envelope’s initial
phase angle (if there is one) is absorbed into the wave’s total phase. Although we
will not pursue this point, if both phasors rotate with the same temporal frequency,
we can find a third phasor that rotates with the same temporal frequency that has
a new complex envelope and spatial phase.
To find the irradiance associated with Eq. (3.2) we need to find the magnitude
squared of the total complex wavefunction. Explicitly we calculate

|UT (r, t ) |2 = |U1 +U2 |2 = U1U1∗ +U2U2∗ +U2U1∗ +U1U2∗ (3.3)

where the space and time dependence of Ui has been suppressed for simplicity.
Expanding each one of the four terms we find

U1U1∗ = |u 1 (r) |2 (3.4)


2
U2U2∗ = |u 2 (r) | (3.5)
U2U1∗ = |u 2 (r) ||u 1 (r) |e i [θ2 (r,t )−θ1 (r,t )] (3.6)
−i [θ2 (r,t )−θ1 (r,t )]
U1U2∗ = |u 1 (r) ||u 2 (r) |e . (3.7)

Notice the third and fourth lines are of the form ae i ψ + ae −i ψ which equals
2a cos ψ.
Next, we define the irradiance associated with wave i as I i (r) = I i (note |u i | =
p
I i and of course there is no time dependence in the constituent wave irradiance)
so that we can recollect all the terms in Eq. (3.7) as
p
I 1 I 2 cos ∆φ(r, t )
¡ ¢
I T (r, t ) = I 1 + I 2 + 2 (3.8)

where
∆φ(r, t ) = θ2 (r, t ) − θ1 (r, t ) (3.9)
is the phase difference between the two waves. The irradiance pattern resulting
from this superposition of two waves is called an interferogram. In Eq. (3.8) I 1
and I 2 are called the direct terms resulting solely from each constituent wave and
3.1 Interference: general considerations 39

the “interference" term depending on the relative phase arises from the cross-
terms (in Eq. (3.7) lines 3 and 4). If the complex envelopes of each wave are equal,
i.e. |u 1 | = |u 2 |, such that I 1 = I 2 = I o then Eq. (3.8) simplifies to

I T (r, t ) = 2I o 1 + cos ∆φ(r, t ) = 4I o cos2 ∆φ(r, t )/2


£ ¡ ¢¤ ¡ ¢
(3.10)

It is important to emphasize that Eq. (3.8) is one of the fundamental results


of wave optics. First, it makes explicit that interference depends on the phase
difference between the two waves that are interfering. Second, and a more subtle
feature, is the waves must have a stable (to be precise a correlated) phase relation-
ship so that they can interfere. If the relative phase is stable, the waves are said to
be coherent. Qualitatively a stable phase means with knowledge of the phase at a
specific location in space and time it is possible to predict the phase’s value at all
other points in space and time. If this is true for the constituent phases then this
is also true for the phase difference. The notion of wave coherence can be made
quantitative and we will make some first steps in this direction in a later chapter
where we have a fuller discussion of coherence in both space and time. For now,
when we use the word coherent we simple mean the waves can interfere; i.e. Eq.
(3.8) can be used. If the waves are incoherent then they can not interfere and the
total irradiance is
I T (r, t ) = I 1 + I 2 . (3.11)
Coherent and incoherent are idealized descriptions of actual light sources and all
physically realizable waves are partially coherent. Said another way there are no
physical sources of truly coherent or incoherent light. That said, the light source
that closest approximates a coherent source is a laser. The light source that most
closely resembles an incoherent light source is a light bulb or the sun. It turns
out to be much easier to produce an interferogram if the light source is nearly
perfectly coherent.
We can make a few observations concerning Eq. (3.8). First, in an interfer-
ogram, the spatial locations that have the same values of phase difference also
share the same value of irradiance. This leads to the following definition

A fringe in an interferogram is a connected loci of points that share the


same phase difference ∆φ.

Again, a fringe also has the same value of irradiance and it is common to refer to
the irradiance maxima as fringes. An interferogram is composed of a collection
of fringes and much information can often be extracted from the geometric
structure of a fringe pattern. In the following we will discuss many different
fringe patterns. Finally, the definition of a fringe should be contrasted with the
definition of a wavefront. A wavefront is the locus of points of constant phase for
an optical disturbance whereas a fringe is the locus of points in an interferogram
with constant phase difference. Make sure you appreciate this difference since it
is often a source of confusion.
Second, a natural question is where is the interference irradiance a maximum?
Focusing on the argument of the interference term it is clear I T is a maximum
40 Chapter 3 The interference of two waves

when ∆φ = 2πm where m = 0, ±1, ±2, .... (is an integer); this makes cos ∆φ = 1.
¡ ¢

What does m, the fringe order, tells us about the interfering waves?

The magnitude of the fringe order m is the number of waves of optical path
difference between the interfering waves.

For example, m = 2 means there are two waves of optical path difference or a
relative phase difference of 4π between the two interfering waves. The value of
p
irradiance at the maximum is I 1 + I 2 + 2 I 1 I 2 . Similarly, the minimum occurs for
∆φ = π(2p + 1) where p = 0, ±1, ±2, .... (is an integer); this makes cos ∆φ = −1.
¡ ¢
p
The value of irradiance at the minimum is I 1 +I 2 −2 I 1 I 2 . A convenient definition
that aims to capture “how much" or “how deep" is the interference is the visibility.
Visibility is defined as
I max − I mi n
V= (3.12)
I max + I mi n
where I max (I mi n ) are the maximum (minimum) values in the irradiance pattern.
From the previous considerations the visibility can be expressed as
p
2 I1 I2
V= . (3.13)
I1 + I2

If the complex envelopes of the two waves are equal then I 1 = I 2 = I o and the
visibility equals one, V = 1. Also, if the waves are incoherent so that I T = I 1 + I 2
then the visibility equals zero, V = 0. These properties make the visibility a good
measure of an interferograms strength. Notice V < 1 if the waves are of unequal
amplitude and/or if the waves are partially coherent. We will return later to how
an interferometer’s output reflects properties, such as coherence, of a light source.

3.2 Interference of two monochromatic plane waves


In Section 3.1 a number of general relationships regarding interference were
derived. In this section, we will focus on applying Eq. (3.8) to the case of two
interfering monochromatic plane waves.

3.2.1 Spherical points sources and plane wave interference


Before examining the interference of two monochromatic plane waves in this
subsection a brief discussion is provided to identify when plane-wave like inter-
ference may be observed. Remember, although mathematically easy to deal with,
plane waves are unphysical waves since their infinite extent requires an infinite
amount of energy. Nevertheless, there are many physical situations in which
plane wave interference is a good approximation to the observed interference
phenomena. The next two examples illustrate two scenarios in which plane wave
interference would be observed. Both rely on points sources of light located at
infinity.
3.2 Interference of two monochromatic plane waves 41

In Fig. 3.1 two point sources radiate spherical waves that are made planar by
a thin lens (see Section 2.5) and then overlapped in space. The role of the lens
is to bring the plane at infinity to the len’s focal plane. In this way the planar
wavefronts are easily presented to the region of space on the opposite side of the
lens. Critical in this example is that the two point sources are coherent. In general,
if these were two atoms or two stars, the sources would be incoherent and we
would not observe any interference. By assuming coherence interference effects
become pronounced. The dashed box identifies the region in space where the
total optical disturbance irradiance is measured. To quantify the total irradiance it
is necessary to find expressions for the two wavevectors kA and kB . This depends
on the coordinate system.

lens
kA
B kB

lens
Figure 3.1 Point sources in the focal plane of a lens. Two point sources that radiate
spherical waves are each situated at the focal point of a lens. After the lens collimates
the waves radiated by each source, the two plane waves are overlapped. The irradiance
is measured within the dashed box. By considering only the dashed box, as presented
in the inset, it is clear considering the interference of two monochromatic plane waves
is appropriate.

The second scenario considered is present in In Fig. 3.2 two point sources,
located at infinity, radiate spherical waves. The coordinate origin is coincident
with the geometric center of the sphere. The irradiance in this region can be un-
derstood as resulting from the interference of two monochromatic plane waves.
For this example the simplifications introduced will be employed to justify the
plane wave interference. This will require approximations to the complex enve-
lope and spatial phase of each point source. In Fig. 3.2, point source i is located at
a distance r i = r from the coordinate origin since both point sources are situated
the same distance from the center of the sphere. The calculation for source A
and B is the same so only the calculation for point source B will be presented.
The derivation is identical for A. Before starting, the goal of this derivation is to
42 Chapter 3 The interference of two waves

demonstrate that in the vicinity of the sphere’s center, each point source delivers a
plane wave that propagates along a direction determined by the source’s location.
Consulting the illustration, our intuition tells us the ray emitted by the point
source (that connects the point source to the origin) should be collinear with the
approximate plane wave.

rB=(xB,zB)
rB+rBD=rD
rA+rAD=rD

rBD
rAD
rD=(xD,zD)

rA=(xA,zA)

rB
rA
kA
x
z
kB

Figure 3.2 Point sources at infinity. Two point sources located at infinity radiate spher-
ical waves. The coordinate origin is located at the geometric center of the sphere. In
this region the total irradiance can be understood from considering the interference of
two monochromatic plane waves. The dashed box identifies the region to be consid-
ered and illustrates the planar wavefronts delivered by the two sources. The grey circle
identifies a specific detector point.

For point source A, the geometry in Fig. 3.2 reveals rA + rAd = rd where rd is
the point of detection (d). The spatial phase at this location for source A is
k A r Ad = k A |(rd − rA )| = k A ((x d − x A )2 + (z d + z A )2 )1/2 (3.14)
where the definitions rd = (x d , z d ) and rA = (x A , z A ) have been used. The next
step is to rearrange the spatial phase to leverage the paraxial approximation.
Specifically
2 2
−2x d x A + 2z d z A x d + z d 1/2
k A ((x d − x A )2 + (z d + z Z )2 )1/2 = k A r (1 + + ) (3.15)
r2 r2
where r A2 = x 2A + z 2A = r 2 is the sphere radius. In carrying through the paraxial
approximation, the third term under the square root is assumed to be small
3.2 Interference of two monochromatic plane waves 43

(the Fresnel number is much less than 1 for this situation) and the spatial phase
becomes
k A xd x A − k A zd z A
k A ((x d − x A )2 + (z d + z Z )2 )1/2 u k A r − . (3.16)
r
In Eq. (3.16) we notice there is a global phase in r and linear phase in the detector
coordinates x d and z d . From the geometry, z A /r = cos θ A and x A /r = sin θ A .
Substituting this into Eq. (3.16) the spatial phase becomes

k A r − k A sin θ A x d + k A cos θ A z d . (3.17)

What emerges in the spatial phase is k Ax = k A sin θ A and k Az = −k A cos θ A . These


define the wavevector of the plane wave generated by source A

k A = (k A sin θ A , −k A cos θ A ) = (−k A cos α, −k A cos γ) (3.18)

where α = π+θ A and γ = θ A are the appropriate direction cosine angles. The final
result for the spatial phase of the wave generated by point source A in the vicinity
of the sphere’s geometric center, where the irradiance will be measured, is

k A r + k A · rd (3.19)

which is the spatial phase of a monochromatic plane wave. A similar story holds
for source B. With this conceptual background in hand, the next section considers
in detail the interference of two monochromatic plane waves.

3.2.2 Monochromatic plane wave interference


To quantify the observed interferogram from interfering two coherent monochro-
matic plane waves is it necessary to apply the phase difference in Eq. (3.9) to
these two waves. If we assume the waves have the same temporal frequency then
the phase difference is expressible as

∆φ(r) = (k2 − k1 ) · r + (δ2 − δ1 ) (3.20)

where the wavevector and initial phase angle difference determines the relative
phase. Two monochromatic plane waves with the same temporal frequency do
not necessarily have the same wavevector. The temporal frequency also specifies
the waves wavenumber and wavelength, but not the propagation direction. We
will see the observed fringe pattern in a monochromatic plane wave interferogram
will reflect the relative angle of propagation between the two waves.
¡ ¢ ¡ ¢
Since k1 = k 1x , k 1y , k 1z and k2 = k 2x , k 2y , k 2z we can evaluate the phase
difference in Eq. (3.20) as

∆φ(r) = (k 2x − k 1x )x + (k 2y − k 1y )y + (k 2z − k 1z )z + (δ2 − δ1 ) (3.21)


44 Chapter 3 The interference of two waves

where the position vector r is r = (x, y, z) and located at the observation point. If
we remember the wavevector definition in terms of the direction cosines then the
phase difference is equivalent to

∆φ(r) = k(cos α2 − cos α1 )x + k(cos β2 − cos β1 )y + k(cos γ2 − cos γ1 )z + (δ2 − δ1 ) .


(3.22)
In the interferogram we can define characteristic lengths that are based on
the fringe pattern and traversing paths through the interferogram that result in
the phase difference changing by 2π. These lengths define what are called fringe
spacings. Specifically, we can define the fringe spacing along each of the cartesian
coordinate axes as well as the shortest distance between fringes. Quantitatively
the fringe spacing along each direction is

x is |∆φ(x + Λx , y, z) − ∆φ(x, y, z)| = 2π (3.23)


y is |∆φ(x, y + Λ y , z) − ∆φ(x, y, z)| = 2π (3.24)
z is |∆φ(x, y, z + Λz ) − ∆φ(x, y, z)| = 2π. (3.25)

By substituting Eq. (3.22) into Eq. (3.25) we find the following expressions for the
fringe spacing
λ
Λx = (3.26)
| cos α2 − cos α1 |
λ
Λy = (3.27)
| cos β2 − cos β1 |
λ
Λz = . (3.28)
| cos γ2 − cos γ1 |
Let’s consider the previous results for two plane waves propagating in the x −
z plane (in which case k y2 and k y1 equal zero) with the wavevector for wave
2 making an angle of 60o = π/3 with the z-axis. Figure 3.3 illustrates the real
wavefunctions for both of these waves, see Fig. 3.3a and Fig. 3.3b, and the real
wavefunction that results from their superposition (Fig. 3.3c). In Fig. 3.3c energy
flows along the direction of the white arrow that is parallel to the wavefunction
variation (the red and blue peaks and valleys). The second column of Fig. 3.3
illustrates the wavefronts of the two waves (Fig. 3.3d and Fig. 3.3e) as well as how
the fringes result (solid line) in their superposition (the wavefronts are the two
dashed lines) in Fig. 3.3f.
For this geometry the wavevectors, in terms of their direction cosines (note
cos β1 = cos β2 = 0), are

k1 = k (0, 0, 1) (3.29)
³ p ´
k2 = k (cos 5π/6, 0, cos π/3) = k − 3/2, 0, 0.5 (3.30)

where k is the magnitude of k1 and k2 since they have the same temporal frequen-
cies. We can use Eq. (3.22) to find the phase difference
p
∆φ(r) = −(k 3/2)x − (k/2)z + (δ2 − δ1 ) . (3.31)
3.2 Interference of two monochromatic plane waves 45

Notice the slope of the fringe spacing can be found from the previous. Looking at
p
Eq. (3.31) as the equation of a line x = mz + b we see that m = −1/ 3. The fringe
spacings are
λ 2λ
Λx = p =p (3.32)
| − 3/2| 3
λ
Λz = = 2λ (3.33)
|0.5 − 1|
Λx Λz
Λ= q = λ. (3.34)
Λ2x + Λ2z

where Λ is the shortest distance between adjacent fringes. In this particular


example the fringe space Λ = λ. The previous is not always the case and is specific
to the angles that have been selected. Finally, we can also determine the fringe
angular orientation relative to the z-axis by tan−1 Λx /Λz which equals −30o . This
is half of the relative angle between k1 and k2 . The fringe pattern that results from
this superposition is illustrated in Fig. 3.4. Fig 3.4a is the resultant irradiance
pattern in the x − z plane. Labeled on this plot are Λx , Λz and Λ. The irradiance
can also be visualized in the x − y plane at z = 0. This visualization is shown in Fig.
3.4b and Λx determines the observed fringe pattern. The linear fringes observed
in the x − y plane are called tilt fringes since they result from two plane waves
propagating in different directions. When this is observed in an interferometer it
is typically the result of two nonparallel mirror surfaces, i.e. the mirrors are tilted
with repsect to one another.

a)   b)   c)  

d)   e)   f)  

Figure 3.3 Monochromatic plane wave interference in the x − z plane. a) Real wave-
function for plane wave 1. b) Real wavefunction for plane wave 2 propagating at π/3
from the z-axis. c) The resultant superposition. d) and e) are the wavefronts of wave 1
and 2. f) shows how the fringes (solid line) arise from the wavefronts, the two dashed
lines.
46 Chapter 3 The interference of two waves

a)   b)  
x   y  

z   Λ   x  
Λx  

Λz  
Λx  

Figure 3.4 Monochromatic plane wave interference irradiance. a) The irradiance in


the x − z plane of the two interfering monochromatic plane waves. The relevant fringe
spacings are labeled. b) The irradiance in the x − y plane at z = 0. Λx determines the
observed fringe spacing.

3.3 Interference of two monochromatic spherical waves


In this section, we will focus on the case of two interfering monochromatic spher-
ical waves. Figure 3.5 illustrates the scenario we will consider. There are two point

Lz
x x
rP I1=I(xx,yp,zp)
rAP
I2=I(xp,Lz)
A rBP
rA
d/2
z
d/2
rB
B
Lx
detector
plane

z
detector
I3=I(Lx,yp,zp) plane

Figure 3.5 Interference of two spherical waves. This is the geometry we consider for
the interference of two spherical waves.

sources located at a distance ± d2 along the x-axis. Each of these point sources gen-
erates a diverging spherical wave; see Eq. (2.34). The observation point is located
¡ ¢
by the observation point vector rP = x p , y p , z p . The displacement along the
3.3 Interference of two monochromatic spherical waves 47

x-axis causes the r in Eq. (2.34) to be replaced by r AP and r BP , the distances from
point source A and B to the observation point. These distance are found from the
¡ ¢ ¡ ¢
magnitudes of the vectors rAP = x p − d/2, y p , z p and rBP = x p + d/2, y p , z p .
If we assume the waves have the same temporal frequency and initial phase
angles then from the given geometry the phase difference in Eq. (3.9) is expressible
as
∆θ(k, rAP , rBP ) = (r BP − r AP ) k (3.35)
where the difference in path length r BP − r AP determines the phase difference at
the observation point. It is clear from Eq. (3.10) and Eq. (3.35) that maximum
in the interference pattern occur when r BP − r AP = mλ, i.e. when the path length
difference to the specified point is an integer multiple of wavelengths. The integer
m is again the number of full waves of path difference to the irradiance maximum.
To figure out the irradiance, we either directly use Eq. (3.35) or we consider
certain regions of space where the path length difference simplifies. In general, the
fringes that result from the interference of two spherical waves are hyperboloids.
These fringes are illustrated in Fig. (3.6). The two panels evaluate the irradiance
associated with the phase difference given in Eq. (3.35) for two different source
locations. In panel a, it is assumed the separation between the sources is d = 4λ,
Notice along the x-axis for locations greater than d /2 the irradiance is a maximum.
In panel b, the distance is some fraction of a full wavelength so that the phase
difference related to the source geometry does not correspond to a multiple of
2π. For this configuration, along the x-axis for locations greater than d /2, the
irradiance takes on an intermediate value between its max and min. It should
also be clear from the images that there are some maximum number of fringes
that fit between the two spherical sources. Refering to Fig. 3.5 there are 4 regions

−10
−10

a)  
−8 b)  
−8

−6
−6

−4
−4

−2 −2

0 0

2 2

4 4

6 6

8 8

10 10
−10 −8 −6 −4 −2 0 2 4 6 8 10 −10 −8 −6 −4 −2 0 2 4 6 8 10

Figure 3.6 Fringes pattern for two interfering monochromatic spherical waves. a)
The two source separation is a multiple of the wavelength. b) The source separation is
not an integer multiple of the wavelength.

that things simplify. The easiest region to consider is the set of points that are
equidistant from the two point sources. These points coincide with the x − y
plane at z = 0. For all points in this plane r BP = r AP so this determines the m = 0
fringe. The x − y plane at z = 0 is coincident with the m = 0 irradiance maxima.
48 Chapter 3 The interference of two waves

A second region that is relatively easy corresponds with the collection of


points defined by −d /2 < x p < d /2. This is the line segment that connects the two
spherical sources. Recognizing y p = z p = 0 for this line segment we can write the
optical path difference as
q q
r BP − r AP = (x p + d/2)2 − (x p − d/2)2 (3.36)
q q
→ (x p + d/2)2 − (d/2 − x p )2 = 2x p (3.37)

where we have ensured that number being squared underneath each square root
is positive. Constructive interference occurs for locations


2x pm k = 2πm → x pm = (3.38)
2
where the m superscript indicates that there is a discrete collection of x p values
for which there are irradiance maximum. Since x p must be less than d /2 from Eq.
(3.38) it follows that the largest fringe order m max occurs for

d m max λ d
x pm = = → m max = . (3.39)
2 2 λ
If d /λ is not an integer then you should take the integer portion of d /λ. In total
there are 2 ∗ m max + 1 fringes between the two point sources along the x axis.
We can plot the irradiance along the x-axis and this is illustrated in Fig. 3.7. It
combines what we have observed in the previous two cases. In the region between
the sources, the sources generate counter-propagating waves that interfere as
standing waves with a spatial period of λ/2. These are the fringes we identified in
Fig. 3.6. For locations beyond the sources, the geometry and source wavelength
determines the phase difference for all x p and the irradiance is a constant value.
This was also evidenced in Fig. 3.6. In Fig. 3.7a) it is assumed the source sepa-
ration is a whole number of wavelengths and in panel b) the distance is not an
integer multiple of the wavelength.
¡ ¢
Next we will consider the irradiance I x p , 0, L z along the x-axis at the location
z p = L z . This region is illustrated in Fig. 3.5. For points such that x p << L z we
can make use of the paraxial approximation to the path length difference. This
is a simplification that arises due to the mismatch in length between the two
coordinates of our observation point. Recalling the binomial expansion in Eq.
(2.43) we can simplify both r AP and r BP . First, considering r AP we find
s
q x p2 + (d /2)2 − x p d
r AP = (x p − d /2)2 + L 2z = L z 1 + (3.40)
Lz
2 2
x p + (d /2)2 − x p d
2
à !
x p + (d /2) − x p d
u Lz 1 + = Lz + (3.41)
2L z 2L z
3.3 Interference of two monochromatic spherical waves 49

a) 4.5 b) 4.5

4 4

3.5 3.5

3 3

2.5
I/Io

2.5

I/Io
2 2
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 −0.5
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4

xp [µm] xp [µm]

Figure 3.7 Irradiance along the x-axis. a) The irradiance pattern I (x p , y p = 0, z p = 0).
For this example d = 4 µm and λ = 1 µm. b) Same as a) except λ = 0.750 µm

where we have made the paraxial approximation in the second line of the above
equation. We find a similar expression for r BP
s
q x p2 + (d /2)2 + x p d
r BP = (x p + d /2)2 + L 2z = L z 1 + (3.42)
Lz
2 2
x p + (d /2)2 + x p d
2
à !
x p + (d /2) + x p d
u Lz 1 + = Lz + (3.43)
2L z 2L z

Bringing together Eq. (3.41) and Eq. (3.43) the optical path length difference
(assuming the medium’s refractive index is 1) is

xp d
r BP − r AP u , (3.44)
Lz

the phase difference is


kx p d
k (r BP − r AP ) u . (3.45)
Lz
and the irradiance is
kx p d
I (x p ) = 2I o [1 + cos( )] (3.46)
Lz
where I o = u o2 /L 2z .
In the detector plane, there are irradiance maximum at locations x pm deter-
mined by
kx p d mλL z
= 2πm → x pm = (3.47)
Lz d
where again m is the fringe order determine the number of wavelengths in path
difference between the two sources in delivering optical energy to the observation
point. The fringes generated in this region are called tilt fringes. The origin of
this name can be understood from considering the dashed box in Fig. 3.5. In this
location far removed from the source plane, each spherical source can locally be
50 Chapter 3 The interference of two waves

approximated as have a planar wavefront each with a wavevector propagating


in a different direction. From this perspective it is as if we have the interference
of two monochromatic plane waves propagating with a relative angle between
their wave vectors. This was discussed in the previous section. The fringe pattern
we see in the x − y plane as well as a linecut along the y = 0 line is shown in Fig.
3.8a and 3.8b. The horizontal line in Fig. 3.8a is the resultant irradiance if the two
sources are incoherent.
4.5

a) 4b)
−4
4.5
m= -3 -2 -1 0 1 2 3
4 3.5 −3

3.5 3 −2

3
2.5 −1
2.5
I/Io

2
0
2
1.5
1.5 1

1
1
2

0.5
0.5
3
0 0

−0.5 −0.5 4
−4 −3 −2 −1
−4 3 2 1 0 1 2 3 4 −4 3 2 1 00 11 22 33 44
xp [mm] xp [mm]

Figure 3.8 Tilt fringes for L z >> x p , y p . a) The irradiance pattern I (x p , 0, L z ). For this
example d = 100 µm, λ = 1 µm and L = 1 m. Across the top of the plot is the fringe
order. b) The irradiance pattern I (x p , y p , L z ). The observed fringes are called tilt fringes.
Note this is only valid for a finite size x p − y p plane about z p = L z .

¡ ¢
Finally, we determine the irradiance I L x , y p , z p in the y − z plane for the
location x p = L x (see Fig. 3.5). We further assume L x >> y p , z p so that we can
consider a paraxial region about the x-axis. Again we need to evaluate r AP and
¡ ¢ ¡ ¢
r BP . For this region rAP = −(L x + d /2), y p , z p and rBP = −(L x − d /2), y p , z p . If
we define L A = L x + d /2 and ρ 2p = y p2 + z p2 we find
v
ρ 2p
q u
u
r AP = L A + ρ p = L A t1 + 2
2 2
(3.48)
LA
ρ 2p
u LA + (3.49)
2L A

where we have made the paraxial approximation in the second line of the above
equation. With L B = L x − d /2 a similar expression for r BP results
v
ρ 2p
q u
u
r BP = L B + ρ p = L B t1 + 2
2 2
(3.50)
LB
ρ 2p
u LB + (3.51)
2L B
3.3 Interference of two monochromatic spherical waves 51

The two previous equations result in the following optical path length difference

ρ 2p ρ 2p ρ 2p
r BP − r AP = (L B − L A ) + − = L− + (3.52)
2L B 2L A 2L e f f

where L − = L B − L A and L e f f = L A L B /(L A − L B ). Multiplying the previous by k we


arrive at the phase difference

kρ 2p
r BP − r BP = kL − + (3.53)
2L e f f

which results in the following expression for the irradiance

kρ 2p
I (L x , y p , z p ) = 2I o [1 + cos(kL − + )]. (3.54)
2L e f f

There are two terms in Eq. (3.53) and each has a different meaning. First notice,
that it y p = z p = 0 then the phase difference is completely determined by kL − .
So, we see that along the x p axis, for distances |x p | > d /2, the ratio of the source
separation d and wavelength determines the strength of the interference along
this axis (see Fig. 3.6). The second piece is a quadratic phase that is a function of
the radial distance ρ p . The circular symmetry in this plane generates so-called
defocus fringes, circular fringes with an increasing spatial frequency away from
the x-axis. The reason for using the word defocus is since the two spherical
sources are not coincident and if one of the sources was at a focal point of a
lens the other would be displaced axially from the focal point/plane. After the
lens, the in focus point would generate a plane wave and the out of focus point
would generate a converging or diverging spherical wave. The overlap of these
two waves, one with a planar and the other with a spherical wave front would
result in circular fringes.
Figure 3.9 examines the fringe patterns observed in the y − z plane at distance
L x far from the two sources. The observed fringe pattern exhibits so-called Fresnel
zones. A Fresnel zone is any one of the circularly symmetric rings. When we learn
a bit more about diffraction we will see such a zone plate can leverage diffraction
to focus light much like a spherical mirror or normal refracting lens. And, such a
plate can be manufactured via holography that records the interference pattern
of 2 displaced spherical sources. From considering Eq. (3.53) we can find the
locations of both minima and maxima in this fringe pattern. Lets consider the
location of the minima. If we assume the geometry is such that kL − is a multiple
of 2π then it ensures that along the x-axis the irradiance is a maximum. So, in the
y − z plane we can find the minima for

kρ 2p ( j ) (j)
q
= (2 j + 1)π → ρ p = (2 j + 1)λL e f f (3.55)
2L e f f

where j is an integer and the superscript in parenthesesqidentifies the spatial


j
location ρ p . j = 0 corresponds to a spatial distance of λL e f f to the the 1st
52 Chapter 3 The interference of two waves

a) 4.5
b) 30
4
20
3.5

3
10
2.5

yp [µm]
I/Io
2 0
1.5
−10
1

0.5
−20
0

−0.5 −30
−40 −30 −20 −10 0 10 20 30 40 −30 −20 −10 0 10 20 30

zp [µm] zp [µm]

Figure 3.9 Defocus fringes for L x >> y p , z p . a) The irradiance pattern I (L x , 0, z p ). For
this example d = 100 µm, λ = 1 µm and L = 1 m. Notice the spatial frequency of these
fringes change as a function of position. b) The irradiance pattern I (L x , y p , z p ). The
observed fringes are called defocus fringes. Note this is only valid for a finite size y p − z p
plane about x p = L x . Circular defocus fringes are often referred to as Fresnel zones.

minima. What is the area of the first bright fringe - the 0th zone? Using the
previous radius it is πρ 2(0)
p = πλL e f f where the superscript in parentheses denotes
2( j +1) 2( j )
the index j = 0. We can also ask for the area of the j th zone. That is πρ p −πρ p
which equals 2πλL e f f . Remarkably, this is true for all the zones - each bright
fringe has an area of 2πλL e f f !! Notice each bright fringe (Fresnel zone) has an
area that is twice the area of the 0th zone.
How about the maxima? These occur at

kρ 2p (m) q
= 2mπ → ρ (m)
p = 2mλL e f f (3.56)
2L e f f

where m is an integer and the superscript in parentheses identifies the spatial


location ρ (p m). With the maxima, we can ask what are the area of the null fringes.
The formula will be the same as before and the area of the m th dark zone is
πρ 2(m+1)
p − πρ 2(m)
p . Not surprisingly, the result is the same and equals 2πλL e f f .

In summary

the phase difference for displaced spherical point sources results in circular
defocus fringes that are commonly called Fresnel zones. All Fresnel zones,
excluding the central disk, have an area equal to 2λL e f f . The area of the
central disk is twice that of any zone.

When we begin discussions about diffraction we will find (and revisit) the previous
observations provide enormous intuition into understanding diffraction.
Chapter 4

Interferometry of 2 simple
sources

Interferometry aims to leverage interference to do useful things. Devices designed


to control interference are called interferometers and understanding their opera-
tion - specifically how phase differences arise - is at the heart of interferometry.
In this chapter we will consider simple sources - monochromatic spherical and
plane waves - and then relax the constraints on spectral content and spatial extent
in a subsequent chapter. Remember, from our perspective a plane wave results
from a spherical point source at infinity (like an atom) or at the focal plane of a
lens.
Before looking underneath the hood of inteferometers, why are they useful?
You can use interferometers to

1. Measuring the geometry of a surface

2. Determining the mechanical displacement of an object

3. Determining the refractive index of a material

4. Characterizing the spectral content of a light source

5. Elucidating the spatial and temporal coherence of a light source

6. Revealing the modal content of a light source

In the previous list, items 1-3 primarily belong to optical engineering and form
the backbone of optical shop testing and metrology. The last three items, 4, 5
and 6, are typically identified with optical physics and provide a window into the
understanding of both light sources and materials.
Remarkably, interferometers can be classified as one of two general types.
The two types of interferometers are called wavefront splitting and amplitude
splitting interferometers. Wavefront splitting interferometers sample the wave-
front in distinct locations and recombines them on a detector. We will discuss
this class of interferometers first. Amplitude splitting interferometers divide the

53
54 Chapter 4 Interferometry of 2 simple sources

wavefront into two or more replicas. These replicas are then combined on a
detector. Adopting our system’s viewpoint to studying interferometers, Fig. 4.1a)
presents a block diagram of an interferometer. It is illuminated by a light source,
the light is manipulated by the optical system - it is divided and recombined
dependent on the optical system hardware, and the total irradiance that arrives
at the detector plane is recorded. The irradiance is the result of adding up the
waves delivered to the detector via two distinct paths that are the result of either
amplitude or wavefront splitting the light source that enters the optical system.
For this chapter we will assume one of our two previously studied monochromatic
waves illuminates the interferometer - a spherical wave or a plane wave, see Fig
4.1b). The plane wave illumination, the lower panel of 4.1b), can result from
either collimating the spherical waves generated by a point source or via laser
illumination (not illustrated). Finally, the detector 4.1a) may be either a bucket
detector or a spatially resolving detector like a charged coupled device (CCD).

a) b)
source =
interferometer
source
(optical system) detector

source =

lens

Figure 4.1 a) System level illustration of an interferometer. Different types of optical


systems used for interferometry are described in this chapter. b) The two types of
simple sources that will illuminate the interferometers we consider. Monochromatic
spherical waves (top illustration) and monochromatic plane waves (bottom illustra-
tions).

In Ch. 3 we have developed all the tools to predict the irradiance output of
both interferometer classes. Before beginning the discussion, a word of caution.
In the previous chapter, we assumed two independent sources were superposed
to create the optical disturbance superpositions and interferograms we analyzed
(two plane waves, two spherical waves). In the real world of optics, unless great
care is taken, two independent light sources - atoms, light bulbs, lasers, etc -
will never interfere and are incoherent; their total irradiance is the sum of their
constituent irradiances. You will notice as we describe various interferometric
geometries, that each optical system makes a replica of the source and these
replicas are superposed. By deriving interfering waves from the same source co-
herence is ensured. One caveat is that as the source spatial or spectral properties
are increased (we have only considered point sources of a single frequency) the
sources ability to interfere with replicas of itself is reduced. Said another way,
monochromatic point sources are always completely coherent - in our studies
these sources generate spherical and plane waves.
4.1 Wavefront splitting interferometers; 2 Sources 55

4.1 Wavefront splitting interferometers; 2 Sources


Chapter 3 has prepared us well for the analysis of interferometers and the exam-
ination of wavefront splitting interferometers will benefit from the discussions
of 3.3. First what is meant by wavefront splitting? Wavefront splitting occurs
when an optical device samples two different regions of a wavefront and these
sampled regions are then recombined. Since one source generates the two waves
via wavefront sampling, the coherence properties of the source are inherited by
the two waves emanating from the sampled wavefront locations. If the primary
source is a monochromatic point source, the secondary waves, like the primary
wave, are guaranteed to be completely temporally and spatially coherent.

Figure 4.2 Huyghen’s principle.

Critical in the operation of a wavefront sampling interferometer is the sam-


pled wavefront locations isolate spatially two secondary waves. What is meant by
secondary waves? The notion of secondary waves arose in discussions of wave
propagation pioneered by Christiaan Huygens in the 1600’s. See Figure 4.5 and
Sec. 2.6. Recall in Huygen’s eyes, each point on a primary wavefront can be de-
scribed by fictitious secondary sources that radiate as spherical waves. To make
this clear, in the left panel of Fig. 4.5 the propagation of a monochromatic plane
wave as understood through Huyghen’s principle is illustrated. We imagine each
point of a wavefront is the location of a secondary source that radiates a spherical
wave. To determine the wavefront at a given instant a time it is necessary to
draw the surface tangent to all secondary waves radiated at an earlier time. For
planar secondary sources the subsequent wavefront is planar. Can you illustrate
Huygen’s principle for a spherical wave?
t Is there a problem with
Huygen’s principle when it
comes to direction of wave
propagation?
56 Chapter 4 Interferometry of 2 simple sources

4.1.1 Young’s Two Pinhole Interferometer


The operation of Young’s two pinhole interferometer is the canonical example
of a wavefront splitting interferometer that isolates two wavefront secondary
sources with two pinholes in an otherwise opaque screen. Figure 4.3 presents
an illustration of the device. The pinhole locations are denoted A and B and
are separated by a distance d. The aperture/source plane is parallel to the x-
y plane at z = 0 with the location of coordinates in the source plane labeled
by (x s , y s ). The pinholes are symmetrically located along the x s axis with their
midpoint coincident with the coordinate origin. The detector plane is described
with coordinates (x d , y d , L z ). In the following we assume y d = 0.
How is the interferometer illuminated? The dashed box to the left of the pin-
hole screen (we will also call this the aperture) is meant to represent illumination
of the aperture by either of the two sources on the left side of the figure. The top
source box represents illumination by a spherical wave and the bottom source
box is illumination by a plane wave (a spherical point source collimated by a
lens). The spectrum of each source, represented by the plot S(ω), demonstrates
t The spectrum of a source the assumed sources are monochromatic.
is the range of frequencies
that represent the source. xs

A
xs xd

t We could also consider A
rAd IT(xd,Lz)
other illuminations. For
B rd
example, extended sources
comprised of incoherent xs √d z
rBd
monochromatic spherical A
sources or a single point B
√ Lz
source that is polychro- S(ω)
pixelated detector
matic (composed of many spatial resolving
B
Charged Coupled Device (CCD)
colors) with colors that are ω CMOS camera
ωo
incoherent.
Figure 4.3 Young’s Interferometer.

With the previous setup we can lean directly on the results of Ch. 3 to write
down the total irradiance received by the detector plane. Restating the results of
Section 3.3 the irradiance is
kx d d
I (x d ) = 2I o [1 + cos( )] (4.1)
Lz
where I o = u o2 /L 2z . In the detector plane, there are irradiance maximum at loca-
tions x dm determined by
kx d d mλL z
= 2πm → x dm = (4.2)
Lz d
where again m is the fringe order determine the number of wavelengths in path
difference between the two sources in delivering optical energy to the observa-
tion point. The fringes generated in this region are called tilt fringes. In this
4.2 Amplitude splitting interferometers; 2 Sources 57

location far removed from the source plane, each spherical source can locally be
approximated as have a planar wavefront each with a wavevector propagating
in a different direction. From this perspective it is as if we have the interference
of two monochromatic plane waves propagating with a relative angle between
their wave vectors. This was discussed in the previous section. The fringe pattern
we see in the x − y plane as well as a linecut along the y = 0 line is shown in Fig.
4.4a and 4.4b. The horizontal line in Fig. 4.4a is the resultant irradiance if the two
sources are incoherent.
4.5

a) 4b)
−4
4.5
m= -3 -2 -1 0 1 2 3
4 3.5 −3

3.5 3 −2

3
2.5 −1
2.5
I/Io

2
0
2
1.5
1.5 1

1
1
2

0.5
0.5
3
0 0

−0.5 −0.5 4
−4 −3 −2 −1
−4 3 2 1 0 1 2 3 4 −4 3 2 1 00 11 22 33 44
xp [mm] xp [mm]

Figure 4.4 Tilt fringes for L z >> x d , y d . a) The irradiance pattern I (x d , 0, L z ). For this
example d = 100 µm, λ = 1 µm and L = 1 m. Across the top of the plot is the fringe
order. b) The irradiance pattern I (x d , y d , L z ). The observed fringes are called tilt fringes.
Note this is only valid for a finite size x d − y d plane about z d = L z .

4.1.2 Other Wavefront Splitting Interferometers


The Young’s two pinhole interferometer is of course not the only interferometer
that can be analyzed from the perspective of wavefront splitting.

4.2 Amplitude splitting interferometers; 2 Sources


Amplitude splitting interferometers are based on optical elements that make
replicas of the input wave and then recombine these potentially phase shifted
replicas. Figure 4.5a) presents an illustration of a common wavefront splitting
optical element, the beam splitter. For this section, it is also assumed that the
source spectrum is monochromatic; see Fig. 4.5b). Another common way to
illustrate a beam splitter is shown in Fig. 4.5c). Diagrams of interferometers
may use either of these ways to depict beamsplitters. One important point is
that beamsplitters modify the amplitude and phase of waves that traverse these
elements. Without going into details, we will assume that the reflection (r ) and
transmission coefficients (t ), are both equal to p1 and that the reflection coeffi-
2
cient introduces a phase shift of π2 . So, in problem solving the result is that r = pi
2
58 Chapter 4 Interferometry of 2 simple sources

and t = p1 . Notice the magnitude squared of r and t is 12 so that this type of beam
2
splitter reflects half the impinging power and transmits half of the power. For this
reason such a beam splitter is called a 50/50 beam splitter.

a) b)
S(ω)

ωo ω

c)

Figure 4.5 Amplitude Splitting a) The beam splitter is an example of an amplitude


slitting optical element. b) The source spectrum is monochromatic. c) A common way
to illustrate a wave being split by a beam splitter.

Beam splitters are used in amplitude splitting interferometers such as the


Michelson and Mach-Zehnder interferometers. Also, a thin slab or film of material
can also lead to amplitude splitting interference. In this section we will study
each of these type of interferometers and assume both devices are illuminated by
monochromatic plane waves.

4.2.1 Michelson Interfereometer


Figure 4.7b) illustrates a Michelson interferometer. It consists of an illumination
source, a central beamsplitter and then 2 separate paths that each encounter a
mirror. Upon reflection, the two paths propagate back toward the beamsplitter
where they are recombined and propagate toward a detector. Typically, the inter-
ferometer path that encounters the fixed mirror is called the reference path. The
second path reflects from a mirror that is free to displace. We will see this mirror
displacement results in a change in the total irradiance received by the detector
due to interference.
t The amplitude and phase of Just as for the Young’s interferometer it is critical to understand how path
the reflection and transmis- length difference manifests in the interference signal. Unlike the Young’s appara-
sion coefficient for a beam
tus, where the two pinholes deliver secondary waves to the detector plane that
splitter are determined by
energy conservation and
interfere, in the Michelson (and Mach-Zehnder) the distinct, uncommon, paths
the principle of reversibility. each interferometer serve as the interfering sources. To analyze the Michelson,
we will first determine the total optical path length from the input plane to the
detector in Fig 4.7 for both path A and path B. Each path is labeled by its physical
length L i . If we assume the interferometer is embedded in a medium of refractive
index n then the optical path length from input to detector for path A is

nL A = n(L i n + 2L A + L A+B ) (4.3)

and for path B


nL B = n(L i n + 2L B + L A+B ) (4.4)
4.2 Amplitude splitting interferometers; 2 Sources 59

fixed mirror
A+B
input LA A 3
a) b) c)
Lin B
A 4 # ∆%
LB
LA+B A+B ∆l
B
1
# ∆$ ∆%
bucket detector 2
pin photodiode

Figure 4.6 Amplitude Splitting Interferometers a) The interferometer illumination.


b) A Michelson interferometer. The plane wave illumination is split into two paths A
and B that are recombined by the beam splitter and directed toward a bucket detector
that records the total irradiance. Path A is reflected from a fixed mirror and path B
encounters a mirror that can be displaced. The dashed vertical line corresponds to
the mirror position when paths A and B corresponds to equal optical path length. c)
A Mach-Zehnder interferometer. In path B a phase shift can be introduced. A bucket
detector records the total irradiance.

The factor of 2 arises since the wave propagates along path A and B two times.
Notice that if the moving mirror is situated at the location of the dashed line then
the length of path A and path B are equal; L A =L B . We will call this length L o . If
the mirror displaces from the equal path configuration then L B can be expressed
as L B = L o + ∆l .
In the following discussion, the interferometer is assumed to be illuminated
with a monochromatic plane wave. The input plane (the dashed-dot line in Fig.
4.7) is coordinated with the wave’s plane that has a spatial phase of 0 so that the
complex amplitude of the wave that propagates through the interferometer along
path A and arrives at the detector is

U A = t r u o e i ko n(L i n +2L o +L A+B ) (4.5)

where we have ignored phase shifts from mirror reflections, as each path sees the
same number of mirrors, and r and t correspond to the beam splitter reflection
and transmission coefficients. The wave that travels along path B and arrives at
the detector is
UB = r t u o e i ko n(L i n +2L o +2∆l +L A+B ) . (4.6)
With the previous complex amplitude the total wave that arrives at the detec-
tor is ³ ´
U A +UB = r t u o e i ko n(L i n +2L o +L A+B ) 1 + e i ko n2∆l (4.7)

and the measured irradiance is

I (∆l ) = |U A +UB |2 = 2|r t u o |2 (1 + cos k o n2∆l ) (4.8)

where the dependence on the displacing mirror is explicit. Notice the engineered
phase difference in the Michelson interferometer is

k o n2∆l . (4.9)
60 Chapter 4 Interferometry of 2 simple sources

For the phase difference to equal a multiple of 2π (k o n2∆l = m2π) the mirror
needs to move a distance ∆l = mλ 2n . This feature of a Michelson interferometer
allows it to measure extremely small displacements as well as determine the
refractive index of a material placed in one of the paths. If n = 1 in Eq. (4.9) then
each mirror displacement that corresponds to a physical length of λ2 will result
in the detector observing a different fringe maximum. . Starting from a fringe
t If the detector is observing maximum and counting the number of fringes that pass through the detector
a null in the irradiance, is one way to determine how far the reference mirror has been displaced. Also
where is the optical energy?
notice, that Eq. (4.9) also depends on the refractive index n. If the refractive
changes by an amount ∆n then the optical path length will change and this
change will modify the observed interference signal.

4.2.2 Mach-Zehnder Interfereometer


Figure 4.7b) illustrates a Mach-Zehnder interferometer. It consists of an illumina-
tion source (still assumed to be a monochromatic plane wave), two beam splitters
and two mirrors. It is common to refer to the paths of the interferometer as rails.
Inserted in path B is a relative phase that can be acquired by light propagating
along this rail of the device. It manifests from a change in the optical path length
of B. The paths have an optical path length given by nL A and nL B and the geome-
try of the Mach-Zehnder is such that when ∆θ = 0 the physical path lengths of
the device are equal L A = L B = L o . Just like the Michelson, it is assumed the the
Mach-Zehnder is embedded in a medium of refractive index n.
Notice, in the illustration of Fig. 4.7b), light enters along the port labeled 1,
propagates along paths A and B, and exits through ports 3 and 4. It is critical to
t The Mach-Zehnder inter- account for the beam splitter phase shifts to correctly calculate how the waves
ferometer is like a beam exit the device. For the scenario in Fig. 4.7b), the total complex amplitude arriving
splitter that has some in-
at the detector in port 4 is
ternal structure that can be ³ ´
engineered to control how U A +UB = r t u o e i ko nL o 1 + e i ∆θ (4.10)
electromagnetic energy
exits the device. where ∆θ represents a phase difference arising from a change in the physical path
length L o → L o + ∆l or a local change in the refractive index, n → n + ∆n. If the
phase change is a result of the latter, the phase difference is ∆θ = ∆n∆l where the
local refractive index change results from inserting a new material within rail B
that has a thickness ∆l and a refractive index n + ∆n. The resultant irradiance in
output 4 is
I (∆l ) = |U A +UB |2 = 2|r t u o |2 (1 + cos ∆θ) (4.11)
and has the same structure as many of the interference signals we have deter-
mined. The strategy always is to identify what results in a phase difference be-
tween the paths/rails of an amplitude splitting interferometer.

4.2.3 Thin film interference with two paths


With an understanding of amplitude splitting interferometers it is possible to
make sense of thin film interference. You will now be able to understand why
4.2 Amplitude splitting interferometers; 2 Sources 61

you see a rainbow of colors in an oil slick or in a soap bubble. Figure 4.7 is an
illustration of a thin film, with refractive index n, imagined to be embedded in
air (n = 1). For simplicity, a monochromatic plane wave illuminates the thin slab
through a beam splitter (assume the beam splitter has r = t = 1). The strategy
to understand the total field that arrives on the detector plane is to trace all the
multiple paths that reflect from the slab and contribute to the total field UT that
arrives at the detector. In this section we will only consider 2 reflections/paths (in
a later chapter all we will account for all reflections).

xp

$%&
$'
A B
z
1 n1=1
thin
slab 2 n2=n h

3 n3=1

Figure 4.7 Thin film interference Illustration of a thin film, with refractive index n,
illuminated through a beam splitter by a monochromatic plane wave. Two paths,
labeled A and B , contribute to the total complex amplitude UT that arrives at the
detector. .

The two paths that contribute to UT are labeled A and B in Fig. 4.7. The
arrows are used as a guide to illustrate the paths. Some of the light will pass
through the slab. This is indicated by the sequence of downward facing arrows.
Path A, the solitary upward pointing arrow, corresponds to light that has reflected
of the front interface of the slab with reflection coefficient r . Path B , illustrated by
the double headed arrow within the slab and the upward pointing arrow above
the slab, corresponds to light that has transmitted through the top interface
with a transmission coefficient t , propagated to the bottom interface of the slab
acquiring phase kh, reflects of the bottom interface with a reflection coefficient
r , propagates back to the top interface acquiring phase kh and lastly transmits
through the top interface with transmission coefficient t . Using the previous, an
expression for the light propagating along the z-axis toward the detector (after
reflecting off the beamsplitter), assuming a complex envelope fo u o , is
62 Chapter 4 Interferometry of 2 simple sources

³ ´
UT = u o r e i kz + t r t e i (kz+2kh) (4.12)

where k = nk o and k o is take as the free space wavenumber. Notice, the only
uncommon part of the two paths is for Path B that propagates through the thin
slab. Another feature in the phase difference that we will insert by hand is an extra
π phase shift. We do not have the tools yet to derive this fact, but if two amplitudes
are interfered and one has reflected off a low-high index boundary and the other
has reflected of a high-low index boundary, with the same index, there will be an
extra phase shift of π between the two amplitudes. With the previous in mind Eq.
(4.12) becomes
³ ´
UT = u o r e i kz + t r t e i (kz+2kh+π) . (4.13)

The irradiance is calculated in the usual way as

4πnh
UT UT∗ = I T = r 2 u o2 1 + t 4 + 2t 4 cos (2kh + π) → ∆θ = 2kh +π =
¡ ¢
+π (4.14)
λo

where the phase difference has been made explicit as well as its dependence on
wavelength (λo is the free space wavelength). Fixing λo and n it is possible to
determine the set of slab thickness that would result in constructive interference.
The thicknesses are
4πnh (2m − 1) λo
∆θ = m2π = +π → h = (4.15)
λo 4n

where m is a positive integer greater than zero. m = 1 determines the thinnest slab
λo
that will support constructive interference and it corresponds to 4n . Notice that
for m = 1 the slab is a quarter-wavelength thick (in the material since it is divided
by n). For other orders m the thickness is a multiple of this quarter wave thickness.
The quarter-wave condition (althought it corresponds to a π/2 phase-shift is a
manifestation of the extra π phase shift experienced by the two interfering paths.
Conversely, if the slab thickness is fixed, the spectral irradiance (the interferogram
as a function of wavelength) will exhibit maximum at

4πnh 4nh
m2π = + π → λo = (4.16)
λo (2m − 1)

From the previous, with m = 1, the longest wavelength to exhibit constructive


interference will be 4nh. Additionally, as m increases the wavelengths exhibiting
constructive interference will become progressively closer.
In Fig. 4.8a) the irradiance is plotted as a function of λo (from 0.4 to 1 µm
assuming the slab has a refractive index of 1.5 (this is approximately the refractive
index of glass) and the thickness of the slab is 0.75 µm. In this wavelength window,
constructive interference maximum are expected at 0.409, 0.5, 0.643 and 0.9
µm (these correspond to m = 6, 5, 4, 3 in order of shortest wavelength to longest
wavelength) and this is observed in the figure. Notice the magnitude of the
4.2 Amplitude splitting interferometers; 2 Sources 63

a) 0.1 b) 0.5
c) 0.5
0.09 0.45 0.45
0.08 0.4 0.4
0.07 0.35 0.35

"# 0.06 "# 0.3


"# 0.3
0.05 0.25 0.25
0.04 0.2 0.2
0.03 0.15 0.15
0.02 0.1 0.1
0.01 0.05 0.05
0 0 0
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 0 20 40 60 80

wavelength [!m] wavelength [!m] wedge thickness [!m]

Figure 4.8 Thin film interference Interferograms a) Spectral interferogram for light
reflected from a glass slab (n = 1.5) of thickness 0.75 µm. b) The same slab as in a), but
the refractive index is increased to 3.5 .

constructive interference peaks. These are determined by the refractive index


contrast of the slab with air; larger contrast means bigger reflection. In Fig. 4.8b)
the index is changed to 3.5. Notice the magnitude of the constructive interference
peaks increases and the number of maxima increase. The latter is to be expected
since the wavelength in the medium is now shorter. Lastly, Fig. 4.8c) presents an
interferogram that would be measure
Chapter 5

Interlude on coherence

In the previous chapters we discussed simple waves and the interference that
can result when two of these waves are superposed. We learned when the waves
are coherent they can interfere, and if there amplitudes are equal, the observed
irradiance is found to be I T = 2I o (1 + cos ∆φ) where ∆φ is the phase difference
between the two waves at the observation point, see Eq. (??). If the waves are
incoherent then their joint irradiance is simply I T = I 1 + I 2 which equals 2I o if the
amplitudes of the waves are equal. A measure called the visibility, see Eq. (3.12),
was introduced to quantify the strength of interference. For the coherent case
the visibility is 1 and for the incoherent scenario the visibility is 0. Said another
way, the ability of a light source to produce interference fringes in an amplitude
or wave-front splitting interferometer is determined by the source’s coherence
properties. We will find that a visibility equal to 0 or 1 represents the two extreme
cases of coherence and sources of light can exist that are partially coherent. Partial
coherent sources generate interference fringes of reduced visibility (< 1). In this
chapter we will show how we can use the visibility as an indicator of a light source’s
coherence properties and we will derive an expression for the irradiance seen by
our detector plane for the two cases illustrated in Fig. 5.1.
We will build on this operational definition of coherence (i.e. the ability to
generate fringes of finite visibility) to try and make sense of coherence effects
in wave optics. The aspects presented in this chapter are designed to give some
physical optics interpretation to coherence effects. In optics coherence can come
in many different flavors and at the level of wave optics we will be considered with
spatial and temporal coherence (for example we ignore polarization coherence).
Connecting with our mathematical model of wave optics, to quantify spatial
coherence we will study the similarity of an optical disturbance at two points in
space at the same instance of time and to quantify temporal coherence we will
study the similarity of an optical disturbance at two instances of time at the same
location in space. Remember, to this point we have only considered two of the
simplest optical disturbances, monochromatic spherical and plane waves. Both
of these optical disturbances are spatially and temporally coherent. More general
optical disturbances need not be fully coherent.

65
66 Chapter 5 Interlude on coherence

xs xa xp

A

d

S(ω)
B
Ls Lz

ωo ω

Figure 5.1 Probing spatial and temporal coherence. In this chapter we will examine
the visibility of the inteferogram in a Young’s interferometer for two different source
configurations. Configuration one is an extended monochromatic source. Configura-
tion two is a point source with an extended spectrum - the source is polychromatic.

Returning to spatial coherence we are interested in understanding the sim-


ilarity between U (r1 , t ) and U (r2 , t ). One way to access these two quantities is
in an interferometer. Remember an interferometer takes two sources and pro-
duces a signal that depends on U (r1 , t )U ∗ (r2 , t ). Recall the Young’s interferometer
samples two locations of an optical disturbance (we labeled these locations A
and B ) and superposes them on a detector downstream. These cross-terms are
sensitive to both the complex envelope and phases of the contributing complex
wavefunctions. To quantify spatial coherence we will be interested in U (r, t 1 )
and U (r, t 2 ). Again, an interferometer can reveal this relationship as it produces
a signal that depends on U (r, t 1 )U ∗ (r, t 2 ). The Michelson interferometer with
unbalanced arms will deliver this time-delayed product to the detector as will the
off optical axis points of a Young’s interferometer.
Unfortunately, a complete analysis of coherence effects in optics is built on
the theory of stochastic processes! This is well outside the scope of our text, but, it
turns out, by considering some rather simple examples of more complicated light
sources we can still gain some physical optics intuition into coherence. Common
to both of this situations is the source has been extended. In one scenario we will
introduce a source plane that contains a spatially extended source that consists
of monochromatic spherical wave emitting point sources. This will reveal the
connection between source size and spatial coherence. Next, we will consider a
single point source emit that emits many spherical waves of different color; the
source is polychromatic. Remarkably, the final expression will look no different
5.1 Intuition about coherence 67

than what we have discovered previously except the visibility will be found to
depend on the extended source properties. In fact we will find

I = 2I o [1 + cos(∆φ)] → 2I o [1 + V cos(∆φ)] (5.1)

when finite size effects are included in the source.

5.1 Intuition about coherence


Before embarking on a detailed calculation connecting visibility and source prop-
erties we can develop some intuition by considering a somewhat simplified situa-
tion. We will focus on spatial coherence, but we could apply similar reasoning to
temporal coherence. Recall with the Young’s double slit (the same holds true with
the Michelson interferometer) that regardless of whether the interferometer is
illuminated by a monochromatic point source at infinity - a long winded way of
saying a monochromatic plane wave - or a point source at a finite distance from
the interferometer, the recorded visibility is 1 provided each path receives the
same complex amplitude. This is an important point:

monochromatic point source illumination is always spatially and temporally


coherent

Consider the scenario illustrated in Fig. 5.2. In this example, we will imagine in
addition to the point source on axis (labeled S1 ) there is a second source located at
S2 a distance ∆ from S1 . Each one of these sources generates its own interferogram
also labeled accordingly. The question we want to ask is how far ∆s S2 needs to

xs xa xp

S1 d S1
z I(xp)
Δs θs  

S(ω)
S2
S2
B

ω Ls Lz
ωo

Figure 5.2 Spatial coherence intuition. Fringes from two spherical sources. Source S2
is displaced a distance ∆s from the on-axis source S1 . The inset is the assumed power
spectrum of the source - it is monochromatic. From setup we can determine what
displacement of S2 spoils the interference fringes of S1 .

be shifted so that the 1st minima of S1 is located at the same detector location as
68 Chapter 5 Interlude on coherence

λL z
the m= 0 maximum fringe of S2 . The minima of S1 is located at x pmin = 2d . From
similar triangles (see Fig. 5.2) we can locate the maximum of S2 at x p = LLz ∆s s . If
max

we equate these two we find

λL z L z ∆s λL s
= → ∆s = (5.2)
2d Ls 2d
which is the desired displacement. So, if the second source is displaced by ∆s
the recorded visibility is zero. This result can be reorganized in the following way.
First notice that the angle subtended by the source at the slit plane is θs = 2∆s /L s .
Notice if we divide the wavelength λ by θs we find a quantity with units of length.
This observation leads to the following definition

λ
ρc = (5.3)
θs
which we call the transverse coherence length. Why do we call this the transverse
coherence length? The transverse coherence length captures the ability of the
source to produce interference fringes in the Young’s interferometer. If ρ c < d
then ∆s is bigger than λL s /2d and the interference fringes disappear whereas
if ρ c > d then there will be observable fringes on the detector and ∆s is less
than λL s /2d . We can say this another way. If the source presents a transverse
coherence length that is larger than the slit separation than the waves leaving the
slits will interfere, otherwise the interference is surpressed.
A final point. It is clear from the previous if our source S2 displaces by more
that ∆s we will recover fringes with only two sources. Since in the end we will be
concerned with sources that fill in between S1 and S2 are previous reasoning is
almost correct. In fact, we will find that for slit separations larger than ∆s there
will be a recovery of visibility, but it will not generate fringes that exhibit visibilities
on the order of unity. Using the transverse coherence length, we can determine a
coherence area A c presented by the source to the optical system (pinhole plane
¡ ρ ¢2
in this example) as A c = π 2c . This is the area throughout which one could
sample the impinging wave with two pinholes and still see interference.
A similar story can be constructed in the context of temporal coherence, but
now we keep our source fixed on axis and allow its spectrum to broaden; see Fig.
5.3. Important is that each spectral component in the broadened source does not
interfere with light of a different color/wavelength. To get started, lets express the
phase difference at the observation point as

2πd x p d xp
∆θ = = 2π f = ωτ (5.4)
λL z cL z
d xp
where we have made explicit the difference in time, τ = cL z , it takes for optical
energy to propagate from pinhole A to the observation point and from pinhole B
to the observation point and have used the relationship f λ = c. Next we imagine
the source contains two spectral sources S1 and S2 with frequencies f 1 and f 2 . We
assume source S1 exhibits its first minima at a location where S2 has its second
5.1 Intuition about coherence 69

xs xa xp

-1.5

-0.5

0.5

1.5
-2

-1

2
-15
A

-10
S1 d

-5
z I(xp)

0
S(f) S2

5
10
B

15
Ls Lz
f1 f2 f
Figure 5.3 Temporal coherence intuition. A bi-chromatic point source delivers two
different spherical waves to the Young’s pinhole plane. The blue color (solid curve) is
such that its second maxima overlaps with the first null of the red source (dashed line).
From this construction we can determine how the spectral width of the source results
in loss of interference visibility on the detector.

maxima. This means 2π f 1 τ = π and 2π f 2 τ = 2π. At this particular location on the


detector
1
2π f 2 − f 1 τ = π → τ =
¡ ¢
(5.5)
2δ f
where δ f = f 2 − f 1 is a measure of the spectral spread of the source. Since we have
assumed the source in Fig. 5.3 is symmetric about f 1 , lets define ∆ f = 2δ f as the
source bandwidth. The previous means that when the observation point on the
detector is such that it corresponds to a time that is equivalent to the inverse of
the source’s spectral bandwidth, interference will no longer occur. This leads us
to define the source coherence time τc as
1
τc = . (5.6)
∆f

Using the speed of light c, the coherence time can be related to a longitudinal
coherence length l c as
l c = cτc (5.7)
where l c is a measure of the path length difference above which interference no
longer occurs.
Figure 5.4 illustrates how lack of completer coherence, so called partial co-
herence, influences the structure of a Young’s two pinhole interferomter. The
upper left panel illustrates the loss of interference as a function of position on the
detector due to a spectrally broad source that is spatially coherent. The lower left
panel zooms in to the region that contains the detector location of equal optical
path length difference (synonymous with equal time delay). In contrast, the up-
per right panel illustrates what happens as the source being interrogated goes
from exhibiting full spatial coherence (unit visibility, dashed line) to successively
70 Chapter 5 Interlude on coherence

temporal spatial
5
4

4.5
3.5
4

3 3.5

3
2.5

IT /Io
2.5
2
2

1.5 1.5

1
1
0.5
0.5
0

0 -0.5
-1.5 -1 -0.5 0 0.5 1 1.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

5 5

4.5
temporal 4.5 spatial & temporal
4 4

3.5 3.5
IT /Io

3 3

2.5 2.5

2 2

1.5 1.5

1 1

0.5 0.5

0 0

-0.5 -0.5
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5

xp [a.u.] xp [a.u.]

Figure 5.4 Cohrence influence on inteferograms. .

lower degrees of spatial coherence. The lack of complete spatial coherence is


manifested in the reduced visibility of the interferogram across the detector. The
lower right panel illustrates the combined effect of partial spatial and temporal
coherence. The lack of complete spatial coherence is revealed by a visibility every-
where across the detector that is less than one and the partial temporal coherence
is evident from the loss of interference for regions of the detector away from the
central maximum. Bringing the coherence length together with the coherence
area we can define the spectrally broad and extended source’s coherence volume
¡ ρ ¢2
Vc as Vc = A c l c = π 2c l c . This is the volume throughout which one can sample
with a pair of pinholes and still observe interference.

5.2 Spatial coherence


In this section we will develop a quantitative model of the spatial coherence prop-
erties of an extended monochromatic source. Figure 5.5 is an illustration of the
system we will consider. New in this apparatus as compared to our previous exam-
ination of the Young’s interferometer is the incorporation of an extended source
(of width 2∆) of spherical wave radiators on the left hand side of the slit plane.
The distance between the source and pinhole screen is L s . In developing the full
expression for the measured irradiance in the detector plane at a location x p it is
useful to recall that the irradiance at I (x p , L z ) results from the superposition of
5.2 Spatial coherence 71

xs xa xp

rSA( j ) A rAP
xP
d

z

S(ω) rSB( j ) rBP


( j)
x S B

ω Ls Lz
ωo

Figure 5.5 Spatial coherence. Setup to examine the spatial coherence properties of an
extended monochromatic source.

the complex amplitudes that leave slit A and slit B


¯ ¯2
¯ e −i kr AP e −i kr B P ¯¯
I (x p , L z ) = ¯U A (x a ) +UB (x a ) ¯ . (5.8)
¯
¯ r AP rB P ¯

where we have made explicit the location in the plane of the slits, x a , that complex
amplitude is generated. Notice the irradiance from each of the slits propagates
to the detector as a spherical wave. The distance form slit A is labeled r AP and
B is labeled r B P . Previously U A and UB were taken as complex numbers. In the
current discussion to determine U A and UB we look back from the slit plane to
the source plane to determine the complex numbers that feed the detector plane.
(j)
The sources are assumed to be located at discrete locations labeled x s in the
(j)
source plane. Each source point generates a monochromatic wave at location x s
(j) (j)
of complex amplitude U s (x s ) with initial phase angle e i δ(x s )
that propagates
(j)
−i kr S A
to slit A acquiring a spatial phase e and to slit B acquiring a spatial phase
(j)
−i kr S A
e .
With the previous, we can expand Eq. (5.8) to reflect the source point that
delivers the optical disturbance arriving at x p . We find
¯ " (j) (j) (j) ( j ) #¯2
−i k(r S A +r AP −δ(x s )) −i k(r S A +r B P −δ(x s )) ¯
(j) e (j) e
¯X
U s (x s ) +U s (x s ) (5.9)
¯ ¯
¯ ¯
¯ j r AP rB P ¯

where the sum over j indicates each point in the source plane contributes to the
detected signal at x p . If the sum does not make things bad enough, we also need
to multiply the sum by its complex conjugate! We can simplify the appearance of
(j) (j)
Eq. (5.9) if we call the entire first term U A and the second term UB . With these
72 Chapter 5 Interlude on coherence

definitions we rewrite Eq. (5.9) as


Xh (j) i h
(j) X
i
U A +UB U A(m)∗ +UB(m)∗ = (5.10)
j m
X h (j) (j)
i
(U A +UB )(U A(m)∗ +UB(m)∗ ) (5.11)
jm

where the complex conjugate is indicated by ∗. There is a physical reason for


the two different sums. At the outset there is no reason why the wave generated
(j)
at source location x s should not interfere with the wave generated at source
(m)
location x s when it arrives to slit A or slit B . The double sum captures this
physical fact.
That said, in our current model, we will neglect the possibility of different
source locations interfering. In our derivation it is the wave’s initial phase angles
that captures this in our derivation. We can see this explicitly by considering one
of the product terms in Eq. (5.18). The same reasoning will apply to each of the
four terms. Looking at what arrives at slit A we find
(j) (m) (j)
)−(δ(x s )−δ(x s(m) ))]
(j) (j) e −i [k(r S A −r S A
U A U A(m)∗ = U s (x s )U s∗ (x s(m) ) 2
(5.12)
r AP

where we notice in addition to the propagation phase from source to slit the
initial phase angle difference is also relevant. Recall that the irradiance is the time
average of total optical disturbance function squared and that we have used this
synonymously with the magnitude squared of the complex wavefunction. But,
when we considered the case of superposing two monochromatic plane waves of
different angular frequency we needed to be careful about the time dependence.
That result exhibit modulation of the interferogram at the difference frequency
of the two waves. If that difference frequency was too fast then the interference
term would average to zero and the two waves are said to be incoherent. The
same reasoning applies here except the time dependence is implicit in the initial
phase angles and this is what spoils the interference. We imagine each source
has an initial phase angle that can change as a function of time and these phase
variations are uncorrelated between different locations in the source. Focusing
only on the time varying pieces we quantifying this intuition by writing
(j)
(j) )−δ(x s(m) ))
〈U A U A(m)∗ 〉t = 〈e i (δ(x s 〉t = δ j ,m (5.13)

where we have again the notation for time average < f (t ) >t of a function and
we also introduced the Kronecker delta function δ j ,m that equals 1 if j = m and
equals 0 if j 6= m.
The delta function collapses the double sum in Eq. (5.18) back into a single
sum. This happens since only the terms where j = m get multiplied by 1. The
other j 6= m terms are multiplied by 0. The delta function captures the fact that
(j)
the source points located at x s and x s(m) do not interfere! Utilizing the delta
5.2 Spatial coherence 73

function and multiplying out the product in Eq. (5.18) we find


X h ( j ) ( j )∗ ( j ) ( j )∗ ( j ) ( j )∗ ( j ) ( j )∗
i
U A U A +UB U A +U A UB +UB UB . (5.14)
j

It is now possible to evaluate each term by considering the given geometry. Term
by term the result is
(j)
(j) ( j )∗ I s (x s )
UA UA = 2
(5.15)
r AP
(j)
( j ) ( j )∗ I s (x s )
UB UB = (5.16)
r B2 P
(j) (j) (j)
( j ) ( j )∗ I s (x s )e i k(r SB −r S A ) e i k(r B P −r AP )
U A UB = (5.17)
r AP r B P
(j) (j) (j)
( j ) ( j )∗ I s (x s )e −i k(r SB −r S A ) e −i k(r B P −r AP )
UB U A = (5.18)
r AP r B P
(j) (j) (j)
where I s (x s ) = U s (x s )U s∗ (x s )we can now apply on both the source side and
detector side the paraxial approximation to the previous expression. The paraxial
approximation amounts to constraining L s >> |∆s | and L z >> x p . In the phase,
on the detector side this leads to (see Sec. ***) r B P − r AP = kx p /L z and on the
source sider SB − r S A =. In the complex envelope we can write r AP u r B P = L z .
Bringing this all together we have

X 2I s (x s( j ) ) X I s (x s( j ) ) X I s (x s( j ) )
+ e i kd x p /L e i kd x s /L s + e −i kd x p /L e −i kd x s /L s (5.19)
j L 2z j L 2z j L 2z

which we will rewrite as


I tot i kd x p /L g AB
g∗
· ¸
−i kd x p /L AB
1 + e + e (5.20)
L 2z 2 2
P (j)
where I tot = j 2I s (x s ) and
P ( j ) i kd x s /L s
j I s (x s )e
g AB = . (5.21)
I tot
Two interesting things have started to emerge. First, Eq. (5.23) is starting to
look very much like the expression we derived for the interference of 2 simple
sources except the cross term now depends on the complex number g AB . Second,
g AB depends only on the properties of the source! It is useful to express g A B as

g AB = ¯g AB ¯ e ∠g AB
¯ ¯
(5.22)

which results in
· ¸
|g AB | i (kd x p /L+∠g AB ) |g AB | −i (kd x p /L+∠g ∗ )
I tot 1 + e + e AB (5.23)
2 2
74 Chapter 5 Interlude on coherence

or the more familiar expression


£ ¤
I tot 1 + |g AB | cos(kd x p /L + ∠g AB ) . (5.24)

Finally, when all the dust settles, we have an expression for the irradiance that
looks very much like the our earlier results! From the definition of visibility in Eq.
(3.12) we find with the previous expression the visibility V is

V = |g AB |. (5.25)

Notice, from how g AB is defined its magnitude varies between 0 and 1. It should
also be clear the phase of g AB determines the location of the interferogram’s m = 0
brings.
So, what is g AB ? First, in optical coherence theory it is called the mutual inten-
sity of the optical disturbance. From how it appears in the interference expression,
its magnitude determines the strength of interference and it is therefore related
the source’s spatial coherence properties. Notice that it is a two point relation.
What I mean by this is the visibility of the interferogram is determined by how
correlated the complex amplitude at slit A is with the complex amplitude at slit B .
For this reason g carries the subscripts A and B . For our geometry, the two points
are located at d /2 and −d /2 in the plane of the slits. We can think of the slits as
sampling the source distribution at 2 points and delivering them to a detector that
register’s an interferogram with a visibility that quantifies the coherence between
the two sampled points.
With the previous intuition, we can unpack g AB to understand the coherence
length ρ c and coherence area A c the source presents the aperture plane. For this
we will imagine that we fill the region source plane, with 2∆ with a continuum
of point sources, not a collection of discrete point sources. Ignoring any of the
P
formalities in doing this, our expression for j will be replaced by an integral over
R
the source plane, d x s . With this substitution we have

I s (x s )e i kd x s /L s d x s
R
g AB = . (5.26)
I tot

We will rewrite the phase as

kd x s 2πd x s
= = 2πνx x s (5.27)
Ls λL s

where we have defined


d
= νx . (5.28)
λL s
Notice νx carries units of inverse meters so it is a spatial frequency! The integral
we have just encountered can be written as

I s (x s )e i 2πνx x s d x s
R
g AB = (5.29)
I tot
5.2 Spatial coherence 75

and the numerator is called a Fourier Transform. We will adopt the following
notation for Fourier transforms in this text
Z
I (νx ) = I s (x s )e i 2πνx x s d x s . (5.30)

In the coming chapters we will also find the Fourier transform emerges naturally
in the analysis of far-field diffraction patterns. Equation (??) result is a special
case of a more general result from optical coherence theory known as the van
Cittert Zernike Theorem. In words the van Cittert Zernike theorem relates the
spatial coherence properties of a source in a plane to the Fourier transform of the
source irradiance!
Time for a deep breath and to ask, what does this all mean? We started
out wanting to understand the visibility of an interferogram since our intuition
was the strength of interference was an indicator of source spatial coherence
properties. We have found that the Fourier transform of our source Irradiance
determines our interferogram’s visibility.
But, how do we use this? The first step is to find I (νx ). Operationally you
take the Fourier transform of the source distribution or more plainly evaluate
that integral in Eq. (??). The integral returns a function I (νx that is a function
of νx the spatial frequency. The question is now, been moved to figuring out
how to relate this new distribution to a specific slit separation d . To do this we
remember that d /λL s for νx . So, given a particular slit separation, I can figure out
the visibility by finding first νx = d /λL s the spatial frequency associated with the
given geometry and then determining the magnitude of the Fourier transform at
this νx
|I (νx )| . (5.31)
With the previous we can figure out the depth the interferogram produced by any
spatially extended source that is composed of uncorrelated point radiators!!
Lets use this machinery for a simple example. First, we will assume our
source is a monochromatic point source that radiates an irradiance I o = I tot . To
mathematically represent this point source we introduce the Dirac delta function.
The Dirac delta function is the continuous variable analog of the Kronecker
Delta function. Technically the Dirac delta is a distribution that only exists to
be integrated and has unit area. For wave optics we will define the Dirac delta
function as

δ(x − x o ) = 1 when x = x o (5.32)


δ(x − x o ) = 0 when x 6= x o (5.33)

and will not be concerned with the mathematical subtleties regarding its defini-
tion. With the previous definition we have a mathematical function to represent
point sources of irradiance and complex amplitude. Notice, the Dirac delta is
equal to 1 when its argument is 0 and is zero otherwise. For δ(x − x o ) this happens
when x = x o . We will also encounter δ(x + x o ). This function is one when x = −x o
since this makes the argument of δ(x + x o ) equal zero.
76 Chapter 5 Interlude on coherence

The Dirac delta function has a useful property called the Sifting Property and
this details how to use the Dirac delta function in an integral. Specifically,
Z
f (x)δ(x − x o ) d x = f (x o ) (5.34)

so that a function integrated over the delta function has its value at f (x o ) sifted
out of all its possible values. Recall f (x) is a catalog (or table) of values relating
numbers to each independent variable value x and when integrating over the
Dirac delta function the result is one number, the value of f at x o .
With the previous definitions it is possible to determine g AB for a monochro-
matic point source of irradiance I o located at the origin. Our mathematical
representation of this point source is

I s (x s ) = I o δ(x s ) (5.35)

where x o = 0 so that the delta function is 1 at x s = 0, the origin. To determine the


visiblity it is necessary to evaluate Eq. (5.29). Evaluating this we find

I o δ(x s )e i kd x s /L s d x s e i kd 0/L s d x s
R
Io
g AB = = = (5.36)
I tot I tot I tot

where we have substituted in our definition of a irradiance point source and


used the sifting property of the delta function. Since I o = I tot this means g AB = 1!
The inteferogram has unit visibility. This is exactly what we would expect for a
monochromatic point source. Here is a question, what happens if the point source
is located at x s = x o ? How do you represent this point source mathematically
and what is the resultant interferogram? Physically we have displaced the point
source of the optical axis in the source plane.
The next example we consider is two-dimensional circular aperture. Although
we have developed our expressions for a one-dimensional source plane it is to
extend the previous discussion to 2D. First, we find g AB becomes
Z Z
g AB = I s (x s , y s )e i 2π(νx x s +νy y s d x s d y s (5.37)

where the source has been extended along the y-direction and we now need to
take a 2D Fourier transform. To determine the visibility of an interferogram in a
Young’s interferometer it is necessary to orient a linecut through the 2D Fourier
transform that it coordinated with the line connecting the two pinholes in the the
Young’s aperture.
Imagine our source is a disk of radius ∆s .

5.3 Temporal coherence


In this section we adapt the previous section’s discussion to a point source with
extended spectrum. An illustration of the setup considered is shown in Fig. 5.6.
5.3 Temporal coherence 77

xs xa xp

A rAP
I(fj) rSA
xP
d
f jc z

rSB rBP
fo f
B
fj = fo+ f jc
Ls Lz

Figure 5.6 Temporal coherence. Setup to examine the temporal coherence properties
of a polychromatic point source.

Although the spectrum is continuous, again we will begin by assuming the spec-
trum is discrete and then take the continuum limit with our final expression. In
the below derivation we will find it convenient to measure the frequency from
the center of the spectrum. With the previous, the frequency will be written as
(j) (j)
f ( j ) = f o + f c where f c is the frequency measured from the spectrum’s center
frequency f o . These frequencies are all illustrated in Fig. 5.6.
If the spectral complex amplitude is expressed as US ( f ( j ) ) and we further
assume that each spectral component of the radiates with an initial phase that is
uncorrelated with other spectral components, we find the irradiance measured
on our detector can be expressed as

U A ( f ( j ) )U A∗ ( f ( j ) ) +UB ( f ( j ) )UB∗ ( f ( j ) )
X
(5.38)
j

+U A ( f ( j ) )UB∗ ( f ( j ) ) +UB ( f ( j ) )U A∗ ( f ( j ) ) (5.39)

where Ui ( f ( j ) ) denotes the complex amplitude from slit i with frequency j that
arrives at the detector location. We had a similar expression in the previous
section where j denoted the source location instead of the spectral component.
It is now possible to evaluate each term by considering the given spectrum. Term
78 Chapter 5 Interlude on coherence

by term the result is

I s ( f ( j ))
U A ( f ( j ) )U A∗ ( f ( j ) ) = 2
(5.40)
r AP
I s ( f ( j ))
UB ( f ( j ) )UB∗ ( f ( j ) ) = (5.41)
r B2 P
(j)
I s ( f ( j ) )e i k (r B P −r AP )
U A ( f ( j ) )UB∗ ( f ( j ) ) = (5.42)
r AP r B P
(j)
I s ( f ( j ) )e −i k (r B P −r AP )
UB ( f ( j ) )U A∗ ( f ( j ) ) = (5.43)
r AP r B P
2π f ( j )
where I s ( f ( j ) ) = U s ( f ( j ) )U s∗ ( f ( j ) ) and k ( j ) = c . Referring to Fig. 5.6 we will de-
compose the linear frequency in terms of the center frequency f o and a frequency
(j)
measured from the center frequency f c . With this substitution, and the paraxial
approximation, the spatial phase can be expressed as
(j)
(j) fo xp d fc xp d
eik (r B P −r AP )
→ e i 2π cL z e i 2π cL z (5.44)

where it has factorized into a piece dependent on the spectral component and
xp d
a constant phase dependent only on the central frequency chosen. Notice, cL z
carries units of seconds and is the time difference it takes for light to travel to the
xp d
detector from slit A and slit B. Lets define τ = cL z and I tot = j I s ( f ( j ) ) to write
P

the total irradiance at detector point (x p , L z ) as

2I tot X I s ( f c( j ) ) (j) X I s ( f c( j ) ) (j)


+ e i 2π f o τ e i 2π f c τ
+ e i 2π f o τ e −i 2π f c τ
(5.45)
L 2z j L 2z j L 2z

which we will rewrite as


I tot i 2π f o τ g AB (τ)
g ∗ (τ)
· ¸
−i 2π f o τ AB
1 + e + e (5.46)
L 2z 2 2

where
P ( j ) i 2π f c( j ) τ
j I s ( f c )e
g AB (τ) = . (5.47)
I tot
In the previous, g AB (τ) is the time domain analog of mutual intensity and is often
called the complex degree of coherence.
The next steps are identical to the spatial extended source. First, recognizing
g AB (τ) is a complex number, the expression for the irradiance can be cast in the
following form
¡ ¢
I tot 1 + |g AB (τ)| cos[k o d x p /L + ∠g AB (τ)] . (5.48)

which again looks very much like the our earlier results! Notice 2πτ f o has been
replaced with k o d x p /L where k o is the wavenumber of the central linear temporal
5.3 Temporal coherence 79

frequency. From the definition of visibility in Eq. (3.12) we find with the previous
expression the visibility V is
V = |g AB (τ)|. (5.49)
Notice, from how g AB (τ) is defined its magnitude varies between 0 and 1. It should
also be clear the phase of g AB (τ) determines the location of the interferogram’s
m = 0 fringe.
Finally, we will imagine are source spectrum is continuous and not discrete,
(j)
so that f c → f c . Ignoring any of the formalities in doing this, our expression
P R
for j will be replaced by an integral over the source spectrum, d f c . With this
substitution we have
I s ( f c )e i 2π f c τ d f c
R
g AB (τ) = . (5.50)
I tot
where the numerator is called an Inverse Fourier Transform. Notice, that al-
though this is an inverse Fourier transform, it has the same sign as Eq. (5.30). The
sign convention is a result of how we have chosen to express the phase of our
waves. Equation (5.50) is the time domain analog of the van Cittert Zernike The-
orem. In words this van Cittert Zernike theorem relates the temporal coherence
properties of the source to the inverse Fourier transform of the source spectrum!
In discussing spatial coherence, we identified the first zero of |g AB | with the
lateral coherence length of the spatially extended source. We will do the same
here and use the first zero crossing of the inverse Fourier transform of the source
spectrum to define the coherence time τc . The coherence time tells you relative
time delay two waves can experience and still produce interference fringes. Using
the speed of light, it is also possible to define the longitudinal coherence length l c ,
often simply called the coherence length. The coherence length is l c = cτc . The
coherence length is a measure of the largest optical path length difference two
waves can sustain before they can no longer interfere.
Chapter 6

Interferometry of N simple
sources

In this chapter we will explore interferometry when a countable number of objects


interfere. A generalization of the Young’s two-pinhole will provide the physical
optics underpinning of diffraction gratings. It will also serve as a springboard for
our discussion of diffraction in subsequent chapters. A second example we will
study is this chapter is an amplitude splitting interferometer where the amplitude
is split N-times. The previous is the basis for the Fabry-Perot etalon and the
Fabry-Perot cavity.

6.1 Wavefront splitting interferometers; n=N → the diffrac-


tion grating
Consider the scenario illustrated in Fig. 6.1. The illustration in the right-most
panel will be analyzed in this section. We imagine an otherwise opaque screen
contains N pinholes along a line. Each panel hole samples the assumed spatially
coherent illuminating wavefront. Calculation of the total field and irradiance
requires the superposition of these N complex amplitudes.
Specifically, the total complex amplitude arriving at the observation point r p
is
u o i kr 1p u o i kr 2p u o i kr N p
UT (rp ) = U1 +U2 + .... +U N = e + e + .... + e (6.1)
r 1p r 2p rN p

where the complex amplitude contributed by the i th pinhole is given by


u o i kr i p
Ui (rp ) = e . (6.2)
ri p
At the end of the calculation we will use the paraxial approximation (L z >>
d s , x p , y p ) so r p can be replaced by L z in the complex envelope. Factoring out the
complex amplitude associated with pinhole 1 the sum in Eq. (6.2) becomes
u o i kr 1p ³ ´
UT (rp ) = e 1 + e i k(r 2p −r 1p ) + .... + e i k(r N p −r 1p ) . (6.3)
Lz

81
82 Chapter 6 Interferometry of N simple sources

r1p

r2p I(xp,Lz)
ds

rp
ds=2Δs
z

rNp

Figure 6.1 From Young’s Two-Pinhole to the Diffraction Grating. In moving from the
illustration on the left to the right, progressibley more pinholes are poked into the
opaque screen,. Each pinhole samples a single secondary source on the illuminating
wavefront and this source propagates to the right toward the detector plane. Provided
each source is coherent with the others an interference pattern will form.

Defining δ = d s sin θ (θ is the angle between the z-axis and rp ) the phasor sum in
the previous equation becomes
n=N
X−1 n=N
X−1
1 + e i kδ + .... + e i k(N −1)δ = e i nkδ = xn (6.4)
n=0 n=0

since each pair of pinholes has a path length difference of d sin θ and x = e i kδ has
been used. The previous can be expressed as
n=N
X−1 1 − xN
xn = (6.5)
n=0 1−x

where the N term sum has been evaluated. Substituting back into the formula we
find
1 − e i kδN e i kδN /2 e i kδN /2 − e −i kδN /2
= i kδ/2 . (6.6)
1 − e i kδ e e i kδ/2 − e −i kδ/2
Equation (6.6) simplifies by introducing the sin function and recognizing that

N −1
r 1p + δ = rp . (6.7)
2
Bringing together Eqs. (6.3), (6.6) and (6.7) we find

u o e i kr p sin (kd s sin θN /2)


UT (rp ) = . (6.8)
Lz sin (kd s sin θ/2)
xp
Finally, recognizing as usual, that in the paraxial limit sin θ u θ = Lz we have

kd s x p N
³ ´
uo e i kr p sin 2L z
UT (rp ) = ³
kd s x p
´ . (6.9)
Lz sin 2L z
6.1 Wavefront splitting interferometers; n=N → the diffraction grating 83

We can understand the previous expression as consisting of two parts. The first
part, that contributes the far-field phase, is a spherical wave. Notice that in this
limit L z u r p and so the factor multiplying the sin ratio is identical to a spherical
wave situated at the coordinate origin. The second factor, the sin ratio, captures
the effect of the pinhole array and is refered to as a radiation pattern. It captures
how the far-field complex amplitude is modulated by the array of pinholes. If
one thinks of the pinhole array as a collection of phased antennas, the influence
of adding more antennas is to induce more directivity in the delivered optical
energy. Can you think of a way to scan the spatial location of the main irradiance
lobe in time?
Notice already there is a potential problem for x p = 0 since the expression
becomes indeterminate (0/0). In fact this is true for all numerator and denomi-
nator arguements that are a multiple of π. To figure the field amplitude at x p = 0
L’Hopital’s rule can be used where (ignoring the phase) the derivative of the
numerator and denominator evaluated as x p → 0 is

kd s x p N
³ ´
d
d xp sin 2L z N cos(0)
³
kd s x p
´ = = N. (6.10)
d
sin cos(0)
d xp 2L z

The same line of reasoning holds for all scenarios that result in 0/0. These loca-
tions correspond to the maxima of the measured/observed irradiance distribution
which is given as
kd s x p N
³ ´
I o N 2 sin2 2L z
I T (rp ) = ³
kd s x p
´ . (6.11)
L 2z sin2 2L z

In Fig. 6.2 we plot the irradiance of Eq. (6.11) for N=2,10,100 assuming
d = 10 µm, λ = 1 µm and L z = 1m. As N increases there are three changes in
the observed irradiance. First, each peak height increases. Notice, as expected,
the peak irradiance increases as N 2 I o where N is the number of pinholes in the
screen. Second, the lobe width narrows. And, lastly, the number of secondary
maximum between the main lobes increases with N and there are N − 1 total
secondary maxima.
We can quantify the previous observations. First, notice the numerator zeros
determine the main lobe width. Consulting Eq. (6.11) the numerator has zeroes
at
kd s x p N (m)λL z
= mπ → x p = (6.12)
2L z N ds
where m is an integer. m = 0 give the main lobe maximum location and the m = 1
determines the main lobe width which is
kd s x p N λL z
= π → xp = (6.13)
2L z N ds

and clearly becomes narrower as N increases.


84 Chapter 6 Interferometry of N simple sources

6 12000

5 N=2 10000
N=100
4

IT / Io
8000
3
6000
2 120
4000
1 100
2000
0 80

-1 0
-150 60-100 -50 0 50 100 150 -150 -100 -50 0 50 100 150

40
120 10000

100
N=10
20 N=100
8000
0
80
IT / Io

-150 -100 -50 0 50 100 1506000


60

40 4000

20
2000

0
0
-150 -100 -50 0 50 100 150 -10 -5 0 5 10

xp [mm] xp [mm]

Figure 6.2 Diffracton grating irradiance. Upper left panel: The irradiance for N = 2.
This is identical to the Young’s two pinhole irradiance. Bottom Left Panel: N = 10 irra-
diance. inset: zoom to diffraction pattern to see the number of minima and secondary
maximum. Upper right panel: N = 100 diffraction grating. Notice lobe narrowing and
peak height increase. Bottom right panel: Zoom in to the main lobe of the N = 100
grating main lobe width.

Finally, we can determine the number of minimum (and secondary maxi-


mum) between two primary irradiance maximum (the primary maximum are the
large peaks in Fig. 6.2). To determine the number of minimum we need to count
the number of zeros given by Eq. (6.12) that occur between the first and second
denominator zeros. The zeros of the denominator are found by the equation:
kd s x p pλL z
= pπ → x p = (6.14)
2L z ds
where p is an integer. Clearly, in both equation (6.12) and (6.14) the first zero
occurs for m = p = 0. Next we need to find the value of m in Eq. (6.12) that will
result in the same phase as the denominator when p = 1. Notice the factor of N
in the phase of the numerator amplifies the phase evolution as x p moves along
the detector plane. Specifically
kd s x p mπ kd s x p
= =π= (6.15)
2L z N 2L z
which reveals when m = N the phase of the numerator will result in a zero that
corresponds to the second zero of the denominator leading to the second maxi-
mum. So, between the two maxima there must be N −1 zeros and N −2 subsidiary
maximum. Consulting the inset of Fig. 6.2, for N = 10, it is possible to count there
are 9 minimum (10 − 1 = 9) and 8 subsidiary maximum (10 − 2 = 8) as expected.
6.2 Amplitude splitting interferometers; n=N → the Fabry-Perot etalon 85

6.2 Amplitude splitting interferometers; n=N → the Fabry-


Perot etalon
Chapter 7

Scalar Diffraction Theory

In this chapter, building both on our understanding of how to add up waves


and leaning on Huygen’s principle we discover how to handle a wide variety of
optical disturbance propagation problems. The basic problem of diffraction is to
determine the complex amplitude of the field in some region of space given that
it is specified some other spatial region. For us, we will only consider situations
where the optical disturbance is specified across a plane and we want to find it
on a 2nd plane. Remember, we began our study of propagation by considering
the wave equation; one of the postulates of scalar wave optics. We considered
monochromatic plane and spherical waves. Knowing this solution in some region
of space at a given time it is easy to determine the disturbance in a different
region of space at a different time. As illustrated in Fig. 2.12a if I know the
optical disturbance is a monochromatic plane wave propagating in the positive
z-direction and I know its complex amplitude in the x − y plane at z = 0 is U then
it is a simple matter to figure out the optical disturbance at a plane located at
z = Lz
U → U e i kL z . (7.1)
Building on the previous, if we knew the complex amplitude on a plane consisted
of two monochromatic plane waves, we could also determine the resultant com-
plex amplitude on another plane. For this we exploit the fact the wave equation
is linear and that a viable solution can be constructed from the superposition of
two known solution. In this example it is the two plane waves. In this chapter we
will continue with this style of reasoning, except we will add up spherical instead
of plane waves!

7.1 Huygen-Fresnel Integral


Recall Huygen’s principle. A given wavefront of an optical disturbance can be
imagined to “radiate" a spherical wave, also called a spherical wavelet or sec-
ondary spherical source, from each point on its wavefront. A subsequent wave-
front is found from connecting all the “radiated" spherical wavelets at a later time.
Jean Augustin Fresnel took Huygen’s principle and turned it into a mathematical

87
88 Chapter 7 Scalar Diffraction Theory

formula to describe wave propagation and diffraction. With our current under-
standing regarding the wave nature of light, Fresnel’s formulation seems natural
and obvious, but at the time he devised his construction, the wave nature of light
was hardly recognized or appreciated.
Fresnel’s reasoning can be understood by considering Fig. 7.1. The left panel
of the figure is an illustration of the Young’s double slit experiment. In describing

d=2Δs d=2Δs

Figure 7.1 From Young’s To Fresnel Diffraction. Adding up spherical wavelets provides
a good intuitive physical optics picture to understanding both interference and diffrac-
tion. The left panel is a Young’s double slit aperture. The middle two panels are the
result of adding more pinholes in the aperture screen. Finally, if the slit is imagined
to be filled with a continuum of secondary spherical sources we can make sense of
diffraction.

this apparatus in Ch. 4 we assumed a single monochromatic wave illuminated


the aperture screen from the left and that each of the two slits A and B generated
secondary spherical waves. The amplitude and phase of the secondary waves
was determined by the specifics of the illuminating monochromatic and spatially
coherent source. The superposition of these secondary spherical waves on the
detector plane gave rise to an interferogram in the observed irradiance. It should
be obvious that in the middle panels of Fig. 7.1 we could apply similar reason-
ing regarding the superposition of 3 and N waves to determine the measured
irradiance (this is in fact provides a simple physical optics model of a diffraction
grating).
Consider the right panel of Fig. 7.1. In this case we can imagine the aperture
screen has been filled with so many pinholes that it now contains a finite size
opening. In this situation how would we calculate the downstream complex
amplitude and irradiance? Simply add up the spherical wavelets. Our intuition
says we should expect something like

e i kr sp
Z
Ud (rp ) u U s (rs ) d xs d ys (7.2)
source r sp

where the distance between the source point rs and detector point rp is r sp and
the source plane complex amplitude distribution U (rs ) has been introduced.
Equation (7.2) is nearly correct. From some further reasoning (see section 6.**)
7.1 Huygen-Fresnel Integral 89

Fresnel arrived at the following diffraction integral

−i e i kr sp
Z
Ud (rp ) = U s (rs ) cos θsd d x s d y s (7.3)
λ source r sp

now called the Huygen-Fresnel diffraction integral. In Eq. (7.3) the changes are
in the prefactor of the integral, which is equal to −i /λ and in the integral kernel
there is a cos θsd . The cos-factor is called the obliquity factor and is a function
of the angle between the normal to the source plane and the vector connecting
the source point and observation point. Although given for completeness, in the
paraxial scenarios we will consider the obliquity factor is approximately 1. Notice
the obliquity factor reduces the contribution of the spherical wavelets to points
that are displaced away from the axis connecting the source and detector plane
through the source point.
What is remarkable about Fresnel’s findings are they predated Maxwell’s equa-
tions by nearly 50 years! Following the discovery of Maxwell’s equations many
approaches have been devised to treat the problems initially considered by Fres-
nel. These later approaches can handle a wider variety of problems – for example
the polarization properties of waves and observation planes that are a few wave-
lengths from the source plane – but, Fresnel’s result is found to still be correct
within its restricted domain and fortunately for us it can explain everything we
are interested in.
Figure 7.2 presents the general diffraction problem we will consider. Although
the illustration is only given in the x − z plane it is a simple matter to include
the y direction in the diffraction equations. We imagine a monochromatic and
spatially coherent source illuminates the diffraction aperture. For example the
aperture could be illuminated by a monochromatic plane wave with complex en-
velope Ui n = u o . If the aperture presents this wave with an amplitude and phase
transmission coefficient t s (x s , y s ) the input complex amplitude, in the source
plane becomes U s (x s , y s ) = Uout (x s , y s ) = u o t s (x s , y s ). The lower 3 panels in Fig.
7.2 are illustrations of the source plane as viewed from the detector plane. The
most generic case is the left panel of the source plane illustrations and here it is
assumed U s (x s , y s ) = u o t s (x s , y s ) is completely arbitrary. Also, notice the source
aperture has some longest opening across it. In this illustration this distance is
parallel to the x s axis and has a length 2∆s . The entire aperture can be circum-
scribed by a circle of radius ∆s . Moving to the right, we can also imagine the
aperture has unit transmission associated with it so that it is only the geometry
of the aperture that determines the diffraction pattern. Finally, the 3rd example
aperture has a phase shift of π/3 for x s < 0 and has a magnitude of u o across the
entire aperture. Equation (7.3) can handle all these cases.
Notice, from Eq. (7.3) each detector point, at a distance L z from the source
plane, receives complex amplitude from all source points. Each complex ampli-
tude in the source plane U s (x s , y s ), located at rs , is propagated as a spherical wave
to the detector point rp . To determine the complex amplitude on the detector
plane, it is in principle necessary to evaluate the integral in Eq. (7.3) for each
90 Chapter 7 Scalar Diffraction Theory

diffraction geometry detector plane


xs xp
xp
λo
rsp
θsp z
2Δs yp
2Δd Δd

Lz

Source Plane
xs xs xs

ys Δs ys ys
uo uoeiπ/3 uo
Us(xs,ys)

Figure 7.2 Diffraction from an aperture. The upper left panel is a general illustra-
tion of diffraction from an aperture. It is assumed the illumination is monochromatic
and coherent across the aperture. The lower three panels illustrate the source plane
as viewed from the detector plane. The left source plane panel illustrates a generic
2D diffraction aperture with complex amplitude U s (x s , y s ), the middle plane has
U s (x s , y s ) = u o a constant real amplitude and the right aperture is illuminated with
a complex amplitude that has a π/3 phase shift for x s < 0 and a magnitude of u o across
the entire aperture. The detector plane, upper right panel, is assumed to contain a
circular detector of radius ∆d . If the detector was rectangular, 2∆d , would be the length
of the rectangle’s diagonal that the circle circumscribes. ∆s and ∆d will be important in
what follows.

detector point. In general cases this is done numerically. The upper right panel of
Fig. 7.2 is an illustration of the detector plane that contains a detector of radius
∆d . If the detector was rectangular 2∆d would be the length of the rectangle’s
diagonal and the circle of radius ∆d would circumscribe the rectangular detector.
In the coming sections we will make approximations dependent on the system
geometry L z , ∆d , and ∆s as well as the wavelength λo so that the evaluation of
Eq. (7.3) simplifies and even allows for analytic solutions of the Huygen-Fresnel
diffraction integral.

7.2 The Fresnel Approximation


In this section we begin the application of the paraxial approximation to the
Huygen-Fresnel diffraction integral in Eq. (7.3) The starting point for this is
we will be interested in source-detector plane separations that are much larger
than the difference between source plane and detector plane coordinates. As is
becoming standard fare in wave optics, we will start with the distance between
the source and detector points distance as
q q
r sp = L 2z + (x p − x s )2 + (y p − y s )2 = L 2z + (δx sp )2 + (δy sp )2 (7.4)
7.2 The Fresnel Approximation 91

where δx sp = x p −x s and δy sp = y p − y s . When |δx sp | and |δy sp | are much smaller


than L z the previous can be approximated as

(δx sp )2 + (δy sp )2 [(δx sp )2 + (δy sp )2 ]2


r sd u L z + − (7.5)
2L z 8L 3z

where only the first three terms in the binomial expansion are retained. The
Fresnel Approximation is enforced by substituting the previous into Eq. (7.3).
This approximation will influence both the amplitude and phase of the diffraction
integral. Dealing with the amplitude, as usual, is easier. For these distances, there
is little difference between L z and r sp so L z can replace r sp in Eq. (7.3). The
previous substation modifies the complex envelope of each spherical wave that
connects source and detector points and causes the obliquity factor, equal to
cos θsp = L z /r sp , to approximately equal 1.
We need to be more careful with the phase of the diffraction integral kernel
since the phase is defined modulo 2π. Concentrating on the complex exponential
in the integral kernel of Eq. (7.3) we find

(δx sp )2 +(δy sp )2 [(δx sp )2 +(δy sp )2 ]2


k
e i kr sp u e i kL z e i k 2L z e 8L 3
z (7.6)

where the single exponential has been split into three distinct pieces by keeping
the first three terms of the binomial expansion. Before focusing on the first two
exponential factors, lets make an argument to ignore the third exponential. The
essence of the argument is to find a relationship between optical system parame-
ters that result in the phase in the third exponential being close to zero. Physically
this means that terms of third order and higher have negligible effect on the phase
received by the detector point from the source point. If only third order terms
and higher were important in the determining the propagation phase between
the source and detector point the approximation we will make would result on
points across the source arriving in phase at the detector point to constructively
interfere.
To simplify the discussion without loss of generality we will assume the source
point is situated at the coordinate origin and we allow the detector point to
explore the detector plane within a radius ∆d (see Fig. upper right panel). By
assuming the source sits at the origin we can express (δx sp )2 + (δy sp )2 as x d2 + y d2 .
For the approximation, it is necessary that

[x d2 + y d2 ]2max
k << π (7.7)
8L 3z

where [x d2 + y d2 ]max is the maximum radius in the detector plane. With the source
at the origin and recalling that we assume the detector sits within a radius ∆d in
the detector plane, see Fig. 7.2. . The previous turns Eq. (7.8) into
t In reality this distance
∆4d should be the maximum
k << π (7.8) distance across the source
8L 3z
and detector plane. In our
notation this would be
∆d + ∆s
92 Chapter 7 Scalar Diffraction Theory

where ∆d defines the largest separation of source and detector points. Rearrang-
ing terms Eq. (7.8) can be written as

N fd θd2
<< 1 (7.9)
4
where θd = ∆d /L z defines the angle between the source point and the observation
point and the detector Fresnel number has been defined as

∆2d
N fd = . (7.10)
λL z

The meaning of the previous is the following. When the optical system param-
eters satisfy Eq. (7.9) only the two lowest order terms in the binomial expansion
of the source-detector distance, see Eq. (7.4) and Eq. (7.5) make a contribution
to the propagation phase between the two points. With the previous approxima-
tions to the amplitude and phase the diffraction integral Eq. (7.3) simplifies to
the Fresnel Diffraction Integral

−i e i kL z (x p −x s )2 +(y p −y s )2
Z
U p (x p , y p , L z ) = U s (x s , y s )e i k 2L z d xs d ys . (7.11)
λL z S

In the previous the integral is taken over the source plane. We can express Eq.
(7.11) to make apparent the different phase contributions. Opening up the ex-
ponent in the integrals kernel we find the second form of the Fresnel diffraction
integral
2 +y 2
xp p
−i e i kL z e −i k x s2 +y s2
2L z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e i k 2L z e −i 2π λL z d xs d ys
λL z S
(7.12)
where in the integral prefactor there is a linear plane-wave like phase due to
the propagation between the source and detector plane and there is a quadratic
phase in the detector coordinates x p and y p , in the kernel of the integral there
is a quadratic phase in the source coordinates x s and y s as well as a linear phase
that depends on both the source and detector coordinates. The next set of ap-
proximations are arguments designed to ignore the quadratic phase in the source
and detector coordinate.

7.3 Fraunhofer Diffraction: The far-field


Building off the Fresnel approximation there is a second set of simplifications
that can be made to the diffraction integral in Eq. (7.12). To appreciate these
approximations, Fig. 7.3 illustrates the quadratic phases associated with the
Fresnel integral, Eq. (7.12). Notice there is a quadratic phase in the kernel of the
integral that is a function of the source coordinates and that there is a quadratic
phase in the integral’s prefactor that depends on the detector coordinates. Each
7.3 Fraunhofer Diffraction: The far-field 93

xs xp
quadratic phases

kx2s/2Lz kx2p/2Lz
2Δs
2Δd

Lz

Figure 7.3 Quadratic phases in the Fresnel integral. This illustration demonstrates the
quadratic phase variation of Eq. (7.12) found in the source and detector plane. The
next set of approximations deals with these quadratic phases.

of these phases are illustrated by two curves, one solid and one dashed. The
next simplification to the diffraction integral is to constrain the quadratic phase
of the source coordinates so that it does not vary across the source aperture as
illustrated by the solid curve in Fig. 7.3. The approximation amounts to ensuring
that
k(x s2 + y s2 )max (x 2 + y s2 )max
<< π → s = N fs << 1 (7.13)
2L z λL z
where we have equated (x s2 + y s2 )max , the maximum distance in the source plane,
with the largest distance ∆s from the optical axis to the perimeter of the source
aperture and defined the source Fresnel number

∆s
N fs = . (7.14)
λL z

We see if the source Fresnel number, which is a property of the source size, the
wavelength and the distance to detector plane, is much less than 1 than the
Fresnel diffraction integral simplifies to the Fraunhofer diffraction integral
2 +y 2
xp p
−i e i kL z e i k 2L z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e −i 2π λL z d xs d ys (7.15)
λL z source

One further simplification is possible and this occurs if the quadratic phase
in the Fraunhofer integral prefactor can be ignored. The reasoning is identical to
94 Chapter 7 Scalar Diffraction Theory

the Fraunhofer approximation and can be expressed as

k(x d2 + y d2 )max
<< π → N fd << 1 (7.16)
2L z

where the detector Fresnel number needs to be less than 1. If this is inequality is
also satisfied that the Fraunhofer integral becomes

−i e i kL z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e −i 2π λL z d xs d ys (7.17)
λL z source

where all quadratic phase dependence has been removed.


In summary, starting with the general form of the diffraction integral in Eq.
(7.3), we have made a number of system specific approximations that result in
simplifications to the diffraction integral. In principle, each one of these simplifi-
cations has made solving for the diffracted field complex amplitude distribution
easier. Restating the series of approximations

N fd θd
<< 1 → Fresnel Approximation, use Eq. (7.12) (7.18)
4
N fs << 1 → Fraunhofer Approximation, use Eq. (7.15) (7.19)
N fd << 1 → Fraunhofer Approximation, use Eq. (7.17) (7.20)

The previous is also summarized in the illustration presented in Fig. 7.4. The
expressions we have developed are not valid in the near-field region of the source
plane, within a distance of a few wavelengths. Once the detector plane is multiple
wavelengths away from the source plane, the detector aperture, source-detector
distance (L z ) and the source wavelength determine the diffraction region in which
the observation is made. the detector In propagating from the source plane to the
detector plane, as the

7.3.1 Some Examples


How do you use the previous to solve problems? It is generally a two-step process.
First, decide what is the appropriate diffraction formula to use and then integrate!
As an example, assume λ = 500 nm, L z =10 m, ∆d = 2 mm and ∆s = 50 µm. We
find that

N fd θd2
= 0.8 × 4 × 10−8 /4 = 8 × 10e −9 (7.21)
4
N fs = 5 × 10−4 (7.22)
N fd = 0.8 (7.23)

so in this case, it is appropriate to use Eq. (7.15) and although the Fresnel approxi-
mation is justified, the detector Fresnel number is not less than 1.
7.3 Fraunhofer Diffraction: The far-field 95

Fresnel Fraunhofer
xs xp xp xp
1 2

λo

2Δs 2Δd 2Δd 2Δd

few λ

Lz

Figure 7.4 Different diffraction regions. This illustration demonstrates the different
regions of diffraction.

The last thing to do is to calculate some diffraction patterns! We will first


consider the far-field, i.e. Fraunhofer, diffraction pattern from a 1-dimensional
(1D) aperture. Lets assume the amplitude transmission function associated with
the source plane aperture can be expressed as

t s (x s ) = rect(x s |d ) (7.24)

where d is the full-width (i.e. diameter) of the aperture. The input source
complex amplitude distribution in this case is completely real and equal to
U s (x s ) = u o t s (x s ). The diffraction pattern we will calculate is what would be
expected from a 1D slit. Using C to represent the complex pre-factor of Eq. (7.15)
we find that integral is
Z xs xp
Z ∆s xs xp
i 2π
U p (x p , L z ) = Cu o rect(x s |∆s )e λL z d x s = Cu o e −i 2π λL z d x s (7.25)
s −∆s

where the rectangular function limits the integration across the source plane.
Note the source complex amplitude distribution is constant and coherent across
the aperture. Evaluating the integral we find

2Cu o ∆s sin(2π∆s x p /λL z )


U p (x p , L z ) = = 2Cu o ∆s sinc(2π∆s x p /λL z ) (7.26)
2π∆s x p /λL z

where we have defined the sinc function


sin(2π∆s x d /λL z )
sinc(2π∆s x d /λL z ) = . (7.27)
2π∆s x d /λL z

The resultant irradiance measured by the detector is

I p (x p , L z ) = |Ud (x d , L z )|2 = |2Cu o ∆s |2 sinc2 (2π∆s x p /λL z ) (7.28)


96 Chapter 7 Scalar Diffraction Theory

The 1D diffraction geometry, the complex amplitude and the irradiance is


presented in Fig. 7.5. The left column illustrates the diffraction geometry and
the complex amplitude in the source plane. The right column illustrates various
representations of the diffraction pattern. The top plot in the right column is the
observed complex amplitude distribution. Another way to visualize is to plot the
magnitude and phase of the complex amplitude. The second plot illustrates the

diffrac'on  geometry   detector  plane  

xs xp Up(xp)
λo
z xp
2Δs
|Up(xp)|

Lz
xp
source  plane   <Up(xp)
π
Us(xs) 0
xp
Ip(xp)
-Δs Δs xs

xp

Figure 7.5 Fraunhofer diffraction from a 1D Slit/Rectangular Aperture. This illustra-


tion demonstrates the different regions of diffraction.

magnitude of the complex amplitude and the third plot is the associated phase.
Recall again in optics, a minus 1 is in reality a phase shift of π! The since is a
special diffraction pattern in that its phase is only 0 or π depending on detector
location. Other complex functions can have phases different from 0 and π and it
is only possible to visualize these functions by separately plotting the magnitude
and phase as is done in plots two and three in the right column. Finally, an optical
detector measures irradiance and the bottom plot illustrates the sinc irradiance.
From properties of the sinc function we can identify interesting locations of
the diffraction pattern. First, and most important, is x p = 0. Notice Eq. (7.28) is
divided by x p so it would seem at this location the diffraction pattern should be
infinite. But, in fact, the entire expression is indeterminate at x p = 0 so to find the
value at this location it is necessary to use l’Hopital’s rule (remember calculus!).
From this we find the value of sinc(x) at x = 0 is 1. Second, the numerator of Eq.
(7.27) locates the zeros of the diffraction pattern. Specifically,

2π∆s x p(m) mλL z


= mπ → x p(m) = (7.29)
λL z 2∆s

where m identifies the spatial location of the mt h zero. A generic feature of


diffraction,already present in this result, is the diffracted beam spreads out in
a direction orthogonal to the slit direction. Said, another way, trying to limit
7.3 Fraunhofer Diffraction: The far-field 97

the beam along one axis causes it to spread out along that limiting direction in
observation planes beyond the confining aperture. This is quantitatively clear as
x p(m) ∝ ∆−1 s . With the numbers given previously the first zero is located at 50 mm
in the detector plane and the main lobe width is 100 mm.
A second important diffraction aperture is the circle. In contrast to the slit,
the circle is a two-dimensional (2D) aperture in the source plane. We can think of
the circle as a 1D slit rotated about the optical axis. The transmission function,
t s (x s , y s ) of the circle is defined in the same was a rect function
q q
circ( x s2 + y s2 |∆s ) = 1 x s2 + y s2 < ∆s (7.30)
q q
circ( x s2 + y s2 |∆s ) = 0 x s2 + y s2 > ∆s (7.31)

where ∆s is the circle radius. With the previous definition the following diffraction
integral needs to be solved
Z q x s x p +y s y p
U p (ρ p , φd , L z ) = Cu o circ( x s2 + y s2 |∆s )e −i 2π λL z d x s d y s (7.32)
s

where due to the circular symmetry it is convenient to use cylindrical coordinates


in both the source plane, (ρ s , φs ), and in the detector plane, (ρ p , φp ), instead of
the Cartesian coordinates we have been using. Working in cylindrical coordinates
we can write x s = ρ s cos φs , y s = ρ s sin φs , x p = ρ p cos φp , and y p = ρ p cos φp so
q
in the source plane it is possible to express ρ s = x s2 + y s2 . To solve Eq. (7.32) we
transform both source and detector coordinates to cylindrical coordinates and
then integrate! We will not solve this integral directly and instead directly quote
the result. The diffraction pattern of a circular aperture is

U p (ρ p , φp , L z ) = Cu o 2π∆2s Jinc(ρ p , φp , L z ) (7.33)

where the Jinc function is defined as

J1 (2πρ p ∆s /λL Z )
Jinc(ρ p , φp , L z ) = (7.34)
2πρ p ∆s /λL z
q
and π∆2s is the area of source’s circular aperture and ρ p = x p2 + y p2 .The Jinc
function is the circular equivalent of the Sinc function. The irradiance of the
diffraction pattern formed by the circular aperture is the well known Airy Disk.
Figure ** is an plot of the Airy disk. Looking forward we will discover these
determines the diffraction limited resolution of an imaging system. The first zero
of Jinc(x) is important an located at x = 3.83. From the argument of the Jinc

1.22λL z 1.22λL z
2πρ (0) (0)
p ∆s /λL z = 3.83 → ρ p = = (7.35)
2∆s D

where D is the diameter of the source aperture. When considering imaging


situations the location of this zero will be important.
98 Chapter 7 Scalar Diffraction Theory

7.4 The thin lens and diffraction


In simplifying the Fresnel diffraction integral, Eq. (7.12), to the Fraunhofer diffrac-
tion integral, Eq. (7.15), it was necessary to constrain the optical system so that
f
the source Fresnel number N s is much less than 1. In this scenario the quadratic
phase in the Fresnel integral plays a negligible role in the diffraction calculation.
There is another approach to realizing the Fraunhofer diffraction pattern that
does not involve going to the far-field of the source plane aperture. This second
approach involves placing a thin lens in the diffracting aperture. In this scenario,
the lens pulls the diffraction plane from the far-field to the focal plane of the lens.
Figure 7.6 is an illustration of the lens be incorporated into the source plane of
the optical system.

xs
Fraunhofer  
xp
λo
thin  
lens  

2Δs Fresnel   Fresnel  

f
Figure 7.6 Using a lens to observe diffraction. Placing a lens in the source plane aper-
ture introduces a quadratic phase across the source plane wavefront that can directly
compensate for the diffraction quadratic phase in the lens focal plane. The result is the
diffraction pattern of the lens pupil is observed in the lens focal plane. On either side of
the focal plane a Fresnel diffraction pattern is observed.

When a lens is incorporated into the aperture, the aperture can be modeled
as the following complex transmission function

k(x s2 +y s2 )
−i
t s (x s , y s ) = P (x s , y s )e 2f (7.36)

where we have introduced the pupil of the lens and the spatial phase transforma-
tion associated with a thin lens. If the source aperture is only one dimensional
then the pupil is a rect function. Specifically

P (x s ) = rect(x s |∆s ) (7.37)


7.4 The thin lens and diffraction 99

where ∆s is the radius of the pupil and is identical to the radius of the lens ∆l .
When discussing the lens we can use ∆s and ∆l interchangeably. If our source
plane is two dimensional then the pupil becomes
q
P (x s , y s ) = circ( x s2 + y s2 |∆s ). (7.38)

The previous equations model the optical properties of a cylindrical lens, Eq.
(7.37), and spherical lens, Eq. (7.38). Important in each is the quadratic phase of
the lens.
To see how the lens directly influences the diffraction problem, lets concen-
trate on one dimension and substitute Eq. (7.36) into Eq. (7.12). We find
x s2
Z xs xp
ik ( L1z − 1f ) −i 2π
U p (x p , L z ) u rect(x s |∆s )e 2 e λL z d xs (7.39)
S

where the complex prefactor has been ignored and initially the detector plane is
not coincident with the lens focal plane. From Eq. (7.39) notice for the quadratic
phase in the integral kernel to vanish L z = f . When this is satisfied
Z xs xp
U p (x p , L z ) u rect(x s |∆s )e −i 2π λL z d x s (7.40)
S

and in the focal plane of the lens we see the Fraunhofer diffraction pattern of the
lens pupil P . Notice the geometry of the pupil determines the observed diffraction
pattern and the lens quadratic phase enables the observation of this pattern in
the lens focal plane. Figure 7.6 also illustrates before and after the lens focal
plane we observe Fresnel diffraction whereas right in the focal plane we observe
Fraunhofer diffraction.

You might also like