You are on page 1of 705

Physics 195ab

Course Notes
020917 F. Porter
(fcp/tex/ph125/ay0102/outline)

1 Introduction
This note describes the Ph 195 course, “Advanced Quantum Mechanics” and
gives some administrative information.
There is a web page for the course:
http://www.cithep.caltech.edu/ fcp/ph195/index.html
All course materials will be posted to this web page.
Quantum Mechanics is at the foundation of modern physics; a thorough
understanding is important for a practicing physicist. This is a serious course,
designed to go deeper into the concepts and techniques than the more stan-
dard course. It is a 12 unit course, reflecting the greater intensity. The
outline (below) of material to be covered is ambitious for two quarters. I
intend to remain flexible concerning both pace and subject concentration.
I hope to keep classes somewhat informal, and encourage discussion among
participants. My notion is that the notes will provide the backbone, the
classes will aid understanding, and the homework will provide depth and
practical experience. In particular, I would like to use the classes to ad-
dress questions which arise both in the reading and on the homework, and
to provide illustrative examples relevant to the discussions in the notes.

2 Course Outline
Here follows a course outline. We may change it midstream according to the
interests and background of the course participants. There will (more-or-
less) be a course note handout corresponding to each major topic heading in
the outline.
1. Introduction
(a) Units
(b) Bohr atom
(c) Basic principles

1
2. Ideas of Quantum Mechanics

(a) Probability amplitudes, Wave equations, and Dispersion relations.


(b) Hilbert spaces, Self-adjoint operators
(c) Postulates of quantum mechanics
(d) Uncertainty Principle

3. Path Integral Approach

(a) Hamiltonian for a charged particle in an electromagnetic field


(b) Ahronov-Bohm effect

4. Density Matrix Formalism

(a) Statistical ensembles


(b) Postulates of quantum mechanics
(c) Entropy
(d) Canonical ensemble

5. Two state system – K 0 K̄ 0 mixing

(a) Dealing with non-mass eigenstates


(b) Dealing with decaying particles

6. Harmonic Oscillator in one dimension

(a) Annihilation and Creation operators


(b) Hermite polynomials

7. Resolvents and Green’s Functions

8. Angular Momentum

(a) SU(2), spin-1/2


(b) Addition of Angular Momenta
(c) Wigner-Eckart Theorem
(d) Angular distributions from rotational invariance

2
(e) Breaking of rotational symmetry

9. Solving the Schrödinger equation, approximately

(a) Variational method


(b) WKB approximation
(c) Time-independent perturbation theory
(d) Time-dependent perturbation theory

10. Scattering

(a) Fermi’s golden rule


(b) Partial wave expansion
(c) Resonances

11. Identical particles

(a) Bosons
(b) Fermions, Pauli exclusion principle

12. Second Quantization (photon field)

13. Relativistic Invariance, the Dirac Equation

3
Physics 195a
Course Notes
Preliminaries
020908 Frank Porter

1 Introduction
This note sets conventions and gives some handy conversions for Ph 195.

2 Units
We adopt a system of units which tends to avoid carrying cumbersome con-
stants in expressions. This has the benefit of permitting the physical mean-
ings to be more readily apparent. The downside is the need to do conversions
to “engineering” units. However, this is not a severe difficulty, as long as a
couple of conversion constants are remembered. Indeed, many practicing
physicists (and not only theorists) adopt these units in their everyday work.
Our system of units is such that Planck’s constant (over 2π), h̄ = 1, and
the speed of light in a vacuum, c = 1. Note that this implies the conversion
factors between SI (Système Internationale) units:
h 6.62
1 = h̄ = ≈ × 10−34 Js, (1)
2π 2π
1 = c ≈ 3 × 108 ms−1 . (2)
Thus, if we are given a quantity in Joules, we may convert it to inverse meters
by multiplying by

. (3)
(6.62 × 10 Js)(3 × 108 )ms−1
−34

We typically don’t find Joules to be an especially useful unit in quantum


mechanics, so this isn’t a particularly common conversion, but the idea is
the same for other conversions – you can multiply and divide by 1 as you
please, to go from one unit to another.
We actually will use a variety of units, as dictated by the physical scales
and desired intuition in any given situation. For example, atomic size scales
are conveniently described in angstroms (Å) and energy scales in electron
volts (eV), while the nuclear size scales are more convenient to discuss in

1
femtometers (or “fermis”) (fm) and energy scales in millions of electron volts
(MeV).
The above specification is not unique. The table below shows how Maxwell’s
equations, and related electromagnetic formulae appear in one choice of the
c = 1 (and h̄ = 1, as desired) units, the “Rationalized Heavyside-Lorentz”
units, in comparison with the more familiar modified (c = 1) Gaussian units.

Example: Electromagnetism
Modified Gaussian Rationalized Heavyside-Lorentz
∇ · D = 4πρ ∇·D =ρ
∇ × E = −∂t B same
∇·B =0 same
∇ × H = 4πJ + ∂t D ∇ × H = J + ∂t D
D = E; B = µH same
E = −∇Φ − ∂t A same
B =∇×A same
F = q(E + v × B) same

Let us look at the implications of the first Maxwell equation for Coulomb’s
law between two charges, q1 and q2 . If = 1, then in the modified Gaussian

. q2
F
.q
1
Figure 1: Force on charge q2 due to charge q1 .

units, ∇ · D = 4πρ. Integrating this over a sphere of radius r12 = |x1 − x2 |


centered at q1 gives an electric field at q2 , due to q1 of
q1
E1 (r12 ) = 2
, (4)
r12

2
or a force on q2 of
q1 q2
F = 2
. (5)
r12
If we repeat the exercise in rationalized Heavyside-Lorentz units, we find
q1 q1 q2
E1 (r12 ) = 2
, F = 2
. (6)
4πr12 4πr12
Consider now the electrostatic potential energy of two electrons separated
by an electron’s Compton wavelength, r = 1/m, where m is the electron
mass. We’ll measure the strength of the electromagnetic force by comparing
this energy with the electron’s mass. This strength is a dimensionless con-
stant α, known as the fine structure constant. Let K = 1 or 4π depending
on whether we are using modified Gaussian or rationalized Heavyside-Lorentz
units, respectively. Denote the electron charge by −e. Then:
 
V 1 1/m 1 1/m e2
α = = F · dx = dr (7)
m m ∞ m ∞ Kr 2
e2
= . (8)
K
Thus, the fine structure constant is expressed as e2 in modified Gaussian
units, and e2 /4π in rationalized Heavyside-Lorentz units. Since this is a
dimensionless constant, this may be confusing, but we can readily trace the
origin to the manner in which we wrote the first Maxwell equation. Since
the fine structure constant has a physical interpretation, it is numerically the
same independent of the units, approximately 1/137.
We will use the modified Gaussian system here, and e2 ≈ 1/137. But
be aware that quantum electrodynamics work is typically done in the other
system where we have e2 /4π ≈ 1/137.

2.1 Handy conversions


There are two conversion constants that are especially useful in quantum
mechanics computations:

1 = c ≈ 3 × 108 m/s, (9)


1 = h̄c ≈ 197 MeV-fm. (10)

3
Of course, we also use different scales depending on the problem, for example:

1 = 105 fm/Å, (11)


1 = 106 eV/MeV. (12)

When magnetic fields are involved, a handy formula to remember is:

p(GeV) = 0.3B(T)ρ(m)Q(e). (13)

This relates the radius of curvature (ρ) which a particle of charge Q and
momentum p has in a magnetic field B. The charge unit “e” is the electron
charge. The 0.3 in the equation is more precisely the speed of light in nm/s.

2.2 Example: “Microscope Design”


• If an atom has a size ∼ 1 Å = 10−10 m, how high a momentum must a
probe have, in order to be sensitive to internal structure?
Recall the deBroglie relation for the quantum mechanical wavelength
(over 2π) of a particle of momentum p:
1
λ̄ = . (14)
p
Thus, for a probe with a wavelength of the same scale as the atomic
size, we must have:
1 1
p = ∼ = 1010 m−1
λ̄ 1 Å
= 1010 m−1 (10−15 m/fm)(200 MeV-fm)(103 keV/MeV)
= 2 keV (15)

If the probe is a photon, it is in the X-ray regime.

• A nucleus has a size of order 1 fm. How high a momentum probe is


required in this case?
1 1
p= ∼ 200 MeV-fm ∼ 200 MeV (16)
λ̄ 1 fm

4
2.3 Bohr Model of Atom
Even though the Bohr semi-classical model for the atom is incorrect (for
one thing, it would seem that the orbiting electron should radiate its energy
away), it is a handy model for estimating some general features. We consider
here the Bohr model for the 1-electron atom, with nuclear charge Ze. Let me
be the mass of the electron and mA be the mass of the nucleus. The features
are:

• The electron orbits in the Coulomb potential:

Ze2
V (r) = − , (17)
r
where r ≡ |x − xA |, x is the electron position, and xA is the nucleus
position.
Assuming the virial theorem is valid,

1 ∂V Ze2 1

T =
r =
=
mv 2 , (18)
2 ∂r 2r 2
where we are assuming that the motion is non-relativistic. We have
expressed the kinetic energy, T , in terms of the reduced mass,
me mA
m= ≈ me , (19)
me + mA
and the relative speed of the electron with respect to the nucleus,

v = |ve − vA |. (20)

Alternatively, we could remember the force equation for circular mo-


tion:
Ze2 mv 2
F = 2 = , (21)
r r
which leads to the same result. We assume circular orbits, with con-
stant r, and hence
T = T .

• The angular momentum is assumed to be quantized:

L = |r × p| = n, n = 1, 2, 3, . . . (22)

5
This restricts the possible energy levels. Setting mvr = n, and
1 Ze2 1
E =T +V =− = − mv 2 , (23)
2 r 2
we find v = Ze2 /n = Zα/n. The non-relativistic approximation is
self-consistent if
α
Z 1. (24)
n
In particular, the ground state energy for hydrogen in this model is
1
E = − (0.511 × 106 eV)/(137)2
2
= −13.6 eV. (25)
This agrees nicely with experiment! Also, v = α = 1/137 in the ground
state, and r = n2 /mα leads to a ground state radius of
1 137
r = =
mα m
137
= 200 MeV-fm 10−5 Å/fm ≈ 0.5 Angstrom. (26)
0.511 MeV
However, the orbital angular momentum in the ground state is L = 1
in this model, and that is ultimately not correct.
We notice that the quantities of interest can be described in terms of the
“coupling strength” α = e2 and the reduced mass m ≈ me . In other words,
we have only one “scale” in the problem: me . In addition to this scale,
much of atomic physics is then describable in terms of three dimensionless
parameters:
1. α (strength of the interaction)
2. Z (number of protons in the nucleus)
3. me /mA (corrections for finite mass of nucleus).
Note that, if mN is the nucleon (proton, neutron) mass, me /mN ≈ 0.511/940 ∼
1/2000 This is a small number. Hence, the nucleus is essentially at rest in
the atom’s rest frame, since pA = −pe gives mA vA = −me ve , and thus
me
|vA | = |ve |. (27)
mA

6
3 Exercises
1. We have reviewed the Bohr atom briefly. This atom is held together
by electromagnetism. We may consider the very similar problem of a
gravitationally bound “Bohr atom”. Part of the point of this problem is
to give you some practice making calculations with our simple h̄ = c = 1
units, so I hope you will attempt your calculations in this spirit. The
gravitational potential between two objects of masses m1 and m2 is
Gm1 m2
V =− ,
r
where r = |rr1 − r2 |, and

GN = 6.67 × 10−11 m3 kg−1 s−2 , (28)


= 6.71 × 10−39 GeV−2 , (29)

is Newton’s constant.

(a) With the same quantization condition as for Bohr’s atom, find
the formulas for the energy levels, relative velocities, and relative
separations, for a “gravity atom”. Let n be the quantum level
(i.e., n = 1, 2, 3, . . . is the orbital angular momentum).
(b) Find n for the earth-sun system. Is quantum mechanics important
here?
(c) What would r be for the ground state of the earth-sun system
(give answer in meters)?
(d) Consider a gravitationally bound system of two neutrons, where
we suppose that we have “turned off” other potentially important
interactions (such as the strong interaction). What is the ground
state energy (in GeV) and the ground state separation (in meters)
for this system? Is gravity important compared with other forces
for a real system of two neutrons in a real ground state?
(e) Gravity may be important in a quantum mechanical system if the
interaction potential is comparable with other forces. For exam-
ple, the electromagnetic strength is characterized by e2 = α =
1/137, and the strong interaction (though not strictly Coulom-
bic), is characterized by a larger number, αs ≈ 1. Supposing we

7
have a system of two equal masses, determine the mass (in GeV),
such that the corresponding gravitational interaction strength is
equal to one. That is, the potential energy should be given sim-
ply by 1/r (cf., V = e2 /r for two electrons in electromagnetism).
[This mass has a name; do you know what it is?]

2. Resonances I: An ensemble of neutron decays, observed from time t = 0,


will exhibit the characteristic radioactive decay law:

N(t) = N(0)e−t/τ , (30)

where τ = 886.7 ± 1.9 s (Review of Particle Properties, 2000) is the


mean lifetime of the neutron (convince yourself that this should be
the case). Also, if an ensemble of neutrons were each weighed very
precisely, it would be found that the mass distribution has a (small)
width. The full width at half maximum (FWHM) of this distribution
is typically denoted by Γ. The mean lifetime is inversely related to this
width: τ = 1/Γ.

(a) What is the width of the neutron, in electron volts?


(b) Classical resonances: Let us probe a bit what this “width” means,
and what it has to do with decay rate. We’ll start with a classical
example: Consider a damped, driven harmonic oscillator:

ẍ + Γẋ + ω02 x = cos ωt

Determine the frequency response of this classical oscillator, i.e.,


determine the square of the amplitude of oscillation as a function
of ω. (Why the square?)
(c) In the limit of a narrow resonance, what is the full width at half
maximum of the distribution you found in part b.
(d) Now suppose the driving force is turned off. How does the energy
stored in the oscillator change with time? Find the “lifetime” of
this oscillator. In this classical example, I mean the time it takes
for the oscillator to reach e−1 of its original energy. Your answer
should be very simply related to your answer for part c).
3. The Bohr model for the atom, while wrong, gave some remarkable
agreement with experiment, and a means of estimating atomic scales

8
as long as we didn’t push the model too hard. Let’s play with an-
other, wrong, model, in this problem, the “plum pudding” model of
J.J. Thomson. In this model, the atomic electrons are embedded in
a region of neutralizing positive charge. We assume that, within the
radius of the atom, the positive charge is uniformly distributed. We
consider an atom with atomic number Z, but which has been ion-
ized such that only one electron remains. A simple calculation with
Maxwell’s divergence equation yields that the electric field inside the
atom, due to the positive charge distribution, is linear in radius. The
force on the electron can therefore be written as:

Fr = −eEr = −αkr,

where −e is the electron charge, α = e2 , and k depends on the radius


R of the atom, and on Z.

(a) Write down the Hamiltonian for the electron in this “atom”, mak-
ing sure you define any quantities not already defined above. As-
sume that any contribution from radii greater than R may be
neglected.
(b) Assuming circular orbits, use the Bohr quantization condition on
angular momentum to derive the allowed energy spectrum.
(c) For the hydrogen atom, the ground state energy is -13.6 eV. Note
that we have to be a bit careful now in discussing the energy, since
we need to know where we our reference (zero) is. Also, our for-
mula for the spectrum in part (b) must have a cut-off somewhere
due to the finite atom size. However, we’ll circumvent the com-
plication here by considering the difference between the ground
state and the first excited state, which for hydrogen is about 10
eV. Using this fact, determine the radius of the ground state orbit,
expressing your answer in Å to one significant digit.
[We really ought to check that your answer to part (c) is consistent
with the model, i.e., whether the radius obtained is less than R
or not. But I’ll leave this to your own amusement.]

9
Physics 195a
Course Notes
Preliminaries – Solutions to Exercises
021003 Frank Porter

1 Exercises
1. We have reviewed the Bohr atom briefly. This atom is held together
by electromagnetism. We may consider the very similar problem of a
gravitationally bound “Bohr atom”. Part of the point of this problem is
to give you some practice making calculations with our simple h̄ = c = 1
units, so I hope you will attempt your calculations in this spirit. The
gravitational potential between two objects of masses m1 and m2 is
Gm1 m2
V =− ,
r
where r = |rr1 − r2 |, and
GN = 6.67 × 10−11 m3 kg−1 s−2 , (1)
= 6.71 × 10−39 GeV−2 , (2)
is Newton’s constant.
(a) With the same quantization condition as for Bohr’s atom, find
the formulas for the energy levels, relative velocities, and relative
separations, for a “gravity atom”. Let n be the quantum level
(i.e., n = 1, 2, 3, . . . is the orbital angular momentum).
Solution: First, let us write:
−GmM
V = ,
r
where M = m1 + m2 and m is the reduced mass. The virial
theorem argument gives, for a circular orbit,
1 1 GmM 1
T =− V = = mv 2 ,
2 2 r 2
where v is the relative speed and non-relativistic motion is as-
sumed (but should be checked!). Now we set mvr = n, and see
what this implies. The energy is:
1 GMm 1
E =T +V =− = − mv 2 ,
2 r 2
1
and thus
n2 1
r = (3)
GMm m
GMm
v = (4)
n
 
1 GMm 2
E = − m. (5)
2 n

(b) Find n for the earth-sun system. Is quantum mechanics important


here?
Solution: The period is approximately T = 2πr/v = π107 s, and
the scale m is approximately m ≈ moplus = 6 × 1027 g. The orbit
radius is r = 1.5 × 1013 cm. Thus,

2πr 2 n
rv = = , (6)
T m
or
2πmr 2
n =
T
6 × 1027 (g)[0.511(MeV)/(9.11 × 10−28 (g)]2.25 × 1052 (fm-fm)
= 2π
π107 (s)3 × 1023 (fm/s)200(MeV-fm)
2 × 6 × .511 × 2.25 27+28+52−7−23−2
= 10
9.11 × 3 × 2
≈ 2 × 1074 . (7)

The levels for such high values of n are extremely closely spaced,
hence the “quantumness” is essentially invisible.
(c) What would r be for the ground state of the earth-sun system
(give answer in meters)?
Solution: Since r ∝ n2 , and r = 1.5 × 1026 fm for n = 2 × 1074 ,
we have for n = 1:
1.5 × 1026
r1 = fm ≈ 4 × 10−123 fm, (8)
4 × 10148
a very small distance indeed!

2
(d) Consider a gravitationally bound system of two neutrons, where
we suppose that we have “turned off” other potentially important
interactions (such as the strong interaction). What is the ground
state energy (in GeV) and the ground state separation (in meters)
for this system? Is gravity important compared with other forces
for a real system of two neutrons in a real ground state?
Solution: We’ll approximate the mass of the neutron as mn = 1
GeV for this exercise. The reduced mass is m = 1/2mn and the
total mass is M = 2mn . The ground state energy corresponds to
n = 1 in this model:
1 m2 m 1
E1 = − G2 (2m)2 = − G2 m5 . (9)
2 2 2 4
Newton’s constant is, approximately,

G = 6.7 × 10−39 GeV−2 . (10)

Thus,
E1 ≈ −10−77 GeV. (11)
The ground state separation between the two neutrons in this
model is
2 2 × (200 MeV-fm) × (10−15 m/fm
r1 ≈ ≈ ≈ 6 × 1022 m.
Gm3 (6.7 × 10−39 GeV) × (103 MeV/GeV)
(12)
Nuclear sizes are of order 1 fm, and nuclear binding energies are
of order MeV (e.g., the binding energy of the deuteron is approxi-
mately 2 MeV). It appears that gravity is unimportant compared
with the strong interaction (at least) for a system of two neutrons
which are not far apart.
(e) Gravity may be important in a quantum mechanical system if the
interaction potential is comparable with other forces. For exam-
ple, the electromagnetic strength is characterized by e2 = α =
1/137, and the strong interaction (though not strictly Coulom-
bic), is characterized by a larger number, αs ≈ 1. Supposing we
have a system of two equal masses, determine the mass (in GeV),
such that the corresponding gravitational interaction strength is

3
equal to one. That is, the potential energy should be given sim-
ply by 1/r (cf., V = e2 /r for two electrons in electromagnetism).
[This mass has a name; do you know what it is?]
Solution: We wish to find the mass MP such that
1 GMP2
|V (r)| = = . (13)
r r
Hence, the “Planck mass” is
1 1
MP = √ =  ≈ 1019 GeV. (14)
G 6.7 × 10−39 GeV−2

2. Resonances I: An ensemble of neutron decays, observed from time t = 0,


will exhibit the characteristic radioactive decay law:

N(t) = N(0)e−t/τ , (15)

where τ = 886.7 ± 1.9 s (Review of Particle Properties, 2000) is the


mean lifetime of the neutron (convince yourself that this should be
the case). Also, if an ensemble of neutrons were each weighed very
precisely, it would be found that the mass distribution has a (small)
width. The full width at half maximum (FWHM) of this distribution
is typically denoted by Γ. The mean lifetime is inversely related to this
width: τ = 1/Γ.

(a) What is the width of the neutron, in electron volts?


Solution: The width of the mass distribution is
1 (197.3 Mev-fm)(106 eV/MeV)
Γ= = ) = (7.417±0.016)×10−19 eV
τ (886.7 ± 1.9 s)(3 × 1023 fm/s

(b) Classical resonances: Let us probe a bit what this “width” means,
and what it has to do with decay rate. We’ll start with a classical
example: Consider a damped, driven harmonic oscillator:

ẍ + Γẋ + ω02 x = cos ωt

Determine the frequency response of this classical oscillator, i.e.,


determine the square of the amplitude of oscillation as a function
of ω. (Why the square?)

4
Solution: The square of the amplitude is interesting because it
is proportional to the stored energy. The general solution to the
motion is the sum of the solution to the homogeneous equation
plus a particular solution to the inhomogeneous equation. Because
the solution to the homogeneous equation is transient (due to the
damping term), we need only concern ourselves with a particular
solution, for which we take:
x(t) = A cos(ωt + φ). (16)
We substitute this into the differential equation, obtaining:
−ω 2 A cos(ωt + φ) − ΓωA sin(ωt + φ) + ω02 A cos(ωt + φ) = cos ωt.
(17)
We may evaluate this at ωt + φ = 0,
−A(ω 2 − ω02) = cos φ, (18)
and at ωt + φ = π/2,
−AΓω = sin φ. (19)
Using sin2 φ + cos2 φ = 1, we thus find
1
A2 = . (20)
(ω 2 − ω02 )2 + ω 2 Γ2

(c) In the limit of a narrow resonance, what is the full width at half
maximum of the distribution you found in part b.
Solution: The peak of the distribution is at
d(1/A2)
= 0 = 2(ω 2 − ω02) + Γ2 , (21)
dω 2
or ω 2 = ω02 − Γ2 /2. The value of A2 at the peak is therefore
1 1
A2max = 2 2 ≈ 2 2, (22)
Γ (ω0 − Γ /4)
2 ω0 Γ
the approximation being in the “narrow resonance” limit, ω0  Γ.
Thus, the amplitude at half-maximum is
1 1 1
2
= 2 , (23)
2 Γ (ω0 − Γ /4)
2 2 (ωh − omega20 )2 + ωh2 Γ2

5
where ωh are the frequencies at half maximum. Taking the inverse
of both sides, we obtain a quadratic equation in ωh2 − ω02 , with
solution  
Γ2 
2 2
ωh − ω0 = −1 + (2ω0 /Γ) − 1 .
2 (24)
2
In the narrow resonance limit, we find that FWHM= Γ.
(d) Now suppose the driving force is turned off. How does the energy
stored in the oscillator change with time? Find the “lifetime” of
this oscillator. In this classical example, I mean the time it takes
for the oscillator to reach e−1 of its original energy. Your answer
should be very simply related to your answer for part c).
Solution: Once the driving force is removed, the oscillations
damp down according to the solution to the homogeneous equa-
tion:  
−Γt/2 2
x(t) = Ae cos ω0 − Γ /4 t .
2 (25)

The stored energy is proportional to the square of the maximum


amplitude of each cycle, in the limit ω0  Γ. Hence the time to
decay to 1/e of the initial stored energy is just 1/Γ.

3. The Bohr model for the atom, while wrong, gave some remarkable
agreement with experiment, and a means of estimating atomic scales
as long as we didn’t push the model too hard. Let’s play with an-
other, wrong, model, in this problem, the “plum pudding” model of
J.J. Thomson. In this model, the atomic electrons are embedded in
a region of neutralizing positive charge. We assume that, within the
radius of the atom, the positive charge is uniformly distributed. We
consider an atom with atomic number Z, but which has been ion-
ized such that only one electron remains. A simple calculation with
Maxwell’s divergence equation yields that the electric field inside the
atom, due to the positive charge distribution, is linear in radius. The
force on the electron can therefore be written as:

Fr = −eEr = −αkr,

where −e is the electron charge, α = e2 , and k depends on the radius


R of the atom, and on Z.

6
(a) Write down the Hamiltonian for the electron in this “atom”, mak-
ing sure you define any quantities not already defined above. As-
sume that any contribution from radii greater than R may be
neglected.
Solution: Let m be the reduced mass of the electron-postive
charge system, and p be the relative momentum vector with mag-
nitude p = mv. The Hamiltonian is:

p2 1
H= + αkr 2 .
2m 2

(b) Assuming circular orbits, use the Bohr quantization condition on


angular momentum to derive the allowed energy spectrum.
Solution: The Bohr quantization condition is that the angular
momentum is quantized:

L = mvr = n,

for an orbit of radius r. To get the energy spectrum, we may


consider that the centripetal and electrostatic forces must be equal
and opposite:
αkr = mv 2 /r.
Hence,
αk 2
v2 = r
m
n
r2 = √
αkm

n αk
v2 =
m m

And finally: 
αk
En = n ,
m
where n = 1, 2, . . .
(c) For the hydrogen atom, the ground state energy is -13.6 eV. Note
that we have to be a bit careful now in discussing the energy, since

7
we need to know where we our reference (zero) is. Also, our for-
mula for the spectrum in part (b) must have a cut-off somewhere
due to the finite atom size. However, we’ll circumvent the com-
plication here by considering the difference between the ground
state and the first excited state, which for hydrogen is about 10
eV. Using this fact, determine the radius of the ground state orbit,
expressing your answer in Å to one significant digit.
[We really ought to check that your answer to part (c) is consistent
with the model, i.e., whether the radius obtained is less than R
or not. But I’ll leave this to your own amusement.]
Solution: The ground state radius, r0 , in tems of this energy
difference, ∆ = αk/m = 10 eV, is:

1
r0 =
 ∆m
1
= (106 eV/MeV)(200 MeV fm)(10−5 Å/fm)
(10 eV)(0.5 Mev)
= 0.9 Å

(1 Å is an acceptable answer here also.)

8
Physics 195a
Course Notes
Ideas of Quantum Mechanics
021024 F. Porter

1 Introduction
This note summarizes and examines the foundations of quantum mechanics,
including the mathematical background.

2 General Review of the Ideas of Quantum


Mechanics
2.1 States
We have in mind that there is a “system”, which is describable in terms
of possible “states”. A system could be something simple, such as a single
electron, or complex, such as a table.
Suppose we have a system consisting of N spinless particles. We use
the term “particle” to denote any object for which any internal structure
is unimportant. Classically, we may describe the state of this system by
specifying, at some time t the generalized coordinates and momenta:
{qi (t), pi (t), i = 1, 2, . . . , N} , (1)
where the spatial dimensionality of the qi and pi is implicit. The time evolu-
tion of this system is given by Hamilton’s equations:
∂H
q̇i = (2)
∂pi
∂H
−ṗi = (3)
∂qi
In quantum mechanics, it is not possible to give such a complete speci-
fication to arbitrary precision. For example, the limit to how well we may
specify the position and momentum of a particle in one dimension is limited
by the “uncertainty principle”: ∆x∆p ≥ 1/2, where ∆ indicates a range of
possible values. We’ll investigate this relation more explicitly later, but for

1
now it should just be a reminder of your elementary quantum mechanics un-
derstanding. We must be content with selecting a suitable set of quantities
which can be simultaneously specified to describe the state. We refer to this
set as a “Complete Set of Commuting Observables” (CSCO).
Specifying a CSCO corresponds to specifying the eigenvalues of an ap-
propriate complete set of commuting Hermitian operators, for the state in
question. Measurements (eigenvalues of Hermitian operators) of other quan-
tities cannot be predicted with certainty, only probabilities of outcomes can
be given. The evolution in time of the system is described by a “wave equa-
tion”, for example, the Schrödinger equation.

2.2 Probability Amplitudes


The quantum mechanical state of a system is described in terms of waves,
called probability amplitudes, or just “amplitudes” for short. Note that
probabilities themselves are always non-negative, so it is more difficult to
imagine the probabilities themselves as wavelike. Instead, the probabilities
are obtained by squaring the amplitudes:

Probability ∼ |ψ|2 , (4)

where ψ stands for the amplitude. More explicitly, the probability of ob-
serving state variable (e.g., position) x in volume element d3 (x) around x is
equal to:
|ψ(x)|2 d3 (x). (5)
A quantum mechanical probability is analogous to the intensity of a classical
wave.
The quantum mechanical wave evolves in time according to a time evo-
lution operator, e−iHt involving the Hamiltonian, H. Hence, if

e−iHt ψ0 (x) = ψ(x, t), (6)

where ψ0 (x) is the wave function at t = 0 in terms of coordinate position x,


then differentiation gives:
dψ(x, t)
i = Hψ(x, t). (7)
dt
We recognize this as the Schrödinger equation. Thus, the temporal frequency
of the wave is determined by the energy structure. For a particle of energy

2
E, the frequency is ω = E (or ν = E/2π). This hypothesis is also applied in
relativistic situations, for example, for a photon.
The spatial behavior of a wave is given by the deBroglie hypothesis:
A particle is described as a quantum mechanical wave with wavelength:
λ 1
λ̄ ≡ = , (8)
2π p

or with wavenumber k = 1/λ̄ = p. This relation is assumed to also hold


relativistically.
We may make a brief aside on the subject of “dispersion relations”. As in
classical electrodynamics, the relation between ω and k for a wave is called a
dispersion relation. In the case of the quantum mechanics of a free particle
of mass m, the dispersion relation is

ω = k 2 /2m (9)

for a non-relativistic particle (when we do not include the rest mass in the
energy, hence E = p2 /2m), and

ω 2 = k 2 + m2 (10)

for a relativistic particle.

2.3 Wave Equations


In quantum mechanics, the dynamics is determined by the wave equation.
The form of the wave equation is given by the dispersion relation. By analogy
with light, and ignoring issues of mathematical rigor, let us build physical
waves describing a particle of mass m from superpositions of plane waves.
Note well that we are assuming that the wave equation is linear, so linear
combinations of solutions are also solutions. Our plane wave building blocks
are:
ψ(x, t) = Aei(k·x−ωt) , (11)
where k=p and ω = E. We presume that this forms a “complete set” of
functions, that is, any physical state may be expanded as a linear superpos-
tion of elements of this set. Let us suppose we have a free particle. We search
for a differential equation which is satisfied by all plane waves which could

3
describe the particle. We shall postulate this to be our wave equation. As we
saw above, the dispersion equation for a (possibly relativistic) free particle is

E 2 − p2 = m2 , (12)

or
ω 2 − k 2 = m2 . (13)
Considering the following derivatives of the plane waves

∂ 2 ψ(x, t)
= −ω 2 ψ(x, t), (14)
∂t2

∇2 ψ(x, t) = −k 2 ψ(x, t), (15)

we have
∂2ψ
− ∇2 ψ = −m2 ψ. (16)
∂t2

We postulate this to be the desired wave equation. It is known as the Klein-


Gordon equation. It describes the motion, in quantum mechanics, of a free
particle of mass m.
However, free particles quickly become boring; we really want to be able
to discuss interactions, e.g., interacting particles. Demanding relativistic
invariance leads us into quantum field theory. However, we often don’t require
full invariance, and typically make two very useful simplifying assumptions
in non-relativistic quantum mechanics:

• The creation and destruction of particles is assumed not to occur. The


number of each particle type is constant. However, there is occasional
need to make exceptions to this assumption; the most notable is for
the photon.

• All particles (again, except for photons!) are assumed to be non-


relativistic. Typically this means we stop at order v 2 in the energy,
but sometimes we carry out calculations to higher order.

We have already seen that these assumptions are reasonable for ordinary
atomic systems.
In non-relativistic quantum mechanics (Schrödinger theory) the wave
function ψ(x, t) really has the precise meaning:

4
The probability that a measurement of the particle’s position at
time t will yield x in d3 (x) around x is:

|ψ(x, t)|2d3 (x). (17)

|ψ(x, t)|2 is a probability density.

Note that ψ(x, t) and ψ  (x, t) = eiθ ψ(x, t), where θ is a real number, de-
scribe the same physical situation, since the probability density is unchanged
[where we make the inherent assumption that it is probabilities we can mea-
sure, not probability amplitudes].
Let us take the non-relativistic limit of our free particle wave equation:

E = m2 + p2
 3 
p2 p
= m+ + pO . (18)
2m m

Hence,

ψ(x, t) = Aei(p·x−Et)
p2
≈ Aei(p·x− 2m t) e−imt (19)
≈ ψS (x, t)e−imt ,

where
p2
ψS (x, t) ≡ ei(p·x− 2m t) . (20)
But |ψ(x, t)|2 = |ψS (x, t)|2 , so we may drop the overall e−imt phase factor and
look for a linear differential equation satisfied by our non-relativistic plane
wave solutions (dropping the S subscript now). We have

∂ψ p2
= −i ψ, (21)
∂t 2m
∇2 ψ = −p2 ψ. (22)

Thus,
∂ψ 1 2
i =− ∇ ψ. (23)
∂t 2m
This is the Schrödinger equation for a free particle of mass m. Note the
correspondence with the dispersion relation, Eq. 9.

5
In the non-relativistic case, it is easy to generalize to situations where
the particle is not free: Introduce a potential function, V (x, t) to describe
interactions. Our hypothesis is that the time dependence of the wave is
determined by ω = E. With V = 0, this gives
p2
E =T +V = + V (x, t). (24)
2m
Thus,
∂ψ ∂
i = i exp [i(p · x − Et)]
∂t ∂t
= Eψ
= Hψ, (25)

where H = T + V is the Hamiltionian operator. We may sometimes need to


make the distinction between an operator and its spectral values (eigenvalues)
more explicit. When this arises, we will use notation of the form Ĥ or Hop
to denote the operator. However, we usually rely on context, and omit such
notational guides.
We also have that
1 2 p2
− ∇ ψ(x, t) = ψ(x, t) = (E − V )ψ(x, t). (26)
2m 2m
Putting this together with Eq, 25, we find:
 
Eψ(x, t) 1 2
Hψ(x, t) = =− ∇ ψ(x, t) + V (x, t)ψ(x, t). (27)
i∂t ψ(x, t) 2m
The upper form, in which E is the energy eignenvalue for a static potential,
is referred to as the time-independent Schrödinger equation. The lower
form is referred to as the time-dependent Schrödinger equation.

3 Mathematical Considerations
Let us take a step back now, and set up a more rigorous mathematical frame-
work in which to implement the notions we have been discussing. It is a
highly reassuring feature of quantum mechanics that we are able to do so.
We have decided to describe particles by “waves”, giving “probability am-
plitudes”, where absolute squares lead to measurable physical probabilities.

6
Waves are conveniently described by complex-valued functions of whatever
generalized coordinates are involved. An essential feature is that waves in-
terfere, hence our state space must allow for the possibility of superposition
of waves.
• The requirement of superposability suggests that the state space of
permissible wave functions should be a vector space. This gives us the
property that linear combinations of physical amplitudes lead to new
physically allowed amplitudes.

• To deal with the probability interpretation, we briefly consider the


definition of probability:
Def: Probability: If S is a (sample) space, and P (E) is a real additive
set function defined on sets E in S, then P is referred to as a
probability function if:
1. If E is a subset (event) in S, then P (E) ≥ 0.
2. P (S) = 1.
3. If E, F ⊆ S, and E ∩ F = ∅, then P (E ∪ F ) = P (E) + P (F ).
4. If S is an infinite sample space, we require that:

P (E1 ∪ E2 ∪ . . .) = P (E1 ) + P (E2 ) + . . . (28)

for any sequence of disjoint events E1 , E2 , . . . in S.


[For those with the mathematical background, we remark that a shorter
definition for probability is: A probability function is a measure on S
such that P (S) = 1. We note that P is defined on all subsets E of S,
hence is defined on a σ-ring.]
Thus, the requirement of a probability interpretation means that any
allowable wave function ψ(s) defined on sample space S must be nor-
malizable and square-integrable such that:

ψ ∗ (s)ψ(s)µ(ds) = 1. (29)
S

The integral here is in the Lebesgue-summable sense, and µ is the


appropriate measure function on subsets of S. A “measure function”
is simply a prescription for measuring the “sizes” of sets, implemented
in a mathematically rigorous manner.

7
The mathematical considerations here are both critical to the foundation
of quantum mechanics and potentially unfamiliar to the reader, so we will
digress briefly in order to develop an intuitive understanding of the need for
them.
Apparently, it is important to know how to measure the sizes of sets
in our probability sample space. This is implemented abstractly in measure
theory. We will not develop this here; a couple of examples should provide the
intuition that is sufficient for present purposes. For a first example, suppose
our sample space is the set of real numbers, R1 . In this case, the appropriate
way to measure sizes of sets is a suitable generalization of our ordinary notion
that the size of the interval (a, b) is just b − a. This generalization is called
the Lebesgue measure on R1 . It has the property that a denumerable set
of discrete points is measureable, with measure zero.
We remark that the Reimann integral is not sufficiently general for our
purposes. A function f (x) is Riemann-integrable on [a, b] if and only if:

• f (x) is bounded.

• The set of points of discontinuity of f has Lebesgue measure zero.

For example, the integral  1


f (x) dx, (30)
0
where 
0 if x is rational,
f (x) = (31)
1 if x is irrational,
is not defined. The function is discontinuous at every point, hence the mea-
sure of the points of discontinuity is non-zero:

µ({points of discontinuity}) = µ({(0, 1)}) = 1. (32)

This is perhaps a pathological example. A more obvious example is that the


Riemann sum doesn’t allow us to sum over state variables with possibly dis-
crete spectra, e.g., quantized energy levels. We could handle such situations
in an ad hoc manner, but to build a rigorous foundation we resort to the
Lebesgue integral.
The idea of the Lebesgue integral is simple and elegant. Rather than
divide the “x-axis” up into intervals, as in the Riemann integral, we divide
the “y-axis”. That is, we partition the y-axis into intervals ∆yi , i = 1, 2, . . .

8
Choose a point yi in each interval. Consider the sets f −1 (∆yi ). Multiply the
measure of each such set by the corresponding yi. Then sum the products.
The Lebesgue integral is the value of this sum in the limit where all of the
∆yi intervals vanish.
f(x) f(x)
y
6
y
5
y
∆y 4 {
3 y
3
y
2
y
1

a ... x x
{
{

b a b
∆x ∆x ... f -1(∆ y3)
1 2

(a) (b)

Figure 1: (a) The Riemann integral is a limit of slices in x. (b) The Lebesgue
integral is a limit of slices in y.


IL ≡ lim yi µ[f −1 (∆yi )]. (33)
∆yi →0
i

For this to work, f −1 (∆yi ) must be measurable sets, that is, f (x) must
be a “measurable functon”:
Def: A real function f (x), defined on S is said to be measurable if, for
every real number u, the set Su = {x : f (x) < u, x ∈ S} is measurable.

For example, consider the function of Eqn. 31. Take, in the limit, y1 = 0 and
y2 = 1. Then
f −1 (∆y1 ) = rational numbers
f −1 (∆y2 ) = irrational numbers, (34)

9
f(x)

u 




Su
Figure 2: The set Su .

and

1 = µ([0, 1]) = µ({rationals on [0, 1]}) + µ({irrationals on [0, 1]})


= 0 + µ({irrationals on [0, 1]}). (35)

Hence,  1
f (x) µ(dx) = 1 (36)
0
is the Lebesgue integral.
The choice of measure (of the size of a set) may depend on the physical
circumstance. We have used the Lebesgue measure on R1 , appropriate to
continuous state variables. Another important measure is the Dirac measure
(on S = R1 ): Let x0 ∈ R1 . The Dirac measure associated with point x0 is
defined by: 
1 if x0 ∈ E
µ(E) = (37)
0 if x0 ∈/ E.

10
Note that this is the appropriate measure to use for “discrete” state variables:

f (x) µ(dx) = f (x0 ). (38)
S

We are still trying to build a suitable function space for our quantum
mechanical wave functions. What about “pathological” functions, e.g., with
many discontinuities. We will build into our space the concept that two
functions that differ only in ways which will not affect observable probabilities
are not to be considered as distinct. We proceed as follows:

Def: A property, Q(x), which depends on location x in space S, is said to


hold almost everywhere if the set of points for which Q does not
hold has measure zero.

Def: Two functions f1 (x), f2 (x), defined on S (that is, assume finite values
at every point of S) are said to be equivalent if f1 (x) = f2 (x) almost
everywhere: f1 ∼ f2 .

If equivalent functions f1 (x) and f2 (x) are integrable in the Lebesgue


sense (“summable”) on a set E, then
 
f1 (x)µ(dx) = f2 (x)µ(dx). (39)
E E

Thus, if we decompose the set of summable functions into classes of equivalent


functions, the integral can be regarded as a functional defined on the space
F , of these classes.

3.1 The Space L2


For our quantum mechanical wave functions we are of course interested in
complex functions. A complex function f (x) = f1 (x) + if2 (x), where f1 and
f2 are real functions, is said to be summable on E if f1 and f2 are summable:
  
f (x)µ(dx) = f1 (x)µ(dx) + i f2 (x)µ(dx). (40)
E E E

Theorem: A complex function f (x) is summable if and only if its absolute


value, 
|f (x)| = f1 (x)2 + f2 (x)2 , (41)
is summable.

11
Proof: Suppose f (x) = f1 (x) + if2 (x) is summable on E. Then f1 and f2
are summable on E. By virtue of our defintion of the integral, Eqn. 33,
|f1 | and |f2 | are therefore also summable. Hence,
 
|f (x)|µ(dx) = |f1 (x) + if2 (x)|µ(dx)
E E
≤ |f1 (x)| + |f2 (x)|µ(dx), by the triangle inequality
E
< ∞. (42)

Conversely, suppose |f | is summable on E. Then, again referring to


the definition in Eqn. 33,





f (x)µ(dx)

≤ |f (x)|µ(dx)

E E
< ∞. (43)

Even for complex functions, the integral defines a linear functional. Let
L denote the space of complex functions f (x) such that |f (x)|2 is summable
on S: 
|f (x)|2µ(dx) < ∞, for f (x) ∈ L. (44)
S

Theorem: The space L is a linear space (or “vector” space).

Proof: The principal step of the proof is as follows: Suppose f (x) ∈ L and
g(x) ∈ L. Then

|f (x) + g(x)|2 = 2|f (x)|2 + 2|g(x)|2 − |f (x) − g(x)|2


≤ 2|f (x)|2 + 2|g(x)|2. (45)

But 2|f (x)|2 + 2|g(x)|2 is summable, and hence |f (x) + g(x)|2 is sum-
mable.

The space L is our candidate space for physical quantum mechanical wave
functions. However, there is a problem with it: There are distinct elements
of L, differing on sets of measure zero, which correspond to the same physics.
Let us tidy this ugliness up. Consider Z the subset of L consisting of functions
f (x) such that 
|f (x)|2 µ(dx) = 0. (46)
S

12
Note that Z is a linear subspace of L, since if f ∈ Z, then kf ∈ Z for
all complex constants k, and if f, g ∈ Z, then f + g ∈ Z since |f + g|2 ≤
2|f |2 + 2|g|2. Thus, we can define the “factor space”,

L2 = L/Z. (47)

That is, two functions fa (x), fb (x) in L determine the same class in L2 if and
only if the difference fa − fb vanishes almost everywhere, i.e.,

|fa (x) − fb (x)|2 µ(dx) = 0. (48)
S

We say that the space L2 consists of functions f (x) such that |f (x)|2 is
summable on S, with the understanding that equivalent functions are not
considered distinct. In other words, L2 is a space of equivalence classes.
Finally, we add to this space the notion of a scalar product. We start by
noting that the product of two elements of L2 is summable:
Theorem: If f, g ∈ L2 , then f ∗ g is summable on S.

Proof: Write
1
f ∗g = |f + g|2 − |f − g|2 + i|f − ig|2 − i|f + ig|2 (49)
4
Each term on the right is summable, and hence the product f ∗ g ∈ L2 .

Theorem: L2 is a Hilbert space, with scalar product defined by:



f |g ≡ f (x)∗ g(x)µ(dx). (50)
S

Proof: The proof starts by showing that L2 is a pre-Hilbert space, that is,
a linear space upon which a scalar product has been properly defined.
This consists in showing that:

1. f |f  = 0 if and only if f = 0. The fact that L2 is a space of


equivalence classes is crucial here.
2. f |g = g|f ∗.
3. f |cg = cf |g.
4. f |g1 + g2  = f |g1 + f |g2.

13
Once it has been demonstrated that L2 is a pre-Hilbert space, it remains
to show that L2 is complete. That is, it must be shown that every
Cauchy sequence of vectors in L2 converges to a vector in L2 .

A fundamental postulate of quantum mechanics is:


To every physical system S there corresponds
a separable Hilbert space, HS .

• L2 appears to be a suitable space for our probability amplitudes since


it is a linear space (hence we have superposition), and its elements are
normalizable (square-summable, hence can make a probability inter-
pretation).

• The addition of the scalar product permits us to make projections in our


vector space. Note tht L itself was not sufficient for this construction,
since f |f  = 0 is not equivalent to f = 0 in L. It should be understood
that using L2 is all right, since functions which differ only on sets
of measure zero will yield the same probabilistic, and hence physical,
results.

• The availability of the scalar product in particular leads to the possi-


bility of constructing a (orthonormal) “basis”.

• Completeness means that we have included a sufficiently large set of


vectors that we won’t encounter difficulties when we consider certain
sequences of vectors. We can construct a complete orthonormal ba-
sis {|eα } on a Hilbert space such that every vector |x ∈ H can be
expanded:
|x = |eα eα |x. (51)
α

• Abstractly, a separable space is a topological space T which contains a


denumerable (countable) set of points {t1 , t2 , . . .} which is dense in T .
The point of the postulate that the Hilbert space corresponding to any
physical system be separable is that there is then a denumerable dense
set of vectors. We may find a complete denumerable basis in which to
expand our vectors.

To complete the connection of this postulate with our space L2 we have


the theorem:

14
Theorem: The space L2 (a, b) (where it is permissible for a = −∞, b = ∞)
with Lebesgue measure is separable.

Proof: To prove this theorem, we first prove that there exists a complete
denumerable orthonormal basis in L2 . For example, on L2 (0, 2π), the
set of functions:
eikx
√ , k = 0, ±1, ±2, . . . , (52)

forms a complete orthonormal system. Then we show that from this
basis we can construct a countable dense set of vectors in L2 .

It may be noted that non-separable Hilbert spaces do exist. However, we


have so far not found a need to consider them for quantum mechanics.
2
On L2 (−∞, ∞), with measure µ(dx) = e−x dx, the Hermite polynomials:
2
ex dn −x2
Hn (x) =  √ e , n = 0, 1, . . . (53)
2n n! π dx
n

form a complete orthonormal system. Alternatively, with measure µ(dx) =


2
dx, the functions e−x /2Hn (x) form a complete orthonormal system.

4 Observables
An observable Q is a physical quantity. In quantum mechanics, we deal
with the probability p(Q, ∆) that a measurement will yield a value of Q in
a subset ∆ of the set of real numbers. A fundamental postulate of quantum
mechanics is that:
Every observable corresponds to a self-adjoint
operator defined in HS .
The term “defined in HS ” means, for operator Q, that x ∈ DQ ⊂ HS , and
Qx ∈ RQ ⊂ HS , where DQ is the domain of the operator, and RQ is its
range.
Self-adjoint operators are evidently an important class of operator – the
key point is that a self-adjoint operator is also a Hermitian operator, and
hence has a real eigenvalue spectrum. This is the physical reason why they
are of interest. Let us look at some of the mathematical aspects.

15
Def: (Adjoint) Let L be a linear operator, defined in HS with domain DL ,
such that DL is dense in HS (that is, D̄L = HS , where D̄L is the closure
of DL ). The adjoint, L† , of L is defined by:1

L† u|v = u|Lv, ∀v ∈ DL . (54)

In other words, u is a vector in HS such that there exists a w ∈ HS satisfying


u|Lv = w|v. If this holds, then we say w = L+ u; the adjoint operator
maps u to w. The requirement that DL be dense in HS is necessary in order
for L† to be uniquely defined. To see this, suppose that it is not unique, i.e.,
suppose there exist two vectors wa , wb such that

u|Lv = wa |v = wb |v, ∀v ∈ DL . (55)

In this case, (wa − wb )|v = 0. But wa − wb is thus orthogonal to every


vector in a dense set, and therefore wa − wb = 0. This last point could use
some further proof; we’ll depend on its evident plausibility here.

Def: Self-adjoint: If L† = L (which means: DL† = DL , and L† u = Lu for


all u ∈ DL ), then L is said to be self-adjoint.

Note the distinction between a self-adjoint operator, and a Hermitian oper-


ator, defined according to:
Def: Hermitian: A linear operator L, with DL ⊂ HS , is called Hermitian
if
Lu|v = u|Lv, ∀u, v ∈ DL . (56)

For example, in the case of a finite dimensional vector space, L is a square


matrix, and we have:

u|Lv = u† Lv (57)
Lu|v = (Lu)† v (58)
= u† L† v
= u† Lv if L† = L. (59)
1
There are a variety of notations used to denote the adjoint of an operator, most
notably L† , L+ , and L∗ . We’ll adopt the “dagger” notation here, as it is consistent
with the familiar “complex-conjugate–transpose” notation for matrices. The “asterisk”
notation is common also, but we avoid it here on the grounds of potential confusion with
simple complex conjugation.

16
In this case, a self-adjoint operator is also a Hermitian operator:

L† u|v = Lu|v = u|Lv, ∀u, v ∈ DL = DL† . (60)

However, a Hermitian operator is not necessarily a self-adjoint operator, if


the space is infinite dimensional. The issue is one of domain. It can happen,
in an infinite dimensional Hilbert space, that a Hermitian operator, L, has
DL ⊂ DL+ , as a proper subset. In this case, L is not self-adjoint.
Consider an example to illustrate this inequivalence. A differential equa-
tion (where L is a differential operator, and we write Lu = a) is not com-
pletely specified until we give certain boundary conditions which the solution
must satisfy. Thus, for a function u to belong to DL , not only must the ex-
pression Lu be defined, but u must also satisfy the boundary conditions. If
the boundary conditions are too restrictive, we might have DL ⊂ DL+ but
DL = DL+ , so that a Hermitian operator may not be self-adjoint.
To illustrate with a specific example, let L = p be the momentum operator
in one dimension:
1 d
p= , x ∈ [a, b]. (61)
i dx
The boundary conditions are to be specified, but the domain of this operator
is otherwise the set of continuous functions on [a, b]. This set of functions is
dense in our Hilbert space L2 (a, b). We look at the scalar product of pv with
u, where u and v are continuous functions:
 b
1 d
u|pv = u∗ (x) v(x) dx (62)
a i dx  
b 1 d 
1 ∗
= u (x)v(x)|ba − u∗ (x) v(x) dx
i a i dx
1 ∗
= [u (b)v(b) − u∗ (a)v(a)] + pu|v (63)
i
= p† u|v (64)

The u∗ (b)v(b) − u∗ (a)v(a) boundary term portion of Eqn. 63 must vanish for
all v ∈ Dp in order for p to be a Hermitian operator; hence we shall assume
this condition. There is, however, more than one way to achieve this, even
with the dense requirement. For example, we could impose the boundary
condition v(a) = v(b) = 0, so that Dp = {v|v is continuous, and v(a) =
v(b) = 0}. In this case, u need not satisfy any constraints at a or b, and

17
Dp+ = {u|u is continuous} = Dp , since
1 d
p+ u|v = u|pv =  u|v, (65)
i dx
for all continuous functions u. We have p† = 1i dxd
with Dp† = {u|u continu-
ous on [a, b]}. So, p is Hermitian, but not self-adjoint, since Dp is a proper
subset of Dp† .
On the other hand, if we had chosen the extension of the above p with
boundary condition v(a) = v(b), then we would find a restriction of the above
p† , with Dp† = {u|u(a) = u(b), u continuous on [a, b]}. With this definition
p is a self-adjoint operator.

5 The Uncertainty Principle


The famous “uncertainty principle” is discussed in every introductory quan-
tum mechanics course. We revisit it briefly here. First, the reader is reminded
of the important Schwarz inequality:
Theorem: For any vectors φ, ψ in our Hilbert space,

|φ|ψ| ≤ φ|φψ|ψ. (66)

Equality holds if and only if φ and ψ are linearly dependent: φ = cψ,


where c is a complex number.

Proof: One way to prove the Schwarz inequality is to consider the non-
negative definite scalar product:

φ + reiθ ψ|φ + reiθ ψ ≥ 0. (67)

Expanding the left hand side results a quadratic expression for r. Con-
sidering the possible solutions for r yields a constraint on the discrim-
inant. The resulting inequality is the Schwarz inequality.

Suppose now that we have two self-adjoint operators A and B, and a


state vector in the domains of both. The average (mean) value (expectation
value) of observable A if the system is in state ψ is:

A = ψ|Aψ, (68)

18
where we assume that ψ is normalized. Likewise, the mean of observable B is
B = ψ|Bψ. We are presently interested in learning something about the
spreads of the distributions of observations of A and B. Thus, it is convenient
to subtract out the means by defining “shifted” operators:

AS ≡ A − A (69)
BS ≡ B − B. (70)

The domains of the shifted operators are the same as the domains of the
unshifted operators. We immediately have that AS  = BS  = 0.
Define the commutator of AS and BS :

[AS , BS ] ≡ AS BS − BS AS = [A, B]. (71)

It should be noted that the product of two operators is defined by their


operation on a state vector: AB|ψ means first apply operator B to ψ, then
apply A to the result. The obvious questions of domain need to be dealt with,
of course. Thus, let us further require ψ ∈ DAB , ψ ∈ DBA , and consider:

|[A, B]| = |ψ|AS BS ψ − ψ|BS AS ψ| (72)


≤ |ψ|AS BS ψ| + |ψ|BS AS ψ| (triangle inequality)
≤ |AS ψ|BS ψ| + |BS ψ|AS ψ| (self-adjointness)
≤ 2|AS ψ|BS ψ|

≤ 2 AS ψ|AS ψBS ψ|BS ψ (Schwarz inequality)

≤ 2 ψ|A2S ψψ|BS2 ψ (self-adjointness). (73)

The variance of a distribution is a measure of its spread. For an observ-


able Q, the variance for a system in state ψ is defined by:
2
σQ ≡ ψ|(Q − Q)2 ψ = Q2  − Q2 . (74)

The square root of the variance, σQ is called the standard deviation. We


see that, for example, σA2 = ψ|A2S ψ. Thus, we may rewrite Eqn. 73 in the
form:
1
σA σB ≥ |[A, B]|. (75)
2
This is a precise statement of the celebrated “uncertainty principle”. We shall
often use the convenient notation ∆A ≡ σA . The physical interpretation is

19
that, if we have two non-commuting observables, the product of the variances
of the probability distributions for these two observables is bounded below.
This is typically interpreted further with statements such as “the ability to
measure both variables simulataneously is limited. A measurement of one
observable disturbs the system in a way that affects the result of a second
measurement of the other observable.” While there is some justification for
such statements, one must be careful not to carry them too far – in case
of confusion, come back to what the principle actually says! For example,
“the ability to measure” carries a connotation that there may be an issue
of experimental resolution involved. While expermental resolution generally
needs to be folded into the analysis of an actual experiment, it has nothing
to do with the present point.

5.1 Example: Angular Momentum


The angular momentum operator for a particle is L = x × p, where x is the
position operator, and p is the momentum operator. This may be expressed
in components as:
Li = 4ijk xj pk . (76)
The summation convention is used here: a sum is implied over repeated
indices, in this case, there is a sum over j, k = 1, 2, 3. The quantity 4ijk is
known as the antisymmetric symbol:

 +1 i, j, k =cylic permutation of 1, 2, 3,
4ijk =  −1 i, j, k =anti-cylic permutation of 1, 2, 3, (77)
0 any two indices the same.
We remark that L = x × p = −p × x are both acceptable, since only
commuting components of x and p are paired.
We know that
[pm , xn ] = −iδmn . (78)
We are interested in the commutation relations of the angular momentum
operators:

[Lα , Lβ ] = 4αjk 4βmn [xj pk , xm pn ] (79)


= i(4αjm 4βjn − 4αjn 4βjm )xm pn , (80)

20
where the algebra between Eqns. 79 and 80 is left as an exercise for the
reader. The reader is also encouraged to demonstrate that
Eαβ,mn ≡ 4αjm 4βjn − 4αjn 4βjm = 4αjβ 4mjn . (81)
With the identity of Eqn. 81, we obtain
[Lα , Lβ ] = i4αβγ Lγ . (82)
Thus, the uncertainty relation between components of angular momentum
is:
1 1
∆Lα ∆Lβ ≥ |[Lα , Lβ ]| = |4αβγ Lγ |. (83)
2 2
Let us illustrate this with an explicit example. We first anticipate the
generalization of angular momentum to include spin, with the same com-
mutation relations, and consider the simplest system with non-zero angular
momentum, spin-1/2. We’ll follow common convention, and pick our basis
to be eignevectors of J3 (using now J to indicate angular momentum, leaving
L to stand for “orbital” angular momentum). We again anticipate the quan-
tization of spin, where the eigenvalues of J3 are ±1/2 for a spin-1/2 system.
In this basis, our angular momentum operator is:
1
J = σ, (84)
2
where σ are the Pauli matrices:
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = . (85)
1 0 i 0 0 −1
These are Hermitian matrices, hence correspond to observables.
Suppose, in this basis, we have the state
 
1 1
ψ=√ , (86)
2 1
which is a superposition of J3 = ±1/2 eigenstates. We may compute expec-
tation values of angular momentum for this state:
  
1 0 1 1 1
J1  = (1, 1) = (87)
4 1 0 1 2
  
1 0 −i 1
J2  = (1, 1) =0 (88)
4 i 0 1
  
1 1 0 1
J3  = (1, 1) = 0. (89)
4 0 −1 1

21
To obtain the second moments, we notice that σi2 = 1, i = 1, 2, 3. Thus,
1
Ji2  = , i = 1, 2, 3, (90)
4
hence
 2
2 1 1
(∆J1 ) = J12 
− J1  = −2
= 0, (91)
4 2
1 1
(∆J2 )2 = −0 = , (92)
4 4
1 1
(∆J3 )2 = −0 = . (93)
4 4
Let us check the uncertainty relation involving J1 and J2 :
1 1
∆J1 ∆J2 = 0 · = 0 ≥ |J3 | = 0. (94)
2 2
So this relation is satisfied. Physically, it may readily be seen that our state is
actually an eigenstate of J1 with eigenvalue 1/2. It is a superposition of J2 =
± 12 eigenstates. Even though our lower bound on the product of uncertainties
is zero, and is achieved, we cannot measure J1 and J2 simultaneously with
arbitrary precision. As soon as we know J1 = 1/2, a measurement of J2
will yield ±1/2 with equal probability. Alternatively, if we first measure J2 ,
obtaining a value of either 1/2 or −1/2, a subsequent measurement of J1 will
yield ±1/2 with equal probability. The measurement of J2 has disturbed the
state.
It should perhaps be remarked that the term “precision” here is in the
frequency sense: Imagine that you can prepare the identical state many times
and repeat the measurements. The measurements will yield different results
among the samplings, with expectation values as we have calculated, in the
limit of averaging over an infinite number of samplings.
Finally, let us also look at:
1 1 1 1 1
∆J2 ∆J3 = · = ≥ |J1 | = . (95)
2 2 4 2 4
Again, the uncertainty principle is satisfied.

22
6 Exercises
1. Show that L2 is complete.

2. Complete the proof that the space L2 (a, b) is separable.

3. Show that if x ∈ H, where H is a separable Hilbert space, is orthogonal


to every vector in a dense set, then x = 0.

4. Complete the proof of the Schwarz inequality.

5. Complete the derivation of Eqns. 80, 81, and 82.

6. Time Reversal in Quantum Mechanics:


We wish to define an operation of time reversal, denoted by T , in
quantum mechanics. We demand that T be a “physically acceptable”
transformation, i.e., that transformed states are also elements of the
Hilbert space of acceptable wave functions, and that it be consistent
with the commutation relations between observables. We also demand
that T have the appropriate classical correspondence with the classical
time reversal operation.
Consider a system of structureless (“fundamental”) particles and let
X8 = (X1 , X2 , X3 ) and P8 = (P1 , P2 , P3 ) be the position and momentum
operators (observables) corresponding to one of the particles in the
system. The commutation relations are, of course:

[Pm , Xn ] = −iδmn ,
[Pm , Pn ] = 0,
[Xm , Xn ] = 0.

The time reversal operation T : t → t = −t, operating on a state


vector gives (in Schrödinger picture – you may consider how to make
the equivalent statement in the Heisenberg picture):

T |ψ(t) = |ψ  (t ).

The time reversal of any operator, Q, representing an observable is


then:

23
Q = T QT −1

(a) By considering the commutation relations above, and the obvious


classical correspondence for these operators, show that

T iT −1 = −i.

Thus, we conclude that T must contain the complex conjugation


operator K:

KzK −1 = z ∗ ,
for any complex number z, we require that T on any state yields
another state in the Hilbert space. We can argue that (for you
to think about) we can write: T = UK, where U is a unitary
transformation. If we operate twice on a state with T , then we
should restore the original state, up to a phase:

T 2 = η1,
where η is a pure phase factor (modulus = 1).
(b) Prove that η = ±1. Hence, T 2 = ±1. Which phase applies in any
given physical situation depends on the nature of U, and will turn
out to have something to do with spin, as we shall examine in the
future.

7. Let us consider the action of Gallilean transformations on a quantum


mechanical wave function. We restrict ourselves here to the “proper”
Gallilean Transformations: (i ) translations; (ii ) velocity boosts; (iii )
rotations. We shall consider a transformation to be acting on the state
(not on the observer). Thus, a translation by x0 on a state localized at
x1 produces a new state, localized at x1 + x0 . In “configuration space”,
we have a wave function of the form ψ(xx, t). A translation T (xx0 ) by
x0 of this state yields a new state (please don’t confuse this translation
operator with the time reversal operator of the previous problem, also
denoted by T , but without an argument):

ψ  (xx) = T (xx0 )ψ(xx, t) = ψ(xx − x0 , t). (96)

24
Note that we might have attempted a definition of this transformation
with an additional introduction of some overall phase factor. However,
it is our interest to define such operators as simply as possible, consis-
tent with what should give a valid classical correspondence. Whether
we have succeeded in preserving the appropriate classical limit must be
checked, of course.
Consider a free particle of mass m. The momentum space wave function
is  
ˆ itp2
ψ̂(p
p, t) = f (p
p) exp − , (97)
2m
where p = |pp|. The configuration space wave function is related by the
(inverse) Fourier transform:
1 
ψ(xx, t) = p)eix·p ψ̂(p
d3 (p p, t). (98)
(2π)3/2 (∞)

Obtain simple transformation laws, on both the momentum and config-


uration space wave functions, for each of the following proper Gallilean
transformations:

(a) Translation by x0 : T (xx0 ) (note that we have already seen the result
in configuration space).
(b) Translation by time t0 : M(t0 ).
(c) Velocity boost by v0 : V (vv0 ). (Hint: first find

ψ̂  (p
p, 0) = fˆ (p
p) = V (vv0 )fˆ(p
p), (99)

then
2 /2m
ψ̂  (p
p, t) = fˆ (p
p)e−itp , (100)
etc.)
(d) Rotation about the origin given by 3 × 3 matrix R: U(R).

Make sure your answers make sense to you in terms of classical corre-
spondence.
8. Consider the (real) vector space of real continuous functions with con-
tinuous first derivatives in the closed interval [0, 1]. Which of the fol-
lowing defines a scalar product?

25
1 
(a) f |g = 0f (x)g  (x)dx + f (0)g(0)
1 
(b) f |g = 0f (x)g  (x)dx

9. Consider the following equation in E∞ (infinite-dimensional Euclidean



space – let the scalar product be x|y ≡ ∞ ∗
n=1 xn yn ):

Cx = a,

where the operator C is defined by (in some basis):

C(x1 , x2 , . . .) = (0, x1 , x2 , . . .)

Is C:

(a) A bounded operator [i.e., does there exist a non-negative real


number α such that, for every x ∈ E∞ , we have |Cx| ≤ α|x|
(“|x|” denotes the norm: x|x)]?
(b) A linear operator?
(c) A hermitian operator (i.e., does x|Cy = Cx|y)?
(d) Does Cx = 0 have a non-trivial solution? Does Cx = a always
have a solution?

Now answer the same questions for the operator defined by:

G(α1 , α2 , . . .) = (α1 , α2 /2, α3/3, . . .). (101)

Note that we require a vector to be normalizable if it is to belong to


E∞ – i.e., the scalar product of a vector with itself must exist.

10. Let f ∈ L2 (−π, π) be a summable complex function on the real interval


[−π, π] (with Lebesgue measure).

(a) Define the scalar product by:


 π
f |g = f ∗ (x)g(x)dx, (102)
−π

for f, g ∈ L2 (−π, π). Starting with the intuitive, but non-trivial,


assumption that there is no vector in L2 (−π, π) other than the

26
trivial vector (f ∼ 0) which is orthogonal to all of the functions
sin(nx), cos(nx), n = 0, 1, 2, . . ., show that any vector f may be
expanded as:


f (x) = (an cos nx + bn sin nx) , (103)
n=0

where

1 π
a0 = f (x)dx (104)
2π −π

1 π
an = f (x) cos nxdx (n > 0) (105)
π −π
1π
bn = f (x) sin nxdx. (106)
π −π

[You may consult a text such as Fano’s Mathematical Methods of


Quantum Mechanics for a full proof of the completeness of such
functions.]
(b) Consider the function:

 −1
x < 0,
f (x) = 0 x = 0, (107)

+1 x > 0.
Determine the coefficients an , bn , n = 0, 1, 2, . . . for this function
for the expansion of part (a).
(c) We wish to investigate the partial sums in this expansion:
N

fN (x) = (an cos nx + bn sin nx) . (108)
n=0

Find the position, xN of the first maximum of fN (for x > 0).


Evaluate the limit of fN (xN ) as N → ∞. Give a numerical an-
swer. In so doing, you are finding the maximum value of the series
expansion in the limit of an infinite number of terms. [You may
find the following identity useful:
N
1 sin 2Nx
cos(2n − 1)x = .] (109)
n=1 2 sin x

27
(d) Obviously, the maximum value of f (x), defined in part (b), is 1.
If the value you found for the series expansion is different from 1,
comment on the possible reconciliation of this difference with the
theorem you demonstrated in part (a).

11. Show that, with a suitable measure, any summation over discrete in-
dices may be written as a Lebesgue integral:


f (xn ) = f (x)µ(dx). (110)
n=1 {x}

12. Resonances II: Quantum mechanical resonances – Earlier we investi-


gated some features of a classical oscillator with a “resonant” behavior
under a driving force. Let us begin now to develop a quantum me-
chanical analogue, of relevance also to scattering and particle decays.
For concreteness, consider an atom with two energy levels, E0 < E1 ,
where the transition E0 → E1 may be effected by photon absorption,
and the decay E1 → E0 via photon emission. Because the level E1
has a finite lifetime – we denote the mean lifetime of the E1 state by
τ – it does not have a precisely defined energy. In other words, it has
a finite width, which (assuming that E0 is the ground state) can be
measured by measuring precisely the distribution of photon energies in
the E1 → E0 decay. Call the mean of this distribution ω0 .

(a) Assume that the amplitude for the atom to be in state E1 is given
by the damped oscillatory form:
t
ψ(t) = ψ0 e−iω0 t− 2τ

Show that the mean lifetime is given by τ , as desired.


(b) Note that our amplitude above satisfies a “Schrödinger equation”:

dψ(t) i
i = (ω0 − ) ψ(t)
dt 2τ
Suppose we add a sinusoidal “driving force” F e−iωt on the right
hand side, to describe the situation where we illuminate the atom
with monochromatic light of frequency ω. Solve the resulting in-
homogeneous equation for its steady state solution.

28
(c) Convince yourself (e.g., by “conservation of probability”) that the
intensity of the radiation emitted by the atom in this steady-state
situation is just | ψ(t) |2 . Thus, the incident radiation is “scat-
tered” by our atom, with the amount of scattering proportional
to the emitted radiation intensity in the steady state. Give an
expression for the amount of radiation scattered (per unit time,
per unit amplitude of the incident radiation), as a function of
ω. For convenience, normalize your expression to the amount of
scattering at ω = ω0 . Determine the full-width at half maximum
(FWHM) of this function of ω, and relate to the lifetime τ .
Note that the “Breit-Wigner” function is just the Cauchy distribution
in probability.
13. Time Reversal in Quantum Mechanics, Part II
We earlier showed that the time reversal operator, T , could be written
in the form:
T = UK,
where K is the complex conjugation operator and U is a unitary oper-
ator. We also found that
T 2 = ±1.
Consider a spinless, structureless particle. All kinematic operators for
such a particle may be written in terms of the X 8 and P8 operators,
where
[Pj , Xk ] = −iδjk
T XT8 −1 = X 8
T P8 T −1 = −P8
(where the latter two equations follow simply from classical correspon-
dence).
8 the eigenvalues
If we work in a basis consisting of the eigenvectors of X,
are simply the real position vectors, and hence:
8 −1 = X.
U XU 8

In this basis, the matrix elements of P8 may be evaluated:

P8 = −i∇
8 :

29

8x1 | P8 | 8x2  = 8 x )δ (3) (8x − 8x2 )d(3) 8x
δ (3) (8x − 8x1 )(−i∇
(∞)
8 x1 δ (3) (8x1 − 8x2 ).
= −i∇

Thus, these matrix elements are pure imaginary, and

K P8 K −1 = −P8 ,

which implies finally


U P8 U −1 = P8 .
We conclude that for our spinless, structureless particle:

U = 1eiθ ,

where the phase θ may be chosen to be zero if we wish. In any event,


we have:
T = eiθ K,
and
T 2 = eiθ Keiθ K = 1.

(a) Show that, for a spin 1/2 particle, we may in the Pauli representa-
tion (that is, an angular momentum basis for our spin-1/2 system
such that the angular momentum operators are given by one-half
the Pauli matrices) write:

T = σ2 K,

and hence show that:


T 2 = −1.
Note that the point here is to consider the classical correspondence
for the action of time reversal on angular momentum.
By considering a direct product space made up of many spin-0
and spin 1/2 states (or by other equivalent arguments), this result
may be generalized: If the total spin is 1/2-integral, then T 2 = −1;
otherwise T 2 = +1.

30
(b) Show the following useful general property of an antiunitary op-
erator such as T :
Let
|ψ   = T |ψ
|φ = T |φ.
Then
ψ  |φ = φ|ψ.
This, of course, should agree nicely with your intuition about what
time reversal should do to this kind of scalar product.
(c) Show that, if |ψ is a state vector in an “odd” system (T 2 = −1),
then T |ψ is orthogonal to |ψ.

14. Suppose we have a particle of mass m in a one-dimensional potential


V = 12 kx2 (and the motion is in one dimension). What is the minimum
energy that this system can have, consistent with the uncertainty prin-
ciple? [The uncertainty relation is a handy tool for making estimates
of such things as ground state energies.]

31
Physics 195a
Course Notes
Ideas of Quantum Mechanics – Solutions to Exercises
021024 F. Porter

1 Exercises
1. Show that L2 is complete.
2. Complete the proof that the space L2 (a, b) is separable.
3. Show that if x ∈ H, where H is a separable Hilbert space, is orthogonal
to every vector in a dense set, then x = 0.
Solution: Let {yα} ⊂ H be a set of elements dense in H such that
x|yα  = 0, ∀α. (1)
Consider an element y ∈ H, in the closure of the set {yα }. We wish to
show that
x|y = 0. (2)
If we can show this, then we have demonstrated that x is orthogonal to
every element of H (including x), and hence x = 0, since H is a metric
space.
Since the closure of the dense set {yα } is obtained by including the
limits of all Cauchy sequences of elements of {yα }, we may consider
such a Cauchy sequence, say, z1 , z2 , . . ., such that
lim zn = z. (3)
n→∞

Thus,
|x|z| = |x|z − x|zn |, since x|zn  = 0, (4)
= |x|z − zn | (5)
= |x| lim (zj − zn )|. (6)
j→∞

But we can pick n to be sufficiently large that the ket vector is as close
as we wish to the zero vector. Hence |x|z| is smaller than any positive
number, and must be zero.

1
4. Complete the proof of the Schwarz inequality.
Solution: Given any two vectors φ, ψ ∈ H, consider the quantity:

φ + reiθ ψ|φ + reiθ ψ ≥ 0, (7)

where r and θ are real numbers. The value 0 is attained if and only if
φ + reiθ ψ = 0. Expand to obtain:
 
r 2 ψ|ψ + 2r eiθ φ|ψ + φ|φ ≥ 0. (8)

Equality holds if and only if φ and ψ are linearly dependent. Suppose


they are not; then the relation is a strict inequality, and there is no
real solution for r for the equality. Hence, the discriminant must be
negative:   2
eiθ φ|ψ − ψ|ψφ|φ < 0, (9)
or, since this must hold for all phases θ,

|φ|ψ| < ψ|ψφ|φ. (10)

The case where φ and ψ are linearly related is readily verified.


5. Complete the derivation of Eqns. 80, 81, and 82.
Solution: For equation 80, first note that:

[xj pk , xm pn ] = xj pk xm pn − xm pn xj pk
= xj xm pk pn + xj [pk , xm ]pn − xm xj pn pk − xm [pn , xj ]pk
= −ixj pn δmk + ixm pk δnj .

Then,

[Lα , Lβ ] = αjk βmn [xj pk , xm pn ]


= iαjk βmn (−xj pn δmk + xm pk δnj )
= i(−αjm βmn xj pn + αjk βmj xm pk )
= i(−αmj βjn xm pn + αjn βmj xm pn )
= i(αjm βjn − αjn βjm )xm pn .

For equation 81, start with the definition:

Eαβ,mn ≡ αjm βjn − αjn βjm . (11)

2
Note first that Eαβ,mn = 0 if α = β or if m = n, because then the
two terms will cancel. Now assume that α = β and m = n. If α = m
and α = n, then we again get zero, since if j = α and j = m, then
j = n, which means the first term is zero; similarly for the second term.
Likewise, we have zero if β = m and β = n. Thus, consider α = m; in
this case:

Emβ,mn = −mjn βjm


= mjn mjβ = 1 nb. β = m. (12)

Likewise, when α = n we get −1. Putting all these facts together, we


have
Eαβ,mn = αjβ mjn . (13)

For equation 82, we use 80 and 81:

[Lα , Lβ ] = iEαβ,mn xm pn
= iαjβ mjn xm pn
= −iαjβ jmn xm pn
= −iαjβ Lj
= iαβj Lj . (14)

6. Time Reversal in Quantum Mechanics:


We wish to define an operation of time reversal, denoted by T , in
quantum mechanics. We demand that T be a “physically acceptable”
transformation, i.e., that transformed states are also elements of the
Hilbert space of acceptable wave functions, and that it be consistent
with the commutation relations between observables. We also demand
that T have the appropriate classical correspondence with the classical
time reversal operation.
Consider a system of structureless (“fundamental”) particles and let
X = (X1 , X2 , X3 ) and P = (P1 , P2 , P3 ) be the position and momentum
operators (observables) corresponding to one of the particles in the
system. The commutation relations are, of course:

[Pm , Xn ] = −iδmn ,

3
[Pm , Pn ] = 0,
[Xm , Xn ] = 0.
The time reversal operation T : t → t = −t, operating on a state
vector gives (in Schrödinger picture – you may consider how to make
the equivalent statement in the Heisenberg picture):

T |ψ(t) = |ψ  (t ).


The time reversal of any operator, Q, representing an observable is
then:

Q = T QT −1
(a) By considering the commutation relations above, and the obvious
classical correspondence for these operators, show that
T iT −1 = −i.
Thus, we conclude that T must contain the complex conjugation
operator K:

KzK −1 = z ∗ ,
for any complex number z, we require that T on any state yields
another state in the Hilbert space. We can argue that (for you
to think about) we can write: T = UK, where U is a unitary
transformation. If we operate twice on a state with T , then we
should restore the original state, up to a phase:

T 2 = η1,
where η is a pure phase factor (modulus = 1).
Solution: Consider
T [P1 , X1 ]T −1 = T (−i)T −1 (15)
= T (P1 X1 − X1 P1 )T −1 (16)
= T P1 T −1 T X1 T −1 − T X1 T −1 T P1 )T −1 (17)
= (−P1 )X1 − X1 (−P1 ) (18)
= i. (19)

4
(b) Prove that η = ±1. Hence, T 2 = ±1. Which phase applies in any
given physical situation depends on the nature of U, and will turn
out to have something to do with spin, as we shall examine in the
future.
Solution: Consider T 3 :
T 3 = T 2 T = ηT (20)
= T T 2 = T η = T ηT −1 T = η ∗ T. (21)
Hence η = η ∗ , and since it is of modulus one, we must have η = ±1.

7. Let us consider the action of Gallilean transformations on a quantum


mechanical wave function. We restrict ourselves here to the “proper”
Gallilean Transformations: (i ) translations; (ii ) velocity boosts; (iii )
rotations. We shall consider a transformation to be acting on the state
(not on the observer). Thus, a translation by x0 on a state localized at
x1 produces a new state, localized at x1 + x0 . In “configuration space”,
we have a wave function of the form ψ(x, t). A translation T (x0 ) by
x0 of this state yields a new state (please don’t confuse this translation
operator with the time reversal operator of the previous problem, also
denoted by T , but without an argument):
ψ  (x) = T (x0 )ψ(x, t) = ψ(x − x0 , t). (22)

Note that we might have attempted a definition of this transformation


with an additional introduction of some overall phase factor. However,
it is our interest to define such operators as simply as possible, consis-
tent with what should give a valid classical correspondence. Whether
we have succeeded in preserving the appropriate classical limit must be
checked, of course.
Consider a free particle of mass m. The momentum space wave function
is  
2
itp
ψ̂(p, t) = fˆ(p) exp − , (23)
2m
where p = |p|. The configuration space wave function is related by the
(inverse) Fourier transform:

1
ψ(x, t) = d3 (p)eix·p ψ̂(p, t). (24)
(2π)3/2 (∞)

5
Obtain simple transformation laws, on both the momentum and config-
uration space wave functions, for each of the following proper Gallilean
transformations:

(a) Translation by x0 : T (x0 ) (note that we have already seen the


result in configuration space).
Solution: [Should draw a figure...] In configuration space, the
result was:

ψ  (x, t) = T (x0 )ψ(x, t) = ψ(x − x0 , t). (25)

Thus, in momentum space, we have:

ψ̂  (p, t) = T (x0 )ψ̂(p, t) (26)


1 
= 3/2
e−ip·x ψ  (x, t)d3 (x)
(2π) (∞)
1 
= e−ip·x ψ(x − x0 , t)d3 (x)
(2π)3/2 (∞)

1
= e−ip·(x+x0 ψ(x, t)d3 (x)
(2π)3/2 (∞)
= e−ip·x0 ψ̂(p, t). (27)

(b) Translation by time t0 : M(t0 ).


Solution: Let’s take the time translation to act on a configuration
space wave function in the obvious way:

M(t0 )ψ(x, t) = ψ(x, t − t0 ). (28)

Likewise,

M(t0 )ψ̂(p, t) = ψ̂(p, t − t0 ) (29)


 
it0 p2
= exp ψ̂(p, t). (30)
2m

(c) Velocity boost by v0 : V (v0 ). (Hint: first find

ψ̂  (p, 0) = fˆ (p) = V (v0 )fˆ(p), (31)

then
2 /2m
ψ̂  (p, t) = fˆ (p)e−itp , (32)

6
etc.)
Solution: Under a velocity transformation, the space coordinates
and momentum transform according to

x = x + v0 t (33)
p = p + mv0 . (34)

Thus, it appears reasonable to take

V (v0 )fˆ(p) = fˆ(p − mv0 ). (35)

Thus,
2
ψ̂  (p, t) = fˆ (p)e−itp /2m (36)
ˆ − mv0 )e−itp2 /2m
= f(p

it(p − mv0 )2 −itp2 /2m


= ψ̂(p − mv0 , t) exp e
2m

1 2
= exp −it p · v0 − mv0 ψ(p − mv0 , t). (37)
2
In configuration space, this becomes

 1
ψ (x, t) = d3 (p)eip·x ψ  (p, t)
(2π)3/2 (∞)
 
1 3 ip·x 1 2
= d (p)e exp −it p · v0 − mv0 ψ(p − mv0 , t)
(2π)3/2 (∞) 2

1 3 i(q+mv0 )·x [−it((q+mv0 )·v0 − 12 mv02 )]
= d (q)e e ψ(q, t)
(2π)3/2 (∞)

imv0 ·x [−it(mv02 − 12 mv02 )] d3
= e e 3/2
(q)eiq·x exp (−itq · v0 ) ψ(q, t)
(∞) (2π)

1 2 d3
= e[imv0 ·x−it( 2 mv0 )] (q) exp (iq · x − itq · v0 ) ψ(q, t)
(∞) (2π)3/2

1 2
= exp imv0 · x − it mv0 ψ(x − v0 t, t) (38)
2

(d) Rotation about the origin given by 3 × 3 matrix R: U(R).


Solution: For a rotation on a vector x, rotating it to a new
vector x , x = Rx. Acting on a wave function, the rotated wave

7
function ψ  , when evaluated at a rotated point x , is the same as
the unrotated wave function evaluated at the unrotated point x:
ψ  (x , t) = ψ  (Rx, t) = ψ(x, t). (39)
Thus,
ψ  (x, t) = Rop ψ(x, t) = ψ(R−1 x, t). (40)
I have used the notation Rop to distinguish the operator on the
Hilbert space from the 3 × 3 matrix R.
In momentum space, we may note that the same argument holds
for momenta, or we may Fourier transform the configuration space
result. In either event, we obtain
Rop ψ̂(p, t) = ψ̂(R−1 p, t). (41)

Make sure your answers make sense to you in terms of classical corre-
spondence.
8. Consider the vector space of real continuous functions with continuous
first derivatives in the closed interval [0, 1]. Which of the following
defines a scalar product?
1  
(a) f |g = 0 f (x)g (x)dx + f (0)g(0)
1  
(b) f |g = 0 f (x)g (x)dx
Solution: A scalar product must satisfy, for any f, g, h ∈ V , the
conditions:
(a) f |f  ≥ 0, with f |f  = 0 iff f = 0.
(b) f |g = g|f ∗.
(c) f |cg = cf |g, where c is any complex number.
(d) f |g + h = f |g + f |h.
In the present case, we are dealing with real vector spaces, hence the
second condition becomes f |g = g|f  and the constant in the third
condition is restricted to be a real number.
It may be readily checked that the scalar product defined in part (a):
 1
f |g = f  (x)g  (x) dx + f (0)g(0), (42)
0

8
satisfies all of the properties. However, the product defined in (b):
 1
f |g = f  (x)g  (x) dx, (43)
0

does not. The property that fails is the first – this product will yield
zero for f |f  if f is any constant, i.e., not only f = 0.

9. Consider the following equation in E∞ (infinite-dimensional Euclidean



space – let the scalar product be x|y ≡ ∞ ∗
n=1 xn yn ):

Cx = a,

where the operator C is defined by (in some basis):

C(x1 , x2 , . . .) = (0, x1 , x2 , . . .)

Is C:

(a) A bounded operator [i.e., does there exist a non-negative real


number α such that, for every x ∈ E∞ , we have |Cx| ≤ α|x|
(“|x|” denotes the norm: x|x)]?
(b) A linear operator?
(c) A hermitian operator (i.e., does x|Cy = Cx|y)?
(d) Does Cx = 0 have a non-trivial solution? Does Cx = a always
have a solution?

Now answer the same questions for the operator defined by:

G(α1 , α2 , . . .) = (α1 , α2 /2, α3/3, . . .). (44)

Note that we require a vector to be normalizable if it is to belong to


E∞ – i.e., the scalar product of a vector with itself must exist.
Solution: Both C and G are bounded operators: We note that |Cx| ≤
|x| and |Gx| ≤ |x|. Both C and G are also linear operators:

C(ax + by) = aCx + bCy, (45)

9
for any x, y ∈ E∞ and any complex numbers a, b. The same holds for
operator G. C is not a Hermitian operator:


x|Cy = x∗n yn−1
n=2
∞
= x∗n−1 yn . (46)
n=2

However, G is Hermitian. Neither the equation Cx = 0 nor the equa-


tion Gx = 0 possess non-trivial solutions.

10. Let f ∈ L2 (−π, π) be a summable complex function on the real interval


[−π, π] (with Lebesgue measure).

(a) Define the scalar product by:


 π
f |g = f ∗ (x)g(x)dx, (47)
−π

for f, g ∈ L2 (−π, π). Starting with the intuitive, but non-trivial,


assumption that there is no vector in L2 (−π, π) other than the
trivial vector (f ∼ 0) which is orthogonal to all of the functions
sin(nx), cos(nx), n = 0, 1, 2, . . ., show that any vector f may be
expanded as:


f (x) = (an cos nx + bn sin nx) , (48)
n=0

where

1 π
a0 = f (x)dx (49)
2π −π

1 π
an = f (x) cos nxdx (n > 0) (50)
π −π

1 π
bn = f (x) sin nxdx. (51)
π −π

[You may consult a text such as Fano’s Mathematical Methods of


Quantum Mechanics for a full proof of the completeness of such
functions.]

10
Solution: Let us start a bit more generally. We argued in the
solutions to problems 2 and 3 that: If we have an orthonormal set
of vectors {eα } in a Hilbert space H, then this set is complete if
and only if there is no vector in H, other than the zero vector,
which is orthogonal to all elements of {eα }. Recall also that {eα }
is complete if and only if the closure of the subspace formed by
{eα } is H.
Let us show that, if {eα } is a complete set of vectors, then we also
have that we may expand any vector f ∈ H as:

f= eα |f eα , (52)
α

where at most a denumerable number of terms are non-zero.


Consider a finite dimensional subspace, S, formed by a subset
of the {eα }, giving them labels {e1 , e2 , . . . , en }. We can make a
projection of f ∈ H onto this subspace according to:
n

fS = ei |f ei . (53)
i=1

Take the scalar product of both sides with ej (j ∈ 1, 2, . . . , n):

ej |fS  = ej |f . (54)

Then ej |f − fS  = 0, or f − fS is orthogonal to S. In this case,


the Pythagorean theorem applies, and

|f |2 = |fS |2 + |f − fS |2 . (55)

Thus, we have proven the following form of Bessel’s inequality:


n

|f |2 ≥ |ei |f |2, (56)
i=1

where the vector norm appears on the left, and absolute value
signs on the right.
We may now see that at most a countable set of coefficients ei |f 
can be non-vanishing: If the set is uncountable, then there must
be a limit point in the set of coefficients other than zero, and hence

11
the right hand side of Bessel’s inequality can be made arbitrarily
large. Hence, we consider the expansion in a countable set:



f = ei |f ei. (57)
i=1

Similarly with the above, we take the scalar product of both sides
with ej and learn that f − f  is orthogonal the the subspace gen-
erated by {ei }; Bessel’s inequality generalizes to


2
|f | ≥ |ei |f |2, (58)
i=1

Consider now the scalar product of f − f  with any element eα of


{eα }, where α = 1, 2, . . . This scalar product must be zero. Hence,
f − f  is orthognonal to the subspace generated by {eα }. Since
this is a complete set, f − f  = 0, which is the desired result.
The remainder of the exercise, deriving the expansion coefficients,
and showing that the functions are orthonormal, is straightfor-
ward.
(b) Consider the function:

 −1
x < 0,
f (x) = 0 x = 0, (59)

+1 x > 0.
Determine the coefficients an , bn , n = 0, 1, 2, . . . for this function
for the expansion of part (a).
Solution: Since f is an odd function, all of the ai coefficients are
zero. The remaining coefficients are:

1 π
bn = f (x) sin nx dx
π −π

2 π
= sin nx dx
π 0

0 n even
= 4 (60)

n odd.

In Fig. 1 we show the first 52 partial series expansions, fN (x).

12
1.4

1.2

0.8

0.6

0.4

0.2

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
-0.2

-0.4

-0.6

-0.8

-1

-1.2

-1.4

Figure 1: The first 52 functions fN (x).

(c) We wish to investigate the partial sums in this expansion:


N

fN (x) = (an cos nx + bn sin nx) . (61)
n=0

Find the position, xN of the first maximum of fN (for x > 0).


Evaluate the limit of fN (xN ) as N → ∞. Give a numerical an-
swer. In so doing, you are finding the maximum value of the series
expansion in the limit of an infinite number of terms. [You may
find the following identity useful:
N
 1 sin 2Nx
cos(2n − 1)x = .] (62)
n=1 2 sin x

Solution: Only odd terms in the sum will contribute, so substi-

13
tute index n with 2k − 1
N

4 sin nx
fN (x) =
π n
n=1,odd
[(N +1)/2]
4  sin(2k − 1)x
= . (63)
π k=1 2k − 1

Take the derivative and set equal to zero to find the extrema:
[(N +1)/2]
4 
0 = fN (x)
= cos(2k − 1)x
π k=1
2 sin 2[(N + 1)/2]x
= . (64)
π sin x
If N is even, the first maximum is at NxN = π, and if N is odd,
it is at (N + 1)xN = π. We want the limit:
 xN
lim fN (xN ) = lim fN (x) dx
N →∞ N →∞ 0

xN sin 2[(N + 1)/2]x
2
= lim dx
π N →∞ 0 sin x
 π
2 sin y dy
= lim
π N →∞ 0 sin {y/2[(N + 1)/2]} 2[(N + 1)/2]
2  π sin y
= dy (65)
π 0 y
2
= Si(π) = 1.179, (66)
π
using the Handbook of Mathematical Functions to determine Si(π) ≈
1.85193.
(d) Obviously, the maximum value of f (x), defined in part (b), is 1.
If the value you found for the series expansion is different from 1,
comment on the possible reconciliation of this difference with the
theorem you demonstrated in part (a).
Solution: Two (related) points may be noted here: First, there
is no contradiction with the assertion that f (x) may be arbitrarily
closely approximated, at any x, by taking the expansion to suffi-
ciently high numbers of terms. Second, in the limit N → ∞, the

14
expansion differs from f (x) at a set of measure zero, and hence is
not considered to be a distinct function in our Hilbert space.

11. Show that, with a suitable measure, any summation over discrete in-
dices may be written as a Lebesgue integral:

 
f (xn ) = f (x)µ(dx). (67)
n=1 {x}

Solution:

12. Resonances II: Quantum mechanical resonances – Earlier we investi-


gated some features of a classical oscillator with a “resonant” behavior
under a driving force. Let us begin now to develop a quantum me-
chanical analogue, of relevance also to scattering and particle decays.
For concreteness, consider an atom with two energy levels, E0 < E1 ,
where the transition E0 → E1 may be effected by photon absorption,
and the decay E1 → E0 via photon emission. Because the level E1
has a finite lifetime – we denote the mean lifetime of the E1 state by
τ – it does not have a precisely defined energy. In other words, it has
a finite width, which (assuming that E0 is the ground state) can be
measured by measuring precisely the distribution of photon energies in
the E1 → E0 decay. Call the mean of this distribution ω0 .

(a) Assume that the amplitude for the atom to be in state E1 is given
by the damped oscillatory form:
t
ψ(t) = ψ0 e−iω0 t− 2τ

Show that the mean lifetime is given by τ , as desired.


Solution: The probability to be in the state ψ depends on time
according to:
1
P (t) = |ψ(t)|2 = e−t/τ , (68)
τ
where I have assumed the normalization |ψ0 |2 = 1/τ . Note that
with this normalization, we have a properly normalized probabil-
ity. The mean lifetime is thus:
 ∞
t −t/τ
t = e dt = τ. (69)
0 τ

15
(b) Note that our amplitude above satisfies a “Schrödinger equation”:
dψ(t) i
i = (ω0 − ) ψ(t)
dt 2τ
Suppose we add a sinusoidal “driving force” F e−iωt on the right
hand side, to describe the situation where we illuminate the atom
with monochromatic light of frequency ω. Solve the resulting in-
homogeneous equation for its steady state solution.
Solution: The new Scrödinger equation is
dψ(t) i
i = (ω0 − ) ψ(t) + F e−iωt . (70)
dt 2τ
The steady state solution is
F
ψ(t) = i e−iωt . (71)
ω − ω0 + 2τ

(c) Convince yourself (e.g., by “conservation of probability”) that the


intensity of the radiation emitted by the atom in this steady-state
situation is just | ψ(t) |2 . Thus, the incident radiation is “scat-
tered” by our atom, with the amount of scattering proportional
to the emitted radiation intensity in the steady state. Give an
expression for the amount of radiation scattered (per unit time,
per unit amplitude of the incident radiation), as a function of
ω. For convenience, normalize your expression to the amount of
scattering at ω = ω0 . Determine the full-width at half maximum
(FWHM) of this function of ω, and relate to the lifetime τ .
Solution: Population of state ψ in steady state must be achieved
by absorbing photons from the light source. In steady state, the
same rate of radiation by de-exciting atoms must prevail to keep a
balance. The population probability is just |ψ|2 , hence this gives
the intensity of the emitted radiation. The amount of radiation
scattered per unit time per unit incident amplitude normalized to
the intensity at ω0 is:
 2
1/(4τ 2 ) 2  1/(4τ 2 ) 
 
|ψ| =  
F2  ω − ω0 + i 

1
= . (72)
(ω − ω0 ) + 1/(4τ 2 )
2

16
The maximum of this distribution occurs at ω = ω0 , with peak
value 4τ 2 . Half maximum is at ω − ω0 = ±1/(2τ ). Thus, the
FWHM is Γ = 1/τ .

Note that the “Breit-Wigner” function is just the Cauchy distribution


in probability.

13. Time Reversal in Quantum Mechanics, Part II


We earlier showed that the time reversal operator, T , could be written
in the form:
T = UK,
where K is the complex conjugation operator and U is a unitary oper-
ator. We also found that
T 2 = ±1.
Consider a spinless, structureless particle. All kinematic operators for
such a particle may be written in terms of the X  and P operators,
where
[Pj , Xk ] = −iδjk
T XT −1 = X 

T P T −1 = −P
(where the latter two equations follow simply from classical correspon-
dence).
 the eigenvalues
If we work in a basis consisting of the eigenvectors of X,
are simply the real position vectors, and hence:
 −1 = X.
U XU 

In this basis, the matrix elements of P may be evaluated:

P = −i∇
 :


x1 | P | x2  =  x )δ (3) (x − x2 )d(3) x
δ (3) (x − x1 )(−i∇
(∞)
 x δ (3) (x1 − x2 ).
= −i∇ 1

17
Thus, these matrix elements are pure imaginary, and

K P K −1 = −P ,

which implies finally


U P U −1 = P .
We conclude that for our spinless, structureless particle:

U = 1eiθ ,

where the phase θ may be chosen to be zero if we wish. In any event,


we have:
T = eiθ K,
and
T 2 = eiθ Keiθ K = 1.

(a) Show that, for a spin 1/2 particle, we may in the Pauli representa-
tion (that is, an angular momentum basis for our spin-1/2 system
such that the angular momentum operators are given by one-half
the Pauli matrices) write:

T = σ2 K,

and hence show that:


T 2 = −1.
Note that the point here is to consider the classical correspondence
for the action of time reversal on angular momentum.
By considering a direct product space made up of many spin-0
and spin 1/2 states (or by other equivalent arguments), this result
may be generalized: If the total spin is 1/2-integral, then T 2 = −1;
otherwise T 2 = +1.
Solution: The Pauli matrices are:

0 1 0 −i 1 0
σ1 = ; σ2 = ; σ3 = . (73)
1 0 i 0 0 −1

We assume that spin angular momentum behaves similarly with


orbital angular momentum under time reversal. Since L = r × p,

18
we have that angular momentum reverses sign under time reversal.
Hence, we must have:

−σi = T σi T −1 = UKσi K −1 U −1 = Uσi∗ U −1 . (74)

Thus,

Uσ1 U −1 = −σ1 ; Uσ2 U −1 = σ2 ; Uσ3 U −1 = −σ3 . (75)

That is, U commutes with σ2 and anticommutes with σ1 and σ3 .


Since any 2 × 2 matrix can be written as a complex linear com-
bination of the identity and the three Pauli matrices, we see that
U must be of the form U = eiθ σ2 , where we may pick the phase
theta to be zero. Hence, we can write T = σ2 K. Then,

T 2 = σ2 Kσ2 K = −σ22 = −1. (76)

(b) Show the following useful general property of an antiunitary op-


erator such as T :
Let
|ψ   = T |ψ
|φ = T |φ.
Then
ψ  |φ = φ|ψ.
This, of course, should agree nicely with your intuition about what
time reversal should do to this kind of scalar product.
Solution: We have:

|ψ   = T |ψ = UK|ψ = U(|ψ)∗


|φ  = T |φ = UK|φ = U(|φ)∗ . (77)

Hence,

ψ  |φ  = (ψ|)∗U † U(|φ)∗


= ψ|φ∗
= φ|ψ. (78)

19
(c) Show that, if |ψ is a state vector in an “odd” system (T 2 = −1),
then T |ψ is orthogonal to |ψ.

Solution: Let |φ = T |ψ. From part (b), we have the final step in
the sequence:

φ|ψ = −φ|T 2 |ψ = −φ|T |φ = −φ|ψ. (79)

This can only be true if φ|ψ = 0, that is, T |ψ is orthogonal to |ψ.

14. Suppose we have a particle of mass m in a one-dimensional potential


V = 12 kx2 (and the motion is in one dimension). What is the minimum
energy that this system can have, consistent with the uncertainty prin-
ciple? [The uncertainty relation is a handy tool for making estimates
of such things as ground state energies.]
Solution: The Hamiltonian is:
p2 1
H= + kx2 .
2m 2
We consider expectation values in the ground state. The average mo-
mentum and position are zero (by symmetry):

p = 0; x = 0.

Thus:

(∆p)2 = p2  − p2


= p2 ,
(∆x)2 = x2 .

Let E be the ground state energy:

(∆p)2 1
E = H = + k(∆x)2 .
2m 2
The uncertainty relation for x and p is:
1 1
(∆p)2 (∆x)2 ≥ |[x, p]|2 = .
4 4

20
Thus,
1 k
E≥ 2
+ (∆x)2 .
8m(∆x) 2
This has a minimum at:
dE 1 k
0= =− + ,
d(∆x) 2 8m(∆x) 4 2
or,
1
(∆x)2 = √ .
2 mk
Therefore,
1
E ≥ ω,
2

Where ω = k/m is the classical oscillator frequency. We note that
the minimum is actually achieved in this case: the ground state energy
of the simple harmonic oscillator is ω/2, in quantum mechanics.

21
Physics 195a
Course Notes
Path Integrals: An Example
021024 F. Porter

1 Introduction
This note illustrates the use of path integral arguments in quantum mechanics
via a famous example, the “Aharonov-Bohm” effect. Along the way, we get a
glimpse of the more fundamental role which electromagnetic potentials take
on in quantum mechanics, compared with classical electrodynamics.

2 Electromagnetism in Quantum Mechanics


Consider the motion of a charge q in an electromagnetic field (φ, A), where
φ is the scalar potential and A is the vector potential. Classically, the force
on the charge is the “Lorentz force”:
F = q(E + v × B), (1)
where E = −∇φ − ∂t A is the electric field, B = ∇ × A is the magnetic field,
and v is the velocity of the charge. The classical Hamiltonian is
1
H= (p − qA)2 + qφ, (2)
2m
where m is the mass of the charge and p = ∂L ∂ ẋ
is the generalized momentum
conjugate to x (L is the Lagrangian). Since H = 12 mv 2 +V , v = (p−qA)/m.
Thus, p − qA is the ordinary kinematic momentum. In quantum mechanics
we presume a form:
1
H = [p − qA(x, t)]2 + qφ(x, t) (3)
2m
1
= [−i∇ − qA(x, t)]2 + qφ(x, t). (4)
2m

3 Is the Vector Potential Real?


In classical electrodynamics, we invented the electromagnetic potentials as a
mathematical aid. The real “physics” (i.e., forces affecting motion) is in the

1
fields. Interestingly, when we pursue the correspondence in quantum physics,
we find a new phenomenon. In this section, we look at a specific example,
and use path integral arguments as a tool to investigate this.
Let us consider the following problem: We are given a very long, thin
“solenoid” (Fig. 1). Assume that the current is static, and that the net
charge density is everywhere zero. Thus, we can take φ(x, t) = 0, and hence
E = 0 everywhere.

y
x

Figure 1: A section of the long solenoid, with coordinate system indicated.

Let us further assume that we have made the pitch of the winding very
fine, so that we may assume that the current is perpendicular to the z di-
rection, to whatever approximation we wish (perhaps we could do this by
setting up a persistent current in a cylindrical superconductor). Then, to
whatever approximation we desire, we have Boutside = 0, where “outside”
refers to the region outside the solenoid windings. Inside the solenoid, we
have a non-zero magnetic flux:

Φ= B · dS. (5)
solenoid cross section

2
That is, we have a magnetic field given by:

0 outside solenoid
B= (6)
B0 ez inside solenoid,
where B0 is a constant. This may be demonstrated with Maxwell’s equations
and symmetry arguments.
Thus, we have a situation where the electromagnetic field is zero every-
where outside of a thin cylindrical region, to whatever approximation we
need. What is the vector potential outside this solenoid? Starting with
B = ∇ × A, and using Stoke’s theorem we find:
Φ
Aoutside (x, t) = Aeφ = eφ . (7)
2πr
We have made a particular choice of gauge here, in which Ar = Az = 0.
Since Jr = Jz = 0, these two components of the vector potential must be
constants.
Notice now that the line integral of A along a closed curve outside and
around the solenoid (see Fig. 2) is equal to the enclosed flux, which is non-
zero. Thus, it is impossible to find a gauge transformation such that A = 0
everywhere outside the solenoid. Let’s check this more explicitly, by seeing

Figure 2: Cross section view of the solenoid, with a path around it in a region
with vector potential A.

how we fail: A is ambiguous up to a gauge transformation:

A → A = A + ∇χ, (8)

where χ is an arbitrary differentiable function of position. We already have a


solution in the form of Eqn. 7. If we attempt to find a gauge transformation

3
so that A = 0, we must have
Φ
∇χ = − eφ . (9)
2πr
In cylindrical coordinates,
∂χ 1 ∂χ ∂χ
∇χ = er + eφ + ez . (10)
∂r r ∂φ ∂z
Only the eφ component contributes, hence
φ
χ = −Φ (plus any constant). (11)

Following a brief moment of elation that we have succeeded, we realize
that there is a problem. This solution is not a single-valued function of
spatial position. Any attempt to patch this, e.g., by introducing a point of
discontinuity in χ(φ), results in a vector potential which is non-zero outside
the soelnoid for some value(s) of φ.
We have constructed a physical situation, to a good approximation, in
which the magnetic field vanishes in a region, but the vector potential does
not. Since there is something which we cannot get rid of by a gauge trans-
formation, we might wonder whether there really is “something” outside the
solenoid that “knows” about the field inside, even if the field outside is zero.
Is there an experimentally observable consequence, or is this merely a math-
ematical oddity?
Classically, it appears that we in fact see nothing outside the solenoid:
If we probe the outside region with charged particles, their trajectories are
unaltered, because
dp
F= = q(E + v × B) = 0. (12)
dt
What about quantum mechanics? We expect that the only hope is that
something observable happens in the phase of the amplitude, since there are
no classical effects. Let us imagine an experiment in which phase effects
could appear.
We consider a wave packet (particle) suitably localized, and split, away
from the solenoid (see Fig. 3). To begin the analysis of what happens to this
wave packet, suppose first that the magnet is off, Φ = 0. Let

ψ0 (x, t) = ψ (x, t) + ψr (x, t), (13)

4
where ψ is the wave passing to the left, and ψr is the wave passing to the
right. That is, ψ is a solution to the (A = 0) Schrödinger equation with the
property that it vanishes, for all times, on the half plane to the right of the
solenoid. Switching right and left, ψr has the corresponding interpretation.
We assume that at some early time, ti , the wave packet is in front of the
solenoid, and that at a later time, tf , the wave packet is predominantly
behind the solenoid. We look for interference patterns behind the solenoid
at tf .

Wave
Packet

Figure 3: Schematic of the experimental arrangement.

With this general setup, and magnet-off solutions, let us now consider
what happens when there is a flux Φ in the solenoid. We’ll assume that this
flux is time-independent. Let
 x
s (x) ≡ A(x ) · dx , (14)
xi
left path
 x
sr (x) ≡ A(x ) · dx , (15)
xi
right path

where “left path” (“right path”) is a path which does not intersect the
right(left) half-plane at the solenoid (or perhaps intersects it an even number
of times).

5
left path

x0 . x .
.x
solenoid

right path
Figure 4: A section of the long solenoid, with sample left and right paths
from x0 to x.

We know that, independent of the details of the paths taken (Fig. 4),
 
  0 x in front of solenoid,
A(x ) · dx = (16)
Φ x behind solenoid.
−left path
+right path
Actually, we could worry about less-probable scenarios, such as paths which
circulate the solenoid multiple times, which would require modification to
the above statement. We thus have that s (x) and sr (x) are independent of
their particular paths, as long as we maintain the left, right constraints.
So, what is the solution to the Scrödinger equation in the presence of Φ?
It will be left to the reader to verify that the solution is:

ψΦ (x, t) = ψ (x, t)eiqs (x) + ψr (x, t)eiqsr (x) . (17)

Verification may be accomplished by direct substitution into the Schrödinger


equation; the left and right pieces independently satisfy the equation.

6
If x is in front of the solenoid, then s (x) = sr (x), since Φenclosed = 0.
Then

ψΦ (x, t) = [ψ (x, t) + ψr (x, t)] eiqs (x)


= ψ0 (x, t)eiqs (x) (early times, ti ). (18)

Thus, at early times, ψΦ and ψ0 represent the same state, because they differ
only in phase.
At later times we consider the region “behind” the solenoid. In this case,

sr (x) = Φ + s (x), (19)

and thus,
 
ψΦ (x, t) = ψ (x, t) + eiqΦ ψr (x, t) eiqs (x) (late times, tf ). (20)

Thus, the flux manifests itself in a phase shift by qΦ of the wave passing to the
right relative to the wave passing to the left. The interference pattern behind
the solenoid will then be affected, unless qΦ = 2πn, where n is an integer. The
flux inside the solenoid may be “observed” with a probe outside the solenoid
in a purely quantum mechanical way. This has been experimentally verified.
An early theoretical paper, for which this effect is named, is Aharonov and
Bohm, Physical Review 115 (1959) 485. An early experimental paper is
Chambers, Physical Review Letters 5 (1960) 3.

4 Exercises
1. Show that the B field is as given in Eqn. 6, and that the vector potential
is as given in Eqn. 7, up to gauge transformations.
2. Verify that the wave function in Eqn. 17 satisfies the Schrödinger equa-
tion.
3. A number of assumptions have been made, possibly implicitly in the
discussion of this effect.
(a) Critique the discussion, pointing out areas where the argument
may break down.
(b) Resolve the problematic areas in your critique, or else demonstrate
that the argument really does break down.

7
4. We have discussed the interesting Aharonov-Bohm effect. Let us con-
tinue a bit further the thinking in this example.
(a) Consider again the path integral in the vicinity of the long thin
solenoid. In particular, consider a path which starts at x, loops
around the solenoid, and returns to x. Since the B and E fields
are zero everywhere in the region of the path, the only effect on
the particle’s wave function in traversing this path is a phase shift,
and the amount of phase shift depends on the magnetic flux in the
solenoid, as we discussed in class. Suppose we are interested in a
particle with charge of magnitude e (e.g., an electron). Show that
the magnetic flux Φ in the solenoid must be quantized, and give
the possible values that Φ can have.
(b) Wait a minute!!! Did you just show that there is no Aharonov-
Bohm effect? We know from experiment that the effect is real.
So, if you did what I expected you to do in part (a), there is a
problem. Discuss!
(c) The BCS theory for superconductivity assumes that the basic
“charge carrier” in a superconductor is a pair of electrons (a
“Cooper pair”). The Meissner effect for a (Type I) supercon-
ductor is that when such a material is placed in a magnetic field,
and then cooled below a critical temperature, the magnetic field
is excluded from the superconductor. Suppose that there is a
small non-superconducting region traversing the superconductor,
in which magnetic flux may be “trapped” as the material is cooled
below the critical temperature. Ignoring part (b) above, what val-
ues do you expect to be possible for the trapped flux? [This effect
has been experimentally observed.] What is the value (in Tesla-
m2 ) of the smallest non-zero flux value. You may find the following
conversion constant handy:

1 = 0.3 Tesla-m/GeV, (21)

where the “0.3” is more precisely the speed of light in nanome-


ters/second.
(d) How can we reconcile the answer to part (c), which turns out to be
a correct result (even if the derivation might be flawed), with your
discussion in part (b)? Let’s examine the superconducting case

8
more carefully. Let us suppose we have a ring of superconducting
material. We assume a model for superconductivity in which the
superconducting electrons are paired, and the resulting pairs are
in a “Bose-condensate”. Well, this precedes our discussion on
identical particles, but we essentially mean that the pairs are all
in the same quantum state. We may write our wave function for
the superconducting pairs in the form:

ρs iθ(x)
ψ(x) = e , (22)
2
where ρs is the number density of superconducting electrons, and
θ is a position-dependent phase. Note that we have normalized
our wave function so that its absolute square gives the density of
Cooper pairs. Find an expression for ρ2s v, the Cooper pair number
current density. Use this with the expression for the canonical mo-
mentum of a Cooper pair in a magnetic field (vector potential A)
to arrive at an expression for the electromagnetic current density
of the superconducting electrons.
Now consider the following scenario: We apply an external mag-
netic field with the superconductor above its critical temperature
(that is, not in a superconducting state). We then cool this sys-
tem down below the critical temperature. We want to know what
we can say about any magnetic flux which is trapped in the hole
in the superconductor. Consider a contour in the interior of the
superconductor, much further from the surfaces than any pene-
tration depths. By considering an integral around this contour,
see what you can say about the allowed values of flux through the
hole.
(e) So far, no one has observed (at least not convincingly) a magnetic
“charge”, analogous to the electric charge. But there is nothing
fundamental that seems to prevent us from modifying Maxwell’s
equations to accommodate the existence of such a “magnetic mono-
pole”. In particular, we may alter the divergence equation to:

∇ · B = 4πρM ,

where ρM is the magnetic charge density.

9
Consider a magnetic monopole of strength eM located at the ori-
gin. The B -field due to this charge is simply:
eM
B= r̂r̂,
r
where r̂ is a unit vector in the radial direction. The r̂r̂-component
of the curl of the vector potential is:
1 ∂ ∂Aθ
(Aφ sin θ) − .
r sin θ ∂θ ∂φ
A solution, as you should quickly convince yourself, is a vector
potential in the φ direction:
1 − cos θ
Aφ = eM .
r sin θ
Unfortunately(?), this is singular at θ = π, i.e., on the negative
z-axis. We can fix this by using this form everywhere except in
a cone about θ = π, i.e., for θ ≤ π − , and use the alternate
solution:
−1 − cos θ
Aφ = eM
r sin θ
in the (overlapping) region θ ≥ , thus covering the entire space.
In the overlap region ( ≤ θ ≤ π − ), either A or A may be used,
and must give the same result, i.e., the two solutions are related
by a gauge transformation – that is, they differ by the gradient of
a scalar function.
Consider the effect of the vector potential on the wave function
of an electron (charge −e). Invoke single-valuedness of the wave
function, and determine the possible values of eM that a magnetic
charge can have. [This is sometimes called a “Dirac monopole”.]

10
Physics 195a
Course Notes
Path Integrals: An Example – Solutions to Exercises
021024 F. Porter

1 Exercises
1. Show that the B field is as given in Eqn. 6, and that the vector potential
is as given in Eqn. 7, up to gauge transformations.
Solution: We wish to show that:

0 outside solenoid
B= (1)
B0 ez inside solenoid,
Use the form of Maxwell’s equations which states:

B · d = 4πIenclosed . (2)
Consider a circle of radius r, centered on the solenoid and perpendicular
to the z axis, where r may be larger or smaller than the solenoid radius.
In either case the enclosed current is zero, and hence Bφ = 0 both inside
and outside the solenoid (using the circular symmetry to do the line
integral of the magnetic field). etc.
2. Verify that the wave function in Eqn. 17 satisfies the Schrödinger equa-
tion.
Solution: The Schrödinger equation is
1
i∂t ψΦ = (−i∇ − qA)2ψ (3)
2m
We are given left and right solutions that satisfy the Schrödinger equa-
tion with A = 0:
i∂t ψΦ = i∂t ψ eiqs + i∂t ψr eiqsr
1 iqs 2 1 iqsr 2
= − e ∇ ψ − e ∇ ψr . (4)
2m 2m
Consider now
 
(−i∇ − qA)2 ψ eiqs = (−i∇ − qA) · (−i∇ψ )eiqs + ψ (q∇s )eiqs − qAψ eiqs
= (−i∇ − qA) · (−i∇ψ )eiqs
= −∇2 ψ eiqs . (5)

1
Repeat for ψr and plug into the Schrödinger equation.
3. A number of assumptions have been made, possibly implicitly in the
discussion of this effect.
(a) Critique the discussion, pointing out areas where the argument
may break down.
(b) Resolve the problematic areas in your critique, or else demonstrate
that the argument really does break down.
Solution:

4. We have discussed the interesting Aharonov-Bohm effect. Let us con-


tinue a bit further the thinking in this example.
(a) Consider again the path integral in the vicinity of the long thin
solenoid. In particular, consider a path which starts at x, loops
around the solenoid, and returns to x. Since the B and E fields
are zero everywhere in the region of the path, the only effect on
the particle’s wave function in traversing this path is a phase shift,
and the amount of phase shift depends on the magnetic flux in the
solenoid, as we discussed in the note. Suppose we are interested
in a particle with charge of magnitude e (e.g., an electron). Show
that the magnetic flux Φ in the solenoid must be quantized, and
give the possible values that Φ can have.
Solution: We know that the total change in phase of the wave
function in traversing a loop is given by:

∆θ = e A · ds.
= eΦ,

where the second equality holds if the loop encloses the solenoid.
Single-valuedness for the wave function imposes the constraint
that ∆θ must be an integral muliple of 2π. Hence, we can only
have:

eΦ = 2πn, where n = . . . , −2, −1, 0, 1, 2, . . . .

BUT SEE BELOW!!

2
(b) Wait a minute!!! Did you just show that there is no Aharonov-
Bohm effect? We know from experiment that the effect is real.
So, if you did what I expected you to do in part (a), there is a
problem. Discuss!
Solution: We may see a hint of the problem by noticing that
the answer for the quantum of magnetic flux obtained in part
(a) seems to depend on the charge of the particle being used to
probe the vector potential. If we use a different charge, we’ll get
a different quantum – the answer isn’t intrinsic to the solenoid
alone.
We’ll discuss this further in class. It’ll motivate us into a discus-
sion of Berry’s phase.
(c) The BCS theory for superconductivity assumes that the basic
“charge carrier” in a superconductor is a pair of electrons (a
“Cooper pair”). The Meissner effect for a (Type I) supercon-
ductor is that when such a material is placed in a magnetic field,
and then cooled below a critical temperature, the magnetic field
is excluded from the superconductor. Suppose that there is a
small non-superconducting region traversing the superconductor,
in which magnetic flux may be “trapped” as the material is cooled
below the critical temperature. Ignoring part (b) above, what val-
ues do you expect to be possible for the trapped flux? [This effect
has been experimentally observed.] What is the value (in Tesla-
m2 ) of the smallest non-zero flux value. You may find the following
conversion constant handy:
1 = 0.3 Tesla-m/GeV, (6)
where the “0.3” is more precisely the speed of light in nanome-
ters/second.
Solution: In this case, the “basic” charge carriers have charge
magnitude 2e, and hence the quantization condition in part (a)
becomes:
2πn
Φn = , where n = . . . , −2, −1, 0, 1, 2, . . . .
2e
Let’s determine the value of the flux for n = 1. We have to be a

bit careful about our units here. If we were to blindly say e = α,

3
we would get:
π π
Φ1 = = 0.3 T-m/GeV × 10−3 GeV/MeV × 197 MeV-fm × 10−15 m/fm
e 1/137
≈ 2.2 × 10−15 T-m2 . (WRONG) (7)

But this isn’t quite right for the MKS system. Actually,

e2
= α, (8)
4π0
or, since 0 = 1/µ0, and µ0 = 4π × 10−7 N/A2 ,

107 2
e2 = A /N. (9)
137
Thus, 
π
Φ1 = = π 137 × 107 N/A2 . (10)
e

So what is N/A2 , in T-m2 (or Webers)? Well,
 

N  kg-m/s2
=
A2 (kg/T-s2 )2

2 s2
= T-m
m3 kg

1
= T-m2
m-J

1 J
= T-m2 1.602 × 10−19 197 × 10−9 eV-m
m-J eV
−13 2
= 1.78 × 10 T-m . (11)

Hence,
Φ1 ≈ 2.07 × 10−15 T-m2 . (12)
(d) How can we reconcile the answer to part (c), which turns out to be
a correct result (even if the derivation might be flawed), with your
discussion in part (b)? Let’s examine the superconducting case
more carefully. Let us suppose we have a ring of superconducting

4
material. We assume a model for superconductivity in which the
superconducting electrons are paired, and the resulting pairs are
in a “Bose-condensate”. Well, this precedes our discussion on
identical particles, but we essentially mean that the pairs are all
in the same quantum state. We may write our wave function for
the superconducting pairs in the form:

ρs iθ(x)
ψ(x) = e , (13)
2
where ρs is the number density of superconducting electrons, and
θ is a position-dependent phase. Note that we have normalized
our wave function so that its absolute square gives the density of
Cooper pairs. Find an expression for ρ2s v, the Cooper pair number
current density. Use this with the expression for the canonical mo-
mentum of a Cooper pair in a magnetic field (vector potential A)
to arrive at an expression for the electromagnetic current density
of the superconducting electrons.
Now consider the following scenario: We apply an external mag-
netic field with the superconductor above its critical temperature
(that is, not in a superconducting state). We then cool this sys-
tem down below the critical temperature. We want to know what
we can say about any magnetic flux which is trapped in the hole
in the superconductor. Consider a contour in the interior of the
superconductor, much further from the surfaces than any pene-
tration depths. By considering an integral around this contour,
see what you can say about the allowed values of flux through the
hole.
Solution: We first look for an expression for the Cooper pair
number current density. The Schrödinger equation for a free par-
ticle is:
∇2
i∂t ψ(x, t) = − ψ(x, t), (14)
2m
where m is the mass of the electron. Multiply this equation by
ψ ∗ . Take the complex conjugate of the Schrödinger equation, and
multiply by ψ. Finally, take the difference between the two results,
to obtain:
∇2 ∇2 ∗
ψ ∗ i∂t ψ − ψ(−i)∂t ψ ∗ = −ψ ∗ ψ − (−)ψ ψ (15)
2m 2m
5
1 ∗ 2
i∂t |ψ|2 = −ψ ∇ ψ + ψ∇2 ψ ∗ (16)
2m
ρs 1
i∂t = ∇ · (ψ∇ψ ∗ − ψ ∗ ∇ψ) . (17)
2 2m
This is of the form of a continuity equation (∂t Q + ∇ · J = 0), and
we read off the Cooper pair number current density:
ρs i
v= (ψ∇ψ ∗ − ψ ∗ ∇ψ) , (18)
2 2m
where v is the Cooper pair speed (assumed to be non-relativistic,
of course).
Now for the canonical momentum of a Cooper pair. For this
purpose, we see that a Cooper pair is a “particle” of charge −2e
and mass 2m. The canonical momentum is

p = −i∇, (19)

and is related to the kinematic momentum 2mv in a magnetic


field by
p = 2mv + 2eA. (20)
We take expectation values to find:

p = ∇θ = 2mv + 2eA. (21)

The electromagnetic current density carried by Cooper pairs is


thus:
ρs
Jem = 2e v (22)
2

e2 ρs 1
= ∇θ − A . (23)
m 2e

Now for our scenario. The essential physical point, which makes
this example different from the Aharonov-Bohm situation, is that,
deep enough into the superconductor, the Cooper-pair current
density is zero. We integrate around the contour to obtain:
 
1
∇θ · ds = A · ds = Φ. (24)
2e C C

6
Single-valuedness of the wave function implies that

1 1
Φ= ∇θ · ds = 2πk, (25)
2e C 2e
where k must be an integer.
(e) So far, no one has observed (at least not convincingly) a magnetic
“charge”, analogous to the electric charge. But there is nothing
fundamental that seems to prevent us from modifying Maxwell’s
equations to accommodate the existence of such a “magnetic mono-
pole”. In particular, we may alter the divergence equation to:

∇ · B = 4πρM ,

where ρM is the magnetic charge density.


Consider a magnetic monopole of strength eM located at the ori-
gin. The B -field due to this charge is simply:
eM
B= r̂r̂,
r
where r̂ is a unit vector in the radial direction. The r̂r̂-component
of the curl of the vector potential is:
1 ∂ ∂Aθ
(Aφ sin θ) − .
r sin θ ∂θ ∂φ
A solution, as you should quickly convince yourself, is a vector
potential in the φ direction:
1 − cos θ
Aφ = eM .
r sin θ
Unfortunately(?), this is singular at θ = π, i.e., on the negative
z-axis. We can fix this by using this form everywhere except in
a cone about θ = π, i.e., for θ ≤ π − , and use the alternate
solution:
−1 − cos θ
Aφ = eM
r sin θ
in the (overlapping) region θ ≥ , thus covering the entire space.
In the overlap region ( ≤ θ ≤ π − ), either A or A may be used,

7
and must give the same result, i.e., the two solutions are related
by a gauge transformation – that is, they differ by the gradient of
a scalar function.
Consider the effect of the vector potential on the wave function
of an electron (charge −e). Invoke single-valuedness of the wave
function, and determine the possible values of eM that a magnetic
charge can have. [This is sometimes called a “Dirac monopole”.]
Solution: Consider the gauge transformation relating the two
vector potentials in the overlap region:
2eM
A − A = − êeφ = ∇χ,
r sin θ
where χ is a scalar function. Since
∂χ 1 ∂χ 1 ∂χ
∇χ = êer + êeθ + êeφ ,
∂r r ∂θ r sin θ ∂φ
up to an unimportant constant, we thus have the gauge function

χ = −2eM φ.

Under a gauge transformatin in A, the wave function undergoes a


corresponding transformation, in phase:

ψ → ψ  = ψ exp(ieχ)
   
(Since A · ds → A · ds + ∇χ · ds, and ∇χ · ds = χ). Hence,
we have:
ψ  = ψ exp(−2ieeM φ).
But this must be single-valued, giving the condition that

2eeM = n, where n = . . . , −2, −1, 0, 1, 2, . . . .

Thus, the magnetic charge must be quantized in units of:


1 137
= e.
2e 2

8
Physics 125c
Course Notes
Density Matrix Formalism
040511 Frank Porter

1 Introduction
In this note we develop an elegant and powerful formulation of quantum me-
chanics, the “density matrix” formalism. This formalism provides a structure
in which we can address such matters as:
• We typically assume that it is permissible to work within an appropriate
subspace of the Hilbert space for the universe. Is this all right?
• In practice we often have situations involving statistical ensembles of
states. We have not yet addressed how we might deal with this.

2 The Density Operator


Suppose that we have a state space, with a denumerable orthonormal basis
{|un i, n = 1, 2, . . .}. If the system is in state |ψ(t)i at time t, we have the
expansion in this basis:
X
|ψ(t)i = an (t)|un i. (1)
n

We’ll assume that |ψ(t)i is normalized, and hence:


XX
hψ(t)|ψ(t)i = 1 = an (t)a∗m (t)hum |un i
n m
X
= |an (t)|2 (2)
n

Suppose that we have an observable (self-adjoint operator) Q. The matrix


elements of Q in this basis are:
Qmn = hum |Qun i = hQum |un i = hum |Q|un i. (3)
The average (expectation) value of Q at time t, for the system in state |ψ(t)i
is: XX
hQi = hψ(t)|Qψ(t)i = a∗m (t)an (t)Qmn . (4)
n m

1
We see that hQi is an expansion quadratic in the {an } coefficients.
Consider the operator |ψ(t)ihψ(t)|. It has matrix elements:

hum |ψ(t)ihψ(t)|uni = am (t)a∗n (t). (5)

These matrix elements appear in the calculation of hQi. Hence, define

ρ(t) ≡ |ψ(t)ihψ(t)|. (6)

We call this the density operator. It is a Hermitian operator, with matrix


elements
ρmn (t) = hum |ρ(t)un i = am (t)a∗n (t). (7)
Since ψ(t) is normalized, we also have that
X X
1= |an (t)|2 = ρnn (t) = Tr [ρ(t)] . (8)
n n

We may now re-express the expectation value of observable Q using the


density operator:
XX
hQi(t) = a∗m (t)a∗n (t)Qmn
m n
XX
= ρnm (t)Qmn
m n
X
= [ρ(t)Q]nn
n
= Tr [ρ(t)Q] . (9)

The time evolution of a state is given by the Schrödinger equation:


d
i |ψ(t)i = H(t)|ψ(t)i, (10)
dt
where H(t) is the Hamiltonian. Thus, the time evolution of the density
operator may be computed according to:
d d
ρ(t) = [|ψ(t)ihψ(t)|]
dt dt
1 1
= H(t)|ψ(t)ihψ(t)| − |ψ(t)ihψ(t)|H(t)
i i
1
= [H(t), ρ(t)] (11)
i
2
Suppose we wish to know the probability, P ({q}), that a measurement of
Q will yield a result in the set {q}. We compute this probability by projecting
out of |ψ(t)i that protion which lies in the eigensubspace associated with
observables in the set {q}. Let P{q} be the projection operator. Then:

P ({q}) = hψ(t)|P{q} ψ(t)i


h i
= Tr P{q} ρ(t) . (12)

We note that the density operator, unlike the state vector, has no phase
ambiquity. The same state is described by |ψ(t)i and |ψ 0 (t)i = eiθ |ψ(t)i.
Under this phase transformation, the density operator transforms as:

ρ(t) → ρ0 (t) = eiθ |ψ(t)ihψ(t)|e−iθ


= ρ(t). (13)

Furthermore, expectaion values are quadratic in |ψ(t)i, but only linear in


ρ(t).
For the density operators we have been considering so far, we see that:

ρ2 (t) = |ψ(t)ihψ(t)||ψ(t)ihψ(t)|
= ρ(t). (14)

That is, ρ(t) is an idempotent operator. Hence,

Trρ2 (t) = Trρ(t) = 1. (15)

Finally, notice that:

hun |ρ(t)un i = ρnn (t) = |an (t)|2 ≥ 0 ∀n. (16)

Thus, for an arbitrary state |φi, hφ|ρ(t)φi ≥ 0, as may be demonstrated by


expanding |φi in the |ui basis. We conclude that ρ is a non-negative definite
operator.
We postulate, in quantum mechanics, that the states of a system are in
one-to-one correspondence with the non-negative definite density operators
of trace 1 (defined on the Hilbert space).

3
3 Statistical Mixtures
We may wish to consider cases where the system is in any of a number of
different states, with various probabilities. The system may be in state |ψ1 i
with probability p1 , state |ψ2 i with probability p2 , and so forth (more gener-
ally, we could consider states over some arbitrary, possibly non-denumerable,
P
index set). We must have 1 ≥ pi ≥ 0 for i ∈ {index set}, and i pi = 1.
Note that this situation is not the same thing as supposing that we are in

the state |ψi = p1 |ψ1 i + p2 |ψ2 i + · · · (or even with p1 , etc.). Such statistical
mixtures might occur, for example, when we prepare a similar system (an
atom, say) many times. In general, we will not be able to prepare the same
exact state every time, but will have some probability distribution of states.
We may ask, for such a system, for the probability P ({q}) that a mea-
surement of Q will yield a result in the set {q}. For each state in our mixture,
we have

Pn ({q}) = hψn |P{q} ψn i


 
= Tr ρn P{q} , (17)

where ρn = |ψn ihψn |. To determine the overall probability, we must sum over
the individual probabilities, weighted by pn :
X
P ({q}) = pn Pn ({q})
n
X  
= pn Tr ρn P{q}
n
!
X
= Tr pn ρn P{q}
n
 
= Tr ρP{q} , (18)

where X
ρ≡ pn ρn . (19)
n

Now ρ is the density operator of the system, and is a simple linear combina-
tion of the individual density operators. Note that ρ is the “average” of the
ρn ’s with respect to probability distribution pn .
Let us investigate this density operator:
• Since ρn are Hermitian, and pn are real, ρ is Hermitian.

4
P P P
• Trρ = Tr ( n pn ρn ) = n pn Trρn = n pn = 1.
P
• ρ is non-negative-definite: hφ|ρφi = n pn hφ|ρn φi ≥ 0.
• Let Q be an operator with eigenvalues qn . In the current situation, hQi
refers to the average of Q over the statistical mixture. We have:
X X  
hQi = qn P ({qn }) = qn Tr ρP{qn }
n n
!
X
= Tr ρ qn P{qn }
n
X
= Tr(ρQ), since Q = qn P{qn } . (20)
n

• We may determine the time evolution of ρ. For ρn (t) = |ψn (t)ihψn (t)|
we know (Eqn. 11) that
dρn (t)
i = [H(t), ρn (t)] . (21)
dt
P
Since ρ(t) is linear in the ρn , ρ(t) = n pn ρn (t), we have
dρ(t)
i = [H(t), ρ(t)] . (22)
dt
• Now look at
XX
ρ2 = pm pn ρm ρn
m n
XX
= pm pn |ψm ihψm |ψn ihψn |
m n
6= ρ, in general. (23)
What about the trace of ρ2 ? Let
X
|ψm i = (am )j |uj i. (24)
j

Then
XX
ρ2 = pm pn |ψm ihψm |ψn ihψn |
m n
 " #
XX XX XX
= pm pn  (am )∗i (an )j δij  (am )k (an )∗` |uk ihu`|
m n i j k `
X
= pm pn (am )∗i (an )i (am )k (an )∗` |uk ihu`|. (25)
m,n,i,k,`

5
Let’s take the trace of this. Notice that Tr(|uk ihu`|) = δk` , so that
X
Tr(ρ2 ) = pm pn (am )∗i (an )i (am )k (an )∗k . (26)
m,n,i,k
P ∗
But hψm |ψn i = i (am )i (an )i , and thus:
XX
Tr(ρ2 ) = pm pn |hψm |ψn i|2
m n
XX
≤ pm pn hψm |ψm ihψn |ψn i, (Schwarz inequality)
m n
X X
≤ pm pn
m n
≤ 1. (27)
The reader is encouraged to check that equality holds if and only if the
system can be in only one physical state (that is, all but one of the pn ’s
corresponding to independent states must be zero).
Note that, if Tr(ρ2 ) = 1, then ρ = |ψihψ|, which is a projection operator.
We encapsulate this observation into the definition:
Def: A state of a physical system is called a pure state if Tr(ρ2 ) = 1; the
density operator is a projection. Otherwise, the system is said to be in
a mixed state, or simply a mixture.
The diagonal matrix elements of ρ have a simple physical interpretation:
X
ρnn = pj (ρj )nn
j
X
= pj hun |ψj ihψj |un i
j
X
= pj |(aj )n |2 . (28)
j

This is just the probability to find the system in state |un i. Similarly, the
off-diagonal elements are
X
ρmn = pj (aj )m (aj )∗n . (29)
j

The off-diagonal elements are called coherences. Note that it is possible to


choose a basis in which ρ is diagonal (since ρ is Hermitian). In such a basis,
the coherences are all zero.

6
4 Measurements, Statistical Ensembles, and
Density Matrices
Having developed the basic density matrix formalism, let us now revisit it,
filling in some motivational aspects. First, we consider the measurement
process. It is useful here to regard an experiment as a two-stage process:
1. Preparation of the system.
2. Measurement of some physical aspect(s) of the system.
For example, we might prepare a system of atoms with the aid of spark gaps,
magnetic fields, laser beams, etc., then make a measurement of the system
by looking at the radiation emitted. The distinction between preparation
and measurement is not always clear, but we’ll use this notion to guide our
discussion.
We may further remark that we can imagine any measurement as a sort
of “counter” experiment: First, consider an experiment as a repeated prepa-
ration and measurement of a system, and refer to each measurement as an
“event”. Think of the measuring device as an array of one or more “counters”
that give a response (a “count”) if the variables of the system are within some
range. For example, we might be measuring the gamma ray energy spectrum
in some nuclear process. We have a detector which absorbs a gamma ray
and produces an electrical signal proportional to the absorbed energy. The
signal is processed and ultimately sent to a multichannel analyzer (MCA)
which increments the channel corresponding to the detected energy. In this
case, the MCA is functioning as our array of counters.
The process is imagined to be repeated many times, and we are not con-
cerned with issues of the statistics of finite counting here. The result of such
an experiment is expressed as the probability that the various counters will
register, given the appropriate preparation of the system. These probabilities
may include correlations.
Let us take this somewhat hazy notion and put it into more concrete
mathematical language: Associate with each counter a dichotomic vari-
able, D, as follows:
If the counter registers in an event, D = 1.
If the counter does not register in an event, D = 0.
We assert that we can, in principle, express all physical variables in terms of
dichotomic ones, so this appears to be a sufficiently general approach.

7
By repeatedly preparing the system and observing the counter D, we can
determine the probability that D registers: The average value of D, hDi, is
the probability that D registers in the experiment. We refer to the particu-
lar preparation of the system in the experiment as a statistical ensemble
and call hDi the average of the dichotomic variable D with respect to this
ensemble.
If we know the averages of all possible dichotomic variables, then the en-
semble is completely known. The term “statistical ensemble” is synonymous
with a suitable set of averages of dichotomic variables (i.e., probabilities).
Let us denote a statistical ensemble with the letter ρ. The use of the same
symbol as we used for the density matrix is not coincidental, as we shall see.
The quantity hDiρ explicitly denotes the average of D for the ensemble ρ.
Clearly:
0 ≤ hDiρ ≤ 1. (30)
D is precisely known for ensemble ρ if hDiρ = 0 or hDiρ = 1. Otherwise,
variable D possesses a statistical spread. Note that we may prepare a system
(for example, an atom) many times according to a given ensemble. However,
this does not mean that the system is always in the same state.
We have the important concept of the superposition of two ensembles:
Let ρ1 and ρ2 be two distinct ensembles. An ensemble ρ is said to be an
incoherent superposition of ρ1 and ρ2 if there exists a number θ such
that 0 < θ < 1, and for every dichotomic variable D we have:
hDiρ = θhDiρ1 + (1 − θ)hDiρ2 . (31)
This is expressed symbolically as:
ρ = θρ1 + (1 − θ)ρ2 , (32)
“ρ is a superposition of ρ1 and ρ2 with probabilities θ and 1 − θ.”
We assume that if ρ1 and ρ2 are physically realizable, then any coherent
superposition of them is also physically realizable. For example, we might
prepare a beam of particles from two independent sources, each of which
may hit our counter: ρ1 corresponds to source 1, ρ2 corresponds to source
2. When both sources are on, the beam hitting the counter is an incoherent
mixture of ρ1 and ρ2 . We may compute the probability, P (1|hit), that a
particle hitting the counter is from beam 1. Using Bayes’ theorem:
P (hit|1)P (1)
P (1|hit) =
P (hit)

8
hDiρ1 θ
=
θhDiρ1 + (1 − θ)hDiρ2
hDiρ1
= θ . (33)
hDiρ
(34)

The generalization to an incoherent superposition of an arbitrary number


of ensembles is clear: Let ρ1 , ρ2 , . . . be a set of distinct statistical ensembles,
and let θ1 , θ2 , . . . be a set of real numbers such that
X
θn > 0, and θn = 1. (35)
n

The incoherent sum of these ensembles, with probabilities {θn } is denoted


X
ρ= θn ρn . (36)
n

This is to be interpreted as meaning that, for every dichotomic variable D:


X
hDiρ = θn hDiρn . (37)
n

A particular prepared system is regarded as an element of the statistical


ensemble. We have the intuitive notion that our level of information about
an element from an ensemble ρ = θρ1 + (1 − θ)ρ2 , which is an incoherent
superposition of distinct ensembles ρ1 and ρ2 , is less than our information
about an element in either ρ1 or ρ2 . For example, consider D a dichotomic
variable such that hDiρ1 6= hDiρ2 . Such a variable must exist, since ρ1 6= ρ2 .
We have:
hDiρ = θhDiρ1 + (1 − θ)hDiρ2 . (38)
Consider
1 1 1
hDiρ − = θ(hDiρ1 − ) + (1 − θ)(hDiρ2 − ). (39)
2 2 2
We find:

1 1 1
hDiρ − ≤ θ hDiρ1 − + (1 − θ) hDiρ2 −

2 2 
2

1 1
< max hDiρ1 − , hDiρ2 − . (40)
2 2

9
What does this result tell us? The quantity |hDiρ − 12 | ∈ [0, 12 ] can be
regarded as a measure of the information we have about variable D for en-
semble ρ. For example, if |hDiρ − 12 | = 12 , then hDiρ = 1 or 0, and D is
precisely known for ensemble ρ. On the other hand, if |hDiρ − 12 | = 0, then
hDiρ = 1/2, and each of the possibilities D = 0 and D = 1 is equally likely,
corresponding to maximal ignorance about D for ensemble ρ. Thus, our in-
equality says that, for at least one of ρ1 and ρ2 , we know more about D than
for the incoherent superposition ρ.
We may restate our definition of pure and mixed states:

Def: A pure ensemble (or pure state) is an ensemble which is not an


incoherent superposition of any other two distinct ensembles. A mixed
ensemble (or mixed state) is an ensemble which is not pure.

Intuitively, a pure ensemble is a more carefully prepared ensemble – we have


more (in fact, maximal) information about the elements – than a mixed
ensemble.
The set of all physical statistical ensembles is a convex set,1 with an inco-
herent superposition of two ensembles a convex combination of two elements
of the convex set. Pure states are the extreme points of the set – i.e., points
which are not convex combinations of other points.
So far, this discussion has been rather general, and we have not made
any quantum mechanical assumptions. In fact, let us think about classical
1
Convex set: A subset K ⊂ C n of n-dimensional complex Euclidean space is convex
if, given any two points α, β ∈ K, the straight line segment joining α and β is entirely
contained in K:

α . . β

(a) (b) (c)

(a) Not a convex set. (b) A convex set. (c) A convex set: Any convex combination of
α, β, x = θα + (1 − θ)β, where 0 < θ < 1 is an element of the set.

10
mechanics first. In classical physics, the pure states correspond to a complete
absence of any statistical spread in the dichotomic variables. If a preparation
yields a pure state, then a repeated measurement of any variable will always
yield the same result, either 0 or 1. Experimentally, this does not seem to be
the case. Instead, no matter how carefully we prepare our ensemble, there
will always be at least one dichotomic variable D such that hDi = 1/2, corre-
sponding to maximal statistical spread. Quantum mechanics (ignoring now
issues of superselection rules) also deals with the nature of dichotomic vari-
ables and the set of ensembles, in a way which agrees so far with experiment.
Let us restate some earlier postulates of quantum mechanics, modified and
expanded in this context:
1. To every physical system we associate a Hilbert space H. The pure
ensembles of the system are in 1:1 correspondence with the set of all
one-dimensional projections in H. Such a projection, P , is an operator
on the Hilbert space satisfying (A):

 2
P =P idempotent,

(A) P = P Hermitian, (41)


Tr(P ) = 1 “primitive”, or one-dimensional.
The set of all such projections is in one-to-one correspondence with the
set of all rays2 in H. Alternatively, we say that there is a one-to-one
correspondence between the rays and the pure states.
Given any ray R, we can pick a unit vector φ ∈ R, and the idempotent
P associated with R is
(B) P = |φihφ|. (42)
Conversely, any idempotent with the properties (A) can also be written
in the form (B).
Proof: We assume (see Exercises) that it has been demonstrated that
any linear operator in an n-dimensional Euclidean space may be
expressed as an n-term dyad, and that the extension of this idea
to an infinite-dimensional separable space has been made. Hence,
we may write: X
P = |ai ihbi |. (43)
i
2
A ray is the set of all non-zero multiples of a given non-zero vector. Such a multiple
is called an element of the ray.

11
Note that in some orthonormal basis {|ei i}, the matrix elements
of P are Pij = hei |P |ej i, and hence,
X
P = |ei iPij hej |. (44)
i,j

In the present case, P is Hermitian and therefore diagonalizable.


Let {|ei i} be a basis in which P is diagonal:
X
P = |ei iPii hei |. (45)
i

Since P † = P , the Pii are all real. Calculate:


X
P2 = |ei iPii hei |ej iPjj hej |
i,j
X
= |ei iPii2 hei |
i
= P, (46)

where the latter equality can be true if and only if Pii2 = Pii for
all i. That is, for each i we must either have Pii = 1 or Pii = 0.
P
But we must also have Tr(P ) = i Pii = 1, which holds if exactly
one Pii 6= 0, say Paa . In this basis,

P = |ea ihea | (47)

The ray R associated with P is then {c|ea i; c 6= 0}.


2. To every dichotomic variable D there corresponds a projection on some
subspace of H. That is, such a variable is represented by an operator
D on H satisfying:

D† = D (48)
D 2 = D 6= 0, (49)

the latter since the eigenvalues of D are 0 and 1.


3. The average of D in pure ensemble P (corresponding ot projection P )
is:
hDiP = Tr(DP ) (50)
(if P = |φihφ|, then hDiP = hφ|D|φi.

12
4. An arbitrary ensemble ρ is represented by a statistical operator,
or density matrix, which we also denote by symbol ρ. This is a
Hermitian operator on H with spectral decomposition,
X
ρ= ri Pi , (51)
i

where

Pi Pj = δij (52)
X
Pi = I (53)
i
ri ≥ 0 (54)
X
ri = 1. (55)
i

The set {ri } is the set of eigenvalues of the operator ρ. The properties
of this density matrix are precisely as in our earlier discussion.
Our symbolic equation for the incoherent superposition of two ensem-
bles, ρ = θρ1 + (1 − θ)ρ2 , can be interpreted as an equation for the cor-
responding density matrices represented by the same symbols. Hence,
the density matrix ρ describing the superposition of ρ1 and ρ2 with
probabilities θ and 1 − θ is ρ = θρ1 + (1 − θ)ρ2 . Thus, if ρ is any density
matrix, and D any dichotomic variable, then:

hDiρ = Tr(Dρ). (56)

For example,

hDiρ = hDiθρ1 +(1−θ)ρ2 (57)


= θhDiρ1 + (1 − θ)hDiρ2
= Tr(Dθρ1 ) + Tr [D(1 − θ)ρ2 ]
= Tr {D [θρ1 + (1 − θ)ρ2 ]}
= Tr(Dρ)

5. We regard every projection as corresponding to an observable, i.e.,


every primitive Hermitian idempotent P corresponds to an observable.
If ρ is a density matrix, then

ρ = P ⇔ Tr(P ρ) = 1. (58)

13
Proof: Suppose ρ = P . Then Tr(P P ) = Tr(P ), since P 2 = P . But
TrP = 1, since P is primitive. Now suppose Tr(P ρ) = 1. Then
!
X
1 = Tr P ri Pi
i
X
= ri Tr(P Pi ). (59)
i

Expand the one-dimensional projection operator in the basis in


which Pi = |ei ihei |:
X
P = |ej ihej |P |ek ihek |. (60)
j,k

Then:
 
X X
1 = ri Tr  |ej ihej |P |ek ihek |ei ihei |
i j,k
X X
= ri hej |P |ei iTr (|ej ihei |)
i j
X
= ri hei |P |ei i. (61)
i
P P
But we also have i hei |P |ei i = 1 and i ri = 1, with 0 ≤ ri ≤ 1.
P
Thus, i ri hei |P |ei i < 1, unless there is a k such that rk = 1,
and all of the other ri = 0, i 6= k. Hence, hek |P |ek i = 1, or
P = |ek ihek | = ρ.

Thus, P is the observable which tests whether an element of the sta-


tistical ensemble is in the state corresponding to ray “P ”.

6. In addition to the projection operators, we regard general self-adjoint


operators as observables, and the laws of nature deal with these observ-
ables. For example, we may consider operators with spectral resolutions
of the form: X X
Q= qi Pi = qi |ei ihei |, (62)
i i

where Pi Pj = δij Pi , and where the eigenvalues qi are real. We may


regard this as expressing the physical variable Q in terms of the di-
chotomic variables Pi (noting that the eigenvalues of Pi are 0 and 1).

14
Hence it is natural to define the ensemble average of Q in an ensemble
ρ by:
X
hQiρ = h qi Pi iρ
i
X
= qi Tr(ρPi )
i
= Tr(ρQ). (63)

This completes our picture of the mathematical structure and postulates


of quantum mechanics in this somewhat new language. We see that we
need not discuss “state vectors” in quantum mechanics, we can talk about
“ensembles” instead. In fact, the latter description has a more “physical”
aspect, in the sense that experimentally we seem to be able to prepare systems
as statistical ensembles, but not so readily as pure states.
Of course, we have no proof that our experimental ensembles and di-
chotomic variables must obey the above postulates. It may be that there
is some other theory which is more correct. However, there is so far no ex-
perimental conflict with our orthodox theory, and we shall continue in this
vein.

5 Coherent Superpositions
Theorem: Let P1 , P2 be two primitive Hermitian idempotents (i.e., rays, or
pure states, with P † = P , P 2 = P , and TrP = 1). Then:
1 ≥ Tr(P1 P2 ) ≥ 0. (64)
If Tr(P1 P2 ) = 1, then P2 = P1 . If Tr(P1 P2 ) = 0, then P1 P2 = 0 (vectors
in ray 1 are orthogonal to vectors in ray 2).
More generally, if ρ is a density matrix, and Q is any projection, then
1 ≥ Tr(Qρ) ≥ 0, (65)
Tr(Qρ) = 1 ⇔ Qρ = ρQ = ρ, (66)
Tr(Qρ) = 0 ⇔ Qρ = 0. (67)
Suppose we have orthogonal pure states, P1 P2 = 0. There then exists a
unique two parameter family of pure states {P } such that
Tr(P P1 ) + Tr(P P2 ) = 1. (68)

15
Any member P of this family is a ray corresponding to any vector in the
two-dimensional subspace defined by the projection P1 + P2 = S. We say
that P is a coherent superposition of the pure states P1 and P2 .
Let’s give an explicit construction of the operators P : Pick unit vector
|e1 i from ray P1 and |e2 i from ray P2 . Construct the following four operators:
S = P1 + P2 = |e1 ihe1 | + |e2 ihe2 | (69)
σ1 = |e1 ihe2 | + |e2 ihe1 | (70)
σ2 = i (|e2 ihe1 | − |e1 ihe2 |) (71)
σ3 = |e1 ihe1 | − |e2 ihe2 |. (72)
These operators satisfy the algebraic relations (noting the obvious similarities
with the Pauli matrices):
S2 = S (73)
Sσi = σi (74)
σi2 = S (75)
[σi , σj ] = iijk σk . (76)
Let u = (u1 , u2 , u3 ) be a unit vector in three-dimensional Euclidean space.
Define
1
P (u) ≡ (S + u · σ ). (77)
2
The reader should demonstrate that P (u) is the most general coherent su-
perposition of pure states P1 and P2 . This set is parameterized by the two-
parameter unit vector u. This, of course, is very characterstic of quantum
mechanics: If we have a “two-state” system we may form arbitrary super-
positions |ψi = α|ψ1 i + β|ψ2 i (assume hψ1 |ψ2 i = 0). The overall phase is
arbitrary, and the normalization constraint |α|2 + |β|2 = 1 uses another de-
gree of freedom, hence two parameters are required to describe an arbitrary
state. Note that the coherent superposition of pure states is itself a pure
state, unlike an incoherent superposition.

6 Density Matrices in a Finite-Dimensional


Hilbert Space
Consider a finite-dimensional Hilbert space H. The set of Hermitian opera-
tors on H defines a real vector space (real, so that aQ is Hermitian if Q is

16
Hermitian). Call this vector space O (for vector space of Operators). Define
a postive definite [(X, X) > 0 unless X = 0] symmetric [(X, Y ) = (Y, X)]
scalar product on O by:
(X, Y ) ≡ Tr(XY ), (78)
for any two vectors (i.e., Hermitian operators) X, Y ∈ O. The set of all
density matrices forms a convex subset of O, with norm ≤ 1.
Consider a complete orthonormal basis in O:

{B} = {B1 , B2 , . . .} ⊂ O such that Tr(Bi Bj ) = δij . (79)

Expand any vector X ∈ O in this basis according to


X
X= Bi Tr(Bi X). (80)
i

For a density matrix ρ this expansion is


X
ρ= Bi Tr(Bi ρ), (81)
i

but, as we have seen before, Tr(Bi ρ) = hBi iρ is just the ensemble average of
observable Bi in the ensemble ρ. Hence, the density matrix may be deter-
mined through measurements, uniquely, if we measure the ensemble averages
of a complete set of operators.

7 Entropy, Mixing, Correlations


For this discussion, we need to first define the concept of a function of an
operator. Consider a self-adjoint operator Q, with a pure point spectrum
consisting of (real) eigenvalues {qi ; i = 1, 2, . . .} and no finite point of accu-
mulation.3 Let |ki denote the eigenvector corresponding to eigenvalue qk , and
assume it has been normalized. Then {|ki} forms a complete orthonormal
set, i.e.: X
hk|ji = δkj ; I = |kihk|. (82)
k

3
Abstractly, a point of accumulation (or a limit point) is a point x ∈ S ⊂ T , where
T is a topological space, if every neighborhood N (x) contains a point of S distinct from
x.

17
The spectral resolution of Q is given by:
X
Q= qk |kihk|. (83)
k

Let Σ(Q) denote the spectrum {q} of Q. If f (q) is any function defined
on Σ(Q), we define the operator f (Q) by:
X
f (Q) ≡ f (qk )|kihk|. (84)
k

For example, X 2
Q2 = qk |kihk|, (85)
k
which may be compared with
X
Q2 = qk qj |kihk|jihj| (86)
k,j
X
= qk2 |kihk|,
k

which is what we hope should happen. In particular, we may perform Taylor


series expansions of functions of operators.
We wish to define a measure of the amount of (or lack of) information
concerning the elements of a statistical ensemble ρ. Thus, define the entropy
s = s(ρ) by:
s ≡ −Tr(ρ ln ρ) (= −hln ρiρ ). (87)
Note that, with an expansion (spectral decomposition) of ρ according to
X X
ρ= ri Pi = ri |ei ihei |, (88)
i i

we have X
ln ρ = (ln ri )Pi , (89)
i
and hence
" #
X
s = −Tr (ln ri )ρPi
i
X
= − ln ri Tr(ρPi )
i
X X
= − ln ri Tr( rj Pj Pi )
i j
X
= − ri ln ri . (90)
i

18
Since 0 ≤ ri ≤ 1, we always have s ≥ 0, and also s = 0 if and only if the
ensemble is a pure state. Roughly speaking, the more non-zero ri ’s there are,
that is the more the number of pure states involved, the greater the entropy.
Consistent with our classical thermodynamic notion that entropy in-
creases with “mixing”, we have the “von Neumann mixing theorem”:

Theorem: If 0 < θ < 1, and ρ1 6= ρ2 , then:

s [θρ1 + (1 − θ)ρ2 ] > θs(ρ1 ) + (1 − θ)s(ρ2 ). (91)

8 Combination of Systems
Consider the situation where the system of interest may be regarded as the
“combination” of two subsystems, 1 and 2. For example, perhaps the system
consists of two atoms. For simplicity of illustration, assume that the states
of system 1 alone form a finite-dimensional Hilbert space H1 , and the states
of system 2 alone form another finite-dimensional Hilbert space H2 . The
combined system is then associated with Hilbert space H = H1 ⊗ H2 . For
example, we may have a two-dimensional space H1 and a three-dimensional
space H2 , with sets of vectors:
 
  
 α  
a  
and β  , (92)
b 
 

γ

respectively. Then the product space consists of direct product vectors of


the form:  

 
 bα 
 
 aβ 
 
 bβ  . (93)
 
 
 aγ 

The operators on H which refer only to subsystem 1 are of the form X ⊗I,
and the operators on H which refer only to subsystem 2 are of the form I ⊗Y

19
(X is an operator on H1 and Y is an operator on H2 ). For example:
 
x1 x2
  x 0 0
  1 0 0  3 x4 
x1 x2  x1 x2 
X ⊗I = ⊗0 1 0
 
= 0 0 
. (94)
x3 x4  x3 x4 
0 0 1  
 x1 x2 
0 0
x3 x4
We see that this operator does not mix up the components α, β, γ vectors in
H2 .
Consider now an operator on H of the special form Z = X ⊗ Y . Define
“partial traces” for such an operator according to the mappings:

Tr1 (Z) = Tr1 (X ⊗ Y ) ≡ Y Tr(X) (95)


Tr2 (Z) = Tr2 (X ⊗ Y ) ≡ XTr(Y ) (96)

For our example:


 
  y1 y2 y3
x1 x2  
Z = X ⊗Y = ⊗  y4 y5 y6  (97)
x3 x4
y7 y8 y9
 
x1 y1 x2 y1 x1 y2 x2 y2 x1 y3 x2 y3

 x3 y1 x4 y1 x3 y2 x4 y2 x3 y3 x4 y3 

 
 x1 y4 x2 y4 x1 y5 x2 y5 x1 y6 x2 y6 
=  , (98)
x y x4 y4 x3 y5 x4 y5 x3 y6 x4 y6 
 3 4 
 
 x1 y7 x2 y7 x1 y8 x2 y8 x1 y9 x2 y9 
x3 y7 x4 y7 x3 y8 x4 y8 x3 y9 x4 y9
and thus, for example,
 
y1 y2 y3
 
Tr1 (Z) = (x1 + x4 )  y4 y5 y6  , (99)
y7 y8 y9
and also
Tr [Tr1 (Z)] = (x1 + x4 )(y1 + y5 + y9 ) = Tr(Z). (100)
These mappings thus map operators on H of this from into operators on H1
or on H2 .
An arbitray linear operator on H may be expressed as a linear combina-
tion of operators of this form, and we extend the definition of Tr1 and Tr2 by

20
linearity to all operators on H. For example, suppose Z = X1 ⊗ Y1 + X2 ⊗ Y2 .
Then

Tr1 (Z) = Tr1 (X1 ⊗ Y1 + X2 ⊗ Y2 )


= Tr1 (X1 ⊗ Y1 ) + Tr1 (X2 ⊗ Y2 )
= Y1 Tr(X1 ) + Y2 Tr(X2 ), (101)

and the result is an operator on H2 .


Now let ρ be a density matrix on H, describing a statistical ensemble of
the combined system. Define “reduced density matrices” for subsystems 1
and 2:
ρ1 ≡ Tr2 (ρ), ρ2 ≡ Tr1 (ρ). (102)
The interpretation is that ρ1 summarizes all of the information contained in ρ
about the variables of subsystem 1 alone, and similarly for ρ2 . For example,
if X is any operator on system 1 alone:

hXiρ = Tr [ρ(X ⊗ I)]


= hXiρ1 = Tr(Xρ1 ). (103)

From the reduced density matrices ρ1 and ρ2 we can form a new density
matrix on H:
ρ12 = ρ1 ⊗ ρ2 . (104)
It contains the same information which ρ1 and ρ2 contain together — ρ12
describes a statistical ensemble for which the variables of subsystem 1 are
completely uncorrelated with the variables of subsystem 2. If ρ is not of
this form (ρ 6= ρ12 ), then ρ describes an ensemble for which there is some
correlation between the variables of the two subsystems.
For the entropy in particular, we have

s(ρ12 ) = s(ρ1 ⊗ ρ2 ) = s(ρ1 ) + s(ρ2 ). (105)

Proof: We can choose a basis in which ρ1 and ρ2 are diagonal, and in this
basis ρ12 = ρ1 ⊗ ρ2 is also diagonal. Denote the diagonal elements of
ρ1 as di , i.e., di ≡ (ρ1 )ii , and the diagonal elements of ρ2 as δi . Then
the diagonal elements of ρ12 are given by all products of the form di δj ,
where i = 1, 2, . . . , n1 , and j = 1, 2, . . . , n2 , and where n1 and n2 are

21
the dimensions of H1 and H2 , respectively. Thus,

s(ρ12 ) = −Tr(ρ12 ln ρ12 )


n1 X
X n2
= − (di δj ) ln(di δj ). (106)
i=1 j=1

We compare this with (noting that Trρ1 = Trρ2 = 1):


 
n1
X n2
X
s(ρ1 ) + s(ρ2 ) = −  di ln di + δj ln δj 
i=1 j=1
 
X n2
n1 X n1 X
X n2
= − δj di ln di + di δj ln δj 
i=1 j=1 i=1 j=1
n
XX1 n2
= − di δj (ln di + ln δj )
i=1 j=1
= s(ρ12 ). (107)

Thus, the entropy for an ensemble (ρ12 ) for which the subsystems are
uncorrelated is just equal to the sum of the entropies of the reduced ensem-
bles for the subsystems. When there are correlations, we should expect an
inequality instead, since in this case ρ contains additional information con-
cerning the correlations, which is not present in ρ1 and ρ2 (ρ12 = ρ1 ⊗ρ2 6= ρ).
Then:
s(ρ12 ) = s(ρ1 ) + s(ρ2 ) ≥ s(ρ), (108)
where equality holds if and only if ρ = ρ12 , that is, if there are no correlations.
It is interesting that this inequality is specific for −x ln x, in the following
sense: Let s(ρ) = Tr [f (ρ)]. If this inequality, including the condition for
equality, holds for all finite-dimensional Hilbert spaces H1 and H2 , and all
density matrices ρ on H = H1 ⊗ H2 , then f (x) = −kx ln x, where k > 0
(and we may take k = 1). Since this inequality appears to be determined
by physical considerations, this becomes a strong argument for the form
s(ρ) = −Tr(ρ ln ρ) for the entropy.

22
9 Some Statistical Mechanics
Consider a Hamiltonian H with point spectrum {ωi ; i = 1, 2, . . .}, bounded
below. The partition function, Z(T ), for temperature T > 0 is defined by:

X
Z(T ) ≡ e−ωk /T . (109)
k=1

We are assuming that this sum converges. The density matrix (or statis-
tical operator) for the canonical distribution is given by:
e−H/T
ρ = ρ(T ) = (110)
Z(T )

1 X
= |kihk|e−ωk /T . (111)
Z(T ) k=1
This makes intuitive sense – our canonical, thermodynamic distribution con-
sists of a mixture of states, with each state receiving a “weight” of exp(−ωk /T ).
Note that
∞ ∞
!
X X
−ωk /T −ωk /T
Z(T ) = e = Tr |kihk|e
k=1 k=1
 
= Tr e−H/T . (112)
(113)
Hence, Tr [ρ(T )] = 1.
The ensemble average of any observable (self-adjoint operator), Q, in the
canonical ensemble is:
hQiρ = Tr [Qρ(T )] . (114)
For example, the mean energy is:
U = hHiρ
1  
= Tr He−H/T
Z(T )
T2  
= ∂T Tr e−H/T
Z(T )
T2
= ∂T TrZ(T ) (115)
Z(T )
= T 2 ∂T ln [Z(T )] . (116)

23
The entropy is:
S = −Tr(ρ ln ρ)
(  )
e−H/T H
= −Tr − − ln Z(T )
Z(T ) T
U
= + ln [Z(T )] . (117)
T
If we define the Helmholtz free energy, F = −T ln Z, then S = −∂T F .
Alternatively, U = T S + F .

10 Exercises
1. Show that any linear operator in an n-dimensional Euclidean space
may be expressed as an n-term dyad. Show that this may be extended
to an infinite-dimensional Euclidean space.
2. Suppose we have a system with total angular momentum 1. Pick a
basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1 
ρ = 1 1 0.
4
1 0 1

(a) Is ρ a permissible density matrix? Give your reasoning. For the


remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?
(c) What is the spread (standard deviation) in measured values of Jz ?

3. Prove the first theorem in section 5.


4. Prove the von Neumann mixing theorem.
5. Show that an arbitrary linear operator on a product space H = H1 ⊗H2
may be expressed as a linear combination of operators of the form
Z =X ⊗Y.

24
6. Let us try to improve our understanding of the discussions on the den-
sity matrix formalism, and the connections with “information” or “en-
tropy” that we have made. Thus, we consider a simple “two-state”
system. Let ρ be any general density matrix operating on the two-
dimensional Hilbert space of this system.

(a) Calculate the entropy, s = −Tr(ρ ln ρ) corresponding to this den-


sity matrix. Express your result in terms of a single real para-
meter. Make sure the interpretation of this parameter is clear, as
well as its range.
(b) Make a graph of the entropy as a function of the parameter. What
is the entropy for a pure state? Interpret your graph in terms of
knowledge about a system taken from an ensemble with density
matrix ρ.
(c) Consider a system with ensemble ρ a mixture of two ensembles
ρ1 , ρ2 :
ρ = θρ1 + (1 − θ)ρ2 , 0≤θ≤1 (118)
As an example, suppose
   
1 1 0 1 1 1
ρ1 = , and ρ2 = , (119)
2 0 1 2 1 1
in some basis. Prove that VonNeuman’s mixing theorem holds for
this example:
s(ρ) ≥ θs(ρ1 ) + (1 − θ)s(ρ2 ), (120)
with equality iff θ = 0 or θ = 1.

7. Consider an N-dimensional Hilbert space. We define the real vector


space, O of Hermitian operators on this Hilbert space. We define a
scalar product on this vector space according to:

(x, y) = Tr(xy), ∀x, y ∈ O. (121)

Consider a basis {B} of orthonormal operators in O. The set of den-


sity operators is a subset of this vector space, and we may expand an
arbitrary density matrix as:
X X
ρ= Bi Tr(Bi ρ) = Bi hBi iρ . (122)
i i

25
By measuring the average values for the basis operators, we can thus
determine the expansion coefficients for ρ.
(a) How many such measurements are required to completely deter-
mine ρ?
(b) If ρ is known to be a pure state, how many measurements are
required?
8. Two scientists (they happen to be twins, named “Oivil” and “Livio”,
but never mind. . . ) decide to do the following experiment: They set
up a light source, which emits two photons at a time, back-to-back in
the laboratory frame. The ensemble is given by:
1
ρ = (|LLihLL| + |RRihRR|), (123)
2
where “L” refers to left-handed polarization, and “R” refers to right-
handed polarization. Thus, |LRi would refer to a state in which photon
number 1 (defined as the photon which is aimed at scientist Oivil, say)
is left-handed, and photon number 2 (the photon aimed at scientist
Livio) is right-handed.
These scientists (one of whom is of a diabolical bent) decide to play a
game with Nature: Oivil (of course) stays in the lab, while Livio treks
to a point a light-year away. The light source is turned on and emits
two photons, one directed toward each scientist. Oivil soon measures
the polarization of his photon; it is left-handed. He quickly makes a
note that his brother is going to see a left-handed photon, sometime
after next Christmas.
Christmas has come and gone, and finally Livio sees his photon, and
measures its polarization. He sends a message back to his brother Oivil,
who learns in yet another year what he knew all along: Livio’s photon
was left-handed.
Oivil then has a sneaky idea. He secretly changes the apparatus, with-
out telling his forlorn brother. Now the ensemble is:
1
ρ = (|LLi + |RRi)(hLL| + hRR|). (124)
2
He causes another pair of photons to be emitted with this new appa-
ratus, and repeats the experiment. The result is identical to the first
experiment.

26
(a) Was Oivil just lucky, or will he get the right answer every time,
for each apparatus? Demonstrate your answer explicitly, in the
density matrix formalism.
(b) What is the probability that Livio will observe a left-handed pho-
ton, or a right-handed photon, for each apparatus? Is there a
problem with causality here? How can Oivil know what Livio is
going to see, long before he sees it? Discuss! Feel free to modify
the experiment to illustrate any points you wish to make.

9. Let us consider the application of the density matrix formalism to the


problem of a spin-1/2 particle (such as an electron) in a static external
magnetic field. In general, a particle with spin may carry a magnetic
moment, oriented along the spin direction (by symmetry). For spin-
1/2, we have that the magnetic moment (operator) is thus of the form:
1
σ,
µ = γσ (125)
2
where σ are the Pauli matrices, the 12 is by convention, and γ is a
constant, giving the strength of the moment, called the gyromagnetic
ratio. The term in the Hamiltonian for such a magnetic moment in an
external magnetic field, B is just:
µ · B.
H = −µ (126)

Our spin-1/2 particle may have some spin-orientation, or “polarization


vector”, given by:
σi.
P = hσ (127)
Drawing from our classical intuition, we might expect that in the ex-
ternal magnetic field the polarization vector will exhibit a precession
about the field direction. Let us investigate this.
Recall that the expectation value of an operator may be computed from
the density matrix according to:
hAi = Tr(ρA). (128)
Furthermore, recall that the time evolution of the density matrix is
given by:
∂ρ
i = [H(t), ρ(t)]. (129)
∂t
27
P /dt, of the polarization vector? Express
What is the time evolution, dP
your answer as simply as you can (more credit will be given for right
answers that are more physically transparent than for right answers
which are not). Note that we make no assumption concerning the
purity of the state.

10. Let us consider a system of N spin-1/2 particles (see the previous prob-
lem) per unit volume in thermal equilibrium, in our external magnetic
field B . Recall that the canonical distribution is:
e−H/T
ρ= , (130)
Z
with partition function:
 
Z = Tr e−H/T . (131)

Such a system of particles will tend to orient along the magnetic field,
resulting in a bulk magnetization (having units of magnetic moment
per unit volume), M .

(a) Give an expression for this magnetization (don’t work too hard to
evaluate).
(b) What is the magnetization in the high-temperature limit, to lowest
non-trivial order (this I want you to evaluate as completely as you
can!)?

28
Physics 125c
Course Notes
Density Matrix Formalism
Solutions to Problems
040520 Frank Porter

1 Exercises
1. Show that any linear operator in an n-dimensional Euclidean space
may be expressed as an n-term dyad. Show that this may be extended
to an infinite-dimensional Euclidean space.
Solution: Consider operator A in n-dimensional Euclidean space,
which may be expressed as a matrix in a given basis:
n
X
A = aij |ei ihej |
i,j=1
 
X X
= |ei i  aij hej | . (1)
i j
Pn
This is in the form of an n-term dyad A = i=1 |αi ihβi |, with |αi i = |ei i
P
and hβi | = nj=1 aij hej |.
An arbitrary vector in an infinite dimensional Euclidean space may be
expanded in a countable basis according to:

X
|αi = αi |ei i. (2)
i=1

Another way to say this is that the basis is “complete”, with “com-
pleteness relation”: X
I= |ei ihei |. (3)
i
An arbitrary linear operator can thus be defined in terms of its actions
on the basis vectors:
X
A = IAI = |ei ihei |A|ej ihej |. (4)
i,j

The remainder proceeds as for the finite-dimensional case.

1
2. Suppose we have a system with total angular momentum 1. Pick a
basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1 
ρ = 1 1 0.
4
1 0 1

(a) Is ρ a permissible density matrix? Give your reasoning. For the


remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
Solution: Clearly ρ is hermitian. It is also trace one. This is
almost sufficient for ρ to be a valid density matrix. We can see
this by noting that, given a hermitian matrix, we can make a
transformation of basis to one in which ρ is diagonal. Such a
transformation preserves the trace. In this diagonal basis, ρ is of
the form:
ρ = a|e1 ihe1 | + b|e2 ihe2 | + c|e3 ihe3 |,
where a, b, c are real numbers such that a + b + c = 1. This is
clearly in the form of a density operator. Another way of arguing
this is to consider the n-term dyad representation for a hermitian
matrix.
However, we must also have that ρ is positive, in the sense that
a, b, c cannot be negative. Otherwise, we would interpret some
probabilities as negative. There are various ways to check this.
For example, we can check that the expectation value of ρ with
respect to any state is not negative. Thus, let an arbitrary state
be: |ψi = (α, β, γ). Then
hψ|ρ|ψi = 2|α|2 + |β|2 + |γ|2 + 2<(α∗ β) + 2<(α∗ γ). (5)
This quantity can never be negative, by virtue of the relation:
|x|2 + |y|2 + 2<(x∗ y) = |x + y|2 ≥ 0. (6)
Therefore ρ is a valid density operator.
To determine whether ρ is a pure or mixed state, we consider:
1 5
Tr(ρ2 ) = (6 + 2 + 2) = .
16 8
2
This is not equal to one, so ρ is a mixed state. Alternatively, one
can show explicitly that ρ2 6= ρ.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?
Solution: We are working in a diagonal basis for Jz :
 
1 0 0
 
Jz =  0 0 0  .
0 0 −1

The average value of Jz is:


1 1
hJz i = Tr(ρJz ) = (2 + 0 − 1) = .
4 4

(c) What is the spread (standard deviation) in measured values of Jz ?


Answer: We’ll need the average value of Jz2 for this:
1 3
hJz2 i = Tr(ρJz2 ) = (2 + 0 + 1) = .
4 4
Then: √
q 11
∆Jz = hJz2 i − hJz i2 = .
4
3. Prove the first theorem in section ??.
Solution: The theorem we wish to prove is:

Theorem: Let P1 , P2 be two primitive Hermitian idempotents (i.e.,


rays, or pure states, with P † = P , P 2 = P , and TrP = 1). Then:

1 ≥ Tr(P1 P2 ) ≥ 0. (7)

If Tr(P1 P2 ) = 1, then P2 = P1 . If Tr(P1 P2 ) = 0, then P1 P2 = 0


(vectors in ray 1 are orthogonal to vectors in ray 2).

First, suppose P1 = P2 . Then Tr(P1 P2 ) = Tr(P12 ) = Tr(P1 ) = 1. If


P1 P2 = 0, then Tr(P1 P2 ) = Tr(0) = 0.

3
More generally, expand P1 and P2 with respect to an orthonormal basis
{|ei i}:
X
P1 = aij |ei ihej | (8)
i,j
X
P2 = bij |ei ihej | (9)
i,j
X
P 1 P2 = aik bkj |ei ihej |. (10)
i,j,k

We know from the discussion on pages 11,12 in the Density Matrix


note, that we can work in a basis in which aij = δij δi1. In this basis,
X
P1 P2 = |e1 i b1i hei |. (11)
i

The trace, which is invariant under choice of basis, is


Tr(P1 P2 ) = b11 (12)

We are almost there, but we need to show that b11 > 0, if P1 P2 6= 0.


A simple way to see this is to notice that P2 is the outer product of a
vector with itself, hence b11 ≥ 0, with b11 = 0 if and only if P1 P2 = 0
(since b1i = bi1 = 0 for all i if b11 = 0). Finally, b11 < 1 if P1 6= P2 .
4. Prove the von Neumann mixing theorem.
Solution: The mixing theorem states that, given two distinct en-
sembles ρ1 6= ρ2 , a number 0 < θ < 1, and a mixed ensemble ρ =
θρ1 + (1 − θ)ρ2 , then

s(ρ) > θs(ρ1 ) + (1 − θ)s(ρ2 ). (13)

Let us begin by proving the following:


Lemma: Let n
X
x= λ i xi , (14)
i=1
Pn
where 1 > xi > 0, 0 < λi < 1 for all i, and i=1 λi = 1. Then
n
X
−x ln x > − λi xi ln xi . (15)
i=1

4
Proof: This follows because −x ln x is a concave function of x. Its
second derivative is
d2 d
2
(−x ln x) = (− ln x − 1) = −1/x. (16)
dx dx
For 1 > x > 0 this is always negative; −x ln x is a concave function
for 1 > x > 0. Hence, any point on a straight line between two
points on the curve of this function lies below the curve. The
theorem is for a linear combination of n points on the curve of
−x ln x. Here, x is a weighted average of points xi . The function
−x ln x evaluated at this weighted average point is to be compared
with the weighted average of the values of the function at the n
points x1 , x2 , . . . , xn . Again, the function evaluated at the linear
combination is a point on the curve, and the weighted average of
the function over the n points must lie below that point on the
curve. The region of possible values of the weighted average of the
function is the polygon joining neighboring points on the curve,
and the first and last points. See Fig. 1.

Now we must see how our present problem can be put in the form where
this lemma may be applied. Consider the spectral decompositions of
ρ1 , ρ2 :
X X
ρ1 = ai Pi = ai |ei ihei | (17)
i i
X X
ρ2 = bi Qi = bi |fi ihfi |, (18)
i i

where the decompositions have been “padded” with complete sets of


one-dimensional projections. That is, some of the ai ’s and bi ’s may
be zero. The idea is that the sets {|ei i} and {|fi i} form complete
orthonormal bases. Note that we cannot have Pi = Qi in general.
Then we have:

ρ = θρ1 + (1 − θ)ρ2 (19)


X
= [θai |ei ihei | + (1 − θ)bi |fi ihfi |] (20)
i
X
= ci |gi ihgi |, (21)
i

5
f(x)

. .. f(<x>)

. .
<f(x)> .

x1 x2 x3 <x> x4 x

Figure 1: Illustration showing that the weighted average of a concave function


is smaller than the function evaluated at the weighted average point. The
allowed values of the ordered pairs hxi, hf (x)i lie in the polygon.

where we have defined another complete orthonormal basis, {|gi i}, cor-
responding to a basis in which ρ is diagonal.
We may expand the {|ei i} and {|fi i} bases in terms of the {|gi i} basis.
For example, let X
|ei i = Aij |gj i, (22)
j

where A = {Aij } is a unitary matrix. The inverse transformation is


X
|gi i = A∗ji |ej i. (23)
j

Also,
 †
X X
hei | =  Aij |gj i = A∗ij hgj |, (24)
j j

6
and hence, XX
|ei ihei | = Aij |gj ihgk |A∗ik . (25)
j k

Similarly, we define matrix B such that


XX

|fi ihfi | = Bij |gj ihgk |Bik . (26)
j k

Substituting Eqns. 25 and 26 into Eqn: 20:


 
X X X
ρ= θai A∗ik Aij + (1 − θ)bi ∗
Bik Bij  |gj ihgk |. (27)
i j,k j,k

Thus, the numbers c` are:

c` = hg`|ρ|g` i
 
X X X
= θai A∗ik Aij + (1 − θ)bi ∗
Bik Bij  δ`j δ`k (28)
i j,k j,k
Xh i
2 2
= θ|Ai` | ai + (1 − θ)|Bi` | bi . (29)
i

The entropy for density matrix ρ is:


X
s(ρ) = − ci ln ci (30)
i
   
X X h i X h i
= − 
θ|Aji |2 aj + (1 − θ)|Bji |2 bj  ln  θ|Aji |2 aj + (1 − θ)|Bji |2 bj  .
i j j

Note that ci is of the form


X (a) (b)
ci = (λij aj + λij bj ), (31)
j

where
(a)
λij ≡ θ|Aji |2 (32)
(b)
λij ≡ (1 − θ)|Bji |2 . (33)

7
Furthermore, X (a) (b)
(λij + λij ) = 1. (34)
j

Thus, according to the lemma (some of the ci ’s might be zero; there is


an equality, 0 = 0 in such cases),
X h (a) (b)
i
−ci ln ci > − λij aj ln aj + λij bj ln bj . (35)
j

Finally, we sum the above inequality over i:


X
s(ρ) = − ci ln ci
i
X X h (a) (b)
i
> − λij aj ln aj + λij bj ln bj
i j
X
> − [θaj ln aj + (1 − θ)bj ln bj ] (36)
j
> θs(ρ1 ) + (1 − θ)s(ρ2 ) (37)

This completes the proof.

5. Show that an arbitrary linear operator on a product space H = H1 ⊗H2


may be expressed as a linear combination of operators of the form
Z =X ⊗Y.
Solution: We are given an arbitrary linear operator A on H = H1 ⊗H2 .
We wish to show that there exists a decomposition of the form:
X X
A= Ai Zi = Ai Xi ⊗ Yi , (38)
i i

where Xi are operators on H1 and Yi are operators on H2 .


Let {fi : i = 1, 2, . . .} be an orthonormal basis in H1 and {gi : i =
1, 2, . . .} be an orthonormal basis in H2 . Then we may obtain an orhto-
normal basis for H composed of vectors of the form:

eij = fi ⊗ gj , i = 1, 2, . . . ; j = 1, 2, . . . . (39)

It is readily checked that {eij } is, in fact, an orthonormal basis for H.

8
Expand A with respect to basis {eij }:
X
A = Aij,mn |eij ihemn | (40)
i,j,m,n
X
= Aij,mn |fi i ⊗ |gj ihfm | ⊗ hgn | (41)
i,j,m,n
X
= Aij,mn |fi ihfm | ⊗ |gj ihgn | (42)
i,j,m,n
X
= Ak Xk ⊗ Yk , (43)
k

where k is a relabeling for i, j, m, n.


The only step above which requires further comment is setting:
|fi i ⊗ |gj ihfm | ⊗ hgn | = |fi ihfm | ⊗ |gj ihgn |. (44)
One way to check this is as follows. Pick our bases to be in the form:
(fi )k = δik (45)
(gj )` = δj` . (46)
Then
(|fi i ⊗ |gj ihfm | ⊗ hgn |)k`,pq = δik δj` δmp δnq . (47)
and
(|fi ihfm | ⊗ |gj ihgn |)k`,pq = δik δj`δmp δnq . (48)

6. Let us try to improve our understanding of the discussions on the den-


sity matrix formalism, and the connections with “information” or “en-
tropy” that we have made. Thus, we consider a simple “two-state”
system. Let ρ be any general density matrix operating on the two-
dimensional Hilbert space of this system.
(a) Calculate the entropy, s = −Tr(ρ ln ρ) corresponding to this den-
sity matrix. Express your result in terms of a single real para-
meter. Make sure the interpretation of this parameter is clear, as
well as its range.
Solution: Density matrix ρ is Hermitian, hence diagonal in some
basis. Work in such a basis. In this basis, ρ has the form:
 
θ 0
ρ= , (49)
0 1−θ

9
where 0 ≤ θ ≤ 1 is the probability that the system is in state 1.
We have a pure state if and only if either θ = 1 or θ = 0.
The entropy is
s = −θ ln θ − (1 − θ) ln(1 − θ). (50)

(b) Make a graph of the entropy as a function of the parameter. What


is the entropy for a pure state? Interpret your graph in terms of
knowledge about a system taken from an ensemble with density
matrix ρ.
Solution:
0.8

0.7

0.6

0.5
entropy

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
theta

Figure 2: The entropy as a function of θ.

The entropy for a pure state, with θ = 1 or θ = 0, is zero. The


entropy increases as the state becomes “less pure”, reaching maxi-
mum when the probability of being in either state is 1/2, reflecting
minimal “knowledge” about the state.
(c) Consider a system with ensemble ρ a mixture of two ensembles
ρ1 , ρ2 :
ρ = θρ1 + (1 − θ)ρ2 , 0≤θ≤1 (51)
As an example, suppose
   
1 1 0 1 1 1
ρ1 = , and ρ2 = , (52)
2 0 1 2 1 1

10
in some basis. Prove that VonNeuman’s mixing theorem holds for
this example:
s(ρ) ≥ θs(ρ1 ) + (1 − θ)s(ρ2 ), (53)
with equality iff θ = 0 or θ = 1.

Solution: The entropy of ensemble 1 is:


1 1 1 1
s(ρ1 ) = −Trρ1 ln ρ1 = − ln − ln = ln 2 = 0.6931 (54)
2 2 2 2
It may be noticed that ρ22 = ρ2 , hence ensemble 2 is a pure state,
with entropy s(ρ2 ) = 0. Next, we need the entropy of the combined
ensemble:  
1 1 1−θ
ρ = θρ1 + (1 − θ)ρ2 = . (55)
2 1−θ 1
To compute the entropy, it is convenient to determine the eigenvalues;
they are 1 − θ/2 and θ/2. Note that they are in the range from zero to
one, as they must be. The entropy is
! ! !
θ θ θ θ
s(ρ) = − 1 − ln 1 − − − ln . (56)
2 2 2 2

We must compare s(ρ) with

θs(ρ1 ) + (1 − θ)s(ρ2 ) = θ ln 2. (57)

It is readily checked that equality holds for θ = 1 or θ = 0. For the


case 0 < θ < 1, take the difference of the two expressions:
! ! !
θ θ θ θ
s(ρ) − [θs(ρ1 ) + (1 − θ)s(ρ2 )] = − 1 − ln 1 − − − ln − θ ln 2
2 2 2 2
 !1−θ/2 !θ/2 
θ θ
= − ln  1 − 2θ  . (58)
2 2

This must be larger than zero if the mixing theorem is correct. This is
equivalent to asking whether
!1−θ/2 !θ/2
θ θ
1− 2θ (59)
2 2

11
is less than 1. This expression may be rewritten as
!1−θ/2
θ
1− (2θ)θ/2 . (60)
2

It must be less than one. To check, let’s find its maximum value, by
setting its derivative with respect to θ equal to 0:
!1−θ/2
d θ
0 = 1− (2θ)θ/2
dθ 2
" ! #
d θ
= exp 1 − ln (1 − θ/2) + (θ/2) ln(2θ)
dθ 2
1 1 1 2
= − ln(1 − θ/2) + ln(2θ) − +
2 2 2 4
= ln(2θ) − ln(1 − θ/2). (61)

Thus, the maximum occurs at θ = 2/5. At this value of θ, s(ρ) = 0.500,


and θs(ρ1 ) + (1 − θ)s(ρ2 ) = (2/5) ln 2 = 0.277. The theorem holds.

7. Consider an N-dimensional Hilbert space. We define the real vector


space, O of Hermitian operators on this Hilbert space. We define a
scalar product on this vector space according to:

(x, y) = Tr(xy), ∀x, y ∈ O. (62)

Consider a basis {B} of orthonormal operators in O. The set of den-


sity operators is a subset of this vector space, and we may expand an
arbitrary density matrix as:
X X
ρ= Bi Tr(Bi ρ) = Bi hBi iρ . (63)
i i

By measuring the average values for the basis operators, we can thus
determine the expansion coefficients for ρ.

(a) How many such measurements are required to completely deter-


mine ρ?
Solution: The question is, how many independent basis operators
are there in O? An arbitrary N × N complex matrix is described

12
by 2N 2 real parameters. The requirement of Hermiticity provides
the independent constraint equations:

<(Hij ) = <(Hji ), i < j (64)


=(Hij ) = −=(Hji ), i ≤ j. (65)

This is N + 2[N (N − 1)/2] = N 2 equations. Thus, O is an N 2 -


dimensional vector space. But to completely determine the den-
sity matrix, we have one further constraint, that Trρ = 1. Thus,
it takes N 2 − 1 measurements to completely determine ρ.
(b) If ρ is known to be a pure state, how many measurements are
required?
Solution: We note that a complex vector in N dimensions is
completely specified by 2N real parameters. However, one para-
meter is an arbitrary phase, and another parameter is eaten by
the normalization constraint. Thus, it takes 2(N − 1) parameters
to completely specify a pure state.
If ρ is a pure state, then ρ2 = ρ. How many additional constraints
over the result in part (a) does this imply? Let’s try to get a more
intuitive understanding by attacking this issue from a slightly dif-
ferent perspective. Ask, instead, how many parameters it takes
to build an arbitrary density matrix as a mixture of pure states.
Our response will be to add pure states into the mixture one at
a time, counting parameters as we go, until we cannot add any
more.
It takes 2(N − 1) parameters to define the first pure state in our
mixture. The second pure state must be a distinct state. That is,
it must be drawn from an N − 1-dimensional subspace. Thus the
second pure state requires 2(N − 2) parameters to define. There
will also be another parameter required to specify the relative
probabilities of the first and second state, but we’ll count up these
probablilities later. The third pure state requires 2(N − 3) para-
meters, and so forth, stopping at 2 · 1 paramter for the (N − 1)st
pure state. Thus, it takes
N
X −1
2 k = N (N − 1) (66)
k=1

13
parameters to define all the pure states in the arbitrary mixture.
There can be a total of N pure states making up a mixture (the
Nth one required no additional parameters in the count we just
made). It takes N − 1 parameters to specify the relative prob-
abiliteis of these N components in the mixture. Thus, the total
number of parameters required is:

N(N − 1) + (N − 1) = N 2 − 1. (67)

Notice that this is just the result we obtained in part (a).


8. Two scientists (they happen to be twins, named “Oivil” and “Livio”,
but never mind. . . ) decide to do the following experiment: They set
up a light source, which emits two photons at a time, back-to-back in
the laboratory frame. The ensemble is given by:
1
ρ = (|LLihLL| + |RRihRR|), (68)
2
where “L” refers to left-handed polarization, and “R” refers to right-
handed polarization. Thus, |LRi would refer to a state in which photon
number 1 (defined as the photon which is aimed at scientist Oivil, say)
is left-handed, and photon number 2 (the photon aimed at scientist
Livio) is right-handed.
These scientists (one of whom is of a diabolical bent) decide to play a
game with Nature: Oivil (of course) stays in the lab, while Livio treks
to a point a light-year away. The light source is turned on and emits
two photons, one directed toward each scientist. Oivil soon measures
the polarization of his photon; it is left-handed. He quickly makes a
note that his brother is going to see a left-handed photon, sometime
after next Christmas.
Christmas has come and gone, and finally Livio sees his photon, and
measures its polarization. He sends a message back to his brother Oivil,
who learns in yet another year what he knew all along: Livio’s photon
was left-handed.
Oivil then has a sneaky idea. He secretly changes the apparatus, with-
out telling his forlorn brother. Now the ensemble is:
1
ρ = (|LLi + |RRi)(hLL| + hRR|). (69)
2
14
He causes another pair of photons to be emitted with this new appa-
ratus, and repeats the experiment. The result is identical to the first
experiment.

(a) Was Oivil just lucky, or will he get the right answer every time,
for each apparatus? Demonstrate your answer explicitly, in the
density matrix formalism.
Solution: Yup, he’ll get it right, every time, in either case. Let’s
first define a basis so that we can see how it all works with explicit
matrices:
       
1 0 0 0
0 1 0 0
       
|LLi =   , |LRi =   , |RLi =   , |RRi =  .
0 0 1 0
0 0 0 1
(70)
In this basis the density matrix for the first apparatus is:
1
ρ = (|LLihLL| + |RRihRR|)
2   
1 0
1 0  1 0
 
=  (1 0 0 0) +  (0 0 0 1)
2 0 2 0
0 1
 
1 0 0 0
10 0 0 0

=  . (71)
2 0 0 0 0
0 0 0 1

Since T r(ρ2 ) = 1/2, we know that this is a mixed state.


Now, Oivil observes that his photon is left-handed. His left-
handed projection operator is
 
1 0 0 0
0 1 0 0
 
PL =  , (72)
0 0 0 0
0 0 0 0

so once he has made his measurement, the state has “collapsed”

15
to:  
1 0 0 0

1 0 0 0 0
PL ρ = 

. (73)
2 0 0 0 0
0 0 0 0
This corresponds to a pure |LLi state, hence Livio will observe
left-handed polarization.
For the second apparatus, the density matrix is
1
ρ = (|LLi + |RRi)(hLL| + hRR|)
2 
1 0 0 1
10 0 0 0

=  . (74)
2 0 0 0 0
1 0 0 1
Since T r(ρ2 ) = 1, we know that this is a pure state. Applying the
left-handed projection for Oivil’s photon, we again obtain:
 
1 0 0 0
10 0 0 0
PL ρ =  . (75)
2 0 0 0 0
0 0 0 0
Again, Livio will observe left-handed polarization.
(b) What is the probability that Livio will observe a left-handed pho-
ton, or a right-handed photon, for each apparatus? Is there a
problem with causality here? How can Oivil know what Livio is
going to see, long before he sees it? Discuss! Feel free to modify
the experiment to illustrate any points you wish to make.
Solution: Livio’s left-handed projection operator is
 
1 0 0 0
0 0 0 0
PL (Livio) = 


, (76)
0 0 1 0
0 0 0 0
The probability that Livio will observe a left-handed photon for
the first apparatus is:
hPL (Livio)i = Tr [PL (Livio)ρ] = 1/2. (77)

16
The same result is obtained for the second apparatus.
Here is my take on the philosophical issue (beware!):
If causality means propagation of information faster than the
speed of light, then the answer is “no”, causality is not violated.
Oivil has not propagated any information to Livio at superlumi-
nal velocities. Livio made his own observation on the state of the
system. Notice that the statistics of Livio’s observations are unal-
tered; independent of what Oivil does, he will still see left-handed
photons 50% of the time. If this were not the case, then there
would be a problem, since Oivil could exploit this to propagate a
message to Livio long after the photons are emitted.
However, people widely interpret (and write flashy headlines) this
sort of effect as a kind of “action at a distance”: By measuring the
state of his photon, Oivil instantly “kicks” Livio’s far off photon
into a particular state (without being usable for the propagation
of information, since Oivil can’t tell Livio about it any faster than
the speed of light). Note that this philosophical dilemma is not
silly: The wave function for Livio’s photon has both left- and right-
handed components; how could a measurement of Oivil’s photon
“pick” which component Livio will see? Because of this, quantum
mechanics is often labelled “non-local”.
On the other hand, this philosophical perspective may be avoided
(ignored): It may be suggested that it doesn’t make sense to talk
this way about the “wave function” of Livio’s photon, since the
specification of the wave function involves also Oivil’s photon.
Oivil is merely making a measurement of the state of the two-
photon system, by observing the polarization of one of the pho-
tons, and knowing the coherence of the system. He doesn’t need
to make two measurements to know both polarizations, they are
completely correlated. Nothing is causing anything else to hap-
pen at faster than light speed. We might take the (determinis-
tic?) point of view that it was already determined at production
which polarization Livio would see for a particular photon – we
just don’t know what it will be unless Oivil makes his measure-
ment. There appears to be no way of falsifying this point of view,
as stated. However, taking this point of view leads to to the fur-
ther philosophical question of how the pre-determined information

17
is encoded – is the photon propagating towards Livio somehow
carrying the information that it is going to be measured as left-
handed? This conclusion seems hard to avoid. It leads to the
notion of “hidden variables”, and there are theories of this sort,
which are testable.
We know that our quantum mechanical foundations are compati-
ble with special relativity, hence with the notion of causality that
implies.
As Feynman remarked several years ago in a seminar I arranged
concerning EPR, the substantive question to be asking is, “Do
the predictions of quantum mechanics agree with experiment?”.
So far the answer is a resounding “yes”. Indeed, we often rely
heavily on this quantum coherence in carrying out other research
activities. Current experiments to measure CP violation in B 0
decays crucially depend on it, for example.

9. Let us consider the application of the density matrix formalism to the


problem of a spin-1/2 particle (such as an electron) in a static external
magnetic field. In general, a particle with spin may carry a magnetic
moment, oriented along the spin direction (by symmetry). For spin-
1/2, we have that the magnetic moment (operator) is thus of the form:
1
σ,
µ = γσ (78)
2
where σ are the Pauli matrices, the 12 is by convention, and γ is a
constant, giving the strength of the moment, called the gyromagnetic
ratio. The term in the Hamiltonian for such a magnetic moment in an
external magnetic field, B is just:

µ · B.
H = −µ (79)

Our spin-1/2 particle may have some spin-orientation, or “polarization


vector”, given by:
σi.
P = hσ (80)
Drawing from our classical intuition, we might expect that in the ex-
ternal magnetic field the polarization vector will exhibit a precession
about the field direction. Let us investigate this.

18
Recall that the expectation value of an operator may be computed from
the density matrix according to:

hAi = Tr(ρA). (81)

Furthermore, recall that the time evolution of the density matrix is


given by:
∂ρ
i = [H(t), ρ(t)]. (82)
∂t
P /dt, of the polarization vector? Express
What is the time evolution, dP
your answer as simply as you can (more credit will be given for right
answers that are more physically transparent than for right answers
which are not). Note that we make no assumption concerning the
purity of the state.
Solution: Let us consider the ith-component of the polarization:
dPi dhσi i
i = i (83)
dt dt

= i Tr(ρσi ) (84)
∂t !
∂ρ
= iTr σi (85)
∂t
= Tr ([H, ρ]σi ) (86)
= Tr ([σi , H]ρ) (87)
3
1 X
= − γ Bj Tr ([σi , σj ]ρ) . (88)
2 j=1

To proceed further, we need the density matrix for a state with polar-
ization P . Since ρ is hermitian, it must be of the form:

ρ = a(1 + b · σ ). (89)

But its trace must be one, so a = 1/2. Finally, to get the right polar-
ization vector, we must have b=PP.
Thus, we have
3
( 3
)
dPi 1 X X
i =− γ Bj Tr[σi , σj ] + Pk Tr ([σi , σj ]σk ) . (90)
dt 4 j=1 k=1

19
Now [σi , σj ] = 2iijk σk , which is traceless. Further, Tr ([σi , σj ]σk ) =
4iijk . This gives the result:

X3 X 3
dPi
= −γ ijk Bj Pk . (91)
dt j=1 k=1

This may be re-expressed in the vector form:


P
dP
P × B.
= γP (92)
dt

10. Let us consider a system of N spin-1/2 particles (see the previous prob-
lem) per unit volume in thermal equilibrium, in our external magnetic
field B . Recall that the canonical distribution is:
e−H/T
ρ= , (93)
Z
with partition function:
 
Z = Tr e−H/T . (94)

Such a system of particles will tend to orient along the magnetic field,
resulting in a bulk magnetization (having units of magnetic moment
per unit volume), M .

(a) Give an expression for this magnetization (don’t work too hard to
evaluate).
Solution: Let us orient our coordinate system so that the z-axis
is along the magnetic field direction. Then Mx = 0, My = 0, and:

1
Mz = N γhσz i (95)
2
1 h i
= N γ Tr e−H/T σz , (96)
2Z
where H = −γBz σz /2.
(b) What is the magnetization in the high-temperature limit, to lowest
non-trivial order (this I want you to evaluate as completely as you
can!)?

20
Solution: In the high temperature limit, we’ll discard terms
of order higher than 1/T in the expansion of the exponential:
e−H/T ≈ 1 − H/T = 1 + γBz σz /2T . Thus,
1
Mz = N γ Tr [(1 + γBz σz /2T )σz ] (97)
2Z
1
= N γ 2 Bz . (98)
2ZT
Furthermore,

Z = Tre−H/T (99)
= 2 + O(1/T 2 ). (100)

And we have the result:

Mz = N γ 2 Bz /4T. (101)

This is referred to as the “Curie Law” (for magnetization of a


system of spin-1/2 particles).

21
Physics 195a
Course Notes
The K 0 : An Interesting Example of a “Two-State” System
021029 F. Porter

1 Introduction
An example of a two-state system is considered. Interesting complexities
involve:
1. Treatment of a decaying particle.
2. Superposition of states with different masses.

2 The K 0 Meson and its Anti-particle


The K 0 meson is a pseudoscalar state consisting (for present purposes) of a
d quark and an s̄ antiquark:
|K 0  = |ds̄. (1)
0
Its antiparticle is the K :
0 ¯
|K  = |ds. (2)
If we define a “strangeness” operator, S, (which counts strange quarks),
these states are eigenstates, with:
S|K 0  = |K 0 , (3)
0 0
S|K  = −|K . (4)
  0
We may write S as the two-by-two matrix 10 0
−1
in the |K 0 , |K  basis,
but a convenient “basis-independent” form is:
0 0
S = |K 0 K 0 | − |K K |. (5)
These are not eigenstates of C, the charge conjugation operator (which
changes particles to antiparticles). It is convenient to pick the antiparticle
phases such that:
0
C|K 0  = −|K , (6)
0 0
C|K  = −|K . (7)

1
If we multiply the C operator by the parity operator P , we have:
0
CP |K 0  = |K , (8)
0
CP |K  = |K 0 . (9)
0
We thus have, in the |K 0 , |K  basis:
 
0 1
CP = , (10)
1 0
which we may also express in the basis-independent form:
0 0
CP = |K 0 K | + |K K 0 |. (11)

The eigenstates of CP are (the choice of nomenclature will shortly be moti-


vated):
1  0

|KS0  = √ |K 0  + |K  , with CP = +1, (12)
2
1  0

|KL0  = √ |K 0  − |K  , with CP = −1. (13)
2
0
We remark that the K 0 (or K ) is the lowest mass particle containing
the strange quark. Thus, the only permitted decays must be via the weak
interaction. To a good approximation (but not exactly!), CP is conserved
in the weak interaction (and even more so in the strong and electromagnetic
interactions); we shall assume this here.
0
A neutral K meson (K 0 or K ) is observed to decay sometimes to two
pions and sometimes to three pions. For example, consider the observed
process K 0 → π 0 π 0 . Since all of the particles in this decay are spinless, the
decay must proceed with zero orbital angular momentum (“S-wave” decay).
Thus, the parity of the π 0 π 0 system in the final state must be positive. But
we said that the K 0 is a pseudoscalar particle, i.e., has negative parity.
Thus, this is a parity-violating decay. The weak interaction is known to
violate parity (i.e., parity is not conserved in the weak interaction), so this
is all right. The π 0 is its own anti-particle, hence the |π 0 π 0  final state is an
eigenstate of C with eigenvalue +1. Thus, the |π 0 π 0  final state is also an
eigenstate of CP with eigenvalue +1.
Under our approximation that CP is conserved in the weak interaction,
we therefore conclude that the observation of a K 0 → π 0 π 0 decay projects

2
0
out the KS0 component of the K 0 meson (likewise for the K ). The 2π decay
mode is favored by phase space over decays to greater numbers of pions.
However, the KL0 → 2π decay is forbidden by CP conservation. Hence, the
KL0 → 3π decay is important for KL0 . Because the phase space is considerably
suppressed, the KL0 decay rate is much slower than the KS0 rate. The observed
lifetimes of the KS0 and KL0 are, respectively:

τS = 9 × 10−11 s (90 ps), (14)


τL = 5 × 10−8 s (50 ns). (15)

3 Time Evolution of a Kaon State


Suppose that at time t = 0 we have the state

ψ(0) = |KS0 . (16)

How does this state evolve in time? We should have, at time t,

ψ(t) = e−iHt |KS0 . (17)



For a free particle, the energy is ωS = p2 + m2S , where mS is the mass of
the KS0 . But if we just use this for H, we won’t have a particle which decays
in time. We know that, if we start with a particle at t = 0 the probability
to find it undecayed at a later time t if it has a lifetime τS = 1/ΓS is:

P (t) = e−ΓS t . (18)

Thus, the amplitude should have an exp(−ΓS t/2) time dependence, in addi-
tion to the phase variation:

ψ(t) = e−iωS t−ΓS t/2 |KS0 . (19)



Letting ωL = p2 + m2L , where mL is the mass of the KL0 , and ΓL = 1/τL ,
we similarly have for an intial KL0 state (ψ(0) = |KL0 ):

ψ(t) = e−iωL t−ΓL t/2 |KL0 . (20)

In the |KS0 , |KL0  basis, the Hamiltonian operator is:


 
ωS − iΓS /2 0
H= . (21)
0 ωL − iΓL /2

3
This requires some further discussion. For example, how did I know that
0
H is diagonal in this basis (and not, perhaps, in the |K 0 , |K  basis)? The
answer is that we are assuming that CP is conserved. Hence, [H, CP ] = 0.
The Hamiltonian cannot mix states of differing CP quantum numbers, so
there are no off-diagonal terms in H in the |KS0 , |KL0  basis. The second
point is that we have allowed the possibility that the masses of the two
CP eigenstates are not the same (having already noted that the lifetimes
are different). This might be a bit worrisome – the C operation does not
0
change mass.1 However, the |KS0  and |K L  are not antiparticles of one
another, so there is no constraint that their masses must be equal. So, we
allow the possibility that they may be different. We will address shortly the
measurement of the mass difference.
0
Now suppose that at time t = 0 we have a pure K state:
0
ψ(0) = |K . (22)

Experimentally, this is a reasonable proposition, since we may produce such


states via the strong interaction. For example, if we collide two particles with
no initial strangeness (perhaps a proton and an anit-proton), we make strange
particles in “associated production”, i.e., in the production of ss̄ pairs. Thus,
0
we might have the reaction p̄p → nΛK (see Fig. 1). The presence of the Λ,
0
which contains the s̄ quark, tells us that the kaon produced is a K , since it
contains the s quark.
0
So, we can realistically imagine producing a K at t = 0. But the time-
evolution to later times is governed by the Hamiltonian, which is not diagonal
0
in the |K 0 , |K  basis. Thus, we might expect that at some later time we
may observe a K 0 . What is the probability, PK 0 (t) that a K 0 meson is
0
observed at time t, given a pure√K state at t = 0? The answer, noting that
0
ψ(0) = |K  = (|KS0  − |KL0 ) / 2, is:

PK 0 (t) = |K 0 |ψ(t)|2 (23)


1
= |K 0 |KS0 e−iωS t−ΓS t/2 − K 0 |KL0 e−iωL t−ΓL t/2 |2
2 
1 −ΓS t −ΓL t
Γ +Γ
− S2 Lt
= e +e − 2e cos [(ωS − ωL )t] . (24)
4
1
Actually, this is only an assumption here. But it is a fundamental theorem in rela-
tivistic quantum mechanics that particle and anti-particle have the same mass (as well as
the same total lifetime).

4
_

} _
d
_
_ u Λ
_

{
_ d_ s
p u
_
u
_ _
s
d } K0
u
p
{ u d
d
u

d
} n

0
Figure 1: A possible reaction to produce an K meson. The lines indicate
flow of quark flavors from left to right. No interactions are shown. Note that
0
the production of the antibaryon tells us that it is a K , not a K 0 .

By measuring the frequency of the oscillation in the last term, we may


measure the mass difference between the KS0 and the KL0 . When the mo-
mentum is small, ωS − ωL ≈ mS − mL . Because this difference is very small,
it is experimentally intractable to attempt this with direct kinematic mea-
surements. Measurements of the oscillation frequency yield a mass difference
of

|mS − mL | = 0.5 × 1010 s−1 (25)


0.5 × 1010 s−1
= 200 × 106 eV-fm
3 × 1023 fm/s
= 3 µeV, (26)

a difference comparable to the energy of a microwave photon. Since the mass


of the kaon is approximately 500 MeV, this is a fractional difference of order
one part in 1014 !
We remark that this example shows that sometimes, even in non-relativistic
quantum mechanics, the rest mass term in the energy must be included. This
is because we may have a superpostion of states with different masses, and
the time evolution of the components is correspondingly different, such that
there is a time-dependent interference.

5
P 0 (t)
K
0.30

0.25

0.20

0.15

0.10

0.05

0.00
0 5 10 15 20 25 30 35
t/ τ s

0
Figure 2: Upper curve: the K → K 0 oscillation probability as a function of
time (in units of τS ). Lower curve: the oscillation probability if mS = mL .

4 Exercises
0
1. Find the neutral kaon Hamiltonian in the |K 0 , |K  basis. Is the
symmetry of your result consistent with the notion that the masses of
particles and antiparticles are the same? Same question for the decay
rates?

2. Repeat the derivation of Eqn. 24, but work in the density matrix for-
malism. We did not consider the possibility of decay when we develped
this formalism, so be careful – you may find that you need to modify
some of our discussion.

3. In this note, we discussed the neutral kaon (K) meson, in particular the
phenomenon of K 0 − K̄ 0 mixing. Let us think about this system a bit
further. The K 0 and K̄ 0 mesons interact in matter, dominantly via the
strong interaction. Approximately, the cross section for an interaction
with a deuteron is:

σ(K 0 d) = 36 millibarns (27)


σ(K̄ 0 d) = 59 millibarns, (28)

6
at a kaon momentum of, say, 1.5 GeV. Note that a “barn” is a unit of
area equal to 10−24 cm2 .

(a) Consider a beam of kaons (momentum 1.5 GeV) incident on a


target of liquid deuterium. Let λ be the K 0 “interaction length”,
i.e., the average distance that a K 0 will travel in the deuterium
before it interacts according to the above cross section. Similarly,
let λ̄ be the K̄ 0 interaction length. To a good enough approxima-
tion for our purposes, you may treat the deuterium as a collection
of deuterons (why?). The density of liquid deuterium is approxi-
mately ρ = 0.17 g/cm3 . What are λ and λ̄, in centimeters?
(b) Suppose we have prepared a beam of KL0 mesons, e.g., by first
creating a K 0 beam and waiting long enough for the KS0 compo-
nent to decay away. If we let this KL0 beam traverse a deuterium
target, the K 0 and K̄ 0 components will interact differently, and
we may end up with some KS0 mesons exiting the target. Let us
make an estimate for the size of this effect.
Since the kaon is relativistic, we need to be a little careful com-
pared with our discussion in the note: In the KL0 rest frame, the
amplitude depends on time t∗ according to:

exp(−imL t∗ − ΓL t∗ /2), (29)


where ΓL = 1/τL is the KL0 decay rate. In the laboratory
√ frame,
where the kaon is moving with speed v, and γ = 1/ 1 − v , t∗ →
2

t/γ, where t is the time as measured in the laboratory frame.


In the lab frame, we have t/γ = x/γv, and we may write the
amplitude as for the KL0 as:

exp(−imL x/γv − ΓL x/2γv), (30)


Let us consider a deuterium target, of thickness w, along the beam
direction. At a distance x into the target, an interaction may
occur, resulting in a final state:
1
√ (f |K 0  − f¯|K̄ 0 ), (31)
2

7
where, for example, the amplitude f for the K 0 component tra-
versing distance dx is just:
dx
f = e−dx/2λ ≈ 1 − . (32)

Put all this together and find an expression for the probability to
observe a KS0 to emerge from the deuterium, for a KL0 incident.
Assume that w λ. You may wish to use ∆m ≡ mL − mS ,
ΓS,L ≡ 1/τS,L , and ∆Γ ≡ ΓL − ΓS ≈ −ΓS
(c) Suppose w = 10 cm and γv = 3. What is the probability to
observe a KS0 emerging from the target? What is the probability
to observe a KL0 ? You may use:

ΓS = 1.1 × 1010 s−1 , (33)


∆m = 0.5 × 1010 s−1 . (34)

You have been investigating a phenomenon often called “regeneration”


– by passing through material, a KS0 component to the beam has been
“regenerated”. A similar consideration has been proposed to help ex-
plain the “solar neutrino problem”.

8
Physics 195a
Course Notes
The K 0 : An Interesting Example of a “Two-State” System: Solutions to Exercises
021114 F. Porter

1 Exercises
0
1. Find the neutral kaon Hamiltonian in the |K 0 , |K  basis. Is the
symmetry of your result consistent with the notion that the masses of
particles and antiparticles are the same? Same question for the decay
rates?
Solution: In the |KS0 , |KL0  basis, the Hamiltonian operator is:
 
ωS − iΓS /2 0
HSL = . (1)
0 ωL − iΓL /2
0
We wish to make a basis change to the |K 0 , |K  basis:

HKK = R−1 HSL R, (2)


0
where R is the matrix which transforms a vector in the |K 0 , |K  basis
to one in the |KS0 , |KL0  basis.
The two bases are related by:
1 0
|KS0  = √ (|K 0  + |K ) (3)
2
1 0
|KL0  = √ (|K 0  − |K ) (4)
2
0
The matrix which transforms a vector in the |K 0 , |K  basis to one in
the |KS0 , |KL0  is thus:
 
1 1 1
R= √ . (5)
2 1 −1
[Aside: this is a Hadamard matrix of order two. Roughly, a Hadamard
matrix is an even matrix where each element can be 1 or -1, and pairs
of rows or columns have half of their elements in common, and half
opposite.]

1
Thus,
    
1 1 1 ωS − iΓS /2 0 1 1 1
HKK = √ √ ,
2 1 −1 0 ωL − iΓL /2 2 1 −1
 
1 ωS + ωL − i(ΓS + ΓL )/2 ωS − ωL − i(ΓS − ΓL )/2
=
2 ωS − ωL − i(ΓS − ΓL )/2 ωS + ωL − i(ΓS + ΓL )/2
 
M − iΓ/2 (∆M − i∆Γ/2)/2
= , (6)
(∆M − i∆Γ/2)/2 M − iΓ/2
where M ≡ (ωS + ωL )/2, Γ ≡ (Γs + ΓL )/2, ∆M ≡ MS − ML , and
∆Γ ≡ ΓS − ΓL .

2. Repeat the derivation of Eqn. 24, but work in the density matrix for-
malism. We did not consider the possibility of decay when we develped
this formalism, so be careful – you may find that you need to modify
some of our discussion.
Solution: In the density matrix course note, we had the time evolution
of the density matrix given by:
d
ρ(t) = −i[H, ρ]. (7)
dt
If we blindly use this equation in the present problem, we get a nonsen-
sical result. The problem is that the K 0 ’s are decaying, so the density
matrix no longer corresponds to “conserved” probability content. Thus,
we need to work out the time dependence anew. We repeat the steps
in the density matrix note, but now without assuming H = H † :
d d d  
ρ(t) = [|ψ(t)ψ(t)|] = |ψ(t) [|ψ(t)]†
dt dt dt
   †
1 1
= H|ψ(t) ψ(t)| + |ψ(t) H|ψ(t)
i i
1 
= H|ψ(t)ψ(t)| − |ψ(t)ψ(t)|H †
i  
= −i Hρ(t) − ρ(t)H † . (8)

We are given that at time t = 0:


 
0 1 1
ψ(0) = |K  = √ , (9)
2 −1

2
where the later representation is in the KS , KL basis. We’ll work in
this basis here, though that isn’t required – we could use the result of
0
exercise 1 and work in the |K 0 , |K  basis just as well.
We may write the time-dependent density matrix in the form:
   
α(t) |α|2 αβ ∗
ρ(t) = |ψ(t)ψ(t)| = ( α∗ (t) β ∗ (t) ) = . (10)
β(t) α∗ β |β|2
Let’s work out the time dependence, knowing that the Hamiltonian is
   
ωS − iΓS /2 0 S 0
HSL = ≡ . (11)
0 ωL − iΓL /2 0 L
We thus have:
     ∗ 
d 1 S 0 |α|2 αβ ∗ |α|2 αβ ∗ S 0
ρ(t) = −
dt i 0 L α∗ β |β|2 α∗ β |β|2 0 L∗
 
1 (S − S ∗ )|α|2 (S − L∗ )αβ ∗
=
i (L − S ∗ )α∗ β (L − L∗ )|β|2
 
−ΓS |α|2 −i(S − L∗ )αβ ∗
= . (12)
−i(L − S ∗ )α∗ β −ΓL |β|2
We may integrate this equation to find:
 ∗ )t 
|α(0)|2e−ΓS t α(0)β ∗(0)e−i(S−L
ρ(t) = ∗ . (13)
α (0)β(0)e−i(L−S )t

|β(0)|2e−ΓL t
√ √
Our initial condition is that α(0) = 1/ 2 and β(0) = −1/ 2. Thus,
 ∗ )t 
1 e−ΓS t −e−i(S−L
ρ(t) = ∗ )t . (14)
2 −e−i(L−S e−ΓL t

We wish to know the probability to find a K 0 at time t:


 
PK 0 (t) = Tr ρ(t)|K 0 K 0 |
  ∗ )t   
1 e−ΓS t −e−i(S−L 1 1 1
= Tr −i(L−S ∗ )t
2 −e e−ΓL t 2 1 1
1 −ΓS t  ∗ ∗

= e + e−ΓL t − e−i(S−L )t − e−i(L−S )t
4

Γ +Γ
1 −ΓS t −ΓL t − S2 L t
= e +e − 2e cos [(ωS − ωL)t] .(15)
4

3
3. In this note, we discussed the neutral kaon (K) meson, in particular the
phenomenon of K 0 − K̄ 0 mixing. Let us think about this system a bit
further. The K 0 and K̄ 0 mesons interact in matter, dominantly via the
strong interaction. Approximately, the cross section for an interaction
with a deuteron is:

σ(K 0 d) = 36 millibarns (16)


σ(K̄ 0 d) = 59 millibarns, (17)

at a kaon momentum of, say, 1.5 GeV. Note that a “barn” is a unit of
area equal to 10−24 cm2 .

(a) Consider a beam of kaons (momentum 1.5 GeV) incident on a


target of liquid deuterium. Let λ be the K 0 “interaction length”,
i.e., the average distance that a K 0 will travel in the deuterium
before it interacts according to the above cross section. Similarly,
let λ̄ be the K̄ 0 interaction length. To a good enough approxima-
tion for our purposes, you may treat the deuterium as a collection
of deuterons (why?). The density of liquid deuterium is approxi-
mately ρ = 0.17 g/cm3 . What are λ and λ̄, in centimeters?
Solution: The mean distance to interact is given by the inverse
of the effective size (cross section) presented by a scattering cen-
ter, divided by the number density of scattering centers in the
material:
1
λ= . (18)
σρ#
The mass of a deuterium atom is md ∼ 1876 MeV, hence, the
number density of scattering centers in liquid deuterium is

0.17 g/cm3
ρ# = ρ/md = = 5.08×1022 cm−3 .
1876 MeV × 1.783 × 10−27 g/MeV
(19)
Thus,
1
λ = 2 = 550 cm(20)
36 mb × 10−27 cm /mb × 5.08 × 1022 cm−3
λ̄ = 330 cm. (21)

4
(b) Suppose we have prepared a beam of KL0 mesons, e.g., by first
creating a K 0 beam and waiting long enough for the KS0 compo-
nent to decay away. If we let this KL0 beam traverse a deuterium
target, the K 0 and K̄ 0 components will interact differently, and
we may end up with some KS0 mesons exiting the target. Let us
make an estimate for the size of this effect.
Since the kaon is relativistic, we need to be a little careful com-
pared with our discussion in the note: In the KL0 rest frame, the
amplitude depends on time t∗ according to:

exp(−imL t∗ − ΓL t∗ /2), (22)


where ΓL = 1/τL is the KL0 decay rate. In the laboratory
√ frame,
where the kaon is moving with speed v, and γ = 1/ 1 − v , t∗ →
2

t/γ, where t is the time as measured in the laboratory frame.


In the lab frame, we have t/γ = x/γv, and we may write the
amplitude as for the KL0 as:

exp(−imL x/γv − ΓL x/2γv), (23)


Let us consider a deuterium target, of thickness w, along the beam
direction. At a distance x into the target, an interaction may
occur, resulting in a final state:
1
√ (f |K 0  − f¯|K̄ 0 ), (24)
2

where, for example, the amplitude f for the K 0 component tra-


versing distance dx is just:
dx
f = e−dx/2λ ≈ 1 − . (25)

Put all this together and find an expression for the probability to
observe a KS0 to emerge from the deuterium, for a KL0 incident.
Assume that w λ. You may wish to use ∆m ≡ mL − mS ,
ΓS,L ≡ 1/τS,L , and ∆Γ ≡ ΓL − ΓS ≈ −ΓS
Solution: We’ll use distance x as surrogate for time t, with x =
t = 0 at the entrance to the deuterium target. We’ll work in the

5
|KS0 , |KL0  basis here. There are two pieces to the Hamiltionian
to worry about now.
First, there is the weak interaction piece we have already been
discussing:  
S 0
HW = , (26)
0 L
where

S ≡ (mS − iΓS /2)/(γv) (27)


L ≡ (mL − iΓL /2)/(γv). (28)

Note that we have defined things so that the Hamiltonian is now


the operator idx .
The other piece of the Hamiltonian, Hd describes the strong in-
teraction with the deuterium. It “takes away” bits of our wave
0
function, at different rates for the K 0 and K components. Sup-
pose, for example, we start with a pure KL0 state, and let it traverse
a small distance dx in the deuterium. The state becomes modified
according to:
1
|ψ(dx) = √ (f |K 0  − f|
¯ K̄ 0 )
2
    
1 dx 0 dx 0
= √ 1− |K  − 1 − |K  (29)
2 2λ 2λ̄

Hence,
 
dψ 1 1 0 1 0
= − √ |K  − |K 
dx 2 2 λ λ̄
    
1 1 1 0 1 1 0
= − + |KL  + − |KS 
4 λ λ̄ λ λ̄
= −a|KL0  + b|KS0 , (30)

where
 
1 1 1
a ≡ + (31)
4 λ λ̄
 
1 1 1
b ≡ − (32)
4 λ̄ λ

6
For an initial KS0 we similarly obtain:

= −a|KS0  + b|KL0 . (33)
dx
Hence, the Hamiltonian driving this transformation is:
 
d −ia ib
Hd = i = . (34)
dx ib −ia
Note that the diagonal elements are pure imaginary, similar to the
decay terms in HW . This reflects the fact that kaons are actually
being removed in the scattering. Let’s re-express the Hamiltonian
in the form H = H0 + H1 , where
    
S 0 0 ib
H0 = , H1 = , (35)
0 L ib 0
where
S = (mS − iΓS /2)/γv, (36)
L = (mL − iΓL /2)/γv, (37)
ΓS = ΓS + 2γva, (38)
ΓL = ΓL + 2γva. (39)
Thus, we have absorbed the a terms into effective decay rates for
the kaons.
The Schrödinger equation we wish to solve is:
d
i ψ(x) = (H0 + H1 )ψ(x). (40)
dx
Explicitly, if the KS0 component is α and the KL0 component is β:
    
d α(x) S α + ibβ
i = . (41)
dx β(x) L β + ibα
Let’s see if we can solve this pair of coupled equations.
The form of these equations suggests we consider a solution of the
form:

α(x) = f (x)e−iS x (42)

β(x) = g(x)e−iL x . (43)

7
We substitute these back into the differential equations, and find
that we may eliminate g(x) and obtain a homogeneous equation
for f (x):
f  − i(S  − L )f  − b2 f = 0. (44)
The solution which satisfies the boundary condition f (0) = 0 is:
     
−i L
 −S 
x i i 
f (x) = Ae 2 exp (S  − L )2 − 4b2 x − exp − (S − L )2 − 4b2 x .
2 2
(45)
Thus,
     
−i L
 +S 
x i i 
α(x) = Ae 2 exp − − − exp −
(S  L )2 4b2 x
(S − L )2 − 4b2 x .
2 2
(46)
We can substitute this back into the equation for β:
 
1 dα
β(x) = + iS  α (47)
b dx
iA −i L +S  x    i i
  i i

= e 2 (L − S  ) e 2 R − e− 2 R + R e 2 R + e− 2 R ,
2b
where 
R≡ (S  − L )2 − 4b2 . (48)
The boundary condition is that
iA
1 = β(0) = R, (49)
b
hence
A = −ib/R. (50)
The probability of seeing a KS0 emerging from the deuterium is
thus

PKS0 (w) = |α(w)|2


b2 w(L +S  )  i Rw − i R∗ w − 2i Rw 2i R∗ w i
Rw 2i R∗ w − 2i Rw − 2i R∗ w

= e e 2 e 2 + e e − e 2 e − e e
|R|2
b2 w(L +S  )  − w R +w R

= e e 2 + e 2 − 2 cos(w R) . (51)
|R|2

8
Let’s rewrite R in terms of the physical inputs:

R = (S  − L )2 − 4b2

= [(mS − iΓS /2) − (mL − iΓL /2)] /(γv)2 − 4b2

= (∆m − i∆Γ/2)2 /(γv)2 − 4b2 . (52)

Hence,

|R|2 = |(∆m − i∆Γ/2)2 /(γv)2 − 4b2 |


= |(∆m)2 − (∆Γ/2)2 − (2bγv)2 − i∆m∆Γ|/(γv)2

= [(∆m)2 + (∆Γ/2)2 ]2 − 2(2γvb)2 [(∆m)2 − (∆Γ/2)2 ] + 2γvb)4 /(γv)2
 
≈ (∆m)2 + (∆Γ/2)2 /(γv)2 , (53)

where the approximation is for (γvb)2 small. Note that we have


not formerly made any such approximations, so our result may
be applied in situations where this is not valid. However, it is
valid here, and we now proceed to use the approximation that b
is small. We note that this corresponds to assuming that there is
at most one interaction in the target.
In this approximation, R ≈ (∆m − i∆Γ/2)/(γv). We finally have:
 
b2 (γv)2 ΓL ΓS ΓL +ΓS ∆mw
PKS0 (w) ≈ e−2aw e− γv w + e− γv w − 2e− 2γv w cos .
2
(∆m) + (∆Γ/2) 2 γv
(54)
Likewise, the probability to observe a KL0 emerging from the target
is:

PKL0 (w) = |β(w)|2


ΓL
= e−2aw e− γv w . (55)

(c) Suppose w = 10 cm and γv = 3. What is the probability to


observe a KS0 emerging from the target? What is the probability
to observe a KL0 ? You may use:

ΓS = 1.1 × 1010 s−1 , (56)


∆m = 0.5 × 1010 s−1 . (57)

9
Solution: I get:

PKS0 (w) ≈ 4.9 × 10−6 (58)


PKL0 (w) ≈ 0.974. (59)

You have been investigating a phenomenon often called “regeneration”


– by passing through material, a KS0 component to the beam has been
“regenerated”. A similar consideration has been proposed to help ex-
plain the “solar neutrino problem”.

10
Physics 195a
Course Notes
The Simple Harmonic Oscillator: Creation and Destruction Operators
021105 F. Porter

1 Introduction
The harmonic oscillator is an important model system pervading many areas
in classical physics; it is likewise ubiquitous in quantum mechanics. The non-
relativistic Scrod̈inger equation with a harmonic oscillator potential is readily
solved with standard analytic methods, whether in one or three dimensions.
However, we will take a different tack here, and address the one-dimension
problem more as an excuse to introduce the notion of “creation” and “an-
nihilation” operators, or “step-up” and “step-down” operators. This is an
example of a type of operation which will repeat itself in many contexts,
including the theory of angular momentum.

2 Harmonic Oscillator in One Dimension


Consider the Hamiltonian:
p2 1
H= + mω 2 x2 . (1)
2m 2
This is the Hamiltonian for a particle of mass m in a harmonic oscillator
potential with spring constant k = mω 2 , where ω is the “classical frequency”
of the oscillator. We wish to find the eigenstates and eigenvalues of this
Hamiltonian, that is, we wish to solve the Schrödinger equation for this
system. We’ll use an approach due to Dirac.
Define the operators:

mω 1
a ≡ x + i√ p (2)
2 2mω

mω 1
a† ≡ x − i√ p. (3)
2 2mω
Note that a† is the Hermitian adjoint of a. We can invert these equations to
obtain the x and p operators:
1
x = √ (a + a† ), (4)
2mω
1


p = −i (a − a† ). (5)
2
Further, define

N ≡ a† a (6)
mω 2 1 2 1
= x + p − , (7)
2 2mω 2
where we have made use of the commutator [x, p] = i. Thus, we may rewrite
the Hamiltonian in the form:
 
1
H=ω N+ , (8)
2
and our problem is equivalent to finding the eigenvectors and eigenvalues of
N.

3 Algebraic Determination of the Spectrum


Let |n denote an eigenvector of N, with eigenvalue n:

N|n = n|n. (9)

Then |n is also an eigenstate of H with eigenvalue (n + 12 )ω. Since N is


Hermitian,
N † = (a† a)† = (a† )(a† )† = a† a = N, (10)
its eigenvalues are real. Also n ≥ 0, since

n = n|N|n = n|a† a|n = an|an ≥ 0. (11)

Now consider the commutator:


  
† mω 1 mω 1
[a, a ] = x + i√ p, x − i√ p
2 2mω 2 2mω
i
= {−[x, p] + [p, x]} = 1. (12)
2
Next, notice that

Na = a† aa = aa† a − a = a(N − 1), (13)


Na† = a† aa† = a† a† a + a† = a† (N + 1). (14)

2
Thus,
Na|n = a(N − 1)|n = (n − 1)a|n. (15)
We see that a|n is also an eigenvector of N, with eigenvalue n−1. Assuming
|n is normalized, n|n = 1, then

an|an = n|N|n = n, (16)



or a|n = n|n − 1, where we have also normalized the new eigenvector
|n − 1.
We may continue this process to higher powers of a, e.g.,
√ 
2
a |n = a n|n − 1 = n(n − 1)|n − 2. (17)

If n is an eigenvalue of N, then so are n − 1, n − 2, n − 3, . . .. But we showed


that all of the eigenvalues of N are ≥ 0, so this sequence cannot go on forever.
In order for it to terminate, we must have n an integer, so that we reach the
value 0 eventually, and conclude with the states

a|1 = 1|0, (18)
a|0 = 0. (19)

Hence, the spectrum of N is {0, 1, 2, 3, . . . , n, ?}.


To investigate further, we similarly consider:

Na† |n = (n + 1)a† |n. (20)

Hence a† |n is also an eigenvector of N with eigenvalue n + 1:



a† |n = n + 1|n + 1. (21)

The spectrum of N is thus the set of all non-negative integers. The energy
spectrum is:  
1
En = ω n + , n = 0, 1, 2, . . . . (22)
2
We notice two interesting differences between this spectrum and our clas-
sical experience. First, of course, is that the energy levels are quantized.
Second, the ground state energy is E0 = ω/2 > 0. Zero is not an allowed
energy. This lowest energy value is referred to as “zero-point motion”. We
cannot give the particle zero energy and be consistent with the uncertainty
principle, since this would require p = 0 and x = 0, simultaneously.

3
We may give a “physical” intuition to the operators a and a† : The energy
of the oscillator is quantized, in units of ω, starting at the ground state with
energy 12 ω. The a† operator “creates” a quantum of energy when operating on
a state – we call it a “creation operator” (alternatively, a “step-up operator”).
Similarly, the a operator “destroys” a quantum of of energy – we call it a
“destruction operator” (or “step-down”, or “annihilation operator”). This
idea takes on greater significance when we encounter the subject of “second
quantization” and quantum field theory.

4 The Eigenvectors
We have determined the eigenvalues; let us turn our attention now to the
eigenvectors. Notice that we can start with the ground state |0 and generate
all eigenvectors by repeated application of a† :

|1 = a† |0
(a† )2
|2 = √ |0
2
..
.
(a† )n
|n = √ |0. (23)
n!
So, if we can find the ground state wave function, we have a prescription for
determining any other state.
We wish to find the ground state wave function, in the position represen-
tation (for example). If |x represents the amplitude describing a particle at
position x, then x|0 = ψ0 (x) is the ground state wave function in postition
space.1 Consider:

x|a|0 = 0

mω ip
= x| x+ √ |0
2 2mω
 
mω 1 d
= x+ √ x|0. (24)
2 2mω dx
1
The notion here is that the wave function of a particle
at position x is (proportional
to) a δ-function in position. Hence, ψ0 (x) = x|0 = δ(x − x )ψ0 (x ) dx .

4
Thus, we have the (first order!) differential equation:

d
ψ0 (x) = −mωxψ0 (x), (25)
dx
with solution:
x2
ln ψ0 (x) = −mω + constant. (26)
2
The constant is determined (up to an arbitrary phase) by normalizing, to
obtain:    
mω 1/4 1
ψ0 (x) = exp − mωx2 . (27)
π 2
This is a Gaussian form. The Fourier transform of a Gaussian is also a
Gaussian, so the momentum space wave function is also of Gaussian form.
In this case, the position-momentum uncertainty relation is an equality,
∆x∆p = 1/2. This is sometimes referred to as a “minimum wave packet”.
Now that we have the ground state, we may follow our plan to obtain the
excited states. For example,
 
† mω 1 d
x|1 = x|a |0 = x− √ x|0 (28)
2 2mω dx

Substituting in our above result for ψ0 , we thus obtain:


√  1/4  
mω 1
ψ1 (x) = 2mω x exp − mωx2 . (29)
π 2

To express the general eigenvector, it is convenient to define y = mωx,
and
1
a† = √ (y − dy ), (30)
2
where dy is a shorthand notation meaning d/dy. The general eigenfunctions
2
are clearly all polynomials times e−y /2 , since

(a† )n 1 1 2
ψn (y) = √ ψ0 (n) = √ (y − dy )n 1/4 e−y /2 . (31)
n! 2 n!
n π

Thus, we may let


1 1 2
ψn (y) = √ Hn (y)e−y /2 , (32)
2 n! π
n 1/4

5
where Hn is a polynomial, to be determined.
We may derive a recurrence relation which these polynomials satisfy.
First, use
a† 1
ψn+1 (y) = √ ψn (y) =  (y − dy )ψn (y), (33)
n+1 2(n + 1)

to obtain:
Hn+1 (y) = 2yHn(y) − dy Hn (y). (34)
On the other hand,
a 1
ψn−1 (y) = √ ψn (y) = √ (y + dy )ψn (y), (35)
n 2n
or,
dy Hn (y) = 2nHn−1 (y). (36)
Combining Eqns. 34 and 36 to eliminate the derivative, we find:
Hn+1 (y) = 2yHn (y) − 2nHn−1(y). (37)
This result is the familiar recurrence relation for the Hermite polynomials
(hence our choice of symbol).
We could go on to determine other properties of these polynomials, such
as Rodrigues’ formula:
2 dn −x2
Hn (x) = (−)n ex e , n = 0, 1, 2, . . . (38)
dxn
But we’ll conclude here only by noting that the harmonic oscillator wave
functions are also eigenstates of parity:
P ψn (x) = ψn (−x) = (−)n ψn (x), (39)
wherer P is the parity (space reflection) operator. This fact shouldn’t be
surprising, since [H, P ] = 0.
In Fig. 1 we show the first five harmonic oscillator wave functions. It is
worth noting several characteristics of these curves, which are representative
of wave functions for one-dimensional potential problems.
• The ground state wave function has no nodes (at finite y); the first
excited state has one node. In general, the n-th wave function has
n − 1 nodes.

6
1.5

1.0

0.5

0.0
-4 -3 -2 -1 0 1 2 3 4
y
-0.5

-1.0

-1.5

Figure 1: The first five harmonic oscillator wave functions, ψ0 (y), . . . , ψ4 (y),
from Eqn. 32. Also shown is the quadratic potential, V (y), for ω = 1/8. The
classical turning point (point of inflection) for ψ4 is projected to the x-axis at
x = 3. The projection is extended to the potential curve, and thence to the
y-axis, where it may be seen that the energy of this state is 9/16 = (4 + 12 )ω.

• Correlating with the increase in nodes, the higher the excited state, the
greater the spatial frequency of the wave function oscillations. This cor-
responds to higher momenta, as expected from the deBroglie relation.

• Each wave function has a region around y = 0 of oscillatory behav-


ior, in which the curve is concave towards the horizontal axis (the sign
of the second derivative is opposite that of the wave function), and a
region at larger values of |y| of “decay”, in which the curve is convex
towards the horizontal axis. This feature may be understood as follows:
Wherever the total energy, E, is larger than √ the potential energy, V ,
the kinetic energy, T , is positive, and p = 2mT is real. This is the
“classically allowed” region, and the wave function really looks like a
wave. On the other hand, wherever E < V , T < 0, and the correspond-
ing “momentum” is imaginary. This is the classically forbidden region.
In this region, the probability to find the particle falls off rapidly. Mea-

7
surements of the particle’s position and momentum (yielding a real
number) are restricted according to the uncertainty principle such that
no internal contradictions occur.
The point of inflection in the wave function between the classically
allowed and forbidden regions is the “classical turning point” of the
system. This is where the classical momentum is zero, and E = V .

5 Exercises
1. We have noticed some things about the qualitative behavior of wave
functions in our discussion of Fig. 1. Consider the one-dimensional
problem with potential function given by:

0 for |x| ≤ a,
V (x) = (40)
V0 for |x| > a,

where V0 > 0 and a > 0.

(a) Suppose that there are four bound states. Make a qualitative, but
careful, drawing of what you expect the first four wave functions
to look like, in the spirit of Fig. 1.
(b) Make a qualitative drawing for the wave function of a state with
energy above V0 .

2. Let us generalize the discussion of the simple harmonic oscillator to


three dimensions. In this case, the Hamiltonian is:
1 2 1
H= (p1 + p22 + p23 ) + m(Ωx)2 , (41)
2m 2
where Ω is a 3 × 3 symmetric real matrix.

(a) Determine the energy spectrum and eigenvectors of this system.


(b) Suppose the potential is spherically symmetric. Using the equiva-
lent one-dimensional potential approach, find the eigenvalues and
eigenvectors of H corresponding to the possible values of orbital
angular momentum.

8
Physics 195a
Course Notes
The Simple Harmonic Oscillator: Creation and Destruction Operators: Solutions to Exercises
021129 F. Porter

1 Exercises
1. We have noticed some things about the qualitative behavior of wave
functions in our discussion of Fig. ??. Consider the one-dimensional
problem with potential function given by:

0 for |x| ≤ a,
V (x) = (1)
V0 for |x| > a,
where V0 > 0 and a > 0.
(a) Suppose that there are four bound states. Make a qualitative, but
careful, drawing of what you expect the first four wave functions
to look like, in the spirit of Fig. ??.
Solution:

V
0

-a a
x

Figure 1: Qualitative wave functions for bound states.

1
(b) Make a qualitative drawing for the wave function of a state with
energy above V0 .
Solution:

Figure 2: Qualitative wave function for continuum state.

2. Let us generalize the discussion of the simple harmonic oscillator to


three dimensions. In this case, the Hamiltonian is:
1 2 1
H= (p1 + p22 + p23 ) + m(Ωx)2 , (2)
2m 2
where Ω is a 3 × 3 symmetric real matrix.
(a) Determine the energy spectrum and eigenvectors of this system.
Solution: The matrix Ω is diagonalizable by an orthogonal ma-
trix (a rotation), so let:
 2 
ω1 0 0
 
RΩRT =  0 ω22 0 , (3)
0 0 ω32
and let y = Rx. The kinetic energy piece is a scalar, indepen-
dent of what coordinate basis we use. Hence, we may rewrite our
Hamiltonian as:
1 2 1
H= (p1 + p22 + p23 ) + m(ω12 y12 + ω22 y22 + ω32 y32 ). (4)
2m 2

2
(b) Suppose the potential is spherically symmetric. Using the equiva-
lent one-dimensional potential approach, find the eigenvalues and
eigenvectors of H corresponding to the possible values of orbital
angular momentum.
Solution: The spherically symmetric Hamiltonian may be written in
the form:
1 2 1
H= p + mω 2 x2 , (5)
2m 2
Letting ψn (x) = unr(r) Ym (θ, φ), we have the equivalent one-dimensional
Schrödinger equation:

1 d2 1 ( + 1)
− + mω 2 r 2 + un (r) = Eun (r). (6)
2m dr 2 2 2mr 2

We put the equation in a more convenient dimensionless form. Let



k ≡ 2mE, (7)
ρ ≡ kr, (8)

2

λ ≡ . (9)
k2
Then the Schrödinger equation may be written:

d2 ( + 1)
− λρ2 − + 1 v(ρ) = 0, (10)
dρ2 ρ2
where v(ρ) = un (ρ/k).
The asymptotic form of the Schrödinger equation is

d2
− λρ2 v(ρ) = 0, (11)
dρ2

which suggests we try a solution with the asymptotic form:



λρ2 /2
v(ρ) = f (ρ)e− . (12)

Near ρ = 0, the equation is of the form:



d2 ( + 1)
− v(ρ) = 0. (13)
dρ2 ρ2

3
Thus, we try a solution of the form:
√ 2
v(ρ) = ρ+1 g(ρ)e− λρ /2
, (14)

where we will look for a series solution for g(ρ).


Substituting the above form into the Schrödinger equation, we obtain
the following differenial equation for g(ρ):
√  √ 
ρg  + 2( + 1 − λρ2 )g  + 1 − λ(2 + 3) ρg = 0. (15)

Letting 
g(ρ) = cj ρj , (16)
j=0

we find that cj = 0 for j odd, and for even j we have the recurrence
relation: √
1 − λ(2 + 2j + 3)
cj+2 = cj . (17)
(j + 2)(2 + j + 3)
We already have the asymptotic behavior, so we suppose that the series
stops at j = n, i.e., √
1 = λ(2 + 2n + 3). (18)
Thus, we have the discrete energy spectrum:


ω 3
En = (2 + 2n + 3) = ω n +  + . (19)
2 2
Note that n is even.

We may now substitute λ = 1/(2 + 2n + 3) and use the recurrence
relation to obtain an expression for coefficient cj :

2(n − j + 2)
cj = cj−2 (20)
(2n + 2 + 3)j(2 + j + 1)
2 (n − j + 2)(n − j + 4)(n − j + 6) · · · (n − j + j)
= c0
(2n + 2 + 3) j(j − 2) · · · 2(2 + j + 1)(2 + j − 1) · · · (2 + 3)
2 n!! 1 (2 + 1)!!
= c0 (21)
(2n + 2 + 3) (n − j)!! j!! (2 + j + 1)!!

4
Putting it all together, we have:

un (r)
ψnm (x) = Ym (θ, φ) (22)
r
vn (kr)
= Ym (θ, φ) (23)
r 
(kr)+1 1 (kr)2
= exp − Ym (θ, φ)
r 2 2(n +  + 3/2)
n
 2 n!! 1 (2 + 1)!! j
c0n ,
(kr)(24)
j=0 (2n + 2 + 3) (n − j)!! j!! (2 + j + 1)!!

where c0n is determined by normalization.

5
Course Notes
Solving the Schrödinger Equation: Resolvents
051117 F. Porter
Revision 111109 F. Porter

1 Introduction
Once a system is well-specified, the problem posed in non-relativistic quan-
tum mechanics is to solve the Schrödinger equation. Once the solutions are
known, then any question of interest can in principle be answered, by taking
appropriate expectation values of operators between states made from the
solutions. There are many approaches to obtaining the solutions, analytic,
approximate, partial, and numerical. All have important applications. In
this note, we exam the approach of obtaining analytic solutions using con-
ventional methods of analysis. In particular, we develop a means to apply
the powerful techniques of complex analysis to this problem.

2 Resolvents and Green’s Functions


We have already considered the interpretation of a function of a self-adjoint
operator Q, with point spectrum (eigenvalues) Σ(Q) = {qi ; i = 1, 2, . . .}, and
spectral resolution

X
Q= qk |kihk|, (1)
k=1

where

Q|ki = qk |ki (2)


hk|ji = δkj (3)

X
I = |kihk|. (4)
k=1

In particular, the eigenvectors of Q form a complete orthonormal set.


If f (q) is any function defined on Σ(Q), then we define
X
f (Q) = f (qk )|kihk|. (5)
k

It may be observed that [f (Q), Q] = 0. If f (q) is defined and bounded on


Σ(Q), then f (Q) is a bounded operator, where the norm of an operator is

1
defined according to:

kf (Q)kop ≡ sup kf (Q)φk (6)


kφk=1

= sup |f (q)| (7)


q∈Σ(Q)

< ∞, if f (q) is bounded. (8)

Now define an operator-valued function G(z), called the resolvent of Q,


of complex variable z, for all z not in Σ(Q) by:1
1
G(z) = , z∈
/ Σ(Q). (9)
Q−z

For any such z the operator G(z) is bounded, and we have


1
kG(z)kop = sup . (10)
k |qk − z|

The resolvent satisfies the identities


!
X 1 1
G(z) − G(z0 ) = − |kihk|
k qk − z qk − z0
X z − z0
= |kihk|
k (q k − z)(q k − z0 )
z − z0
=
(Q − z)(Q − z0 )
= (z − z0 )G(z)G(z0 ), (11)

G(z0 )
G(z) = , (12)
1 + (z0 − z)G(z0 )
and
Q = z + 1/G(z). (13)
If the eigenvectors are written as functions of x ∈ R3 (assuming that
the Hilbert space is L2 (R3 )), we can represent the resolvent as an integral
transform on the wave functions. That is, with

1 X |kihk|
G(z) = = , (14)
Q−z k qk − z

1
The resolvent is sometimes defined with the opposite sign. We shall see eventually
that G(z) also finds motivation in terms of Cauchy’s integral formula.

2
and
|ki = φk (x), (15)
we define
X φk (x)φ∗k (y)
G(x, y; z) = , (16)
k qk − z
so that G operates on a wave function according to
Z
[G(z)ψ] (x) = d3 (y)G(x, y; z)ψ(y). (17)
(∞)

Thus G(x, y; z) is the kernel of an integral transform. We may see that this
correspondence is as claimed as follows: Expand
X
ψ(y) = ψ` φ` (y). (18)
`

Then
Z X φk (x)φ∗k (y) X
d3 (y) ψ` φ` (y)
(∞) k qk − z `
X φk (x) X Z
= ψ` d3 (y)φ∗k (y)φ` (y)
k qk − z ` (∞)
X ψk φk (x)
= . (19)
k qk − z

This is to be compared with


X 1 X
[G(z)ψ] (x) = |kihk| ψ` |`i
k q k − z `
X ψk φk (x)
= . (20)
k qk − z

We have thus demonstrated the representation as an integral transform. Sim-


ilar results would apply on other L2 spaces of wave functions besides L2 (R3 ).
Consider now the formal relation:
1
(Q − z)G(z) = (Q − z) = I. (21)
Q−z
Corresponding to this, we have operator:2

(Qx − z)G(x, y; z) = δ (3) (x − y), (22)


2
We use a subscript x on Q to denote that, if, for example, Q is a differential operator,
the differentiation is on variable x.

3
since δ (3) (x − y) is the kernel corresponding to I. This delta-function is thus
symbolic of the relation:
Z
(Qx − z) d(3) (y)G(x, y; z)ψ(y) = ψ(x), (23)
(∞)

for any continuous ψ(x) in L2 (R3 ) (continuous in case Q is a differential


operator).
The kernel G(x, y; z) is called the Green’s function for the (differential)
operator Q. This Green’s function is the “kernel for a resolvent”(as with
the resolvent, the sign convention is not universal). It is the solution of the
inhomogeneous differential equation (Eqn. 22) for an “impulse source”, which
satisfies “the” boundary conditions since it may be expressed in an expansion
of basis vectors satisfying the boundary conditions:

X φk (x)φ∗k (y)
G(x, y; z) = . (24)
k=1 qk − z

From this relation, we also see the symmetry property:

G(x, y; z)∗ = G(y, x; z ∗ ). (25)


1
Our definition of G(z) ≡ Q−z suggests that the resolvent is an analytic
(operator-valued) function of z in the complement of the spectrum of Q. For
any point z0 ∈
/ Σ(Q), we have the power series:

[(z − z0 )G(z0 )]n .
X
G(z) = G(z0 ) (26)
n=0

This series converges in norm inside any disk |z − z0 | < ρ which does not
intersect Σ(Q). Further, for the nth term in the series, we have
!n
n ρ 1
kG(z0 ) [(z − z0 )G(z0 )] kop ≤ , (27)
ρ0 ρ0

where
1
= kG(z0 )kop = distance [z0 , Σ(Q)] . (28)
ρ0
Since the resolvent is “analytic” we may use contour integration. For exam-
ple, in Fig. 1 we suppose that contour C4 encircles a single, non-degenerate,
eigenvalue q4 . Then

4
z

. . . . ... . q
4
C4

Figure 1: The complex z plane, with eigenvalues of Q indicated on the real


axis. A contour is shown encircling one of the eigenvalues.


1 Z 1 Z X |kihk|
G(z) dz = dz (29)
2πi C4 2πi C4 k=1 qk − z
1 Z |φ4 ihφ4 |
= dz (30)
2πi C4 q4 − z
!
z − q4
= |φ4 ihφ4 | lim (31)
z→q4 q − z
4
= −|φ4 ihφ4 |. (32)

That is,
1 Z
|φ4 ihφ4 | = − G(z) dz. (33)
2πi C4
The contour integral of G around an eigenvalue of Q gives the projection
onto the one-dimensional subspace of the corresponding eigenvector of Q.
Now suppose that the spectrum of Q is bounded below, i.e., there exists
an α > −∞ such that qk > α, ∀k. Of particular interest is the Hamiltonian,
which has this property. In this case, we may consider a contour which
encircles all eigenvalues, as in Fig. 2: Then we have
1 Z
I=− G(z) dz, (34)
2πi C∞
as may be proven by a limiting process, and noting that convergence is all
right.
According to Cauchy’s integral formula, we can express analytic functions
of Q in terms of contour integrals: Let f (z) be a function which is analytic

5
Ch
h
. . . . . . . .. .
q
1
h

Figure 2: A contour which encircles the entire spectrum of Q.

in a region which contains C∞ . Then

1 Z f (z) dz 1 Z
f (Q) = =− dzf (z)G(z), (35)
2πi C∞ z − Q 2πi C∞

assuming the integral converges. In particular consider the function f (z) =


e−izt : ∞
1 Z
U (t) = e−itQ = − G(z)e−izt dz = |kihk|e−itqk ,
X
(36)
2πi C∞ k=1

with the restriction that Im(t) ≤ 0.


The principal application of all this occurs when Q = H is the Hamilto-
nian. For real t in this case, U (t) is the time development transforma-
tion, or evolution operator. Note that it satisfies (assuming H carries no
explicit time dependence):

d d
i U (t) = i e−itH = HU (t) (37)
dt dt
U (0) = I. (38)

We shall consider this case (Q = H) henceforth. The kernel of the integral


transform representing U (t) is:
1 Z
U (x, y; t) = − dze−itz G(x, y; z) (39)
2πi C∞

φk (x)φ∗k (y)e−iqk t .
X
= (40)
k=1

6
If we know U (x, y; t) then we can solve the time-dependent Schrödinger
equation given any initial wave function. Corresponding to the above differ-
ential equation (Eqn. 37), we have for U (x, y; t):

φk (x)φ∗k (y)qk e−iqk t
X
i∂t U (x, y; t) =
k=1

Hx φk (x)φ∗k (y)e−iqk t
X
=
k=1
= Hx U (x, y; t), (41)
and initial condition
U (x, y; 0) = δ (3) (x − y). (42)
Hence, if ψ(x) is any wave function, then ψ(x; t) defined by:
Z
ψ(x; t) ≡ d3 (y)U (x, y; t)ψ(y) (43)
(∞)

satisfies the Schrödinger equation,


i∂t ψ(x; t) = Hx ψ(x; t), (44)
and initial condition
ψ(x; 0) = ψ(x). (45)
We remark that the resolvent and Green’s function, and U (t), actually
exist for a larger class of operators, not just those with a pure point spec-
trum. For example, the resolvent exists for any self-adjoint operator which
is bounded below. In particular, we have existence for the Hamiltonian of a
free particle, with H = p2 /2m. However, we cannot in general express G(z)
and U (t) as sums over states, and the Green’s function cannot be expressed
as a sum over products of eigenfunctions. The contour integral relation:
1 Z
U (x, y; t) = − dze−itz G(x, y; z) (46)
2πi C∞
remains valid. We have resorted to the case with a pure point spectrum,
with sums over states, in order to develop the feeling for how things work
without getting bogged down in mathematical issues.

3 Connection between Schrödinger Equation


and Diffusion Equation
Suppose
1 2
H=− ∇ + V (x), (47)
2m

7
and recall ∞
U (t) = e−itH = |kihk|eiqk t .
X
(48)
k=1

These relations still make sense mathematically if we consider imaginary


times t = −iτ where τ ≥ 0:

U (−iτ ) = e−τ H , (49)

and then
d
U (−iτ ) = −HU (−iτ ). (50)

The corresponding equation for the kernel is thus:

∂τ U (x, y; −iτ ) = −Hx U (x, y; −iτ )


1 2
= ∇ U (x, y; −iτ ) − V (x)U (x, y; −iτ ). (51)
2m x
This is in the form of a diffusion equation. Thus, the Schrödinger equation
is closely related to a diffusion equation, corresponding to the Schrödinger
equation with imaginary time.
However, there is a difference. The Schrödinger equation can be solved in
both time directions – “backward prediction” is all right. But the diffusion
equation can, for general initial conditions, only be solved in the forward time
direction. This is because the operator U (−iτ ) = e−τ H has the entire Hilbert
space as its domain for τ > 0, but not for τ < 0 (since e−τ H = k e−τ ωk |kihk|,
P

and the e−τ ωk “weight” has unbounded contributions for τ < 0. On the other
hand U (t) = e−itH is a unitary operator for all real t, and hence the entire
Hilbert space is its domain for all real t.

4 A Brief Revisit to Statistical Mechanics


In the note on density matrices we gave the density matrix for the canonical
thermodynamic distribution. With U (t) = eitH , we may also write it in terms
of U :
e−H/T 1
ρ(T ) = = U (−i/T ), (52)
Z(T ) Z(T )
where we have made the substitution t = −i/T . As long as T ≥ 0, this sub-
stitution is mathematically acceptable. The inverse temperature corresponds
to an imaginary time coordinate. With this substitution, we also have the

8
partition function:
 
Z(T ) = Tr e−H/T
= Tr [U (−i/T )] (53)
Z
= d3 (x)U (x, x; −i/T ). (54)
(∞)

5 Practical Matters
We see that the objects G(z), U (z), etc., are potentially very useful tools
towards solving the Schrödinger equation, calculating the partition func-
tion, and perhaps other applications. We’ll also make the connection of
G(z) with perturbation theory later. Let us here address the question of
how one goes about constructing these tools in practice in an explicit prob-
lem. “Closed form” solutions for the Green’s function exist for special cases
(with special symmetries of H), and useful forms can be constructed for
one-dimensional problems (hence also for spherically symmetric problems in
three-dimensions). Otherwise, we can attempt to apply perturbation theory
methods towards obtaining useful approximations.

5.1 General Procedure to Construct the Green’s Func-


tion for a One-dimensional Schrödinger Equation
Let
1 d2
H=− + V (x), (55)
2m dx2
where x ∈ (a, b) (and a → −∞, b → ∞ is permissible). The Hilbert space is
L2 (a, b). Assume V (x) is such that H is bounded below.
Let z be a complex number with Im(z) 6= 0, and consider solutions u(x; z)
of the following differential equation:

Hu(x; z) = zu(x; z), (56)

with boundary condition that the solutions must vanish at the endpoints of
the interval. Let uL (x; z) be a solution satisfying the boundary conditions at
the left endpoint, and let uR (x; z) be a solution satisfying the boundary con-
ditions at the right endpoint. Consider the quantity, called the Wronskian:

W (z) ≡ u0L (x; z)uR (x; z) − uL (x; z)u0R (x; z). (57)

9
We only give the Wronskian a z argument, because it has the remarkable
property that is is independent of x:
dW
= u00L uR − uL u00R
dx
= −2m(z − V )uL uR − uL [−2m(z − V )] uR
= 0. (58)

Thus, the Wronskian may be evaluated at any convenient value of x.


Now define
2m
G(x, y; z) ≡ [uL (x; z)uR (y; z)θ(y − x) + uL (y; z)uR (x; z)θ(x − y)] ,
W (z)
(59)
x where the step function θ(x) is defined by:

0 if x < 0,

θ(x) ≡ 1/2 if x = 0 (60)



1 if x > 0.
Since uL and uR are continuous functions on (a, b), with continuous first
derivatives (in order to be in the domain of H), it follows that

1. G(x, y; z) is a continuous function of x, for x ∈ (a, b).

2. G(x, y; z) is a differentiable function of x, and the first derivative is


continuous at all points x ∈ (a, b), except for x = y, where there is a
discontinuity of magnitude:

lim [(∂1 G)(x + , x; z) − (∂1 G)(x − , x; z)] = −2m, (61)


→0+

where the notation ∂1 is used to mean “differentiation with respect to


the first argument”.

3. For x 6= y, G satisfies the differential equation

Hx G(x, y; z) = zG(x, y; z), (x 6= y). (62)

We may include the point x = y by writing

(Hx − z)G(x, y; z) = δ(x − y). (63)

This corresponds to the right magnitude for the discontinuity in the


first derivative.

10
4. If H is self-adjoint, then G(x, y; z) is, in fact, our earlier discussed
Green’s function. It is given by Eqn. 59 for all z (including real z),
not in the spectrum of H. The discrete eigenvalues of H correspond to
poles of G as a function of z. Thus, the bound states can be found by
searching for the poles of G, in a suitably cut complex plane (if H has
a continuous spectrum, we have a branch cut).

5.2 Example: Force-free Motion


Let us evaluate the Green’s function for force-free motion, V (x) = 0, in
x ∈ (−∞, ∞) configuration space:

1 d2
Hu(x; z) = − u(x; z) = zu(x; z). (64)
2m dx2
We must have uL → 0 as x → −∞, and uR → 0 as x → ∞. Then we have
solutions of the form:

uL (x; z) = e−iρx , uR (x; z) = eiρx , (65)

where √
ρ= 2mz. (66)
We select the branch of the square root so that the imaginary part of ρ is
positive, as long as z is not along the non-negative real axis. We know that
the spectrum of H is the non-negative real axis. Thus, we cut the z-plane
along the positive real axis.
To obtain the Wronskian, note that u0L = −iρuL and u0R = iρuR . Evaluate
at x = 0 for convenience: uL (0) = uR (0) = 1. Hence,

W (z) = (−iρ) − (iρ) = −2iρ = −2i 2mz. (67)

Thus, we obtain the Green’s function:


2m h i
G(x, y; z) = √ e−iρx eiρy θ(y − x) + e−iρy eiρx θ(x − y)
−2i 2mz
m iρ|x−y|
r
= i e . (68)
2z
We could have noticed from the start that G had to be a function of (x − y)
only, by the translational invariance of the problem.
Let us continue, and obtain the time development transformation U (x, y; t):
1 Z
U (x, y; t) = − dze−itz G(x, y; z), (69)
2πi C∞

11
z
Ch

(a)
Ch ρ

(b)

Figure 3: The contour C∞ : (a) in the z plane; (b) in the ρ plane.

where C∞ is the contour in Fig. 3. With ρ2 = 2mz, we may make the


substitution z = ρ2 /2m and dz = ρdρ/m, to obtain the integral in the ρ
plane:
im Z −∞+i dρ −itρ2 /2m iρ|x−y|
U (x, y; t) = − e e . (70)
2πi ∞+i m
We may take the  → 0 limit, and guarantee convergence by evaluating the
integral at complex time t → t − iτ , where τ > 0:
1 Z∞ 1
    
2
U (x, y; t − iτ ) = dρ exp − ρ (τ + it) − ρi|x − y| . (71)
2π −∞ 2m
We compute by completing the square in the exponent:
 h q i2 
1
1 a2 Z ∞

 ρ − a/ 2m
(τ + it)  
U (x, y; t − iτ ) = e dρ exp − 2
, (72)
2π −∞ 
 2σ 

where
i |x − y|
a = q (73)
2 1 (τ + it)
2m
1 1
σ = √ q . (74)
2 1 (τ + it)
2m

12
The integral
√ is now in the form of the integral of a Gaussian, and has the
value 2πσ. Therefore,
1√ 1 2
U (x, y; t − iτ ) = 2π q ea
2π 1
(τ + it)
2m
m(x − y)2
s " #
m
= exp − , (75)
2π(τ + it) 2(τ + it)

with Re τ + it > 0.
It is interesting to look at this result for t = 0:
m(x − y)2
" #
m
r
U (x, y, −iτ ) = exp − . (76)
2πτ 2τ
Referring back to our earlier discussion, we see that this gives the solution
to the diffusion equation, or, for example, to the heat conduction problem
(for a homogeneous medium) with an initial heat distribution proportional to
δ(x − y). As time τ increases, the heat propagates out from x = y, spreading
according to a broadening Gaussian.
+
In quantum mechanics, we are √ more interested in the limit τ → 0 . The
only subtle issue is the phase in τ + it. Let
√ √ √
τ + it = Reiθ = Reiθ/2 , (77)
√ →
where R = τ 2 + t2 τ →0 + |t|. Referring to Fig. 4, for t > 0 we have 0 <
θ < π/2, approaching θ = π/2 as τ → 0+ . Similarly, for t < 0 we have
−π/2 < θ < 0, approaching θ = −π/2 as τ → 0+ . Hence,

√ q  eiπ/4 = √1 (1 + i), t > 0,
τ + it →τ →0+ |t|  −iπ/4 21 (78)
e = √2 (1 − i), t < 0.

Thus,
im(x − y)2
!s " #
1 t m
U (x, y; t) = 1−i exp . (79)
2 |t| 2π|t| 2t
This is the time development transformation for the free particle Schrödinger
equation in one dimension, where we have kept proper track of the phase for
all times.
We may check that the behavior of this transformation is as expected
when we transform by time t, followed by transforming by time −t. The
result ought to be what we started with, i.e., this product should be the
identity. Thus, we consider the product
m im h
 
i
U (y2 , x; −t)U (x, y1 ; t) = exp (x − y1 )2 − (x − y2 )2 . (80)
2π|t| 2t

13
it
Rei θ
R
θ
τ

Figure 4: Illustration to help in the evaluation of the phase of the free particle
time development transformation.

Integrating over the intermediate variable x:


Z ∞ m im 2
 
1 Z∞

im

dxU (y2 , x; −t)U (x, y1 ; t) = exp (y1 − y22 ) dx exp (y2 − y1 )x
−∞ |t| 2t 2π −∞ t
m im 2 m
   
= exp (y1 − y22 ) δ (y2 − y1 )
|t| 2t t
= δ(y2 − y1 ). (81)
This has the hoped-for behavior.

5.3 Example: Reflecting Wall


The translation invariance of the example above will be lost if the configura-
tion space is changed to a half-line x ∈ [0, ∞). This may be interpreted as a
free-particle problem, except with a reflecting wall at x = 0. Again,
1 d2
H=− . (82)
2m dx2

14
We still have uR (x; z) = eiρx , but now the left boundary condition is uL (0; z) =
0. Hence, a left solution is

uL (x; z) = sin(ρx). (83)

We obtain W = ρ, by evaluating at x = 0. Thus,


2m h i
G(x, y; z) = sin(ρx)eiρy θ(y − x) + sin(ρy)eiρx θ(x − y)
ρ
m h iρ(x+y)    i
= e − eiρ(y−x) θ(y − x) + eiρ(x+y) − eiρ(x−y) θ(x − y)

m h iρ|x−y|
r i
= i e − eiρ(x+y) . (84)
2z
This Green’s function is not translation invariant. However, if x → ∞,
y → ∞ such that x − y is finite, then this Green’s function tends toward
our first example (Imρ > 0 is still our branch). This is compatible with the
intuition that the local physics far from the wall at x = 0 should be nearly
independent of the existence of the wall.

5.4 Example: Force-free Motion in Three Dimensions


Consider the Green’s function problem for force-free motion in three dimen-
sions:
1 2
H=− ∇, (85)
2m
where x ∈ R3 . The resolvent is most easily found in momentum space, since
H = p2 /2m is just multiplication by a factor there. Hence, the resolvent in
momentum space is:
1
G(z) = p2 . (86)
2m
− z
The Green’s function in momentum space is, formally:
X φk (x)φ∗k (y)
G(x, y; z) = , (87)
k ωk − z

where
p2
ωk = , (88)
2m
1
φk (x) = eip·x . (89)
(2π)3/2

15
That is,
1 Z 1
G(x, y; z) = 3
d3 (p) p2 eip·(x−y) . (90)
(2π) (∞) 2m
−z
To evaluate this integral, let us first evaluate another handy integral, the
Fourier transform of the “Yukawa potential”:
Z
e−µr
Y = d3 (x)e−ix·p , where r ≡ |x| (91)
(∞) 4πr
Z ∞Z 1 2π
Z
1
= dφd cos θdrr2 exp(−µr − irp cos θ), where x · p = rp cos θ
0 −1 0 4πr
1Z ∞Z 1
= d cos θdrre−µr e−irp cos θ
2 0 −1
i Z∞  
= dre−µr e−irp − eirp
2p 0
!
i 1 1
= −
2p µ + ip µ − ip
1
= . (92)
p + µ2
2

We notice in passing that the Coulomb potential corresponds to µ → 0, with


Z
1 1
d3 (x)e−ix·p = 2. (93)
(∞) 4π|x| p

The inverse Fourier transform theorem tells us that then:


−µ|x|
1 Z s 1 ix·p 3/2 e
d (p) e = (2π) . (94)
(2π)3/2 (∞) p2 + µ 2 4π|x|
Hence,
1 Z 1
G(x, y; z) = 3
d3 (p) p2 ei(x−y)·p
(2π) (∞) 2m
−z
e−iρ|x−y|
= 2m . (95)
4π|x − y|

We have selected the branch of the square root function so that Imρ < 0.
We could just as well have selected the branch with Imρ > 0, in which case
the Green’s function is:
eiρ|x−y| √
G(x, y; z) = 2m ; ρ= 2mz. (96)
4π|x − y|

16
6 Perturbation Theory with Resolvents
Let H and H̄ = H + V be self-adjoint operators. The choice of symbol is
motivated by the fact that we are especially interested in the case in which
H is a Hamiltonian, and H̄ is another Hamiltonian related to the first by the
addition of a potential term. We form the resolvents (with z ∈
/ Σ(H), Σ(H̄)):
1
G(z) = (97)
H −z
1 1
Ḡ(z) = = . (98)
H̄ − z H +V −z
Then, noting that
1 1
V = − , (99)
Ḡ(z) G(z)
it may readily be verified that
Ḡ(z) = G(z) − G(z)V Ḡ(z) = G(z) − Ḡ(z)V G(z) (100)
and
Ḡ(z) = G(z) − G(z)V G(z) + G(z)V Ḡ(z)V G(z). (101)
These identities are very important in perturbation theory – if G(z) is known
for Hamiltonian H, then we may learn something about a perturbed Hamil-
tonian H + V .
We could try to iterate these identities still further:
Ḡ(z) = G(z) − Ḡ(z)V G(z)
= G(z) − G(z)V G(z) + Ḡ(z)V G(z)V G(z)
N
[−G(z)V ]n G(z) + (−)N +1 Ḡ(z) [V G(z)]N +1 .
X
= (102)
n=0

If V is such that the “remainder” term above approaches 0 as N → ∞, then


we have the Liouville-Neumann Series:

[−V G(z)]n .
X
Ḡ(z) = G(z) (103)
n=0

We may state a convergence theorem:


Theorem: Let H and V be self-adjoint operators. Let G(z) be the resolvent
for H, and let DH ⊂ DV . If
kV φk < α1 kφk + α2 kHφk, ∀φ ∈ DH , (104)
where α1 > 0 and 0 < α2 < 1, then the Liouville-Neumann series
converges in operator norm for some open region of the complex plane.

17
Proof: Let ψ ∈ H and z ∈
/ Σ(H). Then
1
φ = G(z)ψ = ψ ∈ DH , (105)
H −z
H
since H−z
is a bounded operator. By assumption we have

H
kV φk = kV G(z)ψk < α1 kG(z)ψk + α2 k ψk. (106)
H −z
Since ψ is arbitrary, this implies:
H
kV G(z)kop < α1 kG(z)kop + α2 k kop . (107)
H −z
Let z = x + iy (x, y real). Use

kf (H)kop = sup |f (ω)|, (108)


ω∈Σ(H)

to obtain
1 1 1

kG(z)kop = k kop ≤ = , (109)
H −z x−z
|y|
and
H ω

k kop = sup < 1. (110)
H −z ω∈Σ(H) ω − z

ω
The last part expresses the fact that limω→∞ ω−z = 1.

We thus have
α1
kV G(z)kop < + α2 . (111)
|y|
Since α2 < 1, for large enough y = y0 , say, we have the result

kV G(z)kop < 1 whenever |y| > y0 . (112)

Hence, the series converges in operator norm whenever |y| > y0 .3

This series is the basis for the Born expansion in scattering theory, as will
be discussed in another note.
Consider now the case where H is the Hamiltonian for force-free motion:
1 2
H=− ∇, x ∈ R3 , (113)
2m
3
If the spectrum of H is bounded below, it will also converge for x < x0 , for small
enough x0 .

18
y
z

y0

-y0

Figure 5: The region of convergence of the perturbation series is the unshaded


area.

V = V (x) is a potential function, and H̄ = H + V . Then the identity

Ḡ(z) = G(z) − G(z)V Ḡ(z) (114)

corresponds to the integral equation:


Z
Ḡ(x, y; z) = G(x, y; z) − d3 (x0 )G(x, x0 ; z)V (x0 )Ḡ(x0 , y; z), (115)
(∞)

and
Ḡ(z) = G(z) − G(z)V G(z) + G(z)V Ḡ(z)V G(z) (116)
corresponds to:
Z
Ḡ(x, y; z) = G(x, y; z) − d3 (x0 )G(x, x0 ; z)V (x0 )G(x0 , y; z) (117)
(∞)
Z Z
+ d3 (x0 )d3 (y0 )G(x, x0 ; z)V (x0 )Ḡ(x0 , y0 ; z)V (y0 )G(y0 , y; z),
(∞)

where z ∈/ Σ(H), z ∈/ Σ(H̄).


We can also express the Schrödinger Equation for eigenstates of the per-
turbed Hamiltonian in the form of an integral equation. Let φ̄k be an
eigenstate of H̄, corresponding to eigenvalue ω̄k . Use the identity Ḡ(z) =
G(z) − G(z)V Ḡ(z), and operate on φ̄k , noting that (ω̄k − z)Ḡ(z)φ̄k = φ̄k :

φ̄k = (ω̄k − z)G(z)φ̄k − G(z)V φ̄k . (118)

19
If ω̄k ∈
/ Σ(H), we may now substitute z = ω̄k to obtain:

φ̄k = −G(ω̄k )V φ̄k . (119)

Using our free particle Green’s function, Eqn. 96, this corresponds to the
integral equation:
 √ 
2m Z exp i 2mω̄ k |x − y|
φ̄k (x) = − d3 (y) V (y)φ̄k (y). (120)
4π (∞) |x − y|

In the case of a discrete bound state spectrum (ω̄k < 0),


√ q
i 2mω̄k = − 2m|ω̄k | < 0, (121)

and this portion of the integrand falls off rapidly as |y| becomes large. This
equation can be more convenient for studying the properties of φ̄k than using
the Schrödinger equation itself.

7 Exercises
1. Prove identities 12 and 13.

2. Prove the power series expansion for resolvent G(z) (Eqn. 26):

[(z − z0 )G(z0 )]n .
X
G(z) = G(z0 )
n=0

You may wish to attempt to do this either “directly”, or via iteration


on the identity of Eqn. 11.

3. Prove the result in Eqn. 34.

4. Let’s consider once again the Hamiltonian

1 d2
H=− , (122)
2m dx2
but now in configuration space x ∈ [a, b] (“infinite square well”).

(a) Construct the Green’s function, G(x, y; z) for this problem.


(b) From your answer to part (a), determine the spectrum of H.

20
(c) Notice that, using

X φk (x)φ∗k (y)
G(x, y; z) = , (123)
k=1 ωk − z

the normalized eigenstate, φk (x), can be obtained by evaluating


the residue of G at the pole z = ωk . Do this calculation, and check
that your result is properly normalized.
(d) Consider the limit a → −∞, b → ∞. Show, in this limit that
G(x, y; z) tends to the Green’s function we obtain in this note for
this Hamiltonian on x ∈ (−∞, ∞):
m iρ|x−y|
r
G(x, y; z) = i e . (124)
2z
5. Let us investigate the Green’s function for a slightly more complicated
situation. Consider th potential:
V |x| ≤ ∆

V (x) = (125)
0 |x| > ∆

_∞ ∞
_Δ Δ x

Figure 6: The “finite square potential”.

(a) Determine the Green’s function for a particle of mass m in this


potential.
Remarks: You will need to construct your “left” and “right” so-
lutions by considering the three different regions of the potential,
matching the functions and their first derivatives at the bound-
aries. Note that the “right” solution may be very simply obtained
from the “left” solution by the symmetry of the problem. In your
solution, let
q
ρ = 2m(z − V ) (126)

ρ0 = 2mz. (127)

21
Make sure that you describe any cuts in the complex plane, and
your selected branch. You may find it convenient to express your
answer to some extent in terms of the force-free Green’s function:
im iρ0 |x−y|
G0 (x, y; z) = e . (128)
ρ

(b) Assume V > 0. Show that your Green’s function G(x, y; z) is


analytic in your cut plane, with a branch point at z = 0.
(c) Assume V < 0. Show that G(x, y; z) is analytic in your cut plane,
except for a finite number of simple poles at the bound states of
the Hamiltonian.

6. Find the time development transformation U (x, y; t) for the one-dimensional


harmonic oscillator:
p2 k q
H= + x2 , ω≡ k/m. (129)
2m 2
Be sure to clearly specify any choice of branch.
Note that you can approach this problem in different ways, e.g., by
directly integrating the differential equation U must satisfy, or by con-
sidering its expansion in terms of eigenfunctions of H (and perhaps
using the creation and annihilation operators).

7. Let us investigate the application of the time development transfor-


mation for the harmonic oscillator that we computed in the previous
exercise. Explicitly, let us consider the problem of finding the wave
function at time t corresponding to an initial (t = 0) wave function:
1/4
αmω αmω
  
φ(x; 0) = exp − (x − a)2 , (130)
2π 4
where α > 0. Our initial wave function thus corresponds to a Gaussian
probability in position, with hxi = a, and h(x − a)2 i = 1/αmω. Using
U (x, y; t) we can solve for φ(x; t) in closed form with this initial wave
function.

(a) Solve for φ(x; t). Do not go to great effort to simplify your result
in this part – we’ll consider a case with simple cancellations in
part (b).

22
(b) From your answer to part (a), show that, for α = 2:

a2
( " #)
mω 1 mω iω
φ(x; t) = ( ) 4 exp − (x − a cos ωt)2 − t + 2max sin ωt − m sin 2ωt .
π 2 2 2
(131)
You should give some thought to how you might have attacked this
problem using “elementary methods”, without knowing U (x, y; t).
(c) Notice that the choice α = 2 corresponds to an initial state wave
function something like a “displaced” ground state wave function.
Solve for the probability distribution to find the particle at x as
a function of time (for α = 2). Your result should have a simple
form and should have an obvious classical correspondence.

23
Physics 195a
Course Notes
Solving the Schrödinger Equation: Resolvents
Solutions to Exercises
021209 F. Porter

1 Exercises
1. Prove identities (12) and (13).
2. Prove the power series expansion for resolvent G(z) (Eqn. 2):

[(z − z0 )G(z0 )]n .
X
G(z) = G(z0 )
n=0

You may wish to attempt to do this either “directly”, or via iteration


on the identity of Eqn. 11.
3. Prove the result in Eqn. ??.
4. Let’s consider once again the Hamiltonian
1 d2
H=− , (1)
2m dx2
but now in configuration space x ∈ [a, b] (“infinite square well”).
(a) Construct the Green’s function, G(x, y; z) for this problem.
Please do not look at this solution until you have turned
in problem set number 8
Solution: To construct the Green’s function, we look for solutions
to:
1 d2
Hu(x) = − u(x; z) = zu(x; z). (2)
2m dx2
The solutions may be expressed in the form

u(x; z) = A sin ρ(x + α), (3)



where ρ ≡ 2mz. The left and right solutions, giving the bound-
ary conditions u(a) = u(b) = 0 are thus:

uL (x; z) = A sin ρ(x − a) (4)


uR (x; z) = B sin ρ(b − x). (5)

1
The Green’s function is:
2m
G(x, y; z) ≡ [uL (x; z)uR (y; z)θ(y − x) + uL (y; z)uR (x; z)θ(x − y)] ,
W (z)
(6)
where the Wronskian is

W (z) ≡ u0L (x; z)uR (x; z) − uL (x; z)u0R (x; z). (7)

Since the Wronskian is independent of x, we pick a convenient


place to evaluate it, x = b:

W (z) = ABρ sin ρ(b − a). (8)

Hence, the Green’s function is


2m
G(x, y; z) = (9)
ρ sin ρ(b − a)
h
× sin ρ(x − a) sin ρ(b − y)θ(y − x)
i
+ sin ρ(y − a) sin ρ(b − x)θ(x − y) .

(b) From your answer to part (a), determine the spectrum of H.


Solution: We first remark that G is regular at ρ = 0, hence at
z = 0. The poles appear at:

2mz(b − a) = kπ, k = ±1, ±2, . . . (10)

That is, the eigenvalues of H are:

π2k2
ωk = , k = 1, 2, . . . (11)
2m(b − a)2

(c) Notice that, using



X φk (x)φ∗k (x)
G(x, y; z) = , (12)
k=1 ωk − z

the normalized eigenstate, φk (x), can be obtained by evaluating


the residue of G at the pole z = ωk . Do this calculation, and check
that your result is properly normalized.
Please do not look at this solution until you have turned
in problem set number 9

2
Solution: We need to evaluate the residue at the pole z = ωk :
−φk (x)φ∗k (y) = lim (z − ωk )G(x, y; z)
z→ωk
(13)
2m
= lim (z − ωk ) (14)
z→ω k ρ sin ρ(b − a)
h
× sin ρ(x − a) sin ρ(b − y)θ(y − x)
i
+ sin ρ(y − a) sin ρ(b − x)θ(x − y) .
Hence, consider
π 2 k2
s
2m 1 (2m)2 (b − a) z − 2m(b−a)2
lim (z − ω k) h√ i = lim h√ i
z→ωk z sin 2mz(b − a) kπ z→ωk
sin 2mz(b − a)
2
= (−1)k . (15)
b−a
Thus,
πk πk
sin (b − y) = sin (b − a + a − y) (16)
b−a b−a
a−y y−a
= sin πk cos πk − sin πk cos πk
b−a b−a
y−1
= (−)k+1 sin πk . (17)
b−a

2 h πk πk
φk (x)φ∗k (y) = (−)k+1 sin (x − a) sin (b − y)θ(y − x) (18)
b−a b−a b−a
πk πk i
+ sin (y − a) sin (b − x)θ(x − y) .
b−a b−a
2 h πk πk
= (−)k+1 sin (x − a) sin (b − y)θ(y − x) (19)
b−a b−a b−a
πk πk i
+ sin (y − a) sin (b − x)θ(x − y) .
b−a b−a
2 h πk πk
= (−)k+1 sin (x − a) sin (y − a)θ(y − x)(−)k+1
b−a b−a b−a
πk πk i
+ sin (y − a) sin (x − a)θ(x − y)(−)k+1 .
b−a b−a
2 πk πk
= sin (x − a) sin (y − a). (20)
b−a b−a b−a
Finally, s
2 πk
φk (x) = sin (x − a). (21)
b−a b−a

3
The normalization may be checked by:
Z b 2 Zb πk
dx|φk (x)|2 = dx| sin (x − a)|2
a b−a a b−a
Z 1
= 2 dy sin2 πky
0
= 1. (22)

(d) Consider the limit a → −∞, b → ∞. Show, in this limit that


G(x, y; z) tends to the Green’s function we obtain in this note for
this Hamiltonian on x ∈ (−∞, ∞):
m iρ|x−y|
r
G(x, y; z) = i e . (23)
2z
Solution: We wish to take the limit a → −∞, b → ∞ on the
Green’s function:
2m
G(x, y; z) = (24)
ρ sin ρ(b − a)
h
× sin ρ(x − a) sin ρ(b − y)θ(y − x)
i
+ sin ρ(y − a) sin ρ(b − x)θ(x − y) .

To do this, consider,
1 h iρ(x−a) i
sin ρ(x − a) = e − e−iρ(x−a) . (25)
2i
Let =ρ > 0. Then:
1 −iρ(x−a)
sin ρ(x − a) → − e , −a → ∞; (26)
2i
1
sin ρ(b − a) → − e−iρ(b−a) , −a → ∞, b → ∞; (27)
2i
1
sin ρ(y − a) → − e−iρ(y−a) , −a → ∞; (28)
2i
1
sin ρ(b − x) → − e−iρ(b−x) , b → ∞; (29)
2i
1
sin ρ(b − y) → − e−iρ(b−y) , b → ∞. (30)
2i
Hence,

2m e−iρ(b−a)
G(x, y; z) =
ρ −2i

4
h i
× θ(y − x)e−iρ(x−a) e−iρ(b−y) + θ(x − y)e−iρ(b−x) e−iρ(y−a)
mh
r i
= i θ(y − x)eiρ(y−x) + θ(x − y)eiρ(x−y)
r 2z
m iρ|y−x|
= i e , (31)
2z
as hoped.

5
Physics 125
Course Notes
Angular Momentum
040429 F. Porter

1 Introduction
This note constitutes a discussion of angular momentum in quantum mechan-
ics. Several results are obtained, important to understanding how to solve
problems involving rotational symmetries in quantum mechanics. We will of-
ten give only partial proofs to the theorems, with the intent that the reader
complete them as necessary. In the hopes that this will prove to be a useful
reference, the discussion is rather more extensive than usually encountered
in quantum mechanics textbooks.

2 Rotations: Conventions and Parameteriza-


tions
A rotation by angle θ about an axis e (passing through the origin in R3 )
is denoted by Re (θ). We’ll denote an abstract rotation simply as R. It is
considered to be “positive” if it is performed in a clockwise sense as we look
along e. As with other transformations, our convention is that we think of a
rotation as a transformation of the state of a physical system, and not as a
change of coordinate system (sometimes referred to as the “active view”). If
Re (θ) acts on a configuration of points (“system”) we obtain a new, rotated,
configuration: If x is a point of the old configuration, and x0 is its image
under the rotation, then:
x0 = Re (θ)x. (1)
That is, R is a linear transformation described by a 3×3 real matrix, relative
to a basis (e1 , e2 , e3 ).
Geometrically, we may observe that:
Re (−θ) = Re−1 (θ), (2)
I = Re (0) = Re (2πn), n = integer, (3)
0
Re (θ)Re (θ ) = Re (θ + θ0 ), (4)
Re (θ + 2πn) = Re (θ), n = integer, (5)
R−e = Re (−θ). (6)
1
A product of the form Re (θ)Re0 (θ 0 ) means “first do Re0 (θ 0 ), then do Re (θ)
to the result”. All of these identities may be understood in terms of ma-
trix identities, in addition to geometrically. Note further that the set of all
rotations about a fixed axis e forms a one-parameter abelian group.
It is useful to include in our discussion of rotations the notion also of
reflections: We’ll denote the space reflection with respect to the origin by P ,
for parity:
P x = −x. (7)
Reflection in a plane through the origin is called a mirroring. Let e be a unit
normal to the mirror plane. Then
Me x = x − 2e(e · x), (8)
since the component of the vector in the plane remains the same, and the
normal component is reversed.
Theorem: 1.
[P, Re (θ)] = 0. (9)
[P, Me ] = 0. (10)
(The proof of this is trivial, since P = −I.)
2.
P Me = Me P = Re (π). (11)
3. P , Me , and Re (π) are “involutions”, that is:
P 2 = I. (12)
Me2 = I. (13)
[Re (π)]2 = I. (14)

Theorem: Let Re (θ) be a rotation and let e0 , e00 be two unit vectors per-
pendicular to unit vector e such that e00 is obtained from e0 according
to:
e00 = Re (θ/2)e0 . (15)
Then
Re (θ) = Me00 Me0 = Re00 (π)Re0 (π). (16)
Hence, every rotation is a product of two mirrorings, and also a product
of two rotations by π.

2
M e''

e'

θ /2 M e'
e''
e
x
a
a+b= θ /2
θ/2
b
θ
Re ( θ ) x

Figure 1: Proof of the theorem that a rotation about e by angle θ is equivalent


to the product of two mirrorings or rotations by π.

Proof: We make a graphical proof, referring to Fig. 1.

Theorem: Consider the spherical triangle in Fig. 2.

1. We have
Re3 (2α3 )Re2 (2α2 )Re1 (2α1 ) = I, (17)
where the unit vectors {ei } are as labelled in the figure.
2. Hence, the product of two rotations is a rotation:

Re2 (2α2 )Re1 (2α1 ) = Re3 (−2α3 ). (18)

The set of all rotations is a group, where group multiplication is


application of successive rotations.

Proof: Use the figure and label M1 the mirror plane spanned by e2 , e3 , etc.
Then we have:

Re1 (2α1 ) = M3 M2
Re2 (2α2 ) = M1 M3
Re3 (2α3 ) = M2 M1 . (19)

3
.e3

Μ1
α
Μ
2
α

. α
e
2
Μ3 . e
1

Figure 2: Illustration for theorem. Spherical triangle vertices are defined as


the intersections of unit vectors e1 , e2 , e3 on the surface of the unit sphere.

Thus,

Re3 (2α3 )Re2 (2α2 )Re1 (2α1 ) = (M2 M1 )(M1 M3 )(M3 M2 ) = I. (20)

The combination of two rotations may thus be expressed as a problem


in spherical trigonometry.

As a corollary to this theorem, we have the following generalization of


the addition formula for tangents:

Theorem: If Re (θ) = Re00 (θ 00 )Re0 (θ 0 ), and defining:

τ = e tan θ/2
τ 0 = e0 tan θ0 /2
τ 00 = e00 tan θ00 /2, (21)

then
τ 0 + τ 00 + τ 00 × τ 0
τ = . (22)
1 − τ 0 · τ 00
This will be left as an exercise for the reader to prove.

4
Theorem: The most general mapping x → x0 of R3 into itself, such that
the origin is mapped into the origin, and such that all distances are
preserved, is a linear, real orthogonal transformation Q:

x0 = Qx, where QT Q = I, and Q∗ = Q. (23)

Hence,
x0 · y0 = x · y ∀ points x, y ∈ R3 . (24)
For such a mapping, either:

1. det(Q) = 1, Q is called a proper orthogonal transformation, and


is in fact a rotation. In this case,

x0 × y0 = (x × y)0 ∀ points x, y ∈ R3 . (25)

or,
2. det(Q) = −1, Q is called an improper orthogonal transforma-
tion, and is the product of a reflection (parity) and a rotation. In
this case,

x0 × y0 = −(x × y)0 ∀ points x, y ∈ R3 . (26)

The set of all orthogonal transformations on three dimensions forms a


group (denoted O(3)), and the set of all proper orthogonal transforma-
tions forms a subgroup (O+ (3) or SO(3) of O(3)), in 1 : 1 correspon-
dence with, hence a “representation” of, the set of all rotations.

Proof of this theorem will be left to the reader.

3 Some Useful Representations of Rotations


Theorem: We have the following representations of rotations (u is a unit
vector):

Ru (θ)x = uu · x + [x − uu · x] cos θ + u × x sin θ, (27)

and
Ru (θ) = eθu·J = I + (u · J )2 (1 − cos θ) + u · J sin θ, (28)

5
where J = (J1 , J2 , J3 ) with:
     
0 0 0 0 0 1 0 −1 0
     
J1 =  0 0 −1  , J2 =  0 0 0  , J3 =  1 0 0  .
0 1 0 −1 0 0 0 0 0
(29)

Proof: The first relation may be seen by geometric inspection: It is a de-


composition of the rotated vector into components along the axis of
rotation, and the two orthogonal directions perpendicular to the axis
of rotation.
The second relation may be demonstrated by noticing that Ji x = ei ×x,
where e1 , e2 , e3 are the three basis unit vectors. Thus,

(u · J )x = u × x, (30)

and
(u · J )2 x = u × (u × x) = u(u · x) − x. (31)
Further,

(u · J )2n+m = (−)n (u · J )m , n = 1, 2, . . . ; m = 1, 2. (32)

The second relation then follows from the first.

Note that
h i
Tr [Ru (θ)] = Tr I + (u · J )2 (1 − cos θ) . (33)

With
     
0 0 0 −1 0 0 −1 0 0
     
J12 =  0 −1 0  , J22 =  0 0 0  , J32 =  0 −1 0  , (34)
0 0 −1 0 0 −1 0 0 0

we have
Tr [Ru (θ)] = 3 − 2(1 − cos θ) = 1 + 2 cos θ. (35)
This is in agreement with the eigenvalues of Ru (θ) being 1, eiθ , e−iθ .

6
Theorem: (Euler parameterization) Let R ∈ O + (3). Then R can be
represented in the form:

R = R(ψ, θ, φ) = Re3 (ψ)Re2 (θ)Re3 (φ), (36)

where the Euler angles ψ, θ, φ can be restricted to the ranges:

0 ≤ ψ < 2π; 0 ≤ θ ≤ π; 0 ≤ φ < 2π. (37)

With these restrictions, the parameterization is unique, unless R is a


rotation about e3 , in which case Re3 (α) = R(α − β, 0, β) for any β.

Proof: We refer to Fig. 3 to guide us. Let e0k = Rek , k = 1, 2, 3, noting


that it is sufficient to consider the transformation of three orthogonal
unit vectors, which we might as well take to be intitially along the basis
directions. We must show that we can orient e0k in any desired direction
in order to prove that a general rotation can be described as asserted
in the theorem.
e
3
e'
3

θ
e'
2
e
1
ψ φ
ψ

θ e
2
e'
1

Figure 3: Illustration for visualizing the Euler angle theorem. It may be


useful to think of the figure as the action of R on a “rigid body”, a unit disk
with a unit normal attached, and two orthogonal unit vectors attached in the
plane of the disk. The original and final positions of the disk and attached
unit vectors are shown.

7
We note that e03 does not depend on φ since this first rotation is about
e3 itself. The polar angles of e03 are given precisely by θ and ψ. Hence
θ and ψ are uniquely determined (within the specified ranges, and
up to the ambiguous case mentioned in the theorem) by e03 = Re3 ,
which can be specified to any desired orientation. The angle φ is then
determined uniquely by the orientation of the pair (e01 , e02 ) in the plane
perpendicular to e03 .

We note that the rotation group [O + (3)] is a group of infinite order (or, is
an “infinite group”, for short). There are also an infinite number of subgroups
of O+ (3), including both finite and infinite subgroups. Some of the important
finite subgroups may be classified as:
1. The Dihedral groups, Dn , corresponding to the proper symmetries of
an n-gonal prism. For example, D6 ⊂ O + (3) is the group of rotations
which leaves a hexagonal prism invariant. This is a group of order 12,
generated by rotations Re3 (2π/6) and Re2 (π).
2. The symmetry groups of the regular solids:
(a) The symmetry group of the tetrahedron.
(b) The symmetry group of the octahedron, or its “dual” (replace
vertices by faces, faces by vertices) the cube.
(c) The symmetry group of the icosahedron, or its dual, the dodeca-
hedron.
We note that the tetrahedron is self-dual.
An example of an infinite subgroup of O+ (3) is D∞ , the set of all rotations
which leaves a circular disk invariant, that is, including all rotations about
the normal to the disk, and rotations by π about any axis in the plane of the
disk.

4 Special Unitary Groups


The set of all n × n unitary matrices forms a group (under normal matrix
multiplication), denoted by U (n). U (n) includes as a subgroup, the set of
all n × n unitary matrices with determinant equal to 1 (“unimodular”, or
“special”. This subgroup is denoted by SU (n), for “Special Unitary” group.

8
e
3

e
1
Figure 4: A hexagonal prism, to illustrate the group D6 .

The group of 2 × 2 unimodular unitary matrices, SU (2), has a special


connection with O+ (3), which is very important in quantum mechanics. Con-
sider the real vector space of all 2 × 2 traceless hermitian matrices, which
we denote by V3 . The “3” refers to the fact that this is a three-dimensional
vector space (even though it consists of 2 × 2 matrices). Hence, it can be put
into 1 : 1 correspondence with Euclidean 3-space, R3 . We may make this cor-
respondence an isometry by introducing a positive-definite symmetric scalar
product on V3 :
1
(X, Y ) = Tr(XY ), ∀X, Y ∈ V3 . (38)
2
Let u be any matrix in SU (2): u−1 = u† and det(u) = 1. Consider the
mapping:
X → X 0 = uXu†. (39)
We demonstrate that this is a linear mapping of V3 into itself: If X is her-

9
mitian, so is X 0 . If X is traceless then so is X 0 :

Tr(X 0 ) = Tr(uXu†) = Tr(Xuu†) = Tr(X). (40)

This mapping of V3 into itself also preserves the norms, and hence, the scalar
products:
1
(X 0 , X 0 ) = Tr(X 0 X 0 )
2
1
= Tr(uXu†uXu†)
2
= (X, X). (41)

The mapping is therefore a rotation acting on V3 and we find that to every


element of SU (2) there corresponds a rotation.
Let us make this notion of a connection more explicit, by picking an
orthonormal basis (σ1 , σ2 , σ3 ) of V3 , in the form of the Pauli matrices:
     
0 1 0 −i 1 0
σ1 = , σ2 = , σ3 = . (42)
1 0 i 0 0 −1

Note that the Pauli matrices form an orthonormal basis:


1
Tr(σα σβ ) = δαβ . (43)
2
We have the products:

σ1 σ2 = iσ3 , σ2 σ3 = iσ1 , σ3 σ1 = iσ2 . (44)

Different Pauli matrices anti-commute:

{σα , σβ } ≡ σα σβ + σβ σα = 2δαβ I (45)

The commutation relations are:

[σα , σβ ] = 2iαβγ σγ . (46)

Any element of V3 may be written in the form:


3
X
X = x·σ = xi σi , (47)
i=1

10
where x ∈ R3 . This establishes a 1 : 1 correspondence between elements of
V3 and R3 . We note that
1
Tr [(a · σ )(b · σ )] = a · b, (48)
2
and
3
X 3 X
X
(a · σ )(b · σ ) = ai bi σi2 + ai bj σi σj
i=1 i=1 j6=i
= (a · b)I + i(a × b) · σ . (49)

Finally, we may see that the mapping is isometric:


1 1
(X, Y ) = Tr(XY ) = Tr [(x · σ )(y · σ )] = x · y. (50)
2 2
Let’s investigate SU (2) further, and see the relevance of V3 : Every unitary
matrix can be expressed as the exponential of a skew-hermitian (A† = −A)
matrix. If the unitary matrix is also unimodular, then the skew-hermitian
matrix can be selected to be traceless. Hence, every u ∈ SU (2) is of the form
u = e−iH , where H = H † and Tr(H) = 0. For every real unit vector e and
every real θ, we define ue (θ) ∈ SU (2) by:
 
i
u(θ) = exp − θe · σ . (51)
2
Any element of SU (2) can be expressed in this form, since every traceless
hermitian matrix is a (real) linear combination of the Pauli matrices.
Now let us relate ue (θ) to the rotation Re (θ):

Theorem: Let x ∈ R3 , and X = x · σ . Let


 
i
u(θ) = exp − θe · σ , (52)
2
and let
ue (θ)Xu†e (θ) = X 0 = x0 · σ . (53)
Then
x0 = Re (θ)x. (54)

11
Proof: Note that
 
i θ θ
ue (θ) = exp − θe · σ = I cos − i(e · σ ) sin . (55)
2 2 2
This may be demonstrated by using the identity (a · σ )(b · σ ) = (a ·
b)I + (a × b) · σ , and letting a = b = e to get (e · σ)2 = I, and using
this to sum the exponential series.
Thus,
x0 · σ = ue (θ)Xu†e (θ)
" # " #
θ θ θ θ
= I cos − i(e · σ ) sin (x · σ ) I cos + i(e · σ ) sin
2 2 2 2
θ θ
= x · σ cos2 + (e · σ )(x · σ )(e · σ ) sin2
2 2
θ θ
+ [−i(e · σ )(x · σ ) + i(x · σ )(e · σ )] sin cos . (56)
2 2
But
(e · σ )(x · σ ) = (e · x)I + i(e × x) · σ (57)
(x · σ )(e · σ ) = (e · x)I − i(e × x) · σ (58)
(e · σ )(x · σ )(e · σ ) = (e · x)(e · σ) + i [(e × x) · σ ] (e · σ )
= (e · x)(e · σ ) + i2 [(e × x) × e] · σ
= 2(e · x)(e · σ ) − x · σ , (59)
where we have made use of the identity (C × B) × A = B(A · C) −
C(A · B) to obtain (e × x) × e = x − e(e · x). Hence,
( )
θ θ
x · σ = cos x + sin2 [2(e · x)e − x] · σ
0 2
2 2
θ θ
+i sin cos [−2i(e × x) · σ ] . (60)
2 2
Equating coefficients of σ we obtain:
θ θ θ θ
x0 = x cos2 + [2(e · x)e − x] sin2 + 2(e × x) sin cos
2 2 2 2
= (e · x)e + [x − (e · x)e] cos θ + (e × x) sin θ
= Re (θ)x. (61)

12
Thus, we have shown that to every rotation Re (θ) corresponds at least
one u ∈ SU (2), and also to every element of SU (2) there corresponds a
rotation. We may restate the theorem just proved in the alternative form:

uXu† = u(x · σ )u† = x · (uσ


σ u† )
h i
= x · σ = [Re (θ)x] · σ = x · Re−1 (θ)σ
σ . (62)

But x is arbitrary, so
σu† = Re−1 (θ)σ
uσ σ, (63)
or,
u−1σ u = Re (θ)σ
σ. (64)
More explicitly, this means:
3
X
u−1 σi u = Re (θ)ij σj . (65)
j=1

There remains the question of uniqueness: Suppose u1 ∈ SU (2) and


u2 ∈ SU (2) are such that

u1 Xu†1 = u2 Xu†2 , ∀X ∈ V3 . (66)

Then u−1
2 u1 commutes with every X ∈ V3 and therefore this matrix must
be a multiple of the identity (left for the reader to prove). Since it is uni-
tary and unimodular, it must equal I or −I. Thus, there is a two-to-one
correspondence between SU (2) and O + (3): To every rotation Re (θ) corre-
sponds the pair ue (θ) and −ue (θ) = ue (θ + 2π). Such a mapping of SU (2)
onto O + (3) is called a homomorphism (alternatively called an unfaithful
representation).
We make this correspondence precise in the following:
Theorem: 1. There is a two-to-one correspondence between SU (2) and
O+ (3) under the mapping:
1
u → R(u), where Rij (u) = Tr(u† σi uσj ), (67)
2
and the rotation Re (θ) corresponds to the pair:

Re (θ) ↔ {ue (θ), −ue (θ) = ue (θ + 2π)}. (68)

13
2. In particular, the pair of elements {I, −I} ⊂ SU (2) maps to I ∈
O + (3).
3. This mapping is a homomorphism: u → R(u) is a representation
of SU (2), such that

R(u0 u00 ) = R(u0 )R(u00 ), ∀u0 , u00 ∈ SU (2). (69)

That is, the “multiplication table” is preserved under the map-


ping.

Proof: 1. We have
3
X
u−1 σi u = Re (θ)ij σj . (70)
j=1

Multiply by σk and take the trace:


 
3
X
Tr(u−1 σi uσk ) = Tr  Re (θ)ij σj σk  , (71)
j=1

or
3
X

Tr(u σi uσk ) = Re (θ)ij Tr(σj σk ). (72)
j=1

But 12 Tr(σj σk ) = δjk , hence


1
Rik (u) = Re (θ)ik = Tr(u† σi uσk ). (73)
2

Proof of the remaining statements is left to the reader.

A couple of comments may be helpful here:


1. Why did we restrict u to be unimodular? That is, why are we consider-
ing SU (2), and not U (2). In fact, we could have considered U (2), but
the larger group only adds unnecessary complication. All U (2) adds is
multiplication by an overall phase factor, and this has no effect in the
transformation:
X → X 0 = uXu†. (74)
This would enlarge the two-to-one mapping to infinity-to-one, appar-
ently without achieving anything of interest. So, we keep things as
simple as we can make them.

14
2. Having said that, can we make things even simpler? That is, can we
impose additional restrictions to eliminate the “double-valuedness” in
the above theorem? The answer is no – SU (2) has no subgroup which
is isomorphic with O + 3.

5 Lie Groups: O+(3) and SU (2)


Def: An abstract n-dimensional Lie algebra is an n-dimensional vector
space V on which is defined the notion of a product of two vectors (∗)
with the properties (x, y, z ∈ V, c a complex number):

1. Closure: x ∗ y ∈ V.
2. Distributivity:

x ∗ (y + z) = x ∗ y + x ∗ z (75)
(y + z) ∗ x = y ∗ x + z ∗ x. (76)

3. Associativity with respect to multiplication by a complex number:

(cx) ∗ y = c(x ∗ y). (77)

4. Anti-commutativity:
x ∗ y = −y ∗ x (78)
5. Non-associative (“Jacobi identity”):

x ∗ (y ∗ z) + z ∗ (x ∗ y) + y ∗ (z ∗ x) = 0 (79)

We are especially interested here in Lie algebras realized in terms of matri-


ces (in fact, every finite-dimensional Lie algebra has a faithful representation
in terms of finite-dimensional matrices):

Def: A Lie algebra of matrices is a vector space M of matrices which is


closed under the operation of forming the commutator:

[M 0 , M 00 ] = M 0 M 00 − M 00 M 0 ∈ M, ∀M 0 , M 00 ∈ M. (80)

Thus, the Lie product is the commutator: M 0 ∗ M 00 = [M 0 , M 00 ]. The


vector space may be over the real or complex fields.

15
Let’s look at a couple of relevant examples:
1. The set of all real skew-symmetric 3 × 3 matrices is a three-dimensional
Lie algebra. Any such matrix is a real linear combination of the matri-
ces
     
0 0 0 0 0 1 0 −1 0
     
J1 =  0 0 −1  , J2 =  0 0 0  , J3 =  1 0 0 
0 1 0 −1 0 0 0 0 0
(81)
as defined already earlier. The basis vectors satisfy the commutation
relations:
[Ji , Jj ] = ijk Jk . (82)
We say that this Lie algebra is the Lie algebra associated with the
group O + (3). Recall that

Ru (θ) = eθu·J . (83)

2. The set of all 2×2 skew-hermitian matrices is a Lie algebra of matrices.


This is also 3-dimensional, and if we write:
i
S j = σj , j = 1, 2, 3, (84)
2
we find {S} satisfy the “same” commutation relations as {J }:

[Si , Sj ] = ijk Sk . (85)

This is the Lie algebra associated with the group SU (2). Recall that

ue (θ) = eθe·(− 2 σ ) = eθe·S .


i
(86)

This is also a real Lie algebra, i.e., a vector space over the real field,
even though the matrices are not in general real.
We see that the Lie algebras of O + (3) and SU (2) have the same “struc-
ture”, i.e., a 1 : 1 correspondence can be established between them which is
linear and preserves all commutators. As Lie algrebras, the two are isomor-
phic.
We explore a bit more the connection between Lie algebras and Lie
groups. Let M be an n-dimensional Lie algebra of matrices. Associated

16
with M there is an n-dimensional Lie group G of matrices: G is the matrix
group generated by all matrices of the form eX , where X ∈ M. We see
that O + (3) and SU (2) are Lie groups of this kind – in fact, every element
of either of these groups corresponds to an exponential of an element of the
appropriate Lie algrebra.1

6 Continuity Structure
As a finite dimensional vector space, M has a continuity structure in the
usual sense (i.e., it is a topological space with the “usual” topology). This
induces a continuity structure (topology) on G (for O+ (3) and SU (2), there
is nothing mysterious about this, but we’ll keep our discussion a bit more
general for a while). G is an n-dimensional manifold (a topological space
such that every point has a neighborhood which can be mapped homeo-
morphically onto n-dimensional Euclidean space). The structure of G (its
multiplication table) in some neighborhood of the identity is uniquely deter-
mined by the structure of the Lie algebra M. This statement follows from
the Campbell-Baker-Hausdorff theorem for matrices: If matrices X, Y are
sufficiently “small”, then eX eY = eZ , where Z is a matrix in the Lie algebra
generated by matrices X and Y . That is, Z is a series of repeated commu-
tators of the matrices X and Y . Thus, we have the notion that the local
structure of G is determined solely by the structure of M as a Lie algebra.
We saw that the Lie algebras of O+ (3) and SU (2) are isomorphic, hence
the group O+ (3) is locally isomorphic with SU (2). Note, on the other hand,
that the properties

(u · J )3 = −(u · J ) and (u · J )2n+m = (−)n (u · J )m , (87)

for all positive integers n, m, are not shared by the Pauli matrices, which
instead satisfy:
(u · σ)3 = u · σ. (88)
Such algebraic properties are outside the realm of Lie algebras (the products
being taken are not Lie products). We also see that (as with O+ (3) and
1
This latter fact is not a general feature of Lie groups: To say that G is generated
by matrices of the form eX means that G is the intersection of all matrix groups which
contain all matrices eX where X ∈ M. An element of G may not be of the form eX .

17
SU (2)) it is possible for two Lie algebras to have the same local structure,
while not being globally isomorphic.
A theorem describing this general situation is the following:

Theorem: (and definition) To every Lie algebra M corresponds a unique


simply-connected Lie group, called the Universal Covering Group,
defined by M. Denote this group by GU . Every other Lie group with
a Lie algebra isomorphic with M is then isomorphic with the quotient
group of GU relative to some discrete (central – all elements which
map to the identity) subgroup of GU iself. If the other group is simply
connected, it is isomorphic with GU itself.

We apply this to rotations: The group SU (2) is the universal covering


group defined by the Lie algebra of the rotation group, hence SU (2) takes
on special significance. We note that SU (2) can be parameterized as the
surface of a unit sphere in four dimensions, hence is simply connected (all
closed loops may be continuously collapsed to a point). On the other hand,
O + (3) is isomorphic with the quotient group SU (2)/I(2), where I(2) is the
inversion group in two dimensions:
   
1 0 −1 0
I(2) = , . (89)
0 1 0 −1

7 The Haar Integral


We shall find it desirable to have the ability to perform an “invariant inte-
gral” on the manifolds O+ (3) and SU (2), which in some sense assigns an
equal “weight” to every element of the group. The goal is to find a way of
democratically “averaging” over the elements of a group. For a finite group,
the correspondence is to a sum over the group elements, with the same weight
for each element.
For the rotation group, let us denote the desired “volume element” by
d(R). We must find an expression for d(R) in terms of the parameterization,
for some parameterization of O + (3). For example, we consider the Euler
angle parameterization. Recall, in terms of Euler angles the representation
of a rotation as:

R = R(ψ, θ, φ) = Re3 (ψ)Re2 (θ)Re3 (φ). (90)

18
We will argue that the appropriate volume element must be of the form:

d(R) = Kdψ sin θdθdφ, K > 0. (91)

The argument for this form is as follows: We have a 1 : 1 correspondence


between elements of O+ (3) and orientations of a rigid body (such as a sphere
with a dot at the north pole, centered at the origin; let I ∈ O+ (3) correspond
to the orientation with the north pole on the +e3 axis, and the meridian along
the +e2 axis, say). We want to find a way to average over all positions of the
sphere, with each orientation receiving the same weight. This corresponds
to a uniform averaging over the sphere of the location of the north pole.
Now notice that if R(ψ, θ, φ) acts on the reference position, we obtain
an orientation where the north pole has polar angles (θ, ψ). Thus, the (θ, ψ)
dependence of d(R) must be dψ sin θdθ. For fixed (θ, ψ), the angle φ describes
a rotation of the sphere about the north-south axis – the invariant integral
must correspond to a uniform averaging over this angle. Hence, we intuitively
arrive at the above form for d(R). The constant K > 0 is arbitrary; we pick
it for convenience. We shall choose K so that the integral over the entire
group is one:
Z Z 2π Z π Z 2π
1
1= d(R) = 2 dψ sin θdθ dφ. (92)
+
O (3) 8π 0 0 0

We can thus evaluate the integral of a (suitably behaved) function f (R) =


f (ψ, θ, φ) over O + (3):
Z Z 2π Z π Z 2π
1
f (R) = f (R)d(R) = 2 dψ sin θdθ dφf (ψ, θ, φ). (93)
O + (3) 8π 0 0 0

The overbar notation is intended to suggest an average.


What about the invariant integral over SU (2)? Given the answer for
O + (3), we can obtain the result for SU (2) using the connection between the
two groups. First, parameterize SU (2) by the Euler angles:
     
i i i
u(ψ, θ, φ) = exp − ψσ3 exp − θσ2 exp − φσ3 , (94)
2 2 2
with
0 ≤ ψ < 2π; 0 ≤ θ ≤ π; 0 ≤ φ < 4π. (95)
Notice that the ranges are the same as for O + (3), except for the doubled
range required for φ. With these ranges, we obtain every element of SU (2),

19
uniquely, up to a set of measure 0 (when θ = 0, π). The integral of function
g(u) on SU (2) is thus:
Z Z 2π Z π Z 4π
1
g(u) = g(u)d(u) = dψ sin θdθ dφg [u(ψ, θ, φ)] , (96)
SU (2) 16π 2 0 0 0

with the volume element normalized to give unit total volume:


Z
d(u) = 1. (97)
SU (2)

A more precise mathematical treatment is possible, making use of mea-


sure theory; we’ll mention some highlights here. The goal is to define a
measure µ(S) for suitable subsets of O+ (3) (or SU (2)) such that if R0 is any
element of O+ (3), then:

µ(SR0 ) = µ(S), where SR0 = {RR0 |R ∈ S} . (98)

Intuitively, we think of the following picture: S may be some region in O+ (3),


and SR0 is the image of S under the mapping R → RR0 . The idea then, is
that the regions S and SR0 should have the same “volume” for all R ∈ O+ (3).
Associated with such a measure we have an integral.

SR 0
R0

Figure 5: Set S mapping to set SR0 , under the rotation R0 .

20
It may be shown2 that a measure with the desired property exists and is
unique, up to a factor, for any finite-dimensional Lie group. Such a measure
is called a Haar measure. The actual construction of such a measure must
deal with coordinate system issues. For example, there may not be a good
global coordinate system on the group, forcing the consideration of local
coordinate systems.
Fortunately, we have already used our intuition to obtain the measure
(volume element) for the Euler angle parameterization, and a rigorous treat-
ment would show it to be correct. The volume element in other parameter-
izations may be found from this one by suitable Jacobian calculations. For
example, if we parameterize O+ (3) by:

Re (θ) = eθ ·J , (99)

where θ ≡ θe, and |θθ| ≤ π, then the volume element (normalized again to
unit total volume of the group) is:
1 1 − cos θ 3
d(R) = d (θθ), (100)
4π 2 θ2
where d3 (θθ) is an ordinary volume element on R3 . Thus, the group-averaged
value of f (R) is:
Z Z  
1 1 − cos θ 3
f (R)d(R) = 2
d (θθ)f eθ ·J (101)
O + (3) 4π O+(3) θ2
Z Z 2π  
1 θ ·J
= dΩe (1 − cos θ)dθf e . (102)
4π 2 4π 0

Alternatively, we may substitute 1 − cos θ = 2 sin2 2θ . For SU (2) we have the


corresponding result:
1 θ
d(u) = 2
dΩe sin2 dθ, 0 ≤ θ ≤ 2π. (103)
4π 2
We state without pro0f that the Haar measure is both left- and right-
invariant. That is, µ(S) = µ(SR0 ) = µ(R0 S) for all R0 ∈ O+ (3) and for all
measurable sets S ⊂ O + (3). This is to be hoped for on “physical” grounds.
The invariant integral is known as the Haar integral, or its particular real-
ization for the rotation group as the Hurwitz integral.
2
This may be shown by construction, starting with a small neighborhood of the identity,
and using the desired property to transfer the right volume element everywhere.

21
8 Unitary Representations of SU (2) (and O+(3))
A unitary representation of SU (2) is a mapping

u ∈ SU (2) → U (u) ∈ U (n) such that U (u0 )U (u00 ) = U (u0 u00 ), ∀u0 , u00 ∈ SU (2).
(104)
That is, we “represent” the elements of SU (2) by unitary matrices (not
necessarily 2 × 2), such that the multiplication table is preserved, either
homomorphically or isomorphically. We are very interested in such mappings,
because they permit the study of systems of arbitrary angular momenta,
as well as providing the framework for adding angular momenta, and for
studying the angular symmetry properties of operators. We note that, for
every unitary representation R → T (R) of O+ (3) there corresponds a unitary
representation of SU (2): u → U (u) = T [R(u)]. Thus, we focus our attention
on SU (2), without losing any generality.
For a physical theory, it seems reasonable to to demand some sort of
continuity structure. That is, whenever two rotations are near each other,
the representations for them must also be close.
Def: A unitary representation U (u) is called weakly continuous if, for any
two vectors φ, ψ, and any u:

lim hφ| [U (u0 ) − U (u)] ψi = 0. (105)


u0 →u

In this case, we write:

w-lim
0
U (u0 ) = U (u), (106)
u →u

and refer to it as the “weak-limit”.

Def: A unitary representation U (u) is called strongly continuous if, for


any vector φ and any u:

lim k [U (u0 ) − U (u)] φk = 0. (107)


u0 →u

In this case, we write:

s-lim
0
U (u0 ) = U (u), (108)
u →u

and refer to it as the “strong-limit”.

22
Strong continuity implies weak continuity, since:
|hφ| [U (u0 ) − U (u)] ψi| ≤ kφkk [U (u0 ) − U (u)] ψk (109)
We henceforth (i.e., until experiment contradicts us) adopt these notions of
continuity as physical requirements.
An important concept in representation theory is that of “(ir)reducibility”:
Def: A unitary representation U (u) is called irreducible if no subspace of
the Hilbert space is mapped into itself by every U (u). Otherwise, the
representation is said to be reducible.
Irreducible representations are discussed so frequently that the jargon “ir-
rep” has emerged as a common substitute for the somewhat lengthy “irre-
ducible representation”.
Lemma: A unitary representation U (u) is irreducible if and only if every
bounded operator Q which commutes with every U (u) is a multiple of
the identity.
Proof of this will be left as an exercise. Now for one of our key theorems:
Theorem: If u → U (u) is a strongly continuous irreducible representation of
SU (2) on a Hilbert space H, then H has a finite number of dimensions.
Proof: The proof consists of showing that we can place a finite upper bound
on the number of mutually orthogonal vectors in H: Let E be any one-
dimensional projection operator, and φ, ψ be any two vectors in H.
Consider the integral:
Z
B(ψ, φ) = d(u)hψ|U (u)EU (u−1 )φi. (110)
SU (2)

This integral exists, since the integrand is continuous and bounded,


because U (u)EU (u−1 ) is a one-dimensional projection, hence of norm
1.
Now
Z


|B(ψ, φ)| = d(u)hψ|U (u)EU (u−1 )φi
SU (2)
Z
≤ d(u)|hψ|U (u)EU (u−1 )φi| (111)
SU (2)
Z
≤ d(u)kψkkφkkU (u)EU (u−1)k (112)
SU (2)
≤ kψkkφk, (113)

23
where
R
we have made use of the Schwarz inequality and of the fact
SU (2) d(u) = 1.
B(ψ, φ) is linear in φ, anti-linear in ψ, and hence defines a bounded
operator B0 such that:

B(ψ, φ) = hψ|B0 φi. (114)

Let u0 ∈ SU (2). Then


Z
hψ|U (u0 )B0 U (u−1
0 )φi = d(u)hψ|U (u0 u)EU ((u0 u)−1 )φi
(115)
SU (2)
Z
= d(u)hψ|U (u)EU (u−1 )φi (116)
SU (2)
= hψ|B0 φi, (117)

where the second line follows from the invariance of the Haar integral.
Since ψ and φ are arbitrary vectors, we thus have;

U (u0 )B0 = B0 U (u0 ), ∀u0 ∈ SU (2). (118)

Since B0 commutes with every element of an irreducible representation,


it must be a multiple of the identity, B0 = pI.
Z
d(u)U (u)EU (u−1 ) = pI. (119)
SU (2)

Now let {φn |n = 1, 2, . . . , N } be a set of N orthonormal vectors,


hφn |φm i = δnm , and take E = |φ1 ihφ1 |. Then,
Z
hφ1 | d(u)U (u)EU (u−1 )|φ1 i = phφ1 |I|φ1 i = p, (120)
SU (2)
Z
= d(u)hφ1 |U (u)|φ1 ihφ1 |U (u−1 )|φ1 i
SU (2)
Z
= d(u)|hφ1 |U (u)|φ1 i|2 > 0. (121)
SU (2)

Note that the integral cannot be zero, since the integrand is a contin-
uous non-negative definite function of u, and is equal to one for u = I.

24
Thus, we have:
N
X
pN = hφn |pIφn i (122)
n=1
XN Z
= d(u)hφn |U (u)EU (u−1 )φn i (123)
n=1 SU (2)
XN Z
= d(u)hφn |U (u)|φ1 ihφ1 |U (u−1 )φn i (124)
n=1 SU (2)
XN Z
= d(u)hU (u)φ1 |φn ihφn |U (u)φ1 i (125)
n=1 SU (2)
Z N
X
= d(u)hU (u)φ1 | |φn ihφn |U (u)φ1 i (126)
SU (2) n=1
Z
≤ d(u)hU (u)φ1 |I|U (u)φ1 i (127)
SU (2)
Z
≤ d(u)kU (u)φ1 k2 = 1. (128)
SU (2)
(129)

That is, pN ≤ 1. But p > 0, so N < ∞, and hence H cannot contain


an arbitrarily large number of mutually orthogonal vectors. In other
words, H is finite-dimensional.

Thus, we have the important result that if U (u) is irreducible, then the
operators U (u) are finite-dimensional unitary matrices. We will not have to
worry about delicate issues that might arise if the situation were otherwise.3
Before actually building representations, we would like to know whether
it is “sufficient” to consider only unitary representations of SU (2).
Def: Two (finite dimensional) representations U and W of a group are called
equivalent if and only if they are similar, that is, if there exists a
fixed similarity transformation S which maps one representation onto
the other:
U (u) = SW (u)S −1 , ∀u ∈ SU (2). (130)
Otherwise, the representations are said to be inequivalent.
3
The theorem actually holds for any compact Lie group, since a Haar integral normal-
ized to one exists.

25
Note that we can think of equivalence as just a basis transformation. The
desired theorem is:
Theorem: Any finite-dimensional (continuous) representation u → W (u) of
SU (2) is equivalent to a (continuous) unitary representation u → U (u)
of SU (2).
Proof: We prove this theorem by constructing the required similarity trans-
formation: Define matrix
Z
P = d(u)W † (u)W (u). (131)
SU (2)

This matrix is positive definite and Hermitian, since the integrand is.
Thus P has a unique positive-definite Hermitian square root S:

P = P † > 0 =⇒ P = S = S † > 0. (132)
Now, let u0 , u ∈ SU (2). We have,
h i
W † (u0 )W † (u)W (u) = W † (uu0 )W (uu0 ) W (u−1
0 ). (133)
From the invariance of the Haar integral, we find:
Z h i
W † (u0 )P = d(u) W † (uu0 )W (uu0 ) W (u−1
0 ) (134)
SU (2)

= P W (u−1
0 ), ∀u0 ∈ SU (2). (135)

Now define, for all u ∈ SU (2),


U (u) = SW (u)S −1 . (136)
The mapping u → U (u) defines a continuous representation of SU (2),
and furthermore:
h i† h i
U † (u)U (u) = SW (u)S −1 SW (u)S −1
 †
= S −1 W † (u)S † SW (u)S −1
 †
= S −1 P W †(u−1 )W (u)S −1
 †
= S −1 P S −1
 †
= S −1 S † SS −1
= I. (137)
That is, U (u) is a unitary representation, equivalent to W (u).

26
We have laid the fundamental groundwork: It is sufficient to determine all
unitary finite-dimensional irreducible representations of SU (2).
This brings us to some important “tool theorems” for working in group
representaion theory.
Theorem: Let u → D 0 (u) and u → D 00 (u) be two inequivalent irreducible
representations of SU (2). Then the matrix elements of D 0 (u) and D 00 (u)
satisfy: Z
0 00
d(u)Dmn (u)Drs (u) = 0. (138)
SU (2)

Proof: Note that the theorem can be thought of as a sort of orthogonality


property between matrix elements of inequivalent representations. Let
V 0 be the N 0 -dimensional carrier space of the representation D0 (u), and
let V 00 be the N 00 -dimensional carrier space of the representation D 00 (u).
Let A be any N 0 × N 00 matrix. Define another N 0 × N 00 matrix, A0 by:
Z
A0 ≡ d(u)D 0 (u−1 )AD 00 (u). (139)
SU (2)

Consider (in the sceond line, we use the invariance of the Haar integral
under the substitution u → uu0 ):
Z
0
D (u0 )A0 = d(u)D 0 (u0 u−1 )AD 00 (u)
SU (2)
Z
= d(u)D 0 (u−1 )AD 00 (uu0 )
SU (2)
Z
= d(u)D 0 (u−1 )AD 00 (u)D 00 (u0 )
SU (2)
= A0 D 00 (u0 ), ∀u0 ∈ SU (2). (140)

Now define N 0 × N 0 matrix B 0 and N 00 × N 00 matrix B 00 by:

B 0 ≡ A0 A†0 , B 00 ≡ A†0 A0 . (141)

Then we have:

D 0 (u)B 0 = D 0 (u)A0 A†0


= A0 D 00 (u)A†0
= A0 A†0 D 0 (u)
= B 0 D 0 (u), ∀u ∈ SU (2). (142)

27
Similarly,
D 00 (u)B 00 = B 00 D 00 (u), ∀u ∈ SU (2). (143)
Thus, B 0 , an operator on V 0 , commutes with all elements of irreducible
representation D0 , and is therefore a multiple of the identity operator
on V 0 : B 0 = b0 I 0 . Likewise, B 00 = b00 I 00 on V 00 .
If A0 6= 0, this can be possible only if N 0 = N 00 , and A0 is non-
singular. But if A0 is non-singular, then D 0 and D00 are equivalent,
since D 0 (u)A0 = A0 D 00 , ∀u ∈ SU (2). But this contradicts the as-
sumption in the theorem, hence A0 = 0. To complete the proof, select
Anr = 1 for any desired n, r and set all of the other elements equal to
zero.

Next, we quote the corresponding “orthonormality” theorem among ele-


ments of the same irreducible representation:
Theorem: Let u → D(u) be a (continuous) irreducible representation of
SU (2) on a carrier space of dimension d. Then
Z
d(u)Dmn (u−1 )Drs (u) = δms δnr /d. (144)
SU (2)

Proof: The proof of this theorem is similar to the preceding theorem. Let
A be an arbitrary d × d matrix, and define
Z
A0 ≡ d(u)D(u−1 )AD(u). (145)
SU (2)

As before, we may show that

D(u)A0 = A0 D(u), (146)

and hence A0 = aI is a multiple of the identity. We take the trace to


find the multiple:
"Z #
1
a = Tr d(u)D(u−1 )AD(u) (147)
d SU (2)
Z h i
1
= d(u)Tr D(u−1 )AD(u) (148)
d SU (2)
1
= TrA. (149)
d
28
This yields the result
Z
Tr(A)
d(u)D(u−1)AD(u) = I. (150)
SU (2) d
Again, select A with any desired element equal to one, and all other
elements equal to 0, to finish the proof.

We consider now the set of all irreducible representations of SU (2). More


precisely, we do not distinguish between equivalent representations, so this set
is the union of all equivalence classes of irreducible representations. Use the
symbol j to label an equivalence class, i.e., j is an index, taking on values in
an index set in 1:1 correspondence with the set of all equivalence classes. We
denote D(u) = D j (u) to indicate that a particular irreducible representation
u → D(u) belongs to equivalence class “j”. Two representations D j (u) and
0
Dj (u) are inequivalent if j 6= j 0 . Let dj be the dimension associated with
equivalence class j. With this new notation, we may restate our above two
theorems in the form:
Z
j j0 1
d(u)Dmn (u−1 )Drs = δjj 0 δms δnr . (151)
SU (2) d
This is an important theorem in representation theory, and is sometimes
referred to as the “General Orthogonality Relation”.
For much of what we need, we can deal with simpler objects than the full
representation matrices. In particular, the traces are very useful invariants
under similarity transformations. So, we define:
Def: The character χ(u) of a finite-dimensional representation u → D(u)
of SU (2) is the function on SU (2):
χ(u) = Tr [D(u)] . (152)

We immediately remark that the characters of two equivalent representations


are identical, since h i
Tr SD(u)S −1 = Tr [D(u)] . (153)
In fact, we shall see that the representation is completely determined by the
characters, up to similarity transformations.
Let χj (u) denote the character of irreducible representation D j (u). The
index j uniquely determines χj (u). We may summarize some important
properties of characters in a theorem:

29
Theorem: 1. For any finite-dimensional representation u → D(u) of SU (2):
χ(u0 uu−1
0 ) = χ(u), ∀u, u0 ∈ SU (2). (154)

2.
χ(u) = χ∗ (u) = χ(u−1 ) = χ(u∗ ), ∀u ∈ SU (2). (155)
3. For the irreducible representations u → D j (u) of SU (2):
Z
0 1
d(u)χj (u0 u−1 )D j (u) = δjj 0 D j (u0 ) (156)
SU (2) dj
Z
1
d(u)χj (u0 u−1 )χj 0 (u) = δjj 0 χj (u0 ) (157)
SU (2) dj
Z Z
−1
d(u)χj (u )χ (u) = j0 d(u)χ∗j (u)χj 0 (u) = δjj 0 . (158)
SU (2) SU (2)

Proof: (Selected portions)


1.
h i
χ(u0 uu−1 −1
0 ) = Tr D(u0 uu0 )
h i
= Tr D(u0 )D(u)D(u−1
0 )

= χ(u). (159)

2.
h i h i
χ(u−1 ) = Tr D(u−1 ) = Tr D(u)−1
h i
= Tr (SU (u)S −1 )−1 , where U is a unitary representation,
h i
= Tr S −1 U † (u)S
h i
= Tr U † (u)
= χ∗ (u). (160)

The property χ(u) = χ(u∗ ) holds for SU (2), but not more gener-
ally [e.g., it doesn’t hold for SU (3)]. It holds for SU (2) because
the replacement u → u∗ gives an equivalent representation for
SU (2). Let us demonstrate this. Consider the parameterization:
θ θ
u = ue (θ) = cos I − i sin e · σ . (161)
2 2
30
Now form the complex conjugate, and make the following similar-
ity transformation:
" #
θ θ
σ2 u∗ σ2−1 = σ2 cos I + i sin e · σ ∗ σ2−1
2 2
θ θ
= cos I + i sin [e1 σ2 σ1 σ2 − e2 σ2 σ2 σ2 + e3 σ2 σ3 σ2 ]
2 2
θ θ
= cos I − i sin e · σ
2 2
= u. (162)

We thus see that u and u∗ are equivalent representations for


SU (2). Now, for representation D (noting that iσ2 ∈ SU (2)):

D(u) = D(iσ2 )D(u∗ )D −1 (iσ2 ), (163)

and hence, χ(u) = χ(u∗ ).


3. We start with the general orthogonality relation, and use
X
χj (u0 u−1 ) = j
Dnm j
(u0 )Dmn (u−1 ), (164)
m,n

to obtain
Z Z X
0 0
d(u)χj (u0 u−1 )Drs
j
(u) = d(u) j
Dnm j
(u0 )Dmn (u−1 )Drs
j
(u)
SU (2) SU (2) m,n
X 1
j
= Dnm (u0 ) δjj 0 δnr δms
m,n dj
1 j
= δjj 0 Drs (u0 ). (165)
dj

Now take the trace of both sides of this, as a matrix equation:


Z
1
d(u)χj (u0 u−1 )χj 0 (u) = δjj 0 χj (u0 ). (166)
SU (2) dj

Let u0 = I. The character of the identity is just dj , hence we


obtain our last two relations.

31
9 Reduction of Representations
Theorem: Every continuous finite dimensional representation u → D(u) of
SU (2) is completely reducible, i.e., it is the direct sum:
X
D(u) = ⊕D r (u) (167)
r

of a finite number of irreps Dr (u). The multiplicities mj (the number


of irreps D r which belong to the equivalence class of irreps characterized
by index j) are unique, and they are given by:
Z
mj = d(u)χj (u−1 )χ(u), (168)
SU (2)

where χ(u) = Tr [D(u)]. Two continuous finite dimensional represen-


tations D 0 (u) and D00 (u) are equivalent if and only if their characters
χ0 (u) and χ00 (u) are identical as functions on SU (2).

Proof: It is sufficient to consider the case where D(u) is unitary and re-
ducible. In this case, there exists a proper subspace of the carrier
space of D(u), with projection E 0 , which is mapped into itself by D(u):

E 0 D(u)E 0 = D(u)E 0 . (169)

Take the hermitian conjugate, and relabel u → u−1 :


h i† h i†
E 0 D(u−1 )E 0 = D(u−1 )E 0 (170)
E 0 D † (u−1 )E 0 = E 0 D † (u−1 ) (171)
E 0 D(u)E 0 = E 0 D(u). (172)

Hence, E 0 D(u) = D(u)E 0 , for all elements u in SU (2).


Now, let E 00 = I − E 0 be the projection onto the subspace orthogonal
to E 0 . Then:

D(u) = (E 0 + E 00 )D(u)(E 0 + E 00 ) (173)


= E 0 D(u)E 0 + E 00 D(u)E 00 (174)

(since, e.g., E 0 D(u)E 00 = D(u)E 0 E 00 = D(u)E 0 (I − E 0 ) = 0). This


formula describes a reduction of D(u). If D(u) restricted to subspace

32
E 0 (or E 00 ) is reducible, we repeat the process until we have only ir-
reps remaining. The finite dimensionality of the carrier space of D(u)
implies that there are a finite number of steps to this procedure.
Thus, we obtain a set of projections Er such that
X
I = Er (175)
r
Er Es = δrsEr (176)
X
D(u) = Er D(u)Er , (177)
r

where D(u) restricted to any of subspaces Er is irreducible:


X
D(u) = ⊕D r (u). (178)
r

The multiplicity follows from


Z
d(u)χj (u−1 )χj 0 (u) = δjj 0 . (179)
SU (2)

Thus,
Z Z " #
X
−1 −1 r
d(u)χj (u )χ(u) = χj (u )Tr ⊕D (u) (180)
SU (2) SU (2) r
= the number of terms in the sum
with D r = D j (181)
= mj . (182)

Finally, suppose D 0 (u) and D 00 are equivalent. We have already shown


that the characters must be identical. Suppose, on the other hand, that
D 0 (u) and D 00 are inequivalent. In this case, the characters cannot be
identical, or this would violate our other relations above, as the reader
is invited to demonstrate.

10 The Clebsch-Gordan Series


We are ready to embark on solving the problem of the addition of angular
momenta. Let D 0 (u) be a representation of SU (2) on carrier space V 0 , and

33
let D 00 (u) be a representation of SU (2) on carrier space V 00 , and assume V 0
and V 00 are finite dimensional. Let V = V 0 ⊗ V 00 denote the tensor product
of V 0 and V 00 .
The representations D0 and D 00 induce a representation on V in a natural
way. Define the representation D(u) on V in terms of its action on any φ ∈ V
of the form φ = φ0 ⊗ φ00 as follows:

D(u)(φ0 ⊗ φ00 ) = [D 0 (u)φ0 ] ⊗ [D00 (u)φ00 ] . (183)

Denote this representation as D(u) = D 0 (u) ⊗ D00 (u) and call it the tensor
product of D 0 and D 00 . The matrix D(u) is the Kronecker product of D0 (u)
and D00 (u). For the characters, we clearly have

χ(u) = χ0 (u)χ00 (u). (184)

We can extend this tensor product definition to the product of any finite
number of representations.
0 00
The tensor product of two irreps, D j (u) and D j (u), is in general not
irreducible. We know however, that it is completely reducible, hence a direct
sum of irreps: X
0 00
Dj (u) ⊗ Dj (u) = ⊕Cj 0 j 00 j D j (u). (185)
j

This is called a “Clebsch-Gordan series”. The Cj 0 j 00 j coefficients are some-


times referred to as Clebsch-Gordan coefficients, although we tend to use that
name for a different set of coefficients. These coefficients must, of course, be
non-negative integers. We have the corresponding identity:
X
χj 0 (u)χj 00 (u) = Cj 0 j 00 j χj (u). (186)
j

We now come to the important theorems on combining angular momenta


in quantum mechanics:
Theorem:
1. There exists only a countably infinite number of inequivalent ir-
reps of SU (2). For every positive integer (2j +1), j = 0, 12 , 1, 32 , . . .,
there exists precisely one irrep Dj (u) (up to similarity transforma-
tions) of dimension (2j + 1). As 2j runs through all non-negative
integers, the representations Dj (u) exhaust the set of all equiva-
lence classes.

34
2. The character of irrep D j (u) is:

sin 2j+1
2
θ
χj [ue (θ)] = 1 , (187)
sin 2 θ

and
χj (I) = dj = 2j + 1. (188)
3. The representation Dj (u) occurs precisely once in the reduction
of the 2j-fold tensor product (⊗u)2j , and we have:
Z
d(u)χj (u) [Tr(u)]2j = 1. (189)
SU (2)

R
Proof: We take from SU (2) d(u)χ∗j (u)χj 0 (u) = δjj 0 the suggestion that the
characters of the irreps are a complete set of orthonormal functions
on a Hilbert space of class-functions of SU (2) (A class-function is a
function which takes the same value for every element in a conjugate
class).
Consider the following function on SU (2):
 
1 h i 1 1
ω(u) ≡ 1 − Tr (u − I)† (u − I) = 1 + Tr(u) , (190)
8 2 2
which satisfies the conditions 1 > ω(u) ≥ 0 for u 6= I, and u(I) = 1
(e.g., noting that ue (θ) = cos θ2 I + a traceless piece). Hence, we have
the lemma: If f (u) is a continuous function on SU (2), then
R
SU (2) d(u)f (u) [ω(u)]n
lim R n = f (I). (191)
SU (2) d(u) [ω(u)]
n→∞

The intuition behind this lemma is that, as n → ∞, [ω(u)]n becomes


increasingly peaked about u = I.
Thus, if D j (u) is any irrep of SU (2), then there exists an integer n such
that: Z
d(u)χj (u−1 ) [Tr(u)]n 6= 0. (192)
SU (2)

Therefore, the irrep Dj (u) occurs in the reduction of the tensor product
(⊗u)n .

35
Next, we apply the Gram-Schmidt process to the infinite sequence
{[Tr(u)]n |n = 0, 1, 2, . . .} of linearly independent class functions on SU (2)
to obtain the orthonormal sequence {Bn (u)|n = 0, 1, 2, . . .} of class
functions:
Z
1 Z 2π θ
d(u)Bn∗ (u)Bm (u) = dθ sin2 Bn∗ [ue3 (θ)] Bm [ue3 (θ)] = δnm ,
SU (2) π 0 2
(193)
1 2 θ
where we have used the measure d(u) = 4π2 dΩe sin 2 dθ and the fact
that, since Bn is a class function, it has the same value for a rotation
by angle θ about any axis.
Now, write
βn (θ) = Bn [ue3 (θ)] = Bn [ue (θ)] . (194)
Noting that
!n n  
n θ X n iθ( n2 −m)
[Tr(u)] = 2 cos = e , (195)
2 m=0 m
we may obtain the result
sin [(n + 1)θ/2]
βn (θ) = = Bn [ue (θ)] = Bn∗ [ue (θ)] , (196)
sin θ/2
by adopting suitable phase conventions for the B’s. Furthermore,
Z 
0 if n > m,
d(u)Bn∗ (u) [Tr(u)]m = (197)
SU (2) 1 if n = m.

We need to prove now that the functions Bn (u) are characters of the
irreps. We shall prove this by induction on n. First, B0 (u) = 1; B0 (u) is
the character of the trivial one-dimensional identity representation:
D(u) = 1. Assume now that for some integer n0 ≥ 0 the functions
Bn (u) for n = 0, 1, . . . , n0 are all characters of irreps. Consider the
reduction of the representation (⊗u)n0 +1 :
n0
X X
n0 +1
[Tr(u)] = Nn0 ,n Bn (u) + cn0 ,j χj (u), (198)
n=0 j∈Jn0

where Nn0 ,n and cn0 ,j are integers ≥ 0, and the cn0 ,j sum is over irreps
D j such that the characters are not in the set {Bn (u)|n = 0, 1, . . . , n0 }
(j runs over a finite subset of the Jn0 index set).

36
With the above fact that:
Z 
0 if n > n0 + 1,
d(u)Bn∗ (u) [Tr(u)]n0 +1 = (199)
SU (2) 1 if n = n0 + 1,

we have, X
Bn0 +1 (u) = cn0 ,j χj (u). (200)
j∈Jn0

Squaring both sides, and averaging over SU (2) yields


X
1= (cn0 ,j )2 . (201)
j∈Jn0

But the cn0 ,j are integers, so there is only one term, with cn0 ,j = 1.
Thus, Bn0 +1 (u) is a character of an irrep, and we see that there are a
countably infinite number of (inequivalent) irreducible representations.
Let us obtain the dimensionality of the irreducible representation. The
Bn (u), n = 0, 1, 2, . . . correspond to characters of irreps. We’ll label
these irredusible representations D j (u) according to the 1 : 1 mapping
2j = n. Then

dj = χj (I) = B2j (I) = β2j (0) (202)


sin [(2j + 1)θ/2]
= lim (203)
θ→0 sin θ/2
= 2j + 1. (204)

Next, we consider the reduction of the tensor product of two irreducible


representations of SU (2). This gives our “rule for combining angular momen-
tum”. That is, it tells us what angular momenta may be present in a system
composed of two components with angular momenta. For example, it may
be applied to the combination of a spin and an orbital angular momentum.

Theorem: Let j1 and j2 index two irreps of SU (2). Then the Clebsch-
Gordan series for the tensor product of these representations is:
j1X
+j2
j1 j2
D (u) ⊗ D (u) = ⊕D j (u). (205)
j=|j1 −j2 |

37
Equivalently,
j1X
+j2 X
χj1 (u)χj2 (u) = χj (u) = Cj1 j2 j χj (u), (206)
j=|j1 −j2 | j

where
Z
Cj1 j2 j = d(u)χj1 (u)χj2 (u)χj (u) (207)
SU (2)

1 iff j1 + j2 + j is an integer, and a triangle can
=  be formed with sides j1 , j2 , j, (208)
0 otherwise.

The proof of this theorem is straightforward, by considering


    X
1 Z 2π 1 1 j
Cj1 j2 j = dθ sin (j1 + )θ sin (j2 + )θ e−imθ , (209)
π 0 2 2 m=−j

etc., as the reader is encouraged to carry out.


We have found all the irreps of SU (2). Which are also irreps of O+ (3)?
This is the subject of the next theorem:

Theorem: If 2j is an odd integer, then D j (u) is a faithful representation


of SU (2), and hence is not a representation of O + (3). If 2j > 0 is an
even integer (and hence j > 0 is an integer), then Re (θ) → Dj [ue (θ)]
is a faithful representation of O+ (3). Except for the trivial identity
representation, all irreps of O + (3) are of this form.

The proof of this theorem is left to the reader.


We will not concern ourselves especially much with issues of constructing
a proper Hilbert space, such as completeness, here. Instead, we’ll concentrate
on making the connection between SU (2) representation theory and angular
momentum in quantum mechanics a bit more concrete. We thus introduce
the quantum mechanical angular momentum operators.

Theorem: Let u → U (u) be a (strongly-)continuous unitary representation


of SU (2) on Hilbert space H. Then there exists a set of 3 self-adjoint
operators Jk , k = 1, 2, 3 such that

U [ue (θ)] = exp(−iθee · J). (210)

38
To keep things simple, we’ll consider now U (u) = D(u), where D(u) is
a finite-dimensional representation – the appropriate extension to the
general case may be demonstrated, but takes some care, and we’ll omit
it here.
1. The function D[ue (θ)] is (for e fixed), an infinitely differentiable
function of θ. Define the matrices J(ee) by:
( )

J(ee) ≡ i D [ue (θ)] . (211)
∂θ θ=0

Also, let J1 = J(ee1 ), J2 = J(ee2 ), J3 = J(ee3 ). Then


3
X
J(ee) = e · J = (ee · e k )Jk . (212)
k=1

For any unit vector e and any θ, we have

D [ue (θ)] = exp(−iθee · J). (213)

The matrices Jk satisfy:

[Jk , J` ] = ik`m Jm . (214)

The matrices −iJk , k = 1, 2, 3 form a basis for a representation of


the Lie algebra of O+ (3) under the correspondence:

Jk → −iJk , k = 1, 2, 3. (215)

The matrices Jk are hermitian if and only if D(u) is unitary.


2. The matrices Jk , k = 1, 2, 3 form an irreducible set if and only if
the representation D(u) is irreducible. If D(u) = Dj (u) is irre-
ducible, then for any e , the eigenvalues of e · J are the numbers
m = −j, −j + 1, . . . , j − 1, j, and each eigenvalue has multiplicity
one. Furthermore:

J2 = J12 + J22 + J32 = j(j + 1). (216)

3. For any representation D(u) we have

D(u)JJ2 D(u−1 ) = J 2 ; [Jk , J2 ] = 0, k = 1, 2, 3. (217)

39
Proof: (Partial) Consider first the case where D(u) = D j (u) is an irrep. Let
Mj = {m|m = −j, . . . , j}, let e be a fixed unit vector, and define the
operators Fm (ee) (with m ∈ Mj ) by:

1 Z 4π
Fm (ee) ≡ dθeimθ D j [ue (θ)]. (218)
4π 0

Multiply this defining equation by D j [ue (θ 0 )]:


1 Z 4π
D j [ue (θ 0 )]Fm (ee) = dθeimθ D j [ue (θ + θ0 )] (219)
4π 0
Z
1 4π 0
= dθeim(θ−θ ) D j [ue (θ)] (220)
4π 0
0
= e(−imθ ) Fm (ee). (221)
0
Thus, either Fm (ee) = 0, or e(−imθ ) is an eigenvalue of D j [ue (θ 0 )]. But
j
X
χj [ue (θ)] = e−inθ , (222)
n=−j

and hence, 
1 if m ∈ Mj ,
Tr[Fm (ee)] = (223)
0 otherwise.
Therefore, Fm (ee) 6= 0.
We see that the {Fm (ee)} form a set of 2j+1 independent one-dimensional
projection operators, and we can represent the D j (u) by:
j
X
D j [ue (θ)] = e−imθ Fm (ee). (224)
m=−j

From this, we obtain:


( ) j
∂ j X
J(ee) ≡ i D [ue (θ)] = mFm (ee), (225)
∂θ θ=0 m=−j

and
Dj [ue (θ)] = exp [−iθJ(ee)] , (226)
which is an entire function of θ for fixed e.

40
Since every finite-dimensional continuous representation D(u) is a di-
rect sum of a finite number of irreps, this result holds for any such
representation:
D[ue (θ)] = exp [−iθJ(ee)] . (227)

Let w be a unit vector with components wk , k = 1, 2, 3. Consider


“small” rotations about w:

uw (θ) = ue1 (θw1 )ue2 (θw2 )ue3 (θw3 )ue(θ,w) [α(θ, w )], (228)

where α(θ, w) = O(θ2 ) for small θ. Thus,

w)] = e−iθw1 J1 e−iθw2 J2 e−iθw3 J3 e−iαJ(e) .


exp [−iθJ(w (229)

Expanding the exponentials and equating coefficients of terms linear in


θ yields the result:

w) = w1 J1 + w2 J2 + w3 J3 = w · J.
J(w (230)

To obtain the commutation relations, consider

ue1 (θ)ue2 (θ 0 )ue−11 (θ) (231)


! !
0 0
θ θ θ θ
= cos I − i sin σ1 cos I − i sin σ2 (232)
2 2 2 2
!
θ θ
cos I + i sin σ1 (233)
2 2
0 0
θ θ
= cos I − i sin (cos θσ2 + sin θσ3 ) (234)
2 2
= ue2 cos θ+e3 sin θ (θ 0 ). (235)

Thus,

exp(−iθJ1 ) exp(−iθ0 J2 ) exp(iθJ1 ) = exp [−iθ0 (J2 cos θ + J3 sin θ)] .


(236)
Expanding the exponentials, we have:
X ∞ X
∞ X ∞ n ` m ∞
X
n+m 0 ` J1 J2 J1 n+` m r 1
θ θ (−i) i = (−i)r θ0 (J2 cos θ+J3 sin θ)r .
n=0 `=0 m=0 n!`!m! r=0 r!
(237)

41
We equate coefficients of the same powers of θ, θ0 . In particular, the
terms of order θθ0 yield the result:

[J1 , J2 ] = iJ3 . (238)

We thus also have, for example:

[J1 , J2 ] = [J1 , J12 + J22 + J32 ] (239)


= J1 J2 J2 − J2 J2 J1 + J1 J3 J3 − J3 J3 J1 (240)
= (iJ3 + J2 J1 )J2 − J2 (−iJ3 + J1 J2 ) + (241)
(−iJ2 + J3 J1 )J3 − J3 (iJ2 + J1 J3 ) (242)
= 0. (243)

As a consequence, we also have:

D(u)JJ2 D(u−1 ) = e−iθe·J J2 eiθe·J = J2 . (244)

In particular, this is true for an irrep:

D j (u)JJ2 D j (u−1 ) = J 2 . (245)

Therefore J2 is a multiple of the identity (often referred to as a “casimir


operator”).
Let us determine the multiple. Take the trace:

Tr(JJ2 ) = 3Tr(J32 ) (246)


( )2 
 ∂ j 
= 3Tr D [ue3 (θ)] (247)
 ∂θ 
θ=0
( h
1 i
= −3Tr lim D j (ue3 (θ + ∆)) − D j (ue3 (θ))(248)
∆→0
∆0 →0
∆∆0
)
h i
j 0 j
D (ue3 (θ + ∆ )) − D (ue3 (θ)) (249)
θ=0
( 
1 h j
= −3Tr lim D (ue3 (∆ + ∆0 )) (250)
∆→0
∆0 →0
∆∆0
)
i
j 0 j j
−D (ue3 (∆ )) − D (ue3 (∆)) + D (ue3 (0)) (251)

42
(" # )
∂2 j
= −3Tr D [ue3 (θ)] (252)
∂θ2 θ=0
( )
∂2 j
= −3 χ [ue3 (θ)] . (253)
∂θ2 θ=0

There are different ways to evaluate this. We could insert


sin(j + 12 )θ
χj [ue3 (θ)] = , (254)
sin 12 θ
or we could use
j
X
χj [ue3 (θ)] = eimθ . (255)
m=−j

In either case, we find:


Tr(JJ2 ) = j(j + 1)(2j + 1). (256)
Since dj = 2j + 1, this gives the familiar result:
J2 = j(j + 1)I. (257)
P
Finally, we’ll compute the eigenvalues of e ·JJ = J(ee) = jm=−j mFm (ee).
We showed earlier that the Fm (ee) are one-dimensional projection oper-
ators, and it thus may readily be demonstrated that
J(ee)Fm (ee) = mFm (ee). (258)
Hence, the desired eigenvalues are m = {−j, −j + 1, . . . , j − 1, j}.

11 Standard Conventions
Let’s briefly summarize where we are. We’ll pick some “standard” conven-
tions towards building explicit representations for rotations in quantum me-
chanics.
• The rotation group in quantum mechanics is postulated to be SU (2),
with elements
u = ue (θ) = e− 2 θe·σ ,
i
(259)
describing a rotation by angle θ about vector e , in the clockwise sense
as viewed along e.

43
• O+ (3) is a two to one homomorphism of SU (2):
1
Rmn (u) = Tr(u† σm uσn ). (260)
2

• To every representation of SU (2) there corresponds a representation


of the Lie algrebra of SU (2) given by the real linear span of the three
matrices −iJk , k = 1, 2, 3, where

[Jm , Jn ] = imnp Jp . (261)

The vector operator J is interpreted as angular momentum. Its square


is invariant under rotations.

• The matrix group SU (2) is a representation of the abstract group


1
SU (2), and this representation is denoted D 2 (u) = u. For this rep-
resentation, Jk = 12 σk .

• Every finite dimensional representation of SU (2) is equivalent to a


unitary representation, and every unitary irreducible representation of
SU (2) is finite dimensional. Therefore, the generating operators, Jk ,
can always be chosen to be hermitian.

• Let 2j be a non-negative integer. To every 2j there corresponds a


unique irrep by unitary transformations on a 2j + 1-dimensional carrier
space, which we denote
D j = D j (u). (262)
These representations are constructed according to conventions which
we take to define the “standard representations”:
The matrices Jk are hermitian and constructed according to the follow-
ing: Let |j, mi, m = −j, −j + 1, . . . , j − 1, j be a complete orthonormal
basis in the carrier space such that:

J2 |j, mi = j(j + 1)|j, mi (263)


J3 |j, mi = m|j, mi (264)
q
J+ |j, mi = (j − m)(j + m + 1)|j, m + 1i (265)
q
J− |j, mi = (j + m)(j − m + 1)|j, m − 1i, (266)

44
where
J± ≡ J1 ± iJ2 . (267)
According to convention, matrices J1 and J3 are real, and matrix J2 is
pure imaginary. The matrix

D j [ue (θ)] = exp(−iθee · J), (268)

describes a rotation by θ about unit vector e.

• If j = 12 -integer, then Dj (u) are faithful representations of SU (2). If j


is an integer, then D j (u) are representations of O + (3) (and are faithful
if j > 0). Also, if j is an integer, then the representation D j (u) is
similar to a representation by real matrices:
3 h i
Rmn (u) = Tr D j† (u)Jm D j (u)Jn . (269)
j(j + 1)(2j + 1)

• In the standard basis, the matrix elements of D j (u) are denoted:


j
Dm 1 m2
(u) = hj, m1 |D j (u)|j, m2 i, (270)

and thus,
j
X j
Dj (u)|j, mi = Dm 0
0 m (u)|j, m i. (271)
m0 =−j

Let |φi be an element in the carrier space of D j , and let |φ0 i = D j (u)|φi
be the result of applying rotation D j (u) to |φi. We may expand these
vectors in the basis:
X
|φi = φm |j, mi (272)
m
X
|φ0 i = φ0m |j, mi. (273)
m

Then X j
φ0m = Dmm 0 (u)φm0 . (274)
m0

• Since matrices u and D j (u) are unitary,


j
Dm 1 m2
(u−1 ) = Dm
j
1 m2
(u† ) = Dm
∗j
2 m1
(u). (275)

45
• The representation Dj is the symmetrized (2j)-fold tensor product of
1
the representation D 2 = SU (2) with itself. For the standard repre-
sentation, this is expressed by an explicit formula for matrix elements
j
Dm 1 m2
(u) as polynomials of degree 2j in the matrix elements of SU (2):
j
Define the quantities Dm 1 m2
(u) for m1 , m2 = −j, . . . , j by:
hλ∗ |u|ηi2j = (λ1 u11 η1 + λ1 u12 η2 + λ2 u21 η1 + λ2 u22 η2 )2j (276)
X
= (2j)! (277)
m1 ,m2

λj+m
1
1 j−m1 j+m2 j−m2
λ2 η1 η2 j
q Dm 1 m2
(u).
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
j
We defer to later the demonstration that the matrix elements Dm 1 m2
(u)
so defined are identical with the earlier definition for the standard rep-
resentation. A consequence of this formula (which the reader is encour-
aged to demonstrate) is that, in the standard representation,
D j (u∗ ) = D∗j (u) (278)
Dj (uT ) = DjT (u). (279)
Also, noting that u∗ = σ2 uσ2 , we obtain
D∗j (u) = exp(−iπJ2 )D j (u) exp(iπJ2 ), (280)
making explicit our earlier statement that the congugate representation
was equivalent in SU (2) [We remark that this property does not hold
for SU (n), if n > 2].
• We can also describe the standard representation in terms of an ac-
tion of the rotation group on homogeneous polynomials of degree 2j
of complex variables x and y. We define, for each j = 0, 12 , 1, . . ., and
m = −j, . . . , j the polynomial:
xj+m y j−m
Pjm (x, y) ≡ q ; P00 ≡ 1. (281)
(j + m)!(j − m)!

We also define Pjm ≡ 0 if m ∈ / {−j, . . . , j}. In addition, define the


differential operators Jk , k = 1, 2, 3, J+ , J− :
1
J3 = (x∂x − y∂y ) (282)
2
46
1 1
J1 = (x∂y + y∂x ) = (J+ + J− ) (283)
2 2
i i
J2 = (y∂x − x∂y ) = (J− − J+ ) (284)
2 2
J+ = x∂y = J1 + iJ2 (285)
J− = y∂x = J1 − iJ2 (286)
These definitions give
1h i
J2 = (x∂x − y∂y )2 + 2(x∂x + y∂y ) . (287)
4
We let these operators act on our polynomials:
 
1 xj+m y j−m
J3 Pjm (x, y) = (x∂x − y∂y ) q (288)
2 (j + m)!(j − m)!
1
= [j + m − (j − m)]Pjm (x, y) (289)
2
= mPjm (x, y). (290)
Similarly,
xj+m y j−m
J+ Pjm (x, y) = (x∂y ) q (291)
(j + m)!(j − m)!
xj+m+1 y j−m−1
= (j − m) q (292)
(j + m)!(j − m)!
v
u
u (j + m + 1)!(j − m − 1)!
= (j − m)t Pj,m+1 (x, y)
(j + m)!(j − m)!
q
= (j − m)(j + m + 1)Pj,m+1 (x, y). (293)
Likewise,
q
J− Pjm (x, y) = (j + m)(j − m + 1)Pj,m−1 (x, y), (294)
and
J2 Pjm (x, y) = j(j + 1)Pjm (x, y). (295)

We see that the actions of these differential operators on the monomials,


Pjm , are according to the standard representation of the Lie algebra of

47
the rotation group (that is, we compare with the actions of the standard
representation for J on orthonormal basis |j, mi).
Thus, regarding Pjm (x, y) as our basis, a rotation corresponds to:
X j
D j (u)Pjm (x, y) = Dm 0 m (u)Pjm0 (x, y). (296)
m0

Now,
1 X 1
D 2 (u)P 1 m (x, y) = Dm
2
0 m (u)P 1 m0 (x, y) (297)
2 2
m0
X
= um0 m (u)P 1 m0 (x, y). (298)
2
m0

Or,
uP 1 m (x, y) = u 1 m P 1 1 (x, y) + u− 1 m P 1 − 1 (x, y). (299)
2 2 2 2 2 2 2

With P 1 1 (x, y) = x, and P 1 − 1 (x, y) = y, we thus have (using normal


2 2 2 2
matrix indices now on u)

uP 1 1 (x, y) = u11 x + u21 y, (300)


2 2

uP 1 − 1 (x, y) = u12 x + u22 y. (301)


2 2

Hence,

D j (u)Pjm (x, y) = Pjm (u11 x + u21 y, u12 x + u22 y) (302)


X j
= Dm0 m (u)Pjm0 (x, y). (303)
m0

Any homogeneous polynomial of degree 2j in (x, y) can be written as a


unique linear combination of the monomials Pjm (x, y). Therefore, the
set of all such polynomials forms a vector space of dimension 2j + 1,
and carries the standard representation Dj of the rotation group if the
action of the group elements on the basis vectors Pjm is as above. Note
that
0 0
∂xj+m ∂yj−m xj+m y j−m
Pjm (∂x , ∂y )Pjm0 (x, y) = q
(j + m)!(j − m)!(j + m0 )!(j − m0 )!
= δmm0 . (304)

48
Apply this to
X 1
Pjm (u11 x + u21 y, u12 x + u22 y) = Dm2 0 m (u)P 1 m0 (x, y) : (305)
2
m0
X 1
j
2
Pjm (∂x , ∂y ) Dm 00 m0 (u)P 1 m00 (x, y) = Dmm0 (u). (306)
2
m00
Hence,
j
Dmm 0 (u) = Pjm (∂x , ∂y )Pjm0 (u11 x + u21 y, u12 x + u22 y), (307)
j
and we see that Dmm 0 (u) is a homogeneous polynomial of degree 2j in

the matrix elements of u.


Now,
j j
X X (x1 x2 )j+m (y1 + y2 )j−m
Pjm (x1 , y1 )Pjm (x2 , y2 ) = . (308)
m=−j m=−j (j + m)!(j − m)!
Using the binomial theorem, we can write:
j
(x1 x2 + y1 y2 )2j X (x1 x2 )j+m (y1 + y2 )j−m
= . (309)
(2j)! m=−j (j + m)!(j − m)!
Thus,
j
X (x1 x2 + y1 y2 )2j
Pjm (x1 , y1 )Pjm (x2 , y2 ) = . (310)
m=−j (2j)!
One final step remains to get our asserted equation defining the D j (u)
standard representation in terms of u:
X λj+m
1
1 j−m1 j+m2 j−m2
λ2 η1 η2 j
q Dm 1 m2
(u) (311)
m1 ,m2 (j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
X
j
= Pjm1 (λ1 , λ2 )Dm 1 m2
(u)Pjm2 (η1 , η2 ) (312)
m1 ,m2
X
= Pjm2 (u11 λ1 + u21 λ2 , u12 λ1 + u22 λ2 )Pjm2 (η1 , η2 ) (313)
m2
1
= (λ1 u11 η1 + λ1 u12 η2 + λ2 u21 η1 + λ2 u22 η2 )2j . (314)
(2j)!
The step in obtaining Eqn. 313 follows from Eqn. 303, or it can be
demonstrated by an explicit computation. Thus, we have now demon-
strated our earlier formula, Eqn. 277, for the standard representation
for D j .

49
12 “Special” Cases
We have obtained a general expression for the rotation matrices for an irrep.
Let us consider some “special cases”, and derive some more directly useful
formulas for the matrix elements of the rotation matrices.
1. Consider, in the standard representation, a rotation by angle π about
the coordinate axes. Let ρ1 = exp(−iπσ1 /2) = −iσ1 . Using
j
Dmm 0 (u) = Pjm (∂x , ∂y )Pjm0 (u11 x + u21 y, u12 x + u22 y), (315)
we find:
j
Dmm 0 (ρ1 ) = Pjm (∂x , ∂y )Pjm0 (−iy, −ix), (316)
= Pjm (∂x , ∂y )(−)j Pjm0 (x, y), (317)
= e−iπj δmm0 . (318)
Hence,
exp(−iπJ1 )|j, mi = e−iπj |j, −mi. (319)
Likewise, we define
ρ2 = exp(−iπσ2 /2) = −iσ2 , (320)
ρ3 = exp(−iπσ3 /2) = −iσ3 , (321)
which have the properties:
ρ1 ρ2 = −ρ2 ρ1 = ρ3 , (322)
ρ2 ρ3 = −ρ3 ρ2 = ρ1 , (323)
ρ3 ρ1 = −ρ1 ρ3 = ρ2 , (324)
and hence,
D j (ρ2 ) = D j (ρ3 )D j (ρ1 ). (325)
In the standard representation, we already know that
exp(−iπJ3 )|j, mi = e−iπm |j, mi. (326)
Therefore,
exp(−iπJ2 )|j, mi = exp(−iπJ3 ) exp(−iπJ1 )|j, mi (327)
= exp(−iπJ3 ) exp(−iπj)|j, −mi (328)
= exp(−iπ(j − m))|j, −mi. (329)

50
2. Consider the parameterization by Euler angles ψ, θ, φ:

u = eψJ3 eθJ2 eφJ3 , (330)

(here Jk = − 2i σk ) or,

Dj (u) = D j (ψ, θ, φ) = e−iψJ3 e−iθJ2 e−iφJ3 , (331)

where it is sufficient (for all elements of SU (2)) to choose the range of


parameters:

0 ≤ ψ < 2π, (332)


0 ≤ θ ≤ π, (333)
0 ≤ φ < 4π (or 2π, if j is integral). (334)

We define the functions


j
Dm 1 m2
(ψ, θ, φ) = e−i(m1 ψ+m2 φ) djm1 m2 (θ) = hj, m1 |D j (u)|j, m2 i, (335)

where we have introduced the real functions djm1 m2 (θ) given by:

djm1 m2 (θ) ≡ Dm
j
1 m2
(0, θ, 0) = hj, m1 |e−iθJ2 |j, m2 i. (336)

The “big-D” and “little-d” functions are useful in solving quantum


mechanics problems involving angular momentum. The little-d func-
tions may be found tabulated in various tables, although we have built
enough tools to compute them ourselves, as we shall shortly demon-
strate. Note that the little-d functions are real.
Here are some properties of the little-d functions, which the reader is
encouraged to prove:

djm1 m2 (θ) = dj∗


m1 m2 (θ) (337)
= (−)m1 −m2 djm2 m1 (θ) (338)
= (−)m1 −m2 dj−m1 ,−m2 (θ) (339)
djm1 m2 (π − θ) = (−)j−m2 dj−m1 m2 (θ) (340)
= (−)j+m1 djm1 ,−m2 (θ) (341)
djm1 m2 (−θ)= djm2 m1 (θ) (342)
j
dm1 m2 (2π + θ) = (−)2j djm1 m2 (θ). (343)

51
The dj functions are homogeneous polynomials of degree 2j in cos(θ/2)
and sin(θ/2). Note that slightly different conventions from those here
are sometimes used for the big-D and little-d functions.
j
The Dm 1 m2
(u) functions form a complete and essentially orthonormal
basis of the space of square integrable functions on SU (2):
Z δjj 0 δm1 m01 δm2 m02
j 0
∗j
d(u)Dm 1 m2
(u)Dm 0 m0 (u) = . (344)
SU (2) 1 2 2j + 1
1
In terms of the Euler angles, d(u) = 16π 2
dψ sin θdθdφ, and
δjj 0 1 Z 2π Z π Z 4π
= dψ sin θdθ dφ (345)
2j + 1 16π 2 0 0 0
0
ei(m1 ψ+m2 φ) e−i(m1 ψ+m2 φ) djm1 m2 (θ)djm1 m2 (θ) (346)
Z
1 π 0
= sin θdθdjm1 m2 (θ)djm1 m2 (θ). (347)
2 0

3. Spherical harmonics and Legendre polynomials: The Y`m functions are


special cases of the D j . Hence we may define:
s
2` + 1 ∗`
Y`m (θ, ψ) ≡ Dm0 (ψ, θ, 0), (348)

where ` is an integer, and m ∈ {−`, −` + 1, . . . , `}.
This is equivalent to the standard definition, where the Y`m ’s are con-
structed to transform under rotations like |`mi basis vectors, and where
s
2` + 1
Y`0 (θ, 0) = P`(cos θ). (349)

According to our definition,


s s
2` + 1 ∗` 2` + 1 `
Y`0(θ, 0) = D00 (0, θ, 0) = d (θ). (350)
4π 4π 00
Thus, we need to show that d`00 (θ) = P` (cos θ). This may be done by
comparing the generating function for the Legendre polynomials:

X √
t` P`(x) = 1/ 1 − 2tx + t2 , (351)
`=0

52
with the generating function for d00 (θ), obtained from considering the
generating function for the characters of the irreps of SU (2):
X
χj (u)z 2j = 1/(1 − 2τ z + z 2 ), (352)
j

where τ = Tr(u/2). We haven’t discussed this generating function, and


we leave it to the reader to pursue this demonstration further.
The other aspect of our assertion of equivalence requiring proof con-
cerns the behavior of our spherical harmonics under rotations. Writing
Y`m (ee) = Y`m (θ, ψ), where θ, ψ are the polar angles of unit vector e , we
can express our definition in the form:
s
2` + 1 `
Y`m [R(u)ee3 ] = D0m (u−1 ), (353)

since,

Y`m [R(u)ee3 ] = Y`m (ee) = Y`m (θ, ψ) (354)


s
2` + 1 `
= Dm0 (ψ, θ, 0) (355)

s
2` + 1
= hj, m|D ` (u)|j, 0i∗ (356)

s
2` + 1 `
= hD (u)(j, 0)||j, mi (357)

s
2` + 1
= hj, 0|D†` (u)|j, mi. (358)

To any u0 ∈ SU (2) corresponds R̂(uo ) on any function f (ee) on the unit


sphere as follows:
h i
R̂(uo )f (ee) = f R−1 (u0 )ee . (359)

Thus,
h i
R̂(uo )Y`m (ee) = Y`m R−1 (u0 )ee (360)
h i
= Y`m R−1 (u0 )R(u)ee3 (361)

53
h i
= Y`m R(u−1 e3
0 u)e (362)
s
2` + 1 `
= D0m (u−1 u0 ) (363)

s
2` + 1 X `
= D 0 (u−1 )Dm
`
0 m (u0 ) (364)
4π m0 0m
X
`
= Dm e).
0 m (u0 )Y`m0 (e (365)
m0

This shows that the Y`m transform under rotations according to the
|`, mi basis vectors.
We immediately have the following properties:
(a) If J = −ixx × ∇ is the angular momentum operator, then
J3 Y`m (ee) = mY`m (ee) (366)
J2 Y`m (ee) = `(` + 1)Y`m (ee). (367)

(b) From th D ∗ D orthogonality relation, we further have:


Z 2π Z π

dψ dθ sin θY`m (θ, ψ)Y`0 m0 (θ, ψ) = δmm0 δ``0 . (368)
0 0

The Y`m (θ, ψ) form a complete orthonormal set in the space of


square-integrable functions on the unit sphere.
We give a proof of completeness here: If f (θ, ψ) is square-integrable,
and if
Z 2π Z π

dψ dθ sin θf (θ, ψ)Y`m (θ, ψ) = 0, ∀(`, m), (369)
0 0

we must show that this means that the integral of |f |2 vanishes.


This follows from the completeness of D j (u) on SU (2): Extend
the domain of definition of f to F (u) = F (ψ, θ, φ) = f (θ, ψ), ∀u ∈
SU (2). Then,

Z if m0 = 0, by assumption
0

j
d(u)F (u)Dmm 0 (u) = 0 if m0 6= 0 since
R 4π
F (u) is independent
SU (2) 
 0
of φ, and 0 dφe−im φ = 0.
(370)
R 2 j
Hence, SU (2) d(u)|F (u)| = 0 by the completeness of D (u), and
R
therefore (4π) dΩ|f |2 = 0.

54
q
2`+1
(c) We recall that Y`0 (θ, 0) = 4π
P`(cos θ). With
h i X
R(uo )Y`m (ee) = Y`m R−1 (u0 )ee = `
Dm e),
0 m (u0 )Y`m0 (e (371)
m0

we have, for m = 0,
h i X
R(uo )Y`0 (ee) = Y`0 R−1 (u0 )ee = `
Dm e) (372)
0 0 (u0 )Y`m0 (e
m0
s
4π X ∗ 0
= Y 0 (ee )Y`m0 (ee), (373)
2` + 1 m0 `m

where we have defined R(u0 )ee3 = e0 . But

e · e0 = e · [R(u0 )ee3 = [R−1 (u0 )ee] · e3 , (374)

and

Y`0[R−1 (u0 )ee] = function of θ[R−1 (u0 )ee] only, (375)


s
2` + 1
= P` (cos θ), (376)

where cos θ = e · e 0 . Thus, we have the “addition theorem” for
spherical harmonics:
4π X ∗ 0
P` (ee · e0 ) = Y (ee )Y`m (ee). (377)
2` + 1 m `m

(d) In momentum space, we can therefore write



δ(p − q) X
δ (3) (p
p − q) = (2` + 1)P`(cos θ) (378)
4πpq `=0
∞ X̀
δ(p − q) X ∗
= Y`m (θp , ψp )Y`m (θq , ψq ),(379)
4πpq `=0 m=−`

p|, and θ is the angle between p and q.


where p = |p
(e) Let us see now how we may compute the dj (θ) functions. Recall
j
Dmm 0 (u) = Pjm (∂x , ∂y )Pjm0 (u11 x + u21 y, u12 x + u22 y). (380)

55
With
θ θ θ
u(0, θ, 0) = e−i 2 σ2 = cos I − i sin σ2 (381)

2  2
cos θ2 − sin 2θ
= , (382)
sin θ2 cos θ2
we have:

djmm0 (θ) = Dmm j


0 (0, θ, 0) (383)
θ θ θ θ
= Pjm (∂x , ∂y )Pjm0 (x cos + y sin , −x sin + y cos ) (384)
2 2 2 2
0 0
∂xj+m ∂yj−m (x cos 2θ + y sin θ2 )j+m (−x sin θ2 + y cos θ2 )j−m
= q (385)
.
(j + m)!(j − m)!(j + m0 )!(j − m0 )!

Thus, we have an explicit means to compute the little-d functions.


An alternate equation, which is left to the reader to derive (again
using the Pjm functions), is
v
u
1u
t (j + m)!(j − m)!
djmm0 (θ) = (386)
2π (j + m0 )!(j − m0 )!
Z 2π
0 θ θ 0 θ θ
× dαei(m−m )α (cos + eiα sin )j+m (cos − e−iα sin )j−m
0 2 2 2 2
In tabulating the little-d functions it is standard to use the label-
ing:  j 
djj . . . djj,−j
 . .. .. 
dj =  .. . 
. . (387)
j j
d−j,j . . . d−j,−j
For example, we find:
 
1 cos θ2 − sin θ2
d (θ) =
2 , (388)
sin θ2 cos θ2
and
1 
(1 + cos θ) − √12 sin θ 1
(1 − cos θ)
2 1 2

d1 (θ) =  √ sin θ
 2
cos θ − √12 sin θ 
. (389)
1 1 1
2
(1 − cos θ) √2 sin θ 2
(1 + cos θ)

56
13 Clebsch-Gordan (Vector Addition) Coef-
ficients
We consider the tensor (or Kronecker) product of two irreps, and its reduction
according to the Clebsch-Gordan series:
j1X
+j2
j1 j2
D (u) ⊗ D (u) = D j (u). (390)
j=|j1 −j2 |

The carrier space of the representation Dj1 ⊗D j2 is of dimension (2j1 +1)(2j2 +


1). It is the tensor product of the carrier spaces for the representations D j1
and D j2 . Corresponding to the above reduction of the kronecker product, we
have a decomposition of the carrier space into orthogonal subspaces carrying
the representations D j .
For each j we can select a standard basis system:
X
|j1 j2 ; j, mi = C(j1 j2 j; m1 m2 m)|j1 , m1 ; j2 , m2 i, (391)
m1 ,m2

where the latter ket is just the tensor product of the standard basis vectors
|j1 , m1 i and |j2 , m2 i. The coefficients C are the “vector-addition coefficients”,
or “Wigner coefficients”, or, more commonly now, “Clebsch-Gordan (CG)
coefficients”. These coefficients must be selected so that the unit vectors
|j1 j2 ; j, mi transform under rotations according to the standard representa-
tion.
We notice that the CG coefficients relate two systems of orthonormal
basis vectors. Hence, they are matrix elements of a unitary matrix with rows
labeled by the (m1 , m2 ) pair, and columns labeled by the (j, m) pair. Thus,
we have the orthonormality relations:
X
C ∗ (j1 j2 j; m1 m2 m)C(j1 j2 j 0 ; m1 m2 m0 ) = δjj 0 δmm0 , (392)
m1 ,m2
X
C ∗ (j1 j2 j; m1 m2 m)C(j1 j2 j; m01 m02 m) = δm1 m01 δm2 m02 . (393)
j,m

The inverse basis transformation is:


X
|j1 , m1 ; j2 , m2 i = C ∗ (j1 j2 j; m1 m2 m)|j1 j2 ; j, mi. (394)
j,m

57
We wish to learn more about the CG coefficients, including how to com-
pute them in general. Towards accomplishing this, evaluate the matrix ele-
ments of the Clebsch-Gordan series, with
j1 0 j1 00
Dm 0 m00 (u) = hj1 , m1 |D (u)|j1 , m1 i, (395)
1 1

etc. Thus we obtain the explicit reduction formula for the matrices of the
standard representations:
j1 j2
X
Dm 0 m00 (u)Dm0 m00 (u) = hj1 , m01 ; j2 , m02 | D j (u)|j1 , m001 ; j2 , m002 i (396)
1 1 2 2
j
XX X
= C(j1 j2 j; m01 m02 m0 )hj1 j2 ; j, m0 |D j (u) C ∗ (j1 j2 j; m001 m002 m00 )|j1 j2 ; j, m00 i
j m0 m00
X j
0 0 0 ∗ 00 00 00
= C(j1 j2 j; m1 m2 m )C (j1 j2 j; m1 m2 m )Dm 0 m00 (u). (397)
j,m0 ,m00

∗j
Next, we take this equation, multiply by Dm 0 m00 (u), integrate over SU (2),

and use orthogonality to obtain:


Z
∗j j1 j2
C(j1 j2 j; m01 m02 m0 )C ∗ (j1 j2 j; m001 m002 m00 ) = (2j+1) d(u)Dm 0 m00 (u)Dm0 m00 (u)Dm0 m00 (u).
SU (2) 1 1 2 2

(398)
Thus, the CG coefficients for the standard representations are determined by
the matrices in the standard representations, except for a phase factor which
depends on (j1 j2 j) [but not on (m1 m2 m)].
The coefficients vanish unless:
1. j ∈ {|j1 −j2 |, |j1 −j2 |+1, . . . , j1 +j2 } (from the Clebsch-Gordan series).

2. m = m1 + m2 (as will be seen anon).

3. m ∈ {−j, −j +1, . . . , j}, m1 ∈ {−j1 , −j1 +1, . . . , j1 }, m2 ∈ {−j2 , −j2 +


1, . . . , j2 } (by convention).
Consider the Euler angle parameterization for the matrix elements, giv-
ing:

C(j1 j2 j; m01 m02 m0 )C ∗ (j1 j2 j; m001 m002 m00 )


1 Z 2π Z 4π Z π
= (2j + 1) dψ dφ sin θdθ (399)
16π 2 0 0 0
exp {−i [(−m0 + m01 + m02 )ψ + (−m00 + m001 + m002 )φ]} djm0 m00 (θ)djm1 0 m00 (θ)djm2 0 m00 (θ).
1 1 2 2

58
We see that this is zero unless m0 = m01 + m02 and m00 = m001 + m002 , verifying
the above assertion. Hence, we have:

C(j1 j2 j; m01 m02 m0 )C ∗ (j1 j2 j; m001 m002 m00 ) (400)


Z π
(2j + 1)
= sin θdθdjm0 m00 (θ)djm1 0 m00 (θ)djm2 0 m00 (θ).
2 0 1 1 2 2

Now, to put things in a more symmetric form, let m03 = −m0 , m003 = −m00 ,
and use
m03 −m00
dj−m
3
0 ,−m00 (θ) = (−)
j
3d 3
m0 ,m00 (θ), (401)
3 3 3 3

to obtain the result:


2 0 00
(−)m3 −m3 C(j1 j2 j3 ; m01 m02 , −m03 )C ∗ (j1 j2 j3 ; m001 m002 , −m003 ) (402)
2j + 1
Z π
= sin θdθdjm1 0 m00 (θ)djm2 0 m00 (θ)djm3 0 m00 (θ).
0 1 1 2 2 3 3

The d-functions are real. Thus, we can choose our arbitrary phase in C for
given j1 j2 j3 so that at least one non-zero coefficient is real. Then, according
to this formula, all coefficients for given j1 j2 j3 will be real. We therefore
adopt the convention that all CG coefficients are real.
The right-hand side of Eqn. 402 is highly symmetric. Thus, it is useful
to define the “3-j” symbol (Wigner):
 
j1 j2 j3 (−)j1 −j2 −m3 C(j1 j2 j3 ; m1 m2 , −m3 )
≡ √ . (403)
m1 m2 m3 2j3 + 1

In terms of these 3-j symbols we have:


  
j1 j2 j3 j1 j2 j3
(404)
m01 m02 m03 m001 m002 m003
Z π
1
= sin θdθ(θ)djm1 0 m00 (θ)djm2 0 m00 (θ)djm3 0 m00 (θ),
2 0 1 1 2 2 3 3

for m01 + m02 + m03 = m001 + m002 + m003 = 0.


According to the symmetry of the right hand side, we see, for example,
that the square of the 3-j symbol is invariant under permutations of the
columns. Furthermore, since
00
djm0 m00 (π − θ) = (−)j−m dj−m0 m00 (θ), (405)

59
we find that
   
j1 j2 j3 j1 +j2 +j3 j1 j2 j3
= (−) . (406)
m1 m2 m3 −m1 −m2 −m3
It is interesting to consider the question of which irreps occur in the
symmetric, and in the anti-symmetric, tensor products of Dj with itself:

Theorem:
h i [j]
X
j j
D (u) ⊗ D (u) = D 2j−2n (u), (407)
s
n=0

h i [j− 21 ]
X
D j (u) ⊗ Dj (u) = D 2j−2n−1 (u), (408)
a
n=0

where s denotes the symmetric tensor product, a denotes the anti-


symmetric tensor product, and [j] means the greatest integer which is
not larger than j.

Proof: We prove this by considering the characters. Let S(u) ≡ [D j (u) ⊗ Dj (u)]s ,
and A(u) ≡ [Dj (u) ⊗ Dj (u)]a . Then, by definition,
1 j j j j

S(m01 m02 )(m001 m002 ) = Dm0 m00 Dm 0 m00 + Dm0 m00 Dm0 m00 , (409)
2 1 1 2 2 1 2 2 1

1 j j j j

A(m01 m02 )(m001 m002 ) = Dm0 m00 Dm 0 m00 − Dm0 m00 Dm0 m00 . (410)
2 1 1 2 2 1 2 2 1

(411)

Taking the traces means to set m01 = m001 and m02 = m002 , and sum over
m01 and m02 . This yields
1 2
TrS(u) = [χ (u) + χj (u2 )] (412)
2 j
1 2
TrA(u) = [χ (u) − χj (u2 )]. (413)
2 j

We need to evaluate χj (u2 ). If u is a rotation by θ, then u2 is a rotation


by 2θ, hence,
j
X
χj (u2 ) = e−2imθ (414)
m=−j

60
2j
X 2j−1
X 2j−2
X
−ikθ −ikθ
= e − e + e−ikθ − . . . (415)
k=−2j k=−2j+1 k=−2j+2
2j
X 2j−n
X
= (−)n e−ikθ (416)
n=0 k=−2j+n
2j
X
= (−)n χ2j−n (u). (417)
n=0

Next, consider χ2j (u). Since


0 +j 00
jX
χj 0 (u)χj 00 (u) = χj (u), (418)
j=|j 0 −j 00 |

we have
2j
X 2j
X
χ2j (u) = χk (u) = χ2j−n (u). (419)
k=0 n=0

Thus,
2j
1X
TrS(u) = [χ2j−n (u) + (−)n χ2j−n (u)] (420)
2 n=0
[j]
X
= χ2j−2n (u). (421)
n=0

Similarly, we obtain:
[j− 21 ]
X
TrA(u) = χ2j−2n−1 (u). (422)
n=0

This completes the proof.

This theorem implies an important symmetry relation for the CG coeffi-


cients when two j’s are equal. From Eqn. 398, we obtain
Z
∗j j1 j1
C(j1 j1 j; m01 m02 m0 )C(j1 j1 j; m001 m002 m00 ) = (2j+1) d(u)Dm 0 m00 (u)Dm0 m00 (u)Dm0 m00 (u).
SU (2) 1 1 2 2

(423)

61
But
j1 j1
Dm 0 m00 (u)Dm0 m00 (u) = S(m0 m0 )(m00 m00 ) (u) + A(m0 m0 )(m00 m00 ) (u),
1 2 1 2 1 2 1 2
(424)
1 1 2 2

and the integral of this with D∗j (u) picks the D j piece of this quantity,
which must be either m01 ↔ m02 symmetric or anti-symmetric, according to
the theorem we have just proven. If symmetric, then

C(j1 j1 j; m02 m01 m0 ) = C(j1 j1 j; m01 m02 m0 ), (425)

and if anti-symmetric, then

C(j1 j1 j; m02 m01 m0 ) = −C(j1 j1 j; m01 m02 m0 ). (426)

Let’s try a simple example: Let j1 = j2 = 12 . That is, we wish to


combine two spin-1/2 systems (with zero orbital angular momentum). From
the theorem:
  [1/2]
1 1 X
D ⊗D 2 2 = D 1−2n = D 1 , (427)
s
n=0
  [0]
1 1 X
D2 ⊗ D2 = D 1−2n−1 = D 0 . (428)
a
n=0

Hence, the spin-1 combination is symmetric, with basis

|j = 1, m = 1; m1 = 12 , m2 = 12 i, (429)
1  
√ |1, 0; 12 , − 12 i + |1, 0; − 12 , 12 i , (430)
2
|1, −1; − 12 , − 12 i, (431)

and the spin-0 combination is antisymmetric:


1  
√ |0, 0; 12 , − 12 i − |0, 0; − 12 , 12 i . (432)
2
The generalization of this example is that the symmetric combinations
are j = 2j1 , 2j1 − 2, . . ., and the antisymmetric combinations are j = 2j1 −
1, 2j1 − 3, . . . Therefore,

C(j1 j1 j; m02 m01 m0 ) = (−)2j1 +j C(j1 j1 j; m01 m02 m0 ). (433)

62
Also,    
j j J 2j+J j j J
= (−) , (434)
m1 m2 M m2 m1 M
as well as the corresponding column permutations, e.g.,
   
j J j 2j+J j J J
= (−) , (435)
m1 M m2 m2 M m1

We adopt a “standard construction” of the 3-j and CG coefficients: First,


they are selected to be real. Second, for any triplet j1 j2 j the 3-j symbols
are then uniquely determined, except for an overall sign which depends on
j1 j2 j only. By convention, we pick (this is a convention when j1 , j2 , j3 are all
different, otherwise it is required):
     
j1 j2 j3 j2 j3 j1 j2 j1 j3
= = (−)j1 +j2 +j3 .
m1 m2 m3 m2 m3 m1 m2 m1 m3
(436)
That is, the 3-j symbol is chosen to be symmetric under cyclic permutation
of the columns, and either symmetric or anti-symmetric, depending on j1 +
j2 + j3 , under anti-cyclic permutations.
Sometimes the symmetry properties are all we need, e.g., to determine
whether some process is permitted or not by angular momentum conserva-
tion. However, we often need to know the CG (or 3-j) coefficients themselves.
These are tabulated in many places. We can also compute them ourselves,
and we now develop a general formula for doing this. We can take the fol-
lowing as the defining relation for the 3-j symbols, i.e., it can be shown to
be consistent with all of the above constraints:
(x1 y2 − x2 y1 )2k3 (x2 y3 − x3 y2 )2k1 (x3 y1 − x1 y3 )2k2
G({k}; {x}, {y}) ≡ q
(2k3 )!(2k1 )!(2k2 )!(j1 + j2 + j3 + 1)!
X  
j1 j2 j3
≡ Pj1 m1 (x1 , y1 )Pj2 m2 (x2 , y2 )Pj3 m3 (x3 , y3 ), (437)
m1 m2 m3 m1 m2 m3

where 2k1 , 2k2 , 2k3 are non-negative integers given by:

2k3 = j1 + j2 − j3 ; 2k1 = j2 + j3 − j1 ; 2k2 = j3 + j1 − j2 . (438)

We’ll skip the proof of this consistency here. The interested reader may
wish to look at T.Regge, Il Nuovo Cimento X (1958) 296; V. Bargmann, Rev.

63
Mod. Phys. 34 (1962) 829. Since Pjm (∂x , ∂y )Pjm0 (x, y) = δmm0 , we obtain
the explicit formula:
 
j1 j2 j3
= Pj1 m1 (∂x1 , ∂y1 )Pj2 m2 (∂x2 , ∂y2 )Pj3 m3 (∂x3 , ∂y3 )G({k}; {x}, {y}).
m1 m2 m3
(439)
For example consider the special case in which j3 = j1 + j2 (and m3 =
−(m1 + m2 )). In this case, k3 = 0, k1 = j2 , and k2 = j1 . Thus,
 
j1 j2 j1 + j2
(440)
m1 m2 −m1 − m2
∂xj11+m1 ∂yj11 −m1 ∂xj22+m2 ∂yj22 −m2 ∂xj13+j2 −m1 −m2 ∂yj13 +j2 +m1 +m2
=q
(j1 + m1 )!(j1 − m1 )!(j2 + m2 )!(j2 − m2 )!(j1 + j2 − m1 − m2 )!(j1 + j2 + m1 + m2 )!
(x2 y3 − x3 y2 )2j2 (x3 y1 − x1 y3 )2j1
× q (441)
(2j2 )!(2j1 )!(2j1 + 2j2 + 1)!
v
u
u (2j1 )!(2j2 )!(j1 + j2 − m1 − m2 )!(j1 + j2 + m1 + m2 )!
j1 +j2 +m1 −m2 t
= (−) . (442)
(2j1 + 2j2 + 1)!(j1 + m1 )!(j1 − m1 )!(j2 + m2 )!(j2 − m2 )!
For j1 = j3 = j, j2 = 0, we find
 
j 0 j (−)j+m
=√ . (443)
m 0 −m 2j + 1
We may easily derive the corresponding formulas for the CG coefficients.
In constructing tables of coefficients, much computation can be saved by
using symmetry relations and orthogonality of states. For example, we have
really already computed the table for the 12 ⊗ 12 = 1 ⊕ 0 case:

C( 12 12 j; m1 m2 m)

j 1 1 0 1
m1 m2 m 1 0 0 -1
1 1
2 2
1
1
√ √
2
− 12 1/ 2 1/ 2
√ √
− 12 1
2
1/ 2 −1/ 2
− 12 − 12 1

64
1 1 3
For 2
⊗1= 2
⊕ 2
we find:

C(1 12 j; m1 m2 m)
3 3 1 3 1 3
j 2 2 2 2 2 2
3 1 1
m1 m2 m 2 2 2
− 12 − 12 − 32
1
1 2
1
√ q
1 − 12 1/ 3 2/3
q √
1
0 2
2/3 −1/ 3
q √
0 − 12 2/3 1/ 3
√ q
1
-1 2
1/ 3 − 2/3
-1 − 12 1

65
14 Wigner-Eckart Theorem
Consider a complex vector space Hop of operators on H which is closed
under U (u)QU −1 (u), where Q ∈ Hop , and U (u) is a continuous unitary
representation of SU (2). We denote the action of an element of the group
SU (2) on Q ∈ Hop by

Û (u)Q = U (u)QU −1 (u). (444)

Corresponding to this action of the group elements, we have the action


of the Lie algebra of SU (2):

Jˆk Q = [Jk , Q]. (445)

We have obtained this result by noting that, picking a rotation about the k
axis, with U (u) = exp(−iθJk ),

U (u)QU −1 (u) = (1 − iθJk )Q(1 + iθJk ) + O(θ 2 ) (446)


= Q − iθ[Jk , Q] + O(θ2 ), (447)

and comparing with

Û (u)Q = exp(−iθ Jˆk )Q = Q − iθ Jˆk Q + O(θ2 ). (448)

We may also compute the commutator:

[Jˆk , Jˆ`]Q = Jˆk [J` , Q] − Jˆ` [Jk , Q] (449)


= [Jk , [J` , Q]] − [J` , [Jk , Q]] (450)
= [[Jk , J` ], Q] (451)
= [ik`m Jm , Q]. (452)

Hence,
[Jˆk , Jˆ` ] = ik`m Jˆm . (453)

Def: A set of operators Q(j, m), m = −j, −j + 1, . . . , j consists of the 2j + 1


components of a spherical tensor of rank j if:

1. Jˆ3 Q(j, m) = [J3 , Q(j, m)] = mQ(j, m).


q
2. Jˆ+ Q(j, m) = [J+ , Q(j, m)] = (j − m)(j + m + 1)Q(j, m + 1).

66
q
3. Jˆ− Q(j, m) = [J− , Q(j, m)] = (j + m)(j − m + 1)Q(j, m − 1).
Equivalently,
3
X
2
ĴJ Q(j, m) = (Jˆk )2 Q(j, m) (454)
k=1
X3
= [Jk , [Jk , Q(j, m)]] (455)
k=1
= j(j + 1)Q(j, m), (456)
and X j
U (u)Q(j, m)U −1 (u) = Dm 0
0 m (u)Q(j, m ). (457)
m0
That is, the set of operators Q(j, m) forms a spherical tensor if they
transform under rotations like the basis vectors |jmi in the standard
representation.
For such an object, we conclude that the matrix elements of Q(j, m) must
depend on the m values in a particular way (letting k denote any “other”
quantum numbers describing our state in the situation):
h(j 0 m0 )(k0 )|Q(j, m)|(j 00 m00 )(k00 )i (458)
= “hj 0 m0 |(|jmi|j 00 m00 i)hj 0 , k0 ||Qj ||j 00 , k00 i” (459)
0 00 00 0 00 0 0 0 00 00
= A(j, j , j )C(jj j ; mm m )hj , k ||Qj ||j , k i,(460)
where a common convention lets
0 00
q
A(j, j 0 , j 00 ) = (−)j+j −j / 2j 0 + 1. (461)
The symbol hj 0 , k0 ||Qj ||j 00 , k00 i is called the reduced matrix element of the
tensor Qj . Eqn. 460 is the statement of the “Wigner-Eckart theorem”.
Let us try some examples:
1. Scalar operator: In the case of a scalar operator, there is only one
component:
Q(j, m) = Q(0, 0). (462)
The Wigner-Eckart theorem reads
h(j 0 m0 )(k0 )|Q(0, 0)|(j 00 m00 )(k00 )i (463)
j 0 −j 00
(−)
=√ 0 C(0j 00 j 0 ; 0m00 m0 )hj 0 , k0 ||Q0 ||j 00 , k00 i.(464)
2j + 1

67
But
C(0j 00 j 0 ; 0m00 m0 ) = hj 0 m0 |(|00i|j 00 m00 i) (465)
= δj 0 j 00 δm0 m00 , (466)
and hence,
hj 0 , k0 ||Q0 ||j 00 , k00 i
h(j 0 m0 )(k0 )|Q(0, 0)|(j 00 m00 )(k00 )i = δj 0 j 00 δm0 m00 √ 0 . (467)
2j + 1
The presence of the kronecker deltas tells us that a scalar operator
cannot change the angular momentum of a system, i.e., the matrix
element of the operator between states of differing angular momenta is
zero.
2. Vector operator: For j = 1 the Wigner-Eckart theorem is:
h(j 0 m0 )(k0 )|Q(1, m)|(j 00 m00 )(k00 )i (468)
1+j 0 −j 00
(−)
= √ 0 C(1j 00 j 0 ; mm00 m0 )hj 0 , k0 ||Q1 ||j 00 , k00 i.(469)
2j + 1
Before pursuing this equation with an example, let’s consider the con-
struction of the tensor components of a vector operator. We are given,
say, the Cartesian components of the operator: Q = (Qx , Qy , Qz ). We
wish to find the tensor components Q(1, −1), Q(1, 0), Q(1, 1) in terms
of these Cartesian components, in the standard representation. We
must have:
Jˆ3 Q(j, m) = [J3 , Q(j, m)] = mQ(j, m), (470)
q
Jˆ± Q(j, m) = [J± , Q(j, m)] = (j ∓ m)(j ± m + 1)Q(j, m ± 1).(471)
The Qx , Qy , Qz components of a vector operator obey the commutation
relations with angular momentum:
[Jk , Q` ] = ik`m Qm . (472)
Thus, consistency with the desired relations is obtained if
1
Q(1, 1) = − √ (Qx + iQy ) (473)
2
Q(1, 0) = Qz (474)
1
Q(1, −1) = √ (Qx − iQy ). (475)
2

68
These, then, are the components of a spherical tensor of rank 1, ex-
pressed in terms of Cartesian components.
Now let’s consider the case where Q =JJ, that is, the case in which
our vector operator is the angular momentum operator. To evaluate
the reduced matrix element, we chose any convenient component, for
example, Q(1, 0) = Jz . Hence,

hj 0 m0 (k 0 )|Jz |j 00 m00 (k 00 )i = δj 0 j 00 δm0 m00 δk0 k00 m00 (476)


0 0 00 00
hj , k ||J||j , k i 0 00
= √ 0 C(1j 00 j 0 ; 0m00 m0 )(−)1+j +j . (477)
2j + 1

We see that the reduced matrix element vanishes unless j 0 = j 00 (and


k 0 = k00 ).
The relevant CG coefficients are given by
q  
0 0 1 j0 j0
C(1j 0 j 0 ; 0m0 m0 ) = 2j 0 + 1(−)1−j +m , (478)
0 m0 −m0

where
 
1 j0 j0
(479)
0 m −m0
0

= P10 (∂x1 , ∂y1 )Pj 0 m0 (∂x2 , ∂y2 )Pj 0 ,−m0 (∂x3 , ∂y3 )G({k}; {x}, {y})
(480)
0
0 0 2m
= (−)j −m q , (481)
(2j 0 + 2)(2j 0 + 1)2j 0

which the reader is invited to demonstrate (with straightforward, though


somewhat tedious, algebra). Therefore,

m0
C(1j 0 j 0 ; 0m0 m0 ) = − q . (482)
j 0 (j 0 + 1)

Inserting the CG coefficients into our expression for the reduced matrix
element, we find

hj 0 , k0 ||J||j 00 , k00 i q
√ 0 = δj 0 j 00 δk0 k00 j 0 (j 0 + 1). (483)
2j + 1

69
q
We see that this expression behaves like hJJ2 i. Plugging back into the
Wigner-Eckart theorem, we find:

h(j 0 m0 )(k0 )|Jm |(j 00 m00 )(k00 )i (484)


q
= δj 0 j 00 δk0 k00 j 0 (j 0 + 1)[−C(1j 0 j 0 ; mm00 m0 )], (485)

where Jm = J1 , J0 , J−1 denotes the tensor components of J.


Consider, for example, J(1, 1) = − √12 J+ . The Wigner-Eckart theorem
now tells us that
1 1 q
h(jm + 1| − √ J+ |(jm)i = − √ (j − m)(j + m + 1), (486)
2 2
q
= j(j + 1)[−C(1jj; 1mm + 1)].(487)

Thus, we have found an expression which may be employed to compute


some CG coefficients:
v
u
u (j − m)(j + m + 1)
C(1jj; 1mm + 1) = t . (488)
2j(j + 1)

We see that the Wigner-Eckart theorem applied to J itself can be used


to determine CG coefficients.

15 Exercises
1. Prove the theorem we state in this note:

Theorem: The most general mapping x → x0 of R3 into itself, such


that the origin is mapped into the origin, and such that all dis-
tances are preserved, is a linear, real orthogonal transformation
Q:

x0 = Qx, where QT Q = I, and Q∗ = Q. (489)

Hence,
x0 · y0 = x · y ∀ points x, y ∈ R3 . (490)
For such a mapping, either:

70
(a) det(Q) = 1, Q is called a proper orthogonal transformation,
and is in fact a rotation. In this case,

x 0 × y 0 = (x
x × y )0 ∀ points x, y ∈ R3 . (491)

or,
(b) det(Q) = −1, Q is called an improper orthogonal trans-
formation, and is the product of a reflection (parity) and a
rotation. In this case,

x0 × y 0 = −(x
x × y )0 ∀ points x, y ∈ R3 . (492)

The set of all orthogonal transformations forms a group (denoted


O(3)), and the set of all proper orthogonal transformations forms
a subgroup (O + (3) or SO(3) of O(3)), identical with the set of all
rotations.

[You may wish to make use of the following intuitive lemma: Let
e01 , e02 , e03 be any three mutually perpendicular unit vectors such that:

e 03 = e01 × e02 (right-handed system). (493)

Then there exists a unique rotation Ru (θ) such that

e 0i = Ru (θ)eei , i = 1, 2, 3. (494)

2. We stated the following generalization of the addition law for tangents:

Theorem: If Re (θ) = Re00 (θ 00 )Re0 (θ 0 ), and defining:

τ = e tan θ/2 (495)


τ 0 = e0 tan θ0 /2 (496)
τ 00 = e00 tan θ00 /2, (497)

then:
τ 0 + τ 00 + τ 00 × τ 0
τ = . (498)
1 − τ 0 · τ 00

71
A simple way to prove this theorem is to use SU(2) to represent the
rotations,
 i.e., the rotation Re (θ) is represented by the SU(2) matrix
i
exp − 2 θee · σ . You are asked to carry out this proof.

3. We made the assertion that if we had an element u ∈ SU (2) which


commuted with every element of the vector space of traceless 2 × 2
Hermitian matrices, then u must be a multiple of the identity (i.e.,
either u = I or u = −I). Let us demonstrate this, learning a little
more group theory along the way.
First, we note that if we have a matrix group, it is possible to generate
another matrix representation of the group by replacing each element
with another according to the mapping:

u → v (499)
where
v = SuS −1 . (500)

and S is any chosen non-singular matrix.

(a) Show that if {u} is a matrix group, then {v|v = SuS −1 ; S a


non-singular matrix} is a representation of the group (i.e., the
mapping is 1 : 1 and the multiplication table is preserved under
the mapping). The representations {u} and {v} are considered to
be equivalent.
A group of unitary matrices is said to be reducible if there ex-
ists a mapping of the above form such that every element may
simultaneously be written in block-diagonal form:
 
A(g) 0 0
 0 B(g) 0 
M(g) =  
..
0 0 .

∀g ∈ the group (A(g) and B(g) are sub-matrices).


(b) Show that SU (2) is not reducible (i.e., SU (2) is irreducible).
(c) Now prove the following useful lemma: A matrix which commutes
with every element of an irreducible matrix group is a multiple of
the identity matrix. [Hint: Let B be such a matrix commuting
with every element, and consider the eigenvector equation Bx =

72
λx. Then consider the vector ux where u is any element of the
group, and x is the eigenvector corresponding to eigenvalue λ.]
(d) Finally, prove the assertion we stated at the beginning of this
problem.

4. We have discussed rotations using the language of group theory. Let us


look at a simple application of group theory in determining “selection
rules” implied by symmetry under the elements of the group (where
the group is a group of operations, such as rotations). The point is
that we can often predict much about the physics of a situation simply
by “symmetry” arguments, without resorting to a detailed solution.
Consider a positronium “atom”, i.e., the bound state of an electron and
a positron. The relevant binding interaction here is electromagnetism.
The electromagnetic interaction doesn’t depend on the orientation of
the system, that is, it is invariant with respect to rotations. It also
is invariant with respect to reflections. You may wish to convince
yourself of these statements by writing down an explicit Hamiltonian,
and verifying the invariance.
Thus, the Hamiltonian for positronium is invariant with respect to
the group O(3), and hence commutes with any element of this group.
Hence, angular momentum and parity are conserved, and the eigen-
states of energy are also eigenstates of parity and total angular mo-
mentum (J). In fact, the spin and orbital angular momentum degrees
of freedom are sufficiently decoupled that the total spin (S) and or-
bital angular momentum (L) are also good quantum numbers for the
energy eigenstates to an excellent approximation. The ground state
of positronium (“parapositronium”) is the 1 S0 state in 2S+1 LJ spectro-
scopic notation, where L = S means zero orbital angular momentum.
Note that the letter “S” in the spectroscopic notation is not the same
as the “S” referring to the total spin quantum number. Sorry about
the confusion, but it’s established tradition. . .
In the ground state, the positron and electron have no relative orbital
angular momentum, and their spins are anti-parallel. The first excited
state (“orthopositronium”) is the 3 S1 state, in which the spins of the
positron and electron are now aligned parallel with each other. The
3
S1 −1 S0 splitting is very small, and is analogous to the “hyperfine”
splitting in normal atoms.

73
Positronium decays when the electron and positron annihilate to pro-
duce photons. The decay process is also electromagnetic, hence also
governed by a Hamiltonian which is invariant under O(3). As a conse-
quence of this symmetry, angular momentum and parity are conserved
in the decay.

(a) We said that parity was a good quantum number for positronium
states. To say just what the parity is, we need to anticipate a
result from the Dirac equation (sorry): The intrinsic parities of
the electron and positron are opposite. What is the parity of
parapositronium? Of orthopositronium?
(b) We wish to know whether positronium can decay into two photons.
Let us check parity conservation. What are the possible parities
of a state of two photons, in the center-of-mass frame? Can you
exclude the decay of positronium to two photons on the basis of
parity conservation?
(c) Let us consider now whether rotational invariance, i.e., conserva-
tion of angular momentum, puts any constraints on the permitted
decays of positronium. Can the orthopositronium state decay to
two photons? What about the parapositronium state?

5. The “charge conjugation” operator, C, is an operator that changes all


particles into their anti-particles. Consider the group of order 2 gen-
erated by the charge conjugation operator. This group has elements
{I, C}, where I is the identity element. The electromagnetic interac-
tion is invariant with respect to the actions of this group. That is, any
electromagnetic process for a system of particles should proceed iden-
tically if all the particles are replaced by their anti-particles. Hence, C
is a conserved quantity. Let’s consider the implications of this for the
1
S0 and 3 S1 positronium decays to two photons. [See the preceeding
exercise for a discussion of positronium. Note that you needn’t have
done that problem in order to do this one.]

(a) The result of operating C on a photon is to give a photon, i.e.,


the photon is its own anti-particle, and is thus an eigenstate of
C. What is the eigenvalue? That is, what is the “C-parity” of
the photon? You should give your reasoning. No credit will be
given for just writing down the answer, even if correct. [Hint:

74
think classically about electromagnetic fields and how they are
produced.] Hence, what is the C-parity of a system of n photons?
(b) It is a bit trickier to figure out the charge conjugation of the
positronium states. Since these are states consisting of a particle
and its antiparticle, we suspect that they may also be eigenstates
of C. But is the eigenvalue positive or negative? To determine
this, we need to know a bit more than we know so far.
Let me give an heuristic argument for the new understanding that
we need. First, although we haven’t talked about it yet, you prob-
ably already know about the “Pauli Exclusion Principle”, which
states that two identical fermions cannot be in the same state.
x1 , S 1 ; x2 , S 2 i.
Suppose we have a state consisting of two electrons, |x
We may borrow an idea we introduced in our discussion of the har-
monic oscillator, and define a “creation operator”, a† (x
x, S ), which
creates an electron at x with spin S . Consider the two-electron
state:

[a† (x
x1 , S 1 )a† (x
x2 , S 2 ) + a† (x
x2 , S 2 )a† (x
x1 , S 1 )]|0i, (501)

where |0i is the “vacuum” state, with no electrons. But this puts
both electrons in the same state, since it is invariant under the in-
terchange 1 ↔ 2. Therefore, in order to satisfy the Pauli principle,
we must have that

a† (x
x1 , S 1 )a† (x
x2 , S 2 ) + a† (x
x2 , S 2 )a† (x
x1 , S 1 ) = 0 (502)

That is, the creation operators anti-commute. To put it an-


other way, if two electrons are interchanged, a minus sign is in-
troduced. You may be concerned that a positron and an electron
are non-identical particles, so maybe this has nothing to do with
positronium. However, the relativistic description is such that the
positron and electron may be regarded as different “components”
of the electron (e.g., the positron may be interpreted in terms of
“negative-energy” electron states), so this anti-commutation rela-
tion is preserved even when creating electrons and positrons.
Determine the C-parity of the 3 S1 and 1 S0 states of positronium,
and thus deduce whether decays to two photons are permitted
according to conservation of C. [Hint: Consider a positronium

75
state and let C act on it. Relate this back to the original state by
appropriate transformations.]

6. Suppose we have a system with total angular momentum 1. We pick


a basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1 
ρ = 1 1 0.
4
1 0 1

(a) Is ρ a permissible density matrix? Give your reasoning. For the


remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?
(c) What is the spread (standard deviation) in measured values of Jz ?

7. Let us consider the application of the density matrix formalism to the


problem of a spin-1/2 particle (such as an electron) in a static external
magnetic field. In general, a particle with spin may carry a magnetic
moment, oriented along the spin direction (by symmetry). For spin-
1/2, we have that the magnetic moment (operator) is thus of the form:
1
σ,
µ = γσ (503)
2
where σ are the Pauli matrices, the 12 is by convention, and γ is a
constant, giving the strength of the moment, called the gyromagnetic
ratio. The term in the Hamiltonian for such a magnetic moment in an
external magnetic field, B is just:

µ · B.
H = −µ (504)

Our spin-1/2 particle may have some spin-orientation, or “polarization


vector”, given by:
σi.
P = hσ (505)

76
Drawing from our classical intuition, we might expect that in the ex-
ternal magnetic field the polarization vector will exhibit a precession
about the field direction. Let us investigate this.
Recall that the expectation value of an operator may be computed from
the density matrix according to:

hAi = Tr(ρA). (506)

Furthermore, recall that the time evolution of the density matrix is


given by:
∂ρ
i = [H(t), ρ(t)]. (507)
∂t
What is the time evolution, dP/dt, of the polarization vector? Express
your answer as simply as you can (more credit will be given for right
answers that are more physically transparent than for right answers
which are not). Note that we make no assumption concerning the
purity of the state.
8. Let us consider a system of N spin-1/2 particles (as in the previous
problem) per unit volume in thermal equilibrium, in our external mag-
netic field B. [Even though we refer to the previous exercise, the solu-
tion to this problem does not require solving the previous one.] Recall
that the canonical distribution is:
e−H/T
ρ= , (508)
Z
with partition function:
 
Z = Tr e−H/T . (509)

Such a system of particles will tend to orient along the magnetic field,
resulting in a bulk magnetization (having units of magnetic moment
per unit volume), M.
(a) Give an expression for this magnetization (don’t work too hard to
evaluate).
(b) What is the magnetization in the high-temperature limit, to lowest
non-trivial order (this I want you to evaluate as completely as you
can!)?

77
9. We have discussed Lie algrebras (with Lie product given by the com-
mutator) and Lie groups, in our attempt to deal with rotations. At one
point, we asserted that the structure (multiplication table) of the Lie
group in some neighborhood of the identity was completely determined
by the structure (multiplication table) of the Lie algebra. We noted
that, however intuitively pleasing this might sound, it was not actually
a trivial statement, and that it followed from the “Baker-Campbell-
Hausdorff” theorem. Let’s try to tidy this up a bit further here.
First, let’s set up some notation: Let L be a Lie algebra, and G be the
Lie group generated by this algebra. Let X, Y ∈ L be two elements of
the algebra. These generate the elements eX , eY ∈ G of the Lie group.
We assume the notion that if X and Y are close to the zero element of
the Lie algebra, then eX and eY will be close to the identity element of
the Lie group.
What we want to show is that the group product eX eY may be ex-
pressed in the form eZ , where Z ∈ L, at least for X and Y not too
“large”. Note that the non-trivial aspect of this problem is that, first,
X and Y may not commute, and second, objects of the form XY may
not be in the Lie algebra. Elements of L generated by X and Y must
be linear combinations of X, Y , and their repeated commutators.

(a) Suppose X and Y commute. Show explicitly that the product


eX eY is of the form eZ , where Z is an element of L. (If you think
this is trivial, don’t worry, it is!)
(b) Now suppose that X and Y may not commute, but that they are
very close to the zero element. Keeping terms to quadratic order
in X and Y , show once again that the product eX eY is of the form
eZ , where Z is an element of L. Give an explicit expression for Z.
(c) Finally, for more of a challenge, let’s do the general theorem: Show
that eX eY is of the form eZ , where Z is an element of L, as long
as X and Y are sufficiently “small”. We won’t concern ourselves
here with how “small” X and Y need to be – you may investigate
that at more leisure.
Here are some hints that may help you: First, we remark that the
differential equation
df
= Xf (u) + g(u), (510)
du
78
where X ∈ L, and letting f (0) = f0 , has the solution:
Z u
f (u) = euX f0 + e(u−v)X g(v)dv. (511)
0

This can be readily verified by back-substitution. If g is indepen-


dent of u, then the integral may be performed, with the result:
f (u) = euX f0 + h(u, X)g, (512)
Where, formally,
euX − 1
h(u, X) = . (513)
X
Second, if X, Y ∈ L, then
eX Y e−X = eXc (Y ), (514)
where I have introduced the notation “Xc” to mean “take the
commutator”. That is, Xc (Y ) ≡ [X, Y ]. This fact may be demon-
strated by taking the derivative of
A(u; Y ) ≡ euX Y e−uX (515)
with respect to u, and comparing with our differential equation
above to obtain the desired result.
Third, assuming X = X(u) is differentiable, we have
d −X(u) dX
eX(u) e = −h(1, X(u)c) . (516)
du du
This fact may be verified by considering the object:
∂ −tX(u)
B(t, u) ≡ etX(u) e , (517)
∂u
and differentiating (carefully!) with respect to t, using the above
two facts, and finally letting t = 1.
One final hint: Consider the quantity
 
Z(u) = ln euX eY . (518)
The series:
ln z z − 1 (z − 1)2
`(z) = =1− + − ··· (519)
z−1 2 3
plays a role in the explicit form for the result. Again, you are not
asked to worry about convergence issues.

79
10. In an earlier exercise we considered the implication of rotational in-
variance for the decays of positronium states into two photons. Let us
generalize and broaden that discussion here. Certain neutral particles
(e.g., π 0 , η, η 0 ) are observed to decay into two photons, and others (e.g.,
ω, φ, ψ) are not. Let us investigate the selection rules implied by an-
gular momentum and parity conservation (satisfied by electromagnetic
and strong interactions) for the decay of a particle (call it “X”) into
two photons. Thus, we ask the question, what angular momentum J
and parity P states are allowed for two photons?
Set up the problem in the center-of-mass frame of the two photons,
with the z-axis in the direction of one photon. We know that since a
photon is a massless spin-one particle, it has two possible spin states,
which we can describe by its “helicity”, i.e., its spin projection along its
direction of motion, which can take on the values ±1. Thus, a system
of two photons can have the spin states:

|↑↑i, |↓↓i, |↑↓ + ↓↑i, |↑↓ − ↓↑i

(The first arrow refers to the photon in the +z direction, the second
to the photon in the −z direction, and the direction of the arrow in-
dicates the spin component along the z-axis, NOT to the helicity.)
We consider the effect on these states of three operations (which, by
parity and angular momentum conservation, should commute with the
Hamiltonian):

• P : parity – reverses direction of motion of a particle, but leaves


its angular momentum unaltered.
• Rz (α): rotation by angle α about the z-axis. A state with a given
value of Jz (z-component of angular momentum) is an eigenstate,
with eigenvalue eiαJz .
• Rx (π): rotation by π about x-axis. For our two photons, this
reverses the direction of motion and also the angular momentum
of each photon. For our “X” particle, this operation has the effect
corresponding to the effect on the spherical harmonic with the
appropriate eigenvalues:

Rx (π)YJJz (Ω)

80
(Note that the Ylm functions are sufficient, since a fermion ob-
viously can’t decay into two photons and conserve angular mo-
mentum – hence X is a boson, and we needn’t consider 12 -integer
spins.)
Make sure that the above statements are intuitively clear to you.
(a) By considering the actions of these operations on our two-photon
states, complete the following table: (one entry is filled in for you)

Photonic Spin Transformation


State P Rz (α) Rx (π)
|↑↑i + |↑↑i
|↓↓i
|↑↓ + ↓↑i
|↑↓ − ↓↑i
(b) Now fill in a table of eigenvalues for a state (i.e., our particle
“X”) of arbitrary integer spin J and parity P (or, if states are not
eigenvectors, what the transformations yield):
Spin J Transformation
P
n o
Rz (α) Rx (π)
+1
0
n −1 o
+1
1
n −1 o
+1
2, 4, 6, ...
n −1 o
+1
3, 5, 7, ... −1

Note that there may be more than one eigenvalue of Rz (α) for a
given row, corresponding to the different possible values of Jz .
(c) Finally, by using your answers to parts (a) and (b), determine the
allowed and forbidden J P states decaying into two photons, and
the appropriate photonic helicity states for the allowed transitions.
Put your answer in the form of a table:
Parity Spin
0 1 2,4,... 3,5,...
+1
−1 |↑↓ − ↓↑i

81
You have (I hope) just derived something which is often referred to as
“Yang’s theorem”. Note: People often get this wrong, so be careful!
11. We said that if we are given an arbitrary representation, D(u), of
SU (2), it may be reduced to a direct sum of irreps D r (u):
X
D(u) = ⊕D r (u). (520)
r

The multiplicities mj (the number of irreducible representations D r


which belong to the equivalence class of irreducible representations
characterized by index j) are unique, and they are given by:
Z
mj = d(u)χj (u−1 )χ(u), (521)
SU (2)

where χ(u) = Tr [D(u)].


(a) Suppose you are given a representation, with characters:
sin 32 θ + 2 sin 74 θ cos 14 θ
χ [ue (θ)] = 1 + . (522)
sin 12 θ
What irreducible representations appear in the reduction of this
representation, with what multiplicities?
(b) Does the representation we are given look like it could correspond
to rotations of a physically realizable system? Discuss.
12. In nuclear physics, we have the notion of “charge independence”, or
the idea that the nuclear (strong) force does not depend on whether
we are dealing with neutrons or protons. Thus, the nuclear force is
supposed to be symmetric with respect
  to unitary
 transformations on
a space with basis vectors p = 10 , n = 01 . Neglecting the overall
phase symmetry, we see that the symmetry group is SU(2), just as for
rotations. As with angular momentum, we can generate representations
of other dimensions, and work with systems of more than one nucleon.
By analogy with angular momentum, we say that the neutron and
proton form an “isotopic spin” (or “isospin”) doublet, with
1 1
|ni = |I = , I3 = − i (523)
2 2
1 1
|pi = |I = , I3 = + i (524)
2 2
82
(The symbol “T ” is also often used for isospin). Everything you know
about SU(2) can now be applied in isotopic spin space.
Study the isobar level diagram of the He6 , Li6 , Be6 nuclear level schemes,
and discuss in detail the evidence for charge independence of the nu-
clear force. These graphs are quantitative, so your discussion should
also be quantitative. Also, since these are real-life physical systems,
you should worry about real-life effects which can modify an idealized
vision.
You may find an appropriate level scheme via a google search (you want
a level diagram for the nuclear isobars of 6 nucleons), e.g., at:
http://www.tunl.duke.edu/nucldata/figures/06figs/06 is.pdf
For additional reference, you might find it of interest to look up:
F. Ajzenberg-Selove, “Energy Levels of Light Nuclei, A = 5-10,” Nucl.
Phys. A490 1-225 (1988)
(see also http://www.tunl.duke.edu/nucldata/fas/88AJ01.shtml).
13. We defined the “little-d” functions according to:
djm1 m2 (θ) = Dm
j
1 m2
(0, θ, 0) = hj, m1 |e−iθJ2 |j, m2 i
j
where the matrix elements Dm 1 m2
(ψ, θ, φ), parameterized by Euler an-
gles ψ, θ, φ, are given in the “standard representation” by:
j
Dm 1 m2
(ψ, θ, φ) = hj, m1 |D j (u)|j, m2 i = e−i(m1 ψ+m2 φ) djm1 m2 (θ)
We note that an explicit calculation for these matrix elements is pos-
sible via:
j
Dm 1 m2
(u) = Pjm1 (∂x , ∂y )Pjm2 (u11 x + u21 y, u12 x + u22 y) (525)
where
xj+m y j−m
Pjm (x, y) ≡ q . (526)
(j + m)!(j − m)!

Prove the following handy formulas for the djm1 m2 (θ) functions:
a) dj∗ j
m1 m2 (θ) = dm1 m2 (θ) (reality of dj functions)

b) djm1 m2 (−θ) = djm2 m1 (θ)

83
c) djm1 m2 (θ) = (−)m1 −m2 djm2 m1 (θ)

d) dj−m1 ,−m2 (θ) = (−)m1 −m2 djm1 m2 (θ)

e) djm1 m2 (π − θ) = (−)j−m2 dj−m1 ,m2 (θ) = (−)j+m1 djm1 ,−m2 (θ)

f) djm1 m2 (2π + θ) = (−)2j djm1 m2 (θ)

14. We would like to consider the (qualitative) effects on the energy levels
of an atom which is moved from freedom to an external potential (a
crystal, say) with cubic symmetry. Let us consider a one-electron atom
and ignore spin for simplicity. Recall that the wave function for the
case of the free atom looks something like Rnl (r)Ylm (θ, φ), and that
all states with the same n and l quantum numbers have the same
energy, i .e., are (2l + 1)-fold degenerate. The Hamiltonian for a free
atom must have the symmetry of the full rotation group, as there are
no special directions. Thus, we recall some properties of this group
for the present discussion. First, we remark that the set of functions
{Ylm : m = −l, −l + 1, · · · , l − 1, l} for a given l forms the basis for a
(2l + l)-dimensional subspace which is invariant under the operations
of the full rotation group. [A set {ψi } of vectors is said to span an
invariant subspace Vs under a given set of operations {Pj } if Pj ψi ∈ Vs
∀i, j.] Furthermore, this subspace is “irreducible,” that is, it cannot be
split into smaller subspaces which are also invariant under the rotation
group.
Let us denote the linear transformation operator corresponding to ele-
ment R of the rotation group by the symbol P̂R , i.e.:

P̂R f (~x) = f (R−1~x)

84
The way to think about this equation is to regard the left side as giving
a “rotated function,” which we evaluate at point ~x. The right side tells
us that this is the same as the original function evaluated at the point
R−1~x, where R−1 is the inverse of the rotation matrix corresponding to
rotation R. Since {Ylm } forms an invariant subspace, we must have:
l
X
P̂R Ylm = Ylm0 D l (R)m0 m
0
m =−1

The expansion coefficients, Dl (R)m0 m , can be regarded as the elements


of a matrix D l (R). As discussed in the note, D ` corresponds to an
irreducible representation of the rotation group.
Thus, for a free atom, we have that the degenerate eigenfunctions of
a given energy must transform according to an irreducible representa-
tion of this group. If the eigenfunctions transform according to the lth
representation, the degeneracy of the energy level is (2l + 1) (assuming
no additional, “accidental” degeneracy).
I remind you of the following:
Definition: Two elements a and b of a group are said to belong to the
same “class” (or “equivalence class” or “conjugate class”) if there exists
a group element g such that g −1 ag = b.
The first two parts of this problem, are really already done in the note,
but here is an opportunity to think about it for yourself:

(a) Show that all proper rotations through the same angle ϕ, about
any axis, belong to the same class of the rotation group.
(b) We will need the character table of this group. Since all elements
in the same class have the same character, we pick a convenient
element in each class by considering rotations about the z-axis,
R = (α, z) (means rotate by angle α about the z-axis). Thus:

P̂(α,z) Y`m = e−imα Y`m

(which you should convince yourself of).


Find the character “table” of the rotation group, that is, find
χ` (α), the character of representation D ` for the class of rotations
through angle α. If you find an expression for the character in the

85
form of a sum, do the sum, expressing your answer in as simple a
form as you can.
(c) At last we are ready to put our atom into a potential with cubic
symmetry. Now the symmetry of the free Hamiltonian is broken,
and we are left with the discrete symmetry of the cube. The
symmetry group of proper rotations of the cube is a group of
order 24 with 5 classes. Call this group “O”.
Construct the character table for O.
(d) Consider in particular how the f -level (l = 3) of the free atom
may split when it is placed in the “cubic potential”. The seven
eigenfunctions which transform according to the irreducible rep-
resentation D 3 of the full group will most likely not transform
according to an irreducible representation of O. On the other
hand, since the operations of O are certainly operations of D, the
eigenfunctions will generate some representation of O.
Determine the coefficients in the decomposition.

D 3 = a1 O1 ⊕ a2 O2 ⊕ a3 O3 ⊕ a4 O4 ⊕ a5 O5 ,

where Oi are the irreducible representations of O. Hence, show


how the degeneracy of the 7-fold level may be reduced by the cubic
potential. Give the degeneracies of the final levels.
Note that we cannot say anything here about the magnitude of
any splittings (which could “accidentally” turn out to be zero!),
or even about the ordering of the resulting levels – that depends
on the details of the potential, not just its symmetry.

15. We perform an experiment in which we shine a beam of unpolarized


white light at a gas of excited hydrogen atoms. We label atomic states
by |n`mi, where ` is the total (orbital, we are neglecting spin in this
problem) angular momentum, m is the z-component of angular momen-
tum (Lz |n`mi = m|n`mi), and n is a quantum number determining
the radial wave function. The light beam is shone along the x-axis.
We are interested in transition rates between atomic states, induced by
the light. Since we are dealing with visible light, its wavelength is much
larger than the size of the atom. Thus, it is a good first approximation
to consider only the interaction of the atomic dipole moment with the

86
electric field of the light beam. That is, the spatial variation in the
plane wave eikx, describing the light beam, may be replaced by the
lowest-order term in its expansion, i.e., by 1. Thus, we need only
consider the interaction of the dipole moment with the electric field of
the light beam, taken to be uniform. The electric dipole moment of
the atom is proportional to exx, where x is the position of the electron
relative to the nucleus. Hence, in the “dipole approximation”, we are
E, where E is the electric field vector
interested in matrix elements of x ·E
of the light beam.
Calculate the following ratios of transition rates in the dipole approxi-
mation:
Γ(|23, 1, 1i → |1, 0, 0i)
a)
Γ(|23, 1, 0i → |1, 0, 0i)

Γ(|3, 1, 0i → |4, 2, 1i)


b).
Γ(|3, 1, −1i → |4, 2, 0i)

[Hint: this is an application of the Wigner-Eckart theorem.]

16. It is possible to arrive at the Clebsch-Gordan coefficients for a given


situation by “elementary” means, i.e., by considering the action of the
raising and lowering operators and demanding orthonormality. Hence,
construct a table of Clebsch-Gordan coefficients, using this approach,
for a system combining j1 = 2 and j2 = 1 angular momenta. I find it
convenient to use the simple notation |jmi for total quantum numbers
and |j1 m1 i|j2 m2 i for the individual angular momentum states being
added, but you may use whatever notation you find convenient.] You
will find (I hope) that you have the freedom to pick certain signs. You
are asked to be consistent with the usual conventions where

h33| (|22i|11i) ≥ 0 (527)


h22| (|22i|10i) ≥ 0 (528)
h11| (|22i|1 −1i) ≥ 0 (529)

(in notation hjm| (|j1 m1 i|j2 m2 i)).

87
17. In our discussion of the Wigner-Eckart theorem, we obtained the re-
duced matrix element for the angular momentum operator: hj 0 k 0 kJkj 00 k 00 i.
This required knowing the Clebsch-Gordan coefficient C(1, j, j; 0, m, m).
By using the general prescription for calculating the 3j symbols we de-
veloped, calculate the 3j symbol
 
1 j j
,
0 m −m

and hence obtain C(1, j, j; 0, m, m).

18. Rotational Invariance and angular distributions: A spin-1 particle is


polarized such that its spin direction is along the +z axis. It decays,
with total decay rate Γ, to π + π − . What is the angular distribution,
dΓ/dΩ, of the π + ? Note that the π ± is spin zero. What is the angular
distribution if the intial spin projection is zero along the z-axis? Minus
one? What is the result if the intial particle is unpolarized (equal
probabilities for all spin orientations)?

19. Here is another example of how we can use the rotation matrices to
compute the angular distribution in a decay process. Let’s try another
similar example. Consider a spin one particle, polarized with its angular
momentum along the ±z-axis, with equal probabilities. Suppose it
decays to two spin-1/2 particles, e.g., an electron and a positron.

(a) Such the decay occurs with no orbital angular momentum. What
is the angular distribution of the decay products, in the frame of
the decaying particle?
(b) If this is an electromagnetic decay to e+ e− , and the mass of the
decaying particle is much larger than the electron mass, the sit-
uation is altered, according to relativistic QED. In this case, the
final state spins will be oriented in such a way as to give either
m = 1 or m = −1 along the decay axis, where m is the total
projected angular momentum. What is the angular distribution
of the decay products in this case?

88
Physics 195
Course Notes
Angular Momentum
Solutions to Problems
030131 F. Porter

1 Exercises
1. Prove the theorem we state in this note:
Theorem: The most general mapping x → x of R3 into itself, such
that the origin is mapped into the origin, and such that all dis-
tances are preserved, is a linear, real orthogonal transformation
Q:
x = Qx, where QT Q = I, and Q∗ = Q. (1)
Hence,
x · y = x · y ∀ points x, y ∈ R3 . (2)
For such a mapping, either:
(a) det(Q) = 1, Q is called a proper orthogonal transformation,
and is in fact a rotation. In this case,
x × y  = (x
x × y ) ∀ points x, y ∈ R3 . (3)
or,
(b) det(Q) = −1, Q is called an improper orthogonal trans-
formation, and is the product of a reflection (parity) and a
rotation. In this case,
x × y  = −(x
x × y ) ∀ points x, y ∈ R3 . (4)
The set of all orthogonal transformations forms a group (denoted
O(3)), and the set of all proper orthogonal transformations forms
a subgroup (O + (3) or SO(3) of O(3)), identical with the set of all
rotations.
[You may wish to make use of the following intuitive lemma: Let
e1 , e2 , e3 be any three mutually perpendicular unit vectors such that:
e3 = e1 × e2 (right-handed system). (5)

1
Then there exists a unique rotation Ru (θ) such that

ei = Ru (θ)eei , i = 1, 2, 3. (6)

]
Solution: Let us first show that the scalar product is preserved, under
a mapping which takes the origin to the origin and which preserves
distances. Consider the scalar product between any pair of vectors
x ∈ R3 and y ∈ R3 . Under the mapping, we have:
1
x · y = − [(x − y ) · (x − y ) − x · x − y · y ]
2
1 2   
= − d (x , y ) − d2 (x , 0) − d2 (y, 0)
2
1 2 
= − d (x, y) − d2 (x, 0) − d2 (y, 0)
2
= x · y. (7)

Now we’ll show linearity. Let a be any real number. Let z = ax + y.


We have

z · z = Qz · Qz
= a2 x2 + 2ax · y + y2
= a2 (Qx)2 + 2a(Qx) · (Qy) + (Qy)2 . (8)

Since a, x and y are arbitrary, we must have

Qz = aQx + Qy, (9)

that is, Q is a linear operator (this may also be demonstrated by con-


sidering a basis). Hence, Q is a 3 × 3 matrix. Since the mapping is
from R3 into itself, every element of the matrix must be real: To see
this explicitly, consider a vector x with components xj = δjk . Then
3

xi = Qij xj = Qik . (10)
j=1

We can pick out each element of Q in this way, and hence each must
be real.

2
Let us look at orthogonality. We must have

x · x = xT x
= (Qx)T Qx
= xT QT Qx
= xT x. (11)

Since x is arbitrary, we must have

QT Q = I, (12)

that is, Q is an orthogonal matrix.


The determinant of an orthogonal matrix is either +1 or −1:

1 = det(I) = det(QT Q) = det(QT ) det(Q) = det(Q)2 . (13)

Consider the triple product:

(x × y ) · z = (Qx × Qy) · Qz (14)

But a triple product may be computed as the determinant of the matrix


formed by putting each vector in a column. The transformed triple
product thus corresponds to the determinant of the matrix obtained
by taking the product of Q times the original triple product matrix.
Therefore,
(x × y ) · z = det(Q)(x × y) · z. (15)

For the case det(Q) = +1: Pick an orthonormal basis {ei }, i = 1, 2, 3.


The handedness of this basis is preserved under Q, since the triple
product is preserved. Let ei = Qei , i = 1, 2, 3. The three vectors
{e } constitutes a new orthonormal basis with the same handedness.
Since there is a unique rotation relating a basis and a rotated basis
(with handedness preserved), there exists a rotation R such that ei =
Rei , i = 1, 2, 3. We see that R = Q.
For the case det(Q) = −1 we note that any such matrix may be written
as the product of −I and a real orthogonal matrix with det = +1. The
remainder of the proof is straightforward.

2. We stated the following generalization of the addition law for tangents:

3
Theorem: If Re (θ) = Re (θ )Re (θ ), and defining:
τ = e tan θ/2 (16)
τ  = e tan θ /2 (17)
τ  = e tan θ /2, (18)
then:
τ  + τ  + τ  × τ 
τ = . (19)
1 − τ  · τ 
A simple way to prove this theorem is to use SU(2) to represent the
rotations,
 i.e., the rotation Re (θ) is represented by the SU(2) matrix
i
exp − 2 θee · σ . You are asked to carry out this proof.
Solution: We have
Re (θ) = Re (θ )Re (θ ). (20)
Represent our rotations with the form:
 
i θ θ
Re (θ) = exp − θee · σ = cos − ie · σ sin . (21)
2 2 2
Thus,
θ θ
cos − ie · σ sin = (22)
2  2 
θ θ θ θ
cos − ie · σ sin cos − ie · σ sin
2 2 2 2
   
θ θ θ θ
= cos cos − e · σ e · σ sin sin
2 2 2 2
  
 θ θ  θ θ
−i e · σ cos sin + e · σ cos sin
2 2 2 2
   
θ θ θ θ
= cos cos − e · e sin sin

2 2 2 2
 
  θ θ  θ θ  θ θ
−i (e × e ) · σ sin sin + e · σ cos sin + e · σ cos sin ,
2 2 2 2 2 2
(23)
where we have used the identity
e · σ e · σ = e · e + i(e × e ) · σ . (24)

4
The above result may be separated into real and imaginary parts:

θ θ θ θ θ
cos = cos cos − e · e sin sin (25)
2 2 2 2 2
θ θ θ θ θ θ θ
sin = (e × e ) sin sin + e cos sin + e cos sin (26)
.
2 2 2 2 2 2 2
Dividing these two equations gives us a result which may be expressed
in terms of tangents, and thence in the form of the theorem.

3. We made the assertion that if we had an element u ∈ SU(2) which


commuted with every element of the vector space of traceless 2 × 2
Hermitian matrices, then u must be a multiple of the identity (i.e.,
either u = I or u = −I). Let us demonstrate this, learning a little
more group theory along the way.
First, we note that if we have a matrix group, it is possible to generate
another matrix representation of the group by replacing each element
with another according to the mapping:

u → v (27)
where
v = SuS −1. (28)

and S is any chosen non-singular matrix.

(a) Show that if {u} is a matrix group, then {v|v = SuS −1 ; S a


non-singular matrix} is a representation of the group (i.e., the
mapping is 1 : 1 and the multiplication table is preserved under
the mapping). The representations {u} and {v} are considered to
be equivalent.
A group of unitary matrices is said to be reducible if there ex-
ists a mapping of the above form such that every element may
simultaneously be written in block-diagonal form:
 
A(g) 0 0
 0 B(g) 0 
M(g) =  
..
0 0 .

∀g ∈ the group (A(g) and B(g) are sub-matrices).

5
Solution: The mapping is clearly invertible, hence 1:1. It remains
to show that the multiplication table is preserved. Suppose u3 =
u1 u2 . Under the mapping u → v:
v3 = Su3 S −1 (29)
= Su1 u2 S −1 (30)
= Su1 S −1 Su2 S −1 (31)
= v1 v2 . (32)
(33)
The multiplication table is preserved.
(b) Show that SU(2) is not reducible (i.e., SU(2) is irreducible).
Solution: There are various ways of demonstrating this, including
a “brute-force” calculation. Let us make a simple argument based
on counting degrees-of-freedom. We know that is takes three real
parameters to uniquely specify an element of SU(2). “Block diag-
onal” form for SU(2) means diagonal, i.e., if SU(2) is reducible,
then every element of the group may be represented simultane-
ously by a diagonal matrix. A general diagonal 2 × 2 complex
matrix may be parmeterized by four real numbers. However, the
determinant is preserved under the similarity transformation. Re-
quiring unimodularity reduces the number of degrees of freedom
to only two. This is not sufficient, hence SU(2) is not reducible.
(c) Now prove the following useful lemma: A matrix which commutes
with every element of an irreducible matrix group is a multiple of
the identity matrix. [Hint: Let B be such a matrix commuting
with every element, and consider the eigenvector equation Bx =
λx. Then consider the vector ux where u is any element of the
group, and x is the eigenvector corresponding to eigenvalue λ.]
Solution: Let V be the vector space (“carrier space”) operated
on by the elements of the group, G. Consider Bx = λx. We find
Bux = uBx = λux. (34)
But we assume that our group is irreducible. This means that
there is no invariant subspace of V under the actions of the group.
The set {ux : u ∈ G} thus spans V . Hence, every element of V is
an eigenvector of B, with eigenvalue λ, and therefore B = λI.

6
(d) Finally, prove the assertion we stated at the beginning of this
problem.
Solution: As in the notes, let V3 be the vector space of traceless
2 × 2 traceless Hermitian matrices. Suppose u ∈ SU(2) commutes
with every element of V3 . Any element X of V3 may be written in
the form:
3

X= xi σi . (35)
i=1
We know further that any element g of SU(2) may be expressed
as
 3


−iX
g = e = exp −i xi σi (36)
i=1

x
= I cos |x| − i · σ sin |x|. (37)
|x|
Certainly [u, I] = 0, and we are given [u, X] = 0, hence [u, σi ] =
0, i = 1, 2, 3. Thus [u, g] = 0, i.e., u commutes with every element
of SU(2). By part (c), u is therefore a multiple of the identity.

4. We have discussed rotations using the language of group theory. Let us


look at a simple application of group theory in determining “selection
rules” implied by symmetry under the elements of the group (where
the group is a group of operations, such as rotations). The point is
that we can often predict much about the physics of a situation simply
by “symmetry” arguments, without resorting to a detailed solution.
Consider a positronium “atom”, i.e., the bound state of an electron and
a positron. The relevant binding interaction here is electromagnetism.
The electromagnetic interaction doesn’t depend on the orientation of
the system, that is, it is invariant with respect to rotations. It also
is invariant with respect to reflections. You may wish to convince
yourself of these statements by writing down an explicit Hamiltonian,
and verifying the invariance.
Thus, the Hamiltonian for positronium is invariant with respect to
the group O(3), and hence commutes with any element of this group.
Hence, angular momentum and parity are conserved, and the eigen-
states of energy are also eigenstates of parity and total angular mo-
mentum (J). In fact, the spin and orbital angular momentum degrees

7
of freedom are sufficiently decoupled that the total spin (S) and or-
bital angular momentum (L) are also good quantum numbers for the
energy eigenstates to an excellent approximation. The ground state
of positronium (“parapositronium”) is the 1 S0 state in 2S+1 LJ spectro-
scopic notation, where L = S means zero orbital angular momentum.
Note that the letter “S” in the spectroscopic notation is not the same
as the “S” referring to the total spin quantum number. Sorry about
the confusion, but it’s established tradition. . .
In the ground state, the positron and electron have no relative orbital
angular momentum, and their spins are anti-parallel. The first excited
state (“orthopositronium”) is the 3 S1 state, in which the spins of the
positron and electron are now aligned parallel with each other. The
3
S1 −1 S0 splitting is very small, and is analogous to the “hyperfine”
splitting in normal atoms.
Positronium decays when the electron and positron annihilate to pro-
duce photons. The decay process is also electromagnetic, hence also
governed by a Hamiltonian which is invariant under O(3). As a conse-
quence of this symmetry, angular momentum and parity are conserved
in the decay.

(a) We said that parity was a good quantum number for positronium
states. To say just what the parity is, we need to anticipate a
result from the Dirac equation (sorry): The intrinsic parities of
the electron and positron are opposite. What is the parity of
parapositronium? Of orthopositronium?
Solution: Both states are L = 0, so the spatial wave function
depends only on electron-positron separation, and has no angular
dependence. Thus, the the spatial wave function is even under
parity, and the overall parity is odd due to the opposite intrinsic
parities of the electron and positron. P = −1 for both states.
(b) We wish to know whether positronium can decay into two photons.
Let us check parity conservation. What are the possible parities
of a state of two photons, in the center-of-mass frame? Can you
exclude the decay of positronium to two photons on the basis of
parity conservation?
Solution: The intrinsic parity of the photon, which happens to
be ηγ = −1, doesn’t matter here, since we have two photons,

8
and ηγ2 = +1, independent of whether it is ±1. Since the spin
polarization is unaffected by a spatial reflection, it is only the
spatial wave function we must consider. For S-wave (L = 0)
the spatial wave function does not depend on orientation, and
hence P = +1. For P -wave (L = 1), however, the spatial wave
function is odd under parity (i.e., the spatial wave function has a
Y1m (θ, φ) angular dependence, and P Y1m (θ, φ) = Y1m (π − θ, φ +
π) = −Y1m (θ, φ). So, either parity is possible for a state of two
photons (assuming we can put them in both L = 0 and L = 1
states of angular momentum!), and the decay of positronium may
not be excluded on this basis.
(c) Let us consider now whether rotational invariance, i.e., conserva-
tion of angular momentum, puts any constraints on the permitted
decays of positronium. Can the orthopositronium state decay to
two photons? What about the parapositronium state?
Solution: Align a coordinate system so that the z-axis is along
the decay direction, in the center-of-mass frame. Since photons
carry a spin of ±1 along their direction of motion, the total spin
along the z-axis must be either Sz = 0 or Sz = 2. The orbital
angular momentum must be zero along the direction of motion,
Lz = 0. Thus, the total angular momentum projection along z is
either Jz = 0 or Jz = 2. By conservation of angular momentum,
we cannot have a system with angular momentum 0 or 1 decay
into a state with Jz = 2, so we exclude this possibility henceforth.
If Jz = 0, then the two photons must have spin projections which
are opposite along z: S1z = −S2z . Consider a rotation, Ry (π)
about the y-axis by angle π. This simply interchanges the two
photons, which results in a state indistinguishable from the state
before the rotation.
Now consider the effect of Ry (π) on the positronium state. On the
J = 0 state, the rotation has no effect, since the state is invariant
with respect to orientation. However, the J = 1 state must change
sign under this rotation, since it must have Jz = 0, and hence must
have a wave functions which transforms under rotations according

9
to Y10 (θ, φ). That is,
 
3 3
Ry (π)Y10 (θ, φ) = Ry (π) cos θ = cos(π − θ) = −Y10 (θ, φ).
4π 4π
(38)
Thus, we find that the parapositronium decay to two photons is
permitted, as far as we have checked (it is observed experimen-
tally), but the orthopositronium decay to two photons is forbidden
by rotational invariance.

5. The “charge conjugation” operator, C, is an operator that changes all


particles into their anti-particles. Consider the group of order 2 gen-
erated by the charge conjugation operator. This group has elements
{I, C}, where I is the identity element. The electromagnetic interac-
tion is invariant with respect to the actions of this group. That is, any
electromagnetic process for a system of particles should proceed iden-
tically if all the particles are replaced by their anti-particles. Hence, C
is a conserved quantity. Let’s consider the implications of this for the
1
S0 and 3 S1 positronium decays to two photons. [See the preceeding
exercise for a discussion of positronium. Note that you needn’t have
done that problem in order to do this one.]

(a) The result of operating C on a photon is to give a photon, i.e.,


the photon is its own anti-particle, and is thus an eigenstate of
C. What is the eigenvalue? That is, what is the “C-parity” of
the photon? You should give your reasoning. No credit will be
given for just writing down the answer, even if correct. [Hint:
think classically about electromagnetic fields and how they are
produced.] Hence, what is the C-parity of a system of n photons?
Solution: Electromagnetic fields, and hence photons, are pro-
duced by charged particles. Under C, all electric charges are re-
placed by their opposites. Thus, under C, all electromagnetic
fields are reversed. Thus, the C-parity of the photon is −1, and a
system of n photons will have C = (−1)n .
(b) It is a bit trickier to figure out the charge conjugation of the
positronium states. Since these are states consisting of a particle
and its antiparticle, we suspect that they may also be eigenstates

10
of C. But is the eigenvalue positive or negative? To determine
this, we need to know a bit more than we know so far.
Let me give an heuristic argument for the new understanding that
we need. First, although we haven’t talked about it yet, you prob-
ably already know about the “Pauli Exclusion Principle”, which
states that two identical fermions cannot be in the same state.
x1 , S 1 ; x2 , S 2 .
Suppose we have a state consisting of two electrons, |x
We may borrow an idea we introduced in our discussion of the har-
monic oscillator, and define a “creation operator”, a† (xx, S ), which
creates an electron at x with spin S . Consider the two-electron
state:

[a† (x
x1 , S 1 )a† (x
x2 , S 2 ) + a† (x
x2 , S 2 )a† (x
x1 , S 1 )]|0 , (39)

where |0 is the “vacuum” state, with no electrons. But this puts


both electrons in the same state, since it is invariant under the in-
terchange 1 ↔ 2. Therefore, in order to satisfy the Pauli principle,
we must have that

a† (x
x1 , S 1 )a† (x
x2 , S 2 ) + a† (x
x2 , S 2 )a† (x
x1 , S 1 ) = 0 (40)

That is, the creation operators anti-commute. To put it an-


other way, if two electrons are interchanged, a minus sign is in-
troduced. You may be concerned that a positron and an electron
are non-identical particles, so maybe this has nothing to do with
positronium. However, the relativistic description is such that the
positron and electron may be regarded as different “components”
of the electron (e.g., the positron may be interpreted in terms of
“negative-energy” electron states), so this anti-commutation rela-
tion is preserved even when creating electrons and positrons.
Determine the C-parity of the 3 S1 and 1 S0 states of positronium,
and thus deduce whether decays to two photons are permitted
according to conservation of C. [Hint: Consider a positronium
state and let C act on it. Relate this back to the original state by
appropriate transformations.]
Solution: Our two particle state is of the form:

x1 , S 1 ), ē(x
ψ = |e(x x2 , S 2 ) (41)

11
Operating with C gives:
Cψ = |ē(x x2 , S 2 )
x1 , S 1 ), e(x (42)
Now we need to relate this back to the original state to determine
the eigenvalue. First, according to the above discussion, let us
interchange the electron and positron:
x2 , S 2 ), ē(x
Cψ = −|e(x x1 , S 1 ) (43)
Now, consider the interchange of the spatial coordinates. This has
no effect on our S-wave wave functions, but more generally gives
a factor of (−1)L :
Cψ = −(−1)L |e(x
x1 , S 2 ), ē(x
x2 , S 1 ) . (44)
Finally, we need to exchange the spins. Our positronium states
are labelled with total spin quantum numbers (S S = S1 + S2 ).
The S = 1 state is obtained when the positron and electron spins
are aligned with each other. This spin state is symmetric with
respect to the interchange of the two spins. In order to obtain
an orthogonal spin state then, the S = 0 state must be anti-
symmetric under spin interchage. We may summarize by saying
that the possible spin states have an interchange symmetry given
by the factor −(−1)S . Thus,
Cψ = (−1)L+S |e(x
x1 , S 1 ), ē(x
x2 , S 2 ) . (45)

The positronium states are thus eigenstates of C with eigenvalue


(−1)L+S , or, for L = 0, (−1)S . The 3 S1 state has C = −1, and
therefore is forbidden to decay into two photons by C-conservation.
The 1 S0 state has C = +1, and so the decay into two photons is
permitted by C-conservation.

6. Suppose we have a system with total angular momentum 1. We pick


a basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1
ρ = 1 1 0 .
4
1 0 1

12
(a) Is ρ a permissible density matrix? Give your reasoning. For the
remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?
(c) What is the spread (standard deviation) in measured values of Jz ?

7. Let us consider the application of the density matrix formalism to the


problem of a spin-1/2 particle (such as an electron) in a static external
magnetic field. In general, a particle with spin may carry a magnetic
moment, oriented along the spin direction (by symmetry). For spin-
1/2, we have that the magnetic moment (operator) is thus of the form:
1
σ,
µ = γσ (46)
2
where σ are the Pauli matrices, the 12 is by convention, and γ is a
constant, giving the strength of the moment, called the gyromagnetic
ratio. The term in the Hamiltonian for such a magnetic moment in an
external magnetic field, B is just:

µ · B.
H = −µ (47)

Our spin-1/2 particle may have some spin-orientation, or “polarization


vector”, given by:
σ .
P = σ (48)
Drawing from our classical intuition, we might expect that in the ex-
ternal magnetic field the polarization vector will exhibit a precession
about the field direction. Let us investigate this.
Recall that the expectation value of an operator may be computed from
the density matrix according to:

A = Tr(ρA). (49)

Furthermore, recall that the time evolution of the density matrix is


given by:
∂ρ
i = [H(t), ρ(t)]. (50)
∂t

13
What is the time evolution, dP/dt, of the polarization vector? Express
your answer as simply as you can (more credit will be given for right
answers that are more physically transparent than for right answers
which are not). Note that we make no assumption concerning the
purity of the state.
Solution: Let us consider the ith-component of the polarization:
dPi d σi
i = i (51)
dt dt

= i Tr(ρσi ) (52)
∂t 
∂ρ
= iTr σi (53)
∂t
= Tr ([H, ρ]σi ) (54)
= Tr ([σi , H]ρ) (55)
3
1 
= − γ Bj Tr ([σi , σj ]ρ) . (56)
2 j=1

To proceed further, we need the density matrix for a state with polar-
ization P. Since ρ is hermitian, it must be of the form:
ρ = a(1 + b · σ ). (57)
But its trace must be one, so a = 1/2. Finally, to get the right polar-
ization vector, we must have b = P.
Thus, we have
3
 3

dPi 1  
i =− γ Bj Tr[σi , σj ] + Pk Tr ([σi , σj ]σk ) . (58)
dt 4 j=1 k=1

Now [σi , σj ] = 2i/ijk σk , which is traceless. Further, Tr ([σi , σj ]σk ) =


4i/ijk . This gives the result:
3  3
dPi
= −γ /ijk Bj Pk . (59)
dt j=1 k=1

This may be re-expressed in the vector form:


dP
= γP × B. (60)
dt

14
8. Let us consider a system of N spin-1/2 particles (as in the previous
problem) per unit volume in thermal equilibrium, in our external mag-
netic field B. [Even though we refer to the previous exercise, the solu-
tion to this problem does not require solving the previous one.] Recall
that the canonical distribution is:
e−H/T
ρ= , (61)
Z
with partition function:
 
Z = Tr e−H/T . (62)

Such a system of particles will tend to orient along the magnetic field,
resulting in a bulk magnetization (having units of magnetic moment
per unit volume), M.

(a) Give an expression for this magnetization (don’t work too hard to
evaluate).
Solution: Let us orient our coordinate system so that the z-axis
is along the magnetic field direction. Then Mx = 0, My = 0, and:

1
Mz = N γ σz (63)
2
1  
= Nγ Tr e−H/T σz , (64)
2Z
where H = −γBz σz /2.
(b) What is the magnetization in the high-temperature limit, to lowest
non-trivial order (this I want you to evaluate as completely as you
can!)?
Solution: In the high temperature limit, we’ll discard terms
of order higher than 1/T in the expansion of the exponential:
e−H/T ≈ 1 − H/T = 1 + γBz σz /2T . Thus,
1
Mz = Nγ Tr [(1 + γBz σz /2T )σz ] (65)
2Z
1
= Nγ 2 Bz . (66)
2ZT

15
Furthermore,

Z = Tre−H/T (67)
= 2 + O(1/T 2). (68)

And we have the result:

Mz = Nγ 2 Bz /4T. (69)

This is referred to as the “Curie Law” (for magnetization of a


system of spin-1/2 particles).
9. We have discussed Lie algrebras (with Lie product given by the com-
mutator) and Lie groups, in our attempt to deal with rotations. At one
point, we asserted that the structure (multiplication table) of the Lie
group in some neighborhood of the identity was completely determined
by the structure (multiplication table) of the Lie algebra. We noted
that, however intuitively pleasing this might sound, it was not actually
a trivial statement, and that it followed from the “Baker-Campbell-
Hausdorff” theorem. Let’s try to tidy this up a bit further here.
First, let’s set up some notation: Let L be a Lie algebra, and G be the
Lie group generated by this algebra. Let X, Y ∈ L be two elements of
the algebra. These generate the elements eX , eY ∈ G of the Lie group.
We assume the notion that if X and Y are close to the zero element of
the Lie algebra, then eX and eY will be close to the identity element of
the Lie group.
What we want to show is that the group product eX eY may be ex-
pressed in the form eZ , where Z ∈ L, at least for X and Y not too
“large”. Note that the non-trivial aspect of this problem is that, first,
X and Y may not commute, and second, objects of the form XY may
not be in the Lie algebra. Elements of L generated by X and Y must
be linear combinations of X, Y , and their repeated commutators.

(a) Suppose X and Y commute. Show explicitly that the product


eX eY is of the form eZ , where Z is an element of L. (If you think
this is trivial, don’t worry, it is!)
Solution: Since X and Y commute, the series expansion for

eZ = eX+Y (70)

16
behaves just like an expansion in ordinary numbers, and hence
eX eY = eX+Y , and of course Z = X + Y is an element of the
algebra.
(b) Now suppose that X and Y may not commute, but that they are
very close to the zero element. Keeping terms to quadratic order
in X and Y , show once again that the product eX eY is of the form
eZ , where Z is an element of L. Give an explicit expression for Z.
Solution: We expand:



X Y X2 Y2
e e = 1+X + + O(3) 1 + Y + + O(3) (71)
2! 2!
X2 Y 2
= 1+X +Y + + + XY + O(3) (72)
2! 2!
and
(X + Y )2
eX+Y = 1 + (X + Y ) + + O(3) (73)
2!
X2 Y 2 1
= 1+X +Y + + + XY − [X, Y ] + O(3).
2! 2! 2
Thus,
1
eX eY = eX+Y + 2 [X,Y ]+O(3) , (74)
and Z = X + Y + 12 [X, Y ] ∈ L.
(c) Finally, for more of a challenge, let’s do the general theorem: Show
that eX eY is of the form eZ , where Z is an element of L, as long
as X and Y are sufficiently “small”. We won’t concern ourselves
here with how “small” X and Y need to be – you may investigate
that at more leisure.
Here are some hints that may help you: First, we remark that the
differential equation
df
= Xf (u) + g(u), (75)
du
where X ∈ L, and letting f (0) = f0 , has the solution:
 u
uX
f (u) = e f0 + e(u−v)X g(v)dv. (76)
0

17
This can be readily verified by back-substitution. If g is indepen-
dent of u, then the integral may be performed, with the result:

f (u) = euX f0 + h(u, X)g, (77)

Where, formally,
euX − 1
h(u, X) = . (78)
X
Second, if X, Y ∈ L, then

eX Y e−X = eXc (Y ), (79)

where I have introduced the notation “Xc ” to mean “take the


commutator”. That is, Xc (Y ) ≡ [X, Y ]. This fact may be demon-
strated by taking the derivative of

A(u; Y ) ≡ euX Y e−uX (80)

with respect to u, and comparing with our differential equation


above to obtain the desired result.
Third, assuming X = X(u) is differentiable, we have
d −X(u) dX
eX(u) e = −h(1, X(u)c ) . (81)
du du
This fact may be verified by considering the object:
∂ −tX(u)
B(t, u) ≡ etX(u) e , (82)
∂u
and differentiating (carefully!) with respect to t, using the above
two facts, and finally letting t = 1.
One final hint: Consider the quantity
 
Z(u) = ln euX eY . (83)

The series:
ln z z − 1 (z − 1)2
5(z) = =1− + −··· (84)
z−1 2 3
plays a role in the explicit form for the result. Again, you are not
asked to worry about convergence issues.

18
Solution: With Z(u) as just defined, consider:
d −Z(u) d  −Y −uX 
eZ(u) e = euX eY e e = −X. (85)
du du
Thus,
dZ
X = h (1, Z(u)c)
. (86)
du
From our second “fact” above, we may deduce that:

eZ(u)c = euXc eYc , (87)

or, taking the logarithm,


 
Z(u)c = ln euXc eYc . (88)

We notice that:
h(1, ln X)5(X) = 1. (89)
Thus,
    
h (1, Z(u)c) = h 1, ln euXc eYc = 5−1 euXc eYc , (90)

and
dZ(u)  
= 5 euXc eYc X. (91)
du
Finally, we integrate to obtain:
  1   
X Y uXc Yc
e e = exp Y + 5 e e Xdu (92)
0

We see that the term in the exponential on the right is a linear


combination of X, Y , and their repeated commutators, hence is
an element of L. For example, the term of order 3 in the operators
is:
1
([X, [X, Y ]] − [Y, [X, Y ]]) . (93)
12
10. In an earlier exercise we considered the implication of rotational in-
variance for the decays of positronium states into two photons. Let us
generalize and broaden that discussion here. Certain neutral particles
(e.g., π 0 , η, η ) are observed to decay into two photons, and others (e.g.,

19
ω, φ, ψ) are not. Let us investigate the selection rules implied by an-
gular momentum and parity conservation (satisfied by electromagnetic
and strong interactions) for the decay of a particle (call it “X”) into
two photons. Thus, we ask the question, what angular momentum J
and parity P states are allowed for two photons?
Set up the problem in the center-of-mass frame of the two photons,
with the z-axis in the direction of one photon. We know that since a
photon is a massless spin-one particle, it has two possible spin states,
which we can describe by its “helicity”, i.e., its spin projection along its
direction of motion, which can take on the values ±1. Thus, a system
of two photons can have the spin states:

|↑↑ , |↓↓ , |↑↓ + ↓↑ , |↑↓ − ↓↑

(The first arrow refers to the photon in the +z direction, the second
to the photon in the −z direction, and the direction of the arrow in-
dicates the spin component along the z-axis, NOT to the helicity.)
We consider the effect on these states of three operations (which, by
parity and angular momentum conservation, should commute with the
Hamiltonian):

• P : parity – reverses direction of motion of a particle, but leaves


its angular momentum unaltered.
• Rz (α): rotation by angle α about the z-axis. A state with a given
value of Jz (z-component of angular momentum) is an eigenstate,
with eigenvalue eiαJz .
• Rx (π): rotation by π about x-axis. For our two photons, this
reverses the direction of motion and also the angular momentum
of each photon. For our “X” particle, this operation has the effect
corresponding to the effect on the spherical harmonic with the
appropriate eigenvalues:

Rx (π)YJJz (Ω)

(Note that the Ylm functions are sufficient, since a fermion ob-
viously can’t decay into two photons and conserve angular mo-
mentum – hence X is a boson, and we needn’t consider 12 -integer
spins.)

20
Make sure that the above statements are intuitively clear to you.

(a) By considering the actions of these operations on our two-photon


states, complete the following table: (one entry is filled in for you)

Photonic Spin Transformation


State P Rz (α) Rx (π)
|↑↑ + |↑↑
|↓↓
|↑↓ + ↓↑
|↑↓ − ↓↑
Solution:
Photonic Spin Transformation
State P Rz (α) Rx (π)
2iα
|↑↑ + |↑↑ e | ↑↑ | ↓↓
|↓↓ |↓↓ e2iα | ↓↓ | ↑↑
|↑↓ + ↓↑ |↑↓ + ↓↑ | ↑↓ + ↓↑ | ↑↓ + ↓↑
|↑↓ − ↓↑ −(|↑↓ − ↓↑ ) | ↑↓ − ↓↑ | ↑↓ − ↓↑

(b) Now fill in a table of eigenvalues for a state (i.e., our particle
“X”) of arbitrary integer spin J and parity P (or, if states are not
eigenvectors, what the transformations yield):
Spin J Transformation
P
 
Rz (α) Rx (π)
+1
0
 −1 
+1
1
 −1 
+1
2, 4, 6, ...
 −1 
+1
3, 5, 7, ... −1

Note that there may be more than one eigenvalue of Rz (α) for a
given row, corresponding to the different possible values of Jz .
Solution: For the rotation by π about the x axis, we know that:
 
2j + 1 ∗j 2j + 1 imφ j
Yjm(θ, φ) = Dm0 (φ, θ, 0) = e dm0 (θ). (94)
4π 4π

21
Hence:

Rx (π)Yjm(θ, φ) = Yjm (π − θ, −φ) (95)



2j + 1 −imφ j
= e dm0 (π − θ) (96)


2j + 1
= (−)j+m e−imφ djm0 (θ) (97)

= (−)j+m Yjm∗
(θ, φ) (98)
j+m
= (−) Yj,−m(θ, φ). (99)

States with m = 0 are eigenstates of Rx (π), with eigenvalue (−)j .

Spin J Transformation
P
 
Rz (α) Rx (π)
+1
0 1 1
 −1 
+1
1 eimα , m = −1, 0, 1 Yjm → (−)m+1 Yj,−m
 −1 
+1
2, 4, 6, ... eimα , m = −j, . . . , j Yjm → (−)m Yj,−m
 −1 
+1
3, 5, 7, ... −1
eimα , m = −j, . . . , j Yjm → (−)m+1 Yj,−m

(c) Finally, by using your answers to parts (a) and (b), determine the
allowed and forbidden J P states decaying into two photons, and
the appropriate photonic helicity states for the allowed transitions.
Put your answer in the form of a table:
Parity Spin
0 1 2,4,... 3,5,...
+1
−1 |↑↓ − ↓↑

Solution: Note that there is no component of orbital angular


momentum along the z axis for the two photon states – the entire
Jz is due to the spins of the photons. For spin 0, we must have
Jz = 0 for the two photons. The two Jz = 0 states have different
parities, hence we make the assignments in the table below.
For spin 1, we must also have Jz = 0 for the two photons. But for
spin one, the m = 0 state is odd under Rx (π), while the possible

22
two photon states are both even. Thus, a spin 1 particle cannot
decay to two photons and preserve rotational invariance.
For spin 2, 4, 6, . . ., we must have m = 0, ±2 in order to match
the possible photon states (property under Rz (α). By considering
Rx (π) we see that we thus take m → −m with no sign change.
All of the photon states are consistent with this. It only remains
to match the parities of the initial and final states; the result is in
the table below.
For spin 3, 5, 7, . . ., we again must have m = 0, ±2 in order to
match the possible photon states (property under Rz (α). Now,
however, Rx (π) takes m → −m with a sign change. This excludes
the two Jz = 0 two photon states. We are unable to construct a
negative parity two-photon state with Jz = ±2. We may construct
a positive parity state with the desired sign change under Rx (π),
as shown in the table.
Parity Spin
0 1 2,4,... 3,5,...
+1 |↑↓ + ↓↑ forbidden | ↑↑ , | ↓↓ , | ↑↑ −| ↓↓
| ↑↓ + ↓↑
−1 |↑↓ − ↓↑ forbidden | ↑↓ − ↓↑ forbidden

You have (I hope) just derived something which is often referred to as


“Yang’s theorem”. Note: People often get this wrong, so be careful!

11. We said that if we are given an arbitrary representation, D(u), of


SU(2), it may be reduced to a direct sum of irreps D r (u):

D(u) = ⊕D r (u). (100)
r

The multiplicities mj (the number of irreducible representations D r


which belong to the equivalence class of irreducible representations
characterized by index j) are unique, and they are given by:

mj = d(u)χj (u−1)χ(u), (101)
SU (2)

where χ(u) = Tr [D(u)].

23
(a) Suppose you are given a representation, with characters:

sin 32 θ + 2 sin 74 θ cos 14 θ


χ [ue (θ)] = 1 + . (102)
sin 12 θ
What irreducible representations appear in the reduction of this
representation, with what multiplicities?
Solution: We choose to work with the parameterization of the
volume element in Eqn. (103), and with the characters of the
irreducible representations given in Eqn. (255):
  sin(j + 12 )θ
χj (u−1 (θ)) = χj u−1
e3 (θ) = . (103)
sin 12 θ
Then,

mj = d(u)χj (u−1 )χ(u)
SU (2)
  2π

1 2 θ sin(j + 12 )θ sin 32 θ + 2 sin 74 θ cos 14 θ
= dΩe sin dθ 1+
4π 2 4π 0 2 sin 12 θ sin 12 θ
   
1 2π 2j + 1 1 3 7 1
= dθ sin θ sin θ + sin θ + 2 sin θ cos θ . (104)
π 0 2 2 2 4 4
We have the orthogonality relation, n and m positive integers:
 2π
nθ mθ
sin sin dθ = πδnm (105)
0 2 2
Also,
7 1 θ
2 sin θ cos θ = sin 2θ + sin 3 . (106)
4 4 2
Then,
  
1 2π 2j + 1 θ
dθ sin θ sin = πδ(2j+1)1 = πδj0 (107)
π 0 2 2
  
1 2π 2j + 1 3θ
dθ sin θ sin = πδj1 (108)
π 0 2 2
  
1 2π 2j + 1 4θ
dθ sin θ sin = πδj 3 (109)
π 0 2 2 2
 2π  
1 2j + 1 3θ
dθ sin θ sin = πδj1 (110)
π 0 2 2

24
Thus, the irreducible representations which appear are j = 0, with
multiplicity m0 = 1, j = 1, with multiplicity m1 = 2, and j = 3/2,
with multiplicity m 3 = 1.
2

(b) Does the representation we are given look like it could correspond
to rotations of a physically realizable system? Discuss.
Solution: This representation is a direct sum of irreducible rep-
resentations with j = 0, 1, 32 . Thus, the state space contains states
which are fermionic and states which are bosonic. We know of no
way we can build a system which can have both integer angular
momenta and half-integer angular momenta.

12. In nuclear physics, we have the notion of “charge independence”, or


the idea that the nuclear (strong) force does not depend on whether
we are dealing with neutrons or protons. Thus, the nuclear force is
supposed to be symmetric with respect
  to unitary
 transformations on
1 0
a space with basis vectors p = 0 , n = 1 . Neglecting the overall
phase symmetry, we see that the symmetry group is SU(2), just as for
rotations. As with angular momentum, we can generate representations
of other dimensions, and work with systems of more than one nucleon.
By analogy with angular momentum, we say that the neutron and
proton form an “isotopic spin” (or “isospin”) doublet, with
1 1
|n = |I = , I3 = − (111)
2 2
1 1
|p = |I = , I3 = + (112)
2 2
(The symbol “T ” is also often used for isospin). Everything you know
about SU(2) can now be applied in isotopic spin space.
Study the isobar level diagram of the He6 , Li6 , Be6 nuclear level schemes,
and discuss in detail the evidence for charge independence of the nu-
clear force. These graphs are quantitative, so your discussion should
also be quantitative. Also, since these are real-life physical systems,
you should worry about real-life effects which can modify an idealized
vision.
You may find an appropriate level scheme via a google search (you want
a level diagram for the nuclear isobars of 6 nucleons), e.g., at:
http://www.tunl.duke.edu/nucldata/figures/06figs/06 is.pdf

25
For additional reference, you might find it of interest to look up:
F. Ajzenberg-Selove, “Energy Levels of Light Nuclei, A = 5-10,” Nucl.
Phys. A490 1-225 (1988)
(see also http://www.tunl.duke.edu/nucldata/fas/88AJ01.shtml).
Solution: The isobar diagram for A = 6 is shown in Fig. 1. Note
that 6 He, 6 Li, and 6 Be all have a total of 6 nucleons, and differ in
how many of those are protons or neutrons. Thus, to test for charge
independence, we may compare energy levels for states with the same
quntum numbers. Thus, for example, the 6 He ground state is J P = 0+ ;
we should compare it with the lowest J P = 0+ states of 6 Li and 6 Be.
Charge independence says that these levels should be degenerate in
energy. Likewise, we should be able to compare the lowest J P = 2+
energy levels (actually, the lowest 2+ level for 6 Li is identified as an
isospin singlet, so we take the first excited 2+ state in this case – note
that isospin singlets are allowed for lithium, with equal numbers of
neutrons and protons, but not for helium and beryllium, which must
have T ≥ 1). There is evidently some question about the assignment
of the J P = 2+ level for 6 He, but let’s assume it is correct for now.
Even without understanding what the energies really mean, we may
make a first test of charge independence by asking whether the energy
gaps between the levels are independent. We summarize the situation
in Table 1:

Table 1: Levels of some comaprable states in the A = 6 isobar system.


6
Energy (MeV) He 6 Li 6
Be
J P = 0+ 4.05 3.563 3.09
J P = 2+ 5.85 5.37 4.76
J P = 2− 21.0 29
J P = 4− 25 26
J P = 3− 26.6 30
2+ − 0+ 1.80 1.81 1.67
2− − 0+ 17.4 26
4− − 0+ 21 23
3− − 0+ 23.0 27

26
Figure 1: Nuclear isobar levels for A = 6. From
http://www.tunl.duke.edu/nucldata/fas/88AJ01.shtml [F. Ajzenberg-
Selove, “Energy Levels of Light Nuclei, A = 5-10,” Nucl. Phys. A490 1-225
(1988)].

27
Error bars are not explicitly given, and some further research may be
necessary to elucidate exactly how reliable these numbers are. For our
purposes, we’ll assume that the values are reliable up to ±5 in the
least significant quoted digit. Thus, the 2+ − 0+ energy differences for
6
He and 6 Li appear to be in excellent agreement, but the 6 Li − 6 Be
difference is something like 0.14±0.07 MeV. While small compared with
the overall level spacings, this difference may be real, and is worthy of
further investigation.
More generally, we note that the helium and beryllium isobars are
related by a simple n ↔ p reflection in isospin space, so we should
expect the numbers of states to be the same, and the spacings to be the
same, up to neutron-proton mass difference effects and electromagnetic
effects. As already noticed, lithium may exist in isospin zero states,
and hence more levels may be expected for lithium. These features are
qualitatively borne out by the figure, though the detailed spacings are
not the same. It is worth remarking that the Pauli exclusion principle
is present, though hidden, in our comment on counting states.
To proceed further, we need to understand what the energy scales really
mean. The reference states that the energies in the square brackets are
computed according to:

E[ (Z, N )] = E(Z, N ) − E(3, 3), (113)

where E(Z, N ) is the “aproximate nuclear energy” computed as

E(Z, N ) = M(Z, N ) − ZM(H) − NM(n) − EC . (114)

We may compare the [4.05] relative energy for 6 He with the mea-
surement of its beta decay. A little further reading in the references
above tells us that the Q value (kinetic energy released) in the decay
6
He(β − )6 Li is 3.507 MeV. To an excellent approximation, this corre-
sponds to a nuclear mass difference of

∆M = Q + me = 4.02 MeV. (115)

This is pretty close to the estimate of [4.05] MeV in the figure. Hence,
we seem to have a consistent understanding: The 6 He nucleus, in its
ground state, is 4.02 MeV more massive than the 6 Li ground state

28
nucleus, up to uncertainties in the last digit. Thus, for example, the
difference in mass between the 6 He ground state nucleus and its 6 Li
isospin partner is 4.02 − 3.56 = 0.46 MeV. Considering the neutron-
proton mass difference (1.29 MeV), and the difference in estimated
Coulomb energy (-1.32 MeV), this difference may be compared with a
difference of -0.03 MeV expected from charge independence. This is a
significant disagreement. However, the Coulomb correction is large on
the scale of the difference, and may not be trustworthy at this level.

13. We defined the “little-d” functions according to:

djm1 m2 (θ) = Dm
j
1 m2
(0, θ, 0) = j, m1 |e−iθJ2 |j, m2
j
where the matrix elements Dm 1 m2
(ψ, θ, φ), parameterized by Euler an-
gles ψ, θ, φ, are given in the “standard representation” by:
j
Dm 1 m2
(ψ, θ, φ) = j, m1 |D j (u)|j, m2 = e−i(m1 ψ+m2 φ) djm1 m2 (θ)

We note that an explicit calculation for these matrix elements is pos-


sible via:
j
Dm 1 m2
(u) = Pjm1 (∂x , ∂y )Pjm2 (u11 x + u21 y, u12x + u22 y) (116)
where
xj+m y j−m
Pjm (x, y) ≡  . (117)
(j + m)!(j − m)!

Prove the following handy formulas for the djm1 m2 (θ) functions:

a) dj∗ j
m1 m2 (θ) = dm1 m2 (θ) (reality of dj functions)

Solution: Let us determine SU(2) matrix u for Euler angles (0, θ, 0),
so that we may use Eqn. 116. This must describe a rotation about the
two-axis by angle θ. Hence,
 
i
u = exp − θσ2 (118)
2
θ θ
= I cos − iσ2 sin (119)

2 2
θ θ 
cos 2 − sin 2
= . (120)
sin 2θ cos 2θ

29
This is a real matrix, and substitution into Eqn. 116 preserves that
reality.

b) djm1 m2 (−θ) = djm2 m1 (θ)

Solution:
djm1 m2 (−θ) = dj∗
m1 m2 (−θ) (121)
j∗
= Dm 1 m2
(0, −θ, 0) (122)
j
= Dm2 m1 (0, θ, 0) (123)
= djm2 m1 (θ). (124)

c) djm1 m2 (θ) = (−)m1 −m2 djm2 m1 (θ)


Solution: Let’s write out Eqn. 116 for the little-d functions:
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x cos θ2 + y sin θ2 −x sin θ2 + y cos θ2
djm1 m2 (θ) =  
(j + m1 )!(j − m1 )! (j + m2 )!(j − m2 )!
(125)
Now, starting with the result of part (b),
djm2 m1 (θ) = djm1 m2 (−θ) (126)
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x cos θ2 − y sin θ2 x sin 2θ + y cos θ2
= 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
Now let x → −x:
 j+m2  j−m2
∂xj+m1 ∂yj−m1 −x cos θ2 − y sin θ2 −x sin θ2 + y cos 2θ
djm2 m1 (θ) = (−)j+m1 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x cos θ2 + y sin θ2 −x sin θ2 + y cos θ2
= (−)j+m1 +j+m2 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x cos θ2 + y sin 2θ −x sin θ2 + y cos θ2
= (−)2j+2m2 +m1 −m2 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!

30
But (−)2j+2m2 +m1 −m2 = (−)m1 −m2 , since either 2j and 2m2 are both
odd or both even, hence their sum is always even.

d) dj−m1 ,−m2 (θ) = (−)m1 −m2 djm1 m2 (θ)

Solution:
 j−m2  j+m2
∂xj−m1 ∂yj+m1 x cos θ2 + y sin θ2 −x sin θ2 + y cos θ2
dj−m1 ,−m2 (θ) = 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
(127)
Now interchange x and y, and replace the resulting x by −x:
 j−m2  j+m2
∂xj+m1 ∂yj−m1 y cos 2θ − x sin θ2 −y sin θ2 − x cos θ2
dj−m1 ,−m2 (θ) = (−)j+m1 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
 j−m2  j+m2
∂xj+m1 ∂yj−m1 y cos θ2 − x sin θ2 y sin 2θ + x cos θ2
= (−)j+m1 −j−m2 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
= (−)m1 −m2 djm1 m2 (θ) (128)

e) djm1 m2 (π − θ) = (−)j−m2 dj−m1 ,m2 (θ) = (−)j+m1 djm1 ,−m2 (θ)

Solution: The second equality follows from part (d):


(−)j−m2 dj−m1 ,m2 (θ) = (−)j−m2 (−)m1 +m2 djm1 ,−m2 (θ) (129)
= (−)j+m1 djm1 ,−m2 (θ) (130)
   
π θ
For the first equality, using cos 2
− 2
= sin θ2 and sin π
2
− θ
2
= cos 2θ :
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x sin 2θ + y cos θ2 −x cos θ2 + y sin θ2
djm1 m2 (π − θ) = 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
 j+m2  j−m2
∂xj+m1 ∂yj−m1 −x sin θ2 + y cos θ2 x cos θ2 + y sin θ2
= (−)j+m1 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
= (−)j+m1 djm1 ,−m2 (θ). (131)

31
f) djm1 m2 (2π + θ) = (−)2j djm1 m2 (θ)
   
θ
Solution: We use cos π + 2
= − cos θ2 and sin π + θ
2
= − sin 2θ :
 j+m2  j−m2
∂xj+m1 ∂yj−m1 −x cos θ2 − y sin θ2 x sin θ2 − y cos θ2
djm1 m2 (2π + θ) = 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
 j+m2  j−m2
∂xj+m1 ∂yj−m1 x cos 2θ + y sin θ2 −x sin θ2 + y cos θ2
= (−)j+m2 +j−m2 
(j + m1 )!(j − m1 )!(j + m2 )!(j − m2 )!
= (−)2j djm1 m2 (θ). (132)

14. We would like to consider the (qualitative) effects on the energy levels
of an atom which is moved from freedom to an external potential (a
crystal, say) with cubic symmetry. Let us consider a one-electron atom
and ignore spin for simplicity. Recall that the wave function for the
case of the free atom looks something like Rnl (r)Ylm(θ, φ), and that
all states with the same n and l quantum numbers have the same
energy, i .e., are (2l + 1)-fold degenerate. The Hamiltonian for a free
atom must have the symmetry of the full rotation group, as there are
no special directions. Thus, we recall some properties of this group
for the present discussion. First, we remark that the set of functions
{Ylm : m = −l, −l + 1, · · · , l − 1, l} for a given l forms the basis for a
(2l + l)-dimensional subspace which is invariant under the operations
of the full rotation group. [A set {ψi } of vectors is said to span an
invariant subspace Vs under a given set of operations {Pj } if Pj ψi ∈ Vs
∀i, j.] Furthermore, this subspace is “irreducible,” that is, it cannot be
split into smaller subspaces which are also invariant under the rotation
group.
Let us denote the linear transformation operator corresponding to ele-
ment R of the rotation group by the symbol P̂R , i.e.:

P̂R f (Cx) = f (R−1Cx)

The way to think about this equation is to regard the left side as giving
a “rotated function,” which we evaluate at point Cx. The right side tells

32
us that this is the same as the original function evaluated at the point
R−1Cx, where R−1 is the inverse of the rotation matrix corresponding to
rotation R. Since {Ylm } forms an invariant subspace, we must have:
l

P̂R Ylm = Ylm D l (R)m m

m =−1

The expansion coefficients, Dl (R)m m , can be regarded as the elements


of a matrix D l (R). As discussed in the note, D & corresponds to an
irreducible representation of the rotation group.
Thus, for a free atom, we have that the degenerate eigenfunctions of
a given energy must transform according to an irreducible representa-
tion of this group. If the eigenfunctions transform according to the lth
representation, the degeneracy of the energy level is (2l + 1) (assuming
no additional, “accidental” degeneracy).
I remind you of the following:
Definition: Two elements a and b of a group are said to belong to the
same “class” (or “equivalence class” or “conjugate class”) if there exists
a group element g such that g −1 ag = b.
The first two parts of this problem, are really already done in the note,
but here is an opportunity to think about it for yourself:

(a) Show that all proper rotations through the same angle ϕ, about
any axis, belong to the same class of the rotation group.
Solution: Consider two rotations by angle θ about axes e and
e : Re (θ) and Re (θ). Let R be the rotation which takes axis e
into e . Such a rotation must exist, according to the theorem you
proved in exercise one. That is:

e = Re . (133)

Then,
Re (θ) = RRe (θ)R−1 , (134)
since this sequence has the effect of first rotating the e axis (think-
ing of it as embedded in the object to be rotated) to be along e ,
then rotating about this axis, and finally putting the e axis back
to its original orientation.

33
(b) We will need the character table of this group. Since all elements
in the same class have the same character, we pick a convenient
element in each class by considering rotations about the z-axis,
R = (α, z) (means rotate by angle α about the z-axis). Thus:

P̂(α,z) Y&m = e−imα Y&m

(which you should convince yourself of).


Find the character “table” of the rotation group, that is, find
χ& (α), the character of representation D & for the class of rotations
through angle α. If you find an expression for the character in the
form of a sum, do the sum, expressing your answer in as simple a
form as you can.
j
Solution: Consider Dm 1 m2
(0, 0, α) = eim1 α δm1 m2 .
 
χj (θ) = Tr D j (0, 0, α) (135)
j

= eimα (136)
m=−j
2j 
 k
= e−ijα eiα (137)
k=0
∞      
−ijα iα k iα 2j+1
= e e 1− e (138)
k=0
iα 2j+1
−ijα (e ) − 1
= e (139)
e −1

1 1
ei(j+ 2 )α − e−i(j+ 2 )α
= (140)
eiα/2 − e−iα/2
sin(j + 12 )α
= . (141)
sin α2

(c) At last we are ready to put our atom into a potential with cubic
symmetry. Now the symmetry of the free Hamiltonian is broken,
and we are left with the discrete symmetry of the cube. The
symmetry group of proper rotations of the cube is a group of
order 24 with 5 classes. Call this group “O”.
Construct the character table for O.

34
Solution: Let the cube be centered at the origin and oriented
with its faces centered on the x, y, z axes. The classes of O are:
Class pk Elements
C1 1 Identity
C2 6 Rotations by ± π2 about x, y, z
C3 3 Rotations by π about x, y, z
C4 6 Rotations by π about lines joining centers of opposite edges
C5 8 Rotations by ± 2π 3
about the four diagonals
In constructing the character table, we note:
• The number of irreducible representations is equal to the num-
ber of classes, nr = nc = 5.
• We have the following orthogonality relations among rows and
among columns:
nc

pk χj  (Ck )∗ χj  (Ck ) = hδj  j  , (142)
k=1
nc
 h
χi (Cj )∗ χi (C& ) = δj& . (143)
i=1 pj

• The character of a class of dimension d and whose elements


are of order m is a sum of d m-th roots of unity.
• The sums of the squares of the dimensions of the irreducible
representations is the order of the group:
5

d2n = h = 24. (144)
n=1

For O, the solution is unique, the irreductible representations


must have dimensions 1, 1, 2, 3, 3.
• The second column (dimension 1 irreducible representation)
may be determined uniquely by orthogonality with the first
column.
• The second column may be obtained by using orthogonality
with the first two columns, plus the normalization on the col-
umn itself, plus noting that the characters are given by sums
of roots of unity. Thus, for example, the character χ3 (C5)
must be the sum of two of 1, e±2πi/3 .

35
• The same type of considerations are sufficient to uniquely fill
in the remaining table. In this case, it is not actually necessary
to write any explicit rotation matrices, though that can be
done.
The character table is therefore:

1 C1 1 1 2 3 3
6 C2 1 -1 0 1 -1
3 C3 1 1 2 -1 -1
6 C4 1 -1 0 -1 1
8 C5 1 1 -1 0 0

(d) Consider in particular how the f -level (l = 3) of the free atom


may split when it is placed in the “cubic potential”. The seven
eigenfunctions which transform according to the irreducible rep-
resentation D 3 of the full group will most likely not transform
according to an irreducible representation of O. On the other
hand, since the operations of O are certainly operations of D, the
eigenfunctions will generate some representation of O.
Determine the coefficients in the decomposition.

D 3 = a1 O 1 ⊕ a2 O 2 ⊕ a3 O 3 ⊕ a4 O 4 ⊕ a5 O 5,

where O i are the irreducible representations of O. Hence, show


how the degeneracy of the 7-fold level may be reduced by the cubic
potential. Give the degeneracies of the final levels.
Note that we cannot say anything here about the magnitude of
any splittings (which could “accidentally” turn out to be zero!),
or even about the ordering of the resulting levels – that depends
on the details of the potential, not just its symmetry.
Solution:
Using the result of part (b), letting j = 3:

sin(3 + 12 )α
χ(α) = , (145)
sin α2
we have:

χ(C1 ) = 7 (146)

36
sin 7π
4
χ(C2 ) = π = −1 (147)
sin 4
sin 7π
2
χ(C3 ) = = −1 (148)
sin π2
sin 7π
2
χ(C4 ) = = −1 (149)
sin π2
sin 14π
6
χ(C5 ) = 2π = 1. (150)
sin 6

Finally, we may determine the coefficients in the expansion:

D3 = a1 O 1 ⊕ a2 O 2 ⊕ a3 O 3 ⊕ a4 O 4 ⊕ a5 O 5,

or,
5

χ= ai χi . (151)
i=1

The coefficients are determined using the orthogonality relations


once more:
5
1 
aj = pk χ∗j (Ck )χ(Ck ). (152)
24 k=1
The result is:

D 3 = 0O 1 ⊕ 1O 2 ⊕ 0O 3 ⊕ 1O 4 ⊕ 1O 5 . (153)

15. We perform an experiment in which we shine a beam of unpolarized


white light at a gas of excited hydrogen atoms. We label atomic states
by |n5m , where 5 is the total (orbital, we are neglecting spin in this
problem) angular momentum, m is the z-component of angular momen-
tum (Lz |n5m = m|n5m ), and n is a quantum number determining
the radial wave function. The light beam is shone along the x-axis.
We are interested in transition rates between atomic states, induced by
the light. Since we are dealing with visible light, its wavelength is much
larger than the size of the atom. Thus, it is a good first approximation
to consider only the interaction of the atomic dipole moment with the
electric field of the light beam. That is, the spatial variation in the
plane wave eikx , describing the light beam, may be replaced by the

37
lowest-order term in its expansion, i.e., by 1. Thus, we need only
consider the interaction of the dipole moment with the electric field of
the light beam, taken to be uniform. The electric dipole moment of
the atom is proportional to exx, where x is the position of the electron
relative to the nucleus. Hence, in the “dipole approximation”, we are
interested in matrix elements of x ·E
E, where E is the electric field vector
of the light beam.
Calculate the following ratios of transition rates in the dipole approxi-
mation:
Γ(|23, 1, 1 → |1, 0, 0 )
a)
Γ(|23, 1, 0 → |1, 0, 0 )

Solution: In the dipole approximation, we are concerned with matrix


elements of the position operator x. This is a vector operator, with
commutation relations
[Jk , x& ] = i/k&m xm . (154)
Thus, x, y, z are the Cartesian components of a spherical tensor of rank
1, with spherical components:
1
X1 = − √ (x + iy), (155)
2
X0 = z, (156)
1
X−1 = √ (x − iy). (157)
2

If the light is unpolarized, and traveling along the x axis, we have an


ensemble of photons with electric fields equally likely to be along the y
and z axes. Thus, we are interested in the matrix elements of:
i
y = √ (X1 + X−1 ) (158)
2
z = X0 . (159)
That is, we want
Γ(|n 5 m → |n 5 m ) (160)
1
∝ | n 5 m |X1 + X−1 |n 5 m |2 + | n 5 m |X0 |n 5 m |2 .
2
38
From the notes, the Wigner-Eckart theorem states that the matrix
elements for a vector operator may be written:

(j  m )(k  )|Q(1, m)|(j  m )(k  ) (161)


1+j  −j 
(−)
= √  C(1j  j  ; mm m ) j  , k  ||Q1 ||j  , k  .
2j + 1

Thus, the desired decay rates may be expressed as:

Γ(|n 5 m → |n 5 m ) (162)


 
    1         2     2
= K(n 5 n 5 ) |C(15 5 ; 1m m ) + C(15 5 ; −1m m )| + |C(15 5 ; 0m m )| .
2
The factor of K cancels out in the desired ratios for this problem, so
the results depend only on the Clebsch-Gordan coefficients.
Finally,
1
Γ(|23, 1, 1 → |1, 0, 0 ) 2
|C(110; 110) + C(110; −110)|2 + |C(110; 010)|2
= 1
Γ(|23, 1, 0 → |1, 0, 0 ) 2
|C(110; 100) + C(110; −100)|2 + |C(110; 000)|2
1
2
|0 + √13 |2 + 0
= (163)
0 + | − √13 |2
1
= . (164)
2

Γ(|3, 1, 0 → |4, 2, 1 )
b).
Γ(|3, 1, −1 → |4, 2, 0 )

[Hint: this is an application of the Wigner-Eckart theorem.]


Solution: We proceed as in part (a):
1
Γ(|3, 1, 0 → |4, 2, 1 ) 2
+ C(112; −101)|2 + |C(112; 001)|2
|C(112; 101)
= 1
Γ(|3, 1, −1 → |4, 2, 0 ) 2
|C(112; 1−10) + C(112; −1−10)|2 + |C(112; 0−10)|2
1 √1
2
| 2 + 0|2 + 0
= 1 √1 (165)
2
| 6 + 0|2 + 0
= 3. (166)

39
16. It is possible to arrive at the Clebsch-Gordan coefficients for a given
situation by “elementary” means, i.e., by considering the action of the
raising and lowering operators and demanding orthonormality. Hence,
construct a table of Clebsch-Gordan coefficients, using this approach,
for a system combining j1 = 2 and j2 = 1 angular momenta. I find it
convenient to use the simple notation |jm for total quantum numbers
and |j1 m1 |j2 m2 for the individual angular momentum states being
added, but you may use whatever notation you find convenient.] You
will find (I hope) that you have the freedom to pick certain signs. You
are asked to be consistent with the usual conventions where

33| (|22 |11 ) ≥ 0 (167)


22| (|22 |10 ) ≥ 0 (168)
11| (|22 |1 −1 ) ≥ 0 (169)

(in notation jm| (|j1 m1 |j2 m2 )).


bf Solution:
The action of the lowering operator on a state |jm is:

J− |jm = (j + m)(j − m + 1)|j, m − 1 . (170)

The Clebsch-Gordan series for combining j1 = 2 and j2 = 1 is:

D2 ⊗ D1 = D3 ⊕ D2 ⊕ D1. (171)

We start with
33| (|22 |11 ) = 1. (172)
The lowering operator on |33 gives

J− |33 = 6|32 . (173)

Also, √
J− (|22 |11 ) = 2 (|21 |11 ) + 2 (|22 |10 ) (174)
Thus, 
2 1
|32 = (|21 |11 ) + √ (|22 |10 ) . (175)
3 3

40
The |22 state is also a linear combination of (|21 |11 ) and (|22 |10 ),
which we obtain by requiring orthogonality with |32 and using the
convention specified in the problem statement:

1 2
|22 = − √ (|21 |11 ) + (|22 |10 ) . (176)
3 3

Let’s show one more, to make sure the idea is clear:

J− |22 = 2|21 (177)


  
1 2
J− − √ (|21 |11 ) + (|22 |10 ) (178)
3 3

1 √ √  2 √ 
= −√ 6 (|20 |11 ) + 2 (|21 |10 ) + 2 (|21 |10 ) + 2 (|22 |1 − 1 )
3 3

√ 2 2
= − 2 (|20 |11 ) + (|21 |10 ) + √ (|22 |1 − 1 ) . (179)
3 3
Hence,
1 1 1
|21 = − √ (|20 |11 ) + √ (|21 |10 ) + √ (|22 |1 − 1 ) . (180)
2 6 3

The entire table is shown below.

41
C(21j; m1 m2 m)

j 3 3 2 3 2 1 3 2 1 3 2 1 3 2 3
m1 m2 m 3 2 2 1 1 1 0 0 0 -1 -1 -1 -2 -2 -3

2 1 1
2
2 0 √1
 3
2
3
−1
1 1 3

3 1 3
2 -1 √1
 15
8
 3
1
 35
1 0 −
152 −1
6 10
0 1 √ √1
5 2 10
3
1 -1 √1 √1
 5
3
2 10
0 0 0 − 25
5
−1
3
-1 1 √1 √
10
5 2
2
0 -1 √1 √1
 85 −1
2  10
3
-1 0 √ − 10
15
−1
6 3
-2 1 √1 √
5
15 3
2
-1 -1 √1
3
 32
-2 0 √1 −
3 3

-2 -1 1

17. In our discussion of the Wigner-Eckart theorem, we obtained the re-


duced matrix element for the angular momentum operator: j  k  Jj  k  .
This required knowing the Clebsch-Gordan coefficient C(1, j, j; 0, m, m).
By using the general prescription for calculating the 3j symbols we de-
veloped, calculate the 3j symbol
 
1 j j
,
0 m −m

and hence obtain C(1, j, j; 0, m, m).


Solution: Our general prescription for computing 3j symbols is:
 
j1 j2 j3
= Pj1 m1 (∂x1 , ∂y1 )Pj2 m2 (∂x2 , ∂y2 )Pj3 m3 (∂x3 , ∂y3 )G({k}; {x}, {y}),
m1 m2 m3
(181)

42
where
(x1 y2 − x2 y1 )2k3 (x2 y3 − x3 y2 )2k1 (x3 y1 − x1 y3 )2k2
G({k}; {x}, {y}) ≡  ,
(2k3 )!(2k1 )!(2k2 )!(j1 + j2 + j3 + 1)!
(182)
where 2k1 , 2k2 , 2k3 are non-negative integers given by:
2k3 = j1 + j2 − j3 ; 2k1 = j2 + j3 − j1 ; 2k2 = j3 + j1 − j2 , (183)
and
xj+m y j−m
Pjm (x, y) ≡  ; P00 ≡ 1. (184)
(j + m)!(j − m)!
We also define Pjm ≡ 0 if m ∈
/ {−j, . . . , j}.
The k indices in the present case are:
2k3 = 1 (185)
2k1 = 2j − 1 (186)
2k2 = 1, (187)
and thus,
(x1 y2 − x2 y1 )(x2 y3 − x3 y2 )2j−1 (x3 y1 − x1 y3 )
G({k}; {x}, {y}) =  .
(2j − 1)!(2 + 2j)!
(188)
Finally,
 
1 j j
= P10 (∂x1 , ∂y1 )Pjm (∂x2 , ∂y2 )Pj−m(∂x3 , ∂y3 )G({k}; {x}, {y})
0 m −m
∂xj+m ∂yj−m ∂xj−m ∂yj+m
= ∂x1 ∂y1  2 2
 3 3

(j + m)!(j − m)! (j + m)!(j − m)!


(x1 y2 − x2 y1 )(x2 y3 − x3 y2 )2j−1 (x3 y1 − x1 y3 )

(2j − 1)!(2 + 2j)!
∂xj+m
2
∂yj−m
2
∂xj−m
3
∂yj+m
3
(x2 y3 + x3 y2 )(x2 y3 − x3 y2 )2j−1
= 
(j + m)!(j − m)! (2j − 1)!(2 + 2j)!
∂xj+m
2
∂yj−m
2
∂xj−m
3
∂yj+m
3
= 
(j + m)!(j − m)! (2j − 1)!(2 + 2j)!

43
2j−1
 (2j − 1)!
(x2 y3 + x3 y2 ) (x2 y3 )k (−x3 y2 )2j−1−k
k=0 k!(2j − 1 − k)!
!
∂xj+m
2
∂yj−m
2
∂xj−m
3
∂yj+m
3
! (2j
" − 1)!
=
(j + m)!(j − m)! (2 + 2j)!
2j−1
 (x2 y3 )k+1 (−x3 y2 )2j−1−k − (x2 y3 )k (−x3 y2 )2j−k
k=0 k!(2j − 1 − k)!
!
[(j + m)!(j − m)!)]2 !
" (2j − 1)!
=
(j + m)!(j − m)! (2 + 2j)!


j−m 1 1
(−) −
(j + m − 1)!(j − m)! (j + m)!(j − m − 1)!
2m
= (−)j−m  . (189)
(2j + 2)(2j + 1)2j

Hence, the desired Clebsch-Gordan coefficients are:


  
1−j+m 1 j j
C(1jj; 0mm) = (−) 2j + 1 (190)
0 m −m
m
= − . (191)
j(j + 1)

This agrees with the result asserted in the notes.

18. Rotational Invariance and angular distributions: A spin-1 particle is


polarized such that its spin direction is along the +z axis. It decays,
with total decay rate Γ, to π + π − . What is the angular distribution,
dΓ/dΩ, of the π + ? Note that the π ± is spin zero.
Solution: We assume that the problem statement refers to the center-
of-mass frame. The initial state is |i = |j = 1, m = 1 , where j is
the total angular momentum, and m is its projection on the z-axis.
There is clearly no azimuth angle dependendence in this problem, so
the distribution is uniform in azimuth, φ. By angular momentum con-
servation, the final state must have total angular momentum 1. Since
the pions are spinless, this means that the orbital angular momentum
must be 1. Furthermore, the z projection must also be 1. Thus, the

44
angular distribution must be:
3
dΓ/dΩ(θ, φ) = Γ|Y1 1(θ, φ)|2 = Γ sin2 θ. (192)

Notice that this is zero at θ = 0 – if the pions are emitted along the
z-axis, the orbital angular momentum cannot be along z, so emission
at this angle would violate angular momentum conservation.
Let us also solve this problem using the formula we developed in class:

dΓ/dΩ(θ, φ) ∝ |Aλ1 λ2 |2 |djmδ (θ)|2 , (193)
λ1 ,λ2

where δ ≡= λ1 − λ2 . In this case, the products are spinless, so λ1 =


λ2 = 0. Thus, as applied to the present problem:
1
dΓ/dΩ(θ, φ) ∝ Γ|d110 (θ)|2 = Γ sin2 θ. (194)
2
Normalizing to get a total decay rate of Γ, we recover the result of
Eqn. 192.

19. Here is another example of how we can use the rotation matrices to
compute the angular distribution in a decay process. Let’s try another
similar example. Consider a spin one particle, polarized with its angular
momentum along the ±z-axis, with equal probabilities. Suppose it
decays to two spin-1/2 particles, e.g., an electron and a positron.

(a) Such the decay occurs with no orbital angular momentum. What
is the angular distribution of the decay products, in the frame of
the decaying particle?
Solution: Actually, we are given the answer: If there is no orbital
angular momentum, the spatial portion of the wave function is
uniform in angle. Thus,
1 dΓ 1
= . (195)
Γ dΩ 4π
But suppose we didn’t notice this, or just want to check that
our formalism gives the right answer. We must average over the

45
two initial polarizations m = ±1, and sum over all possible final
helicities (deciding to work in the helicity formalism here). Thus,
1 1
1 dΓ 1   2 2
(m) 2 1
(θ, φ) = |A | |dmδ (θ)|2 . (196)
Γ dΩ 2 m=−1,1 λ =− 1 λ =− 1 λ1 λ2
1 2 2 2

We may simplify a bit by noticing that the helicity amplitudes for


m = −1 must be related to the m = 1 amplitudes by reversing
the z direction, i.e., by letting θ → π − θ:
1 1
1 dΓ 1  2 2  
(θ, φ) = |Aλ1 λ2 |2 |d11δ (θ)|2 + |d11δ (π − θ)| 2
(197) .
Γ dΩ 2 λ =− 1 λ =− 1
1 2 2 2

1  
= (|A++ |2 + |A−− |2 ) |d110 (θ)|2 + |d110 (π − θ)|2
2  
+|A+− |2 |d111 (θ)|2 + |d111 (π − θ)|2
 
+|A−+ |2 |d11−1 (θ)|2 + |d11−1 (π − θ)|2 . (198)

Now,
1
d111 (θ) = d11−1 (π − θ) = (1 + cos θ), (199)
2
1 1 1
d10 (θ) = = d10 (π − θ) = − √ sin θ, (200)
2
1
d11−1 (θ) = d111 (π − θ) = (1 − cos θ). (201)
2
(202)
Thus,
1 dΓ 1
(θ, φ) = (|A++ |2 + |A−− |2 ) sin2 θ (203)
Γ dΩ 2
1  
+ (|A+− |2 + |A−+ |2 ) (1 + cos θ)2 + (1 − cos θ)2 ,
2
1  
= (|A++ |2 + |A−− |2 ) sin2 θ + (|A+− |2 + |A−+ |2 ) 1 + cos2 θ .
2
In order to be consistent with our assumption of L = 0, the helicity
amplitudes must be constrained such that
|A++ |2 + |A−− |2 = |A+− |2 + |A−+ |2 . (204)

46
(b) If this is an electromagnetic decay to e+ e− , and the mass of the
decaying particle is much larger than the electron mass, the sit-
uation is altered, according to relativistic QED. In this case, the
final state spins will be oriented in such a way as to give either
m = 1 or m = −1 along the decay axis, where m is the total
projected angular momentum. What is the angular distribution
of the decay products in this case?
Solution: We should no longer assume L = 0 for this part. We
may proceed as in part (a), except omitting the m = 0 component
along the decay axis:
1 dΓ 1  
(θ, φ) = (|A+− |2 + |A−+ |2 ) (1 + cos θ)2 + (1 − cos θ)2 ,
Γ dΩ 2
3
= (1 + cos2 θ). (205)
16π
Let us derive this result a bit more intuitively, without using the
general formula. Our initial state is either |j = 1, m = 1 or |1−1 ,
in the basis where m is the angular momentum along the z axis.
The description of the final state is similar, except that now the
quantization axis (call it z  ) is at an angle θ with respect to the z
axis. That is, the final state is either |j = 1, m = 1 or |1 − 1  ,
where the prime on the ket is intended to indicate the basis in
which the z  axis is the quantization axis. We want to determine:
1 dΓ (1) (−1)
(θ, φ) = |A+− 11||11 |2 + |A+− 11| |1 − 1 |2 (206)
Γ dΩ
(1) (−1)
+|A−+ 1 − 1| |11 |2 + |A−+ 1 − 1| |1 − 1 |2

To determine the indicated scalar products, we would like to ex-


press the primed basis vectors in terms of the unprimed basis. A
primed basis vector can be obtained by rotating an unprimed ba-
sis vector by angle θ; since we have azimuthal symmetry, let us
take the rotation to be about the y axis:
|1m  = Ry (θ)|1m . (207)
Hence,
1 dΓ (1) (−1)
(θ, φ) = |A+− 11|Ry (θ)|11 |2 + |A+− 1 − 1|Ry (θ)|11 |2 (208)
Γ dΩ
47
(1) (−1)
+|A−+ 11|Ry (θ)|1 − 1 |2 + |A−+ 1 − 1|Ry (θ)|1 − 1 |2
(1) (−1) (1) (−1)
= |A+− d111 (θ)|2 + |A+− d1−11 (θ)|2 + |A−+ d11−1 (θ)|2 + |A−+ d1−1−1 (θ)|2 .

Again, we can use the fact that the amplitude for the initial m =
−1 decay is related to the amplitude for the m = +1 decay by
reversing the z axis:
1 dΓ (1) (1) (1) (1)
(θ, φ) = |A+− d111 (θ)|2 + |A+− d111 (π − θ)|2 + |A−+ d11−1 (θ)|2 + |A−+ d11−1 (π − θ)|2
Γ dΩ
= |A+− d111 (θ)|2 + |A+− d11−1 (θ)|2 + |A−+ d11−1 (θ)|2 + |A−+ d111 (θ)|2
 
= (|A+− |2 + |A−+ |2 ) d111 (θ)2 + d11−1 (θ)2 (209)
3
= (1 + cos2 θ). (210)
16π

48
Physics 125c
Course Notes
Approximate Methods
040415 F. Porter

Contents
1 Introduction 2

2 Variational Method 2
2.1 Bound on Ground State Energy . . . . . . . . . . . . . . . . . 2
2.2 Example: Helium Atom . . . . . . . . . . . . . . . . . . . . . 3
2.3 Other Applications of Variational Method . . . . . . . . . . . 6
2.4 Variational Theorem . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 The “Ritz Variational Method” . . . . . . . . . . . . . . . . . 8

3 The WKB Approximation 10


3.1 Example: Infinite Square Well . . . . . . . . . . . . . . . . . . 12
3.2 Example: Harmonic Oscillator . . . . . . . . . . . . . . . . . . 13

4 Method of Stationary Phase 14


4.1 Application: Asymptotic Bessel Function . . . . . . . . . . . . 15
4.2 Application: Quantum Mechanics of Free Particle Asymptotic
Wave Function . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Application: A Scattering Problem . . . . . . . . . . . . . . . 18

5 Stationary State Perturbation Theory 19


5.1 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 Degenerate State Perturbation Theory 25

7 Time-dependent Perturbation Theory 26


7.1 The Time-Ordered Product . . . . . . . . . . . . . . . . . . . 28
7.2 Transition Probability, Fermi’s Golden Rule . . . . . . . . . . 29
7.3 Coulomb Scattering . . . . . . . . . . . . . . . . . . . . . . . . 36
7.4 Decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.5 Adiabatically Increasing Potential . . . . . . . . . . . . . . . . 39

8 Eigenvalues – Comparison Theorems 41


1
9 Exercises 45

1 Introduction
Typically, problems in quantum mechanics are difficult to solve exactly with
analytic methods. We thus resort to approximate methods, or to numerical
methods. In this note, I review several approximate approaches.

2 Variational Method
There are many applications of the technique of varying quantities to find a
useful extremum. This is the gist of the “variational method”. As a means
of finding approximate solutions to the Schrödinger equation, a common
approach is to guess an approximate form for a solution, parameterized in
some way. The parameters are varied until an extremum is found. We
illustrate this approach with examples.

2.1 Bound on Ground State Energy


Given a system with Hamiltonian H, and ground state energy E0 , we may
note that for any state vector |ψi we must have:
hψ|H|ψi
≥ E0 . (1)
hψ|ψi
This suggests that we may be able to use some set of functions ψ, parame-
terized in some way, to obtain an upper limit on the ground state energy,
even if we cannot solve the problem exactly. With careful choice of “trial”
function, we may even be able to get a good approximation to the energy
level. Thus, the program is to find the minimum of the quantity in Eqn. 1
over variations in the parameter space to get a “least” upper bound on E0
for our trial wave functions:
hψ{θ } |H|ψ{θ } i
δ hψ{θ } |ψ{θ } i
= 0. (2)
δ{θθ}
Here, the parameter set to be varied is denoted {θθ }.

2
2.2 Example: Helium Atom
The Coulombic Hamiltonian for the helium atom is:
p21 p2 2α 2α α
H = + 2 − − + (3)
2m 2m |x1 | |x2 | |x1 − x2 |
1 2α 2α α
= − (∇21 + ∇22 ) − − + , (4)
2m |x1 | |x2 | |x1 − x2 |
where α = e2 , m is the mass of the electron, and we are neglecting the motion
of the nucleus. Let
r1 = |x1 |, (5)
r2 = |x2 |, (6)
r12 = |x1 − x2 |. (7)
Toward guessing a “good” trial wave function, note that, if the interaction
term α/r12 were not present, the ground state wave function would be simply
a product of two hydrogenic ground state wave functions in x1 and x2 :
Z 3 − aZ (r1 +r2 ) 1
ψ(x1 , x2 ) = 3 e 0 , a0 = . (8)
πa0 mα
There is no reason to expect that the α/r12 term to be especially “small”
compared with the other terms, so the perturbation theory approach (dis-
cussed later) may not work especially well here. However, let us use the
above wave function here in the variational method, and let Z be a variable
parameter, to get an upper bound on the helium ground state energy.
We need to evaluate the expectation value of H. The kinetic energy of
one electron is:
Z
p21 Z 3 −Zr1 /a0 p21 −Zr1 /a0
hψ| |ψi = d3 (x1 ) e e
2m (∞) πa30 2m
Z
Z3
d3 (x2 ) 3 e−2Zr2 /a0 (9)
(∞) πa0
= Z 2 × kinetic energy of hydrogen atom ground state (10)
1
= Z 2 mα2 . (11)
2
Thus,
p21 p2
hψ| + 2 |ψi = Z 2 mα2 . (12)
2m 2m
3
Similarly,

hψ| − |ψi = 2Z × potential energy of hydrogen atom ground state
r1
= −2Zmα2 . (13)

A remark is in order: The “2” on the left hand side is for the Coulomb
potential felt by electron number one in the Z = 2 field of the helium nucleus.
Note that this “2” is from the Hamiltonian for helium - it is not a variational
parameter. The factor of Z that appears on the right side is a variational
parameter, since it arises from the trial wave function (h1/r1 i = Z/a0 ). Thus,
we so far have:
!
1 2 1 1 1
hψ| (p1 + p22 ) − 2α + |ψi = mα2 (2Z 2 − 8Z). (14)
2m |x1 | |x2 | 2
It remains to evaluate the “interaction energy” between the two electrons,
for our trial wave function:
Z Z !2 − 2Z (r1 +r2 )
α 3 Z33 ea0
hψ| |ψi = α d (x1 )d (x2 ) . (15)
r12 (∞) (∞) πa30 |x1 − x2 |
Let us digress for a moment here to obtain a couple of handy integrals.

Theorem: Let u, v, and w be three positive real numbers (one of which


may also be zero). Then
1.
Z
0 exp(−u|y − x| − v|y − x0 |)
I(u, v; x, x ) ≡ d3 (y)
(∞) |y − x||y − x0 |
 
4π e−v∆ − e−u∆
= , (16)
∆(u2 − v 2 )
where ∆ ≡ |x − x0 |.
2.
Z Z
3 exp(−u|x| − v|y| − w|x − y|)
J(u, v, w) ≡ d (x) d3 (y)
(∞) (∞) |x||y||x − y|
2
(4π)
= . (17)
(u + v)(v + w)(w + u)

4
Proof: (sketch)
1. Let z = y−x, and let |z| = r. Then we may make the replacement

d3 (y) → d3 (z) = r 2 drd cos θdφ. (18)

Pick the 3-axis to be along x − x0 for the integration over angles:



|y − x0 | = |z + x − x0 | = r2 + ∆2 + 2r∆ cos θ. (19)

Thus,
Z ∞ √
Z 1
0 −ur exp(−v r2 + ∆2 + 2r∆ cos θ)
I(u, v; x, x ) = 2π re dr d cos θ √ .
0 −1 r2 + ∆2 + 2r∆ cos θ
(20)
Integrating over cos θ yields
Z ∞ h i

I(u, v; x, x0 ) = dre−ur e−v|r−∆| − e−v(r+∆) . (21)
v∆ 0

Finally, integrate over r to obtain


 
4π e−v∆ − e−u∆
I(u, v; x, x0) = . (22)
∆(u2 − v 2 )

2. Let x = |x|, y = |y|, and write:


Z ∞ Z ∞ Z 1
2 2
J(u, v, w) = 4π x dx2π
y dy d cos θ (23)
0 0
√ 2 −1 2
exp(−ux − vy − w x + y − 2xy cos θ)
√ .
xy x2 + y 2 − 2xy cos θ
The integration then proceeds similarly as above.

We apply this theorem now to our problem.


!2 Z Z − 2Z (|x|+|y|)
α Z3 3 3 e a 0
hψ| |ψi = α d (x)d (y)
r12 πa30 (∞) (∞) |x − y|
!2
Z3
= α ∂u ∂v J(u = 2Z/a0 , v = 2Z/a0 , 0) (24)
πa30
1 5
= mα2 Z. (25)
2 4
5
Thus,  
1 27
hψ|H|ψi = mα2 2Z 2 − Z . (26)
2 4
The minimum is at Z = 27/16. Hence
"  2 #
1 27
hψ|H|ψimin = − mα2 2 (27)
2 16
= −77.0 eV. (28)

Experimentally, the ground state energy of helium, from the first and second
ionization energies, is

E0 = −(24.59 + 54.41) = −79.00 eV. (29)

We have come within about 2.5% of the right value by our variational
method with the “hydrogen” trial function. More careful variational cal-
culations give good agreement. Note that the best value was obtained for
Z = 27/16 instead of Z = 2. This is suggestive of the “screening” of the nu-
cleus from each electron by the other electron, reducing the effective charge
5
by 16 e.

2.3 Other Applications of Variational Method


We may derive other potentially useful relations in connection with the vari-
ational approach. For example, let {ψn |n = 0, 1, 2, . . .} be an orthonormal
set of true eigenfunctions of H, Hψn = En ψn , with E0 < E1 ≤ . . .. Then we
can expand our trial wave function (assumed to be normalized) in this basis:

X
ψ= cn ψn . (30)
n=0

We then obtain the bound:



X
2
hψ|H|ψi = |c0 | E0 + |cn |2 En (31)
n=1

X
≥ |c0 |2 E0 + E1 |cn |2 (32)
n=1
≥ (1 − |c0 |2 )(E1 − E0 ) + E0 . (33)

6
Thus,
hψ|H|ψi − E0
1 − |c0 |2 = 1 − |hψ|ψ0 i|2 ≤ , (34)
E1 − E0
giving us a bound on how close our trial wave function is to the true ground
state wave function. Of course, this is useful only if we have sufficient knowl-
edge of the spectrum.
Likewise, we can obtain a lower bound on E0 :

Theorem: If we have a normalized function |ψi such that

E0 ≤ hψ|Hψi ≤ E1 , (35)

then
hHψ|Hψi − hψ|H|ψi2
E0 ≥ hψ|H|ψi − . (36)
E1 − hψ|H|ψi
The proof of this will be left to the reader. To use this theorem, a lower
bound on E1 − hψ|H|ψi may be inserted.

2.4 Variational Theorem


We may put our intuition on a firmer foundation with the following “Varia-
tional Theorem”:

Theorem: Let ψ ∈ H (such that 0 < hψ|ψi < ∞), and define the following
functional on H:
hψ|H|ψi
E(ψ) ≡ , (37)
hψ|ψi
where H is the Hamiltonian operator, with a discrete spectrum. Then,
any vector ψ for which the variation of E is stationary [that is, δE(ψ) =
0] is an eigenvector in the discrete spectrum of H, and the correspond-
ing eigenvalue of H is E(ψ).

Proof: Write:
E(ψ)hψ|ψi = hψ|H|ψi. (38)
Take the variation of both sides:

δEhψ|ψi + Ehδψ|ψi + Ehψ|δψi = hδψ|H|ψi + hψ|H|δψi. (39)

7
Thus,
hψ|ψiδE = hδψ|H − E|ψi + hψ|H − E|δψi. (40)
If (H − E)|ψi = 0, then δE = 0, that is, if ψ is an eigenstate of H with
eigenvalue E(ψ), then E is stationary. Suppose, instead, that δE = 0.
In this case,
hδψ|H − E|ψi + hψ|H − E|δψi = 0. (41)
Note that ψ is complex, and the variation of the real and imaginary
parts are independent (the normalization is not constrained). We may
deal with this by considering the variation of iψ. If δE = 0, we have:

hδ(iψ)|H − E|ψi + hψ|H − E|δ(iψ)i = 0. (42)

Hence,
−ihδ(ψ)|H − E|ψi + ihψ|H − E|δ(ψ)i = 0. (43)
Combining with Eq. 41, we obtain:

hδψ|H − E|ψi = 0 (44)


hψ|H − E|δψi = 0. (45)

This must hold for all variations δψ, implying that:

H|ψi = E|ψi, (46)

completing the proof.

2.5 The “Ritz Variational Method”


We can modify our notion of taking a single trial wave function, with parame-
ters to be varied, to a set of orthonormal trial wave functions. In particular,
we could try to use an orthonormal set of solutions to a simpler, but prefer-
ably related, problem.
Consider a finite set of such functions:

{|ni : n = 0, 1, 2, . . . , N }. (47)

Our trial wave function is constructed according to:


N
X
|ψi = an |ni. (48)
n=0

8
The an are N + 1 complex parameters to be varied. Let us impose the
normalization canstraint
N
X
|an |2 = 1, (49)
n=0

so that hψ|ψi = 1.
We wish to find the parameters such that the variation of the expectation
of H vanishes:
δ(hψ|H|ψi) = 0, (50)
subject to the constraint 49. Thus,
 
N
X
 
0 = δ hψ|mihm|H|nihn|ψi (51)
n=0
m=0
 
N
X
 
= δ a∗m an hm|H|ni , (52)
n=0
m=0

subject to  
N
! N
X X
0 = δ(1) = δ |an |2 = δ
 a∗m an δmn 
. (53)
n=0 n=0
m=0

We may impose the constraint with the method of Lagrange multipliers:


! !
X X
0 = δ a∗m an hm|H|ni − δ λ a∗m an δmn (54)
m,n m,n
X
= [δa∗m an (hm|H|ni − λδmn ) + a∗m δan (hm|H|ni − λδmn )] (55)
m,n
X
= [δa∗m an (hm|H|ni − λδmn ) + δam a∗n (hm|H|ni∗ − λδmn )] , (56)
m,n

where the last equation was obtain via a relabeling of the m, n indices. As
in the previous section, we note that the real and imaginary parts may be
varied separately, and hence:
X
δa∗m an (hm|H|ni − λδmn ) = 0, (57)
m,n
X
δam a∗n (hm|H|ni∗ − λδmn ) = 0. (58)
m,n

9
We may also vary each of the individual an ’s separately, setting all δan = 0
except for one. This gives:
X
an (hm|H|ni − λδmn ) = 0. (59)
n

This may be rewritten, in our finite basis:


" #
X
hm| an (H − λI)|ni = 0, or (60)
n
X
hm|(H − λI) an |ni = 0, ∀m. (61)
n
P
That is, n a(i)n |ni, i = 0, 1, . . . , N are eigenvectors of H, in this reduced
basis, with eigenvalues λ(i) .
If we managed to pick “good” functions |ni, then the first few E (i) = λ(i)
may be good approximations to the first few energy levels of the full problem.

3 The WKB Approximation


The WKB (for Wentzel-Kramers-Brillouin) method makes use of the wave
nature of the solutions to the Schrödinger equation between classical turning
points. We’ll give a simple-minded treatment here; refinements are possible.
We wish to consider the problem of stationary states in a one-dimensional
potential well (or equivalent one-dimensional potential for three-dimensional
problems with spherical symmetry). Label the states by n, where n =
0, 1, . . ., and E0 < E1 < . . .. Consider En , with classical turning points
x1 , x2 . The energy En corresponds to the (n + 1)th state, or the nth “excited
state.” The wave function will be oscillatory, with roughly n + 1 half-waves
between the classical turning points. Fig. 1 illustrates this for the fourth
excited state.
Let’s see how we can use this idea to estimate the energy levels. Consider
an oscillatory solution of the form:
ψ(x) = A(x) sin φ(x), (62)
where A(x) > 0, and φ(x) is a phase increasing monotonically with x. When-
ever φ(x) = kπ, where k is an integer, there is a node in ψ(x). Consider the
change in φ between turning points:
∆φ = φ(x2 ) − φ(x1 ). (63)

10
~ ( 5 - 14 ) π
~ 4π π 2π 3π 4π
V(x)
E4

x
x1 x2

Figure 1: Illustration of the WKB method. The classical turning points x1


and x2 are shown for the fourth excited state with energy E4 . The phase is
counted between the turning points.

The figure is suggestive of ∆φ ≈ (n + 12 )π. More rigorously, we may state:


nπ < ∆φ ≤ (n + 1)π. (64)
It is readily seen that the right bound is achieved for an infinite square well
potential. Typically, we simply make the choice (n + 12 )π, although it may
not be difficult to do better in some circumstances.
Now let us make an approximate calculation for this change in phase
according to the Schrödinger equation. For a region of constant V < En the
wave function is
ψ(x) = A sin(x − x0 )p, (65)
where A and x0 are constants, and
q
p= 2m(E − V ). (66)
Thus, in a region of constant V , the phase varies as φ(x) = (x − x0 )p. The
change in phase as we increase x slightly is
q
dφ = pdx = 2m(E − V )dx. (67)

11
Adding up all such contributions between the turning points yields:
Z x2 Z x2 q

∆φ = dx ≈ 2m(E − V )dx. (68)
x1 dx x1

The approximation is better the more slowly V (x) varies with x, since we have
used an integrand based on the assumption of constant V regions. Therefore,
we have: Z x2 q
1
2m [En − V (x)]dx ≈ (n + )π. (69)
x1 2
To use this result to estimate the energy levels En , we must first solve for
the turning points as a function of E:

V (x1 ) = V (x2 ) = E, x2 > x1 , (70)

yielding the functions x1 (E) and x2 (E). Next, we solve for the function of
E: Z x2 (E) q
f (E) = 2m [En − V (x)]dx. (71)
x1 (E)

Finally, we determine En as the solution to


1
f (En ) = (n + )π. (72)
2
This is the “WKB method”.

3.1 Example: Infinite Square Well


Consider the infinite square well potential:

0 0<x<∆
V (x) = (73)
∞ x ≤ 0 or x ≥ ∆.
We have x1 = 0, and x2 = ∆, independent of E. Thus,
Z ∆√ √
f (E) = 2mEdx = 2mE∆. (74)
0

Letting
q 1
∆ 2mEn = (n + )π, (75)
2

12
we obtain the result
(n + 12 )2 π 2
En = , n = 0, 1, 2, . . . (76)
2m∆2
Actually, we can do better than this. As we noted earlier, the change in
phase for this potential is really (n + 1)π, since the turning points are nodes.
Thus, we expect
(n + 1)2 π 2
En = , n = 0, 1, 2, . . . (77)
2m∆2
This is, in fact, the exact result, which is to be expected, because the as-
sumption of constant V is valid in this case.

3.2 Example: Harmonic Oscillator


Consider the one-dimensional simple harmonic oscillator potential:
1
V (x) = kx2 . (78)
2
q
Let ω0 = k/m be the classical angular frequency. Find the turning points:

1 2 1 2
E = kx = kx , (79)
2 1 2 s2
2E
−x1 = x2 = x0 ≡ . (80)
k
Now find f (E):
Z x0 q
f (E) = dx 2m [E − V (x)] (81)
−x0
√ Z 1 √
= 2 mkx20 dx 1 − x2 (82)
0
√ Z π/4
2
= 2 mkx0 cos2 θdθ (83)
0
√ x2
= π mk 0 (84)
2
E
= π . (85)
ω0

13
Setting f (En ) = (n + 12 )π, we arrive at the result:
1
En = (n + )ω0 , n = 0, 1, 2, . . . (86)
2
This turns out to be the exact spectrum for the bound states of the simple
harmonic oscillator potential.

4 Method of Stationary Phase


Suppose that we wish to evaluate an integral of the form (known as the
generalized Fourier integral):
Z ∞
I() = f (x)eiθ(x)/ dx, (87)
−∞

where f and θ are real, and  > 0. If  is very small, the phase oscillation
is very rapid, and we may anticipate little contribution to the integral from
such a region. The dominant contribution may be expected to arise where
the phase variation is slow, that is, where dθ/dx = 0. This is the idea behind
the method of stationary phase. Figure 2 illustrates the idea.
Let us pursue this notion. Suppose θ(x) has a (single) stationary point
at x = x0 : θ0 (x0√
) = 0. We do a Taylor series expansion about this point,
letting x = x0 + u:
Z ∞ h √ i
I() = uf 0 (x0 ) + 12 u2 f 00 (x0 ) + O(3/2 )
f (x0 ) +
−∞
n h √ io √
exp i θ(x0 )/ + 12 u2 θ00 (x0 ) + 3!1 u3 θ000 (x0 ) + O() du (88)
Z ∞ " #
√ iθ(x0 )/
√ f 0 (x0 ) 1 2 f 00 (x0 )
= f (x0 )e 1+ u + 2 u + O(3/2 )
−∞ f (x0 ) f (x0 )
i 2 θ 00 (x )
h √ i
e2u 0
exp i
3!
u3 θ000 (x0 ) + O() du (89)
Z ∞ ( " #)
√ √ f 0 (x0 )
= f (x0 )eiθ(x0 )/ 1+  u + 3!1 u3 θ000 (x0 ) + O()
−∞ f (x0 )
i 2 00
e 2 u θ (x0 ) du. (90)
The terms odd in u vanish on integration, leaving:
√ Z ∞ i 2 θ 00 (x )
I() = f (x0 )e iθ(x0 )/
e2u 0
du
−∞

14
1.2

0.8

0.4

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

-0.4

-0.8

-1.2

Figure 2: Illustration for the method of stationary phase. The smooth con-
cave down curve is f (x). The smooth concave up curve is θ(x). The oscil-
lating curve of constant amplitude is the real part of eiθ(x)/ . The remaining
oscillatory curve is the real part of the
h desired integrand:
i f (x)eiθ(x)/ . In this
1 2
illustration,  = 0.01, f (x) = 0.8 exp − 2 (x − 0.2) , and θ(x) = x2 .

s
√ iθ(x0 )/ i π4 sign θ 00 (x0 ) 2π
= f (x0 )e e [1 + O()] . (91)
|θ00 (x 0 )|

For multiple stationary points, we simply sum over this expression evaluated
at each stationary point.

4.1 Application: Asymptotic Bessel Function


Suppose we wish to evaluate the Jn (z) Bessel function for large z. We start
with the integral representation:
Z π
1
Jn (z) = n eiz cos φ cos(nφ)dφ. (92)
i π 0
The role of  is carried by 1/z. The quantity θ(φ) is just cos φ. This has
stationary points at φ = 0 and φ = π, within the region of integration. There

15
is a potential difficulty with the fact that these points are at the limits of the
integration. However, we may use the fact that the integrand is symmetric
about φ = 0, and periodic, to write:
Z 3π/2
1
Jn (z) = eiz cos φ cos(nφ)dφ. (93)
2in π −π/2

We need to sum Eqn. 91 over the two stationary points, with


f (0) = cos(n0) = 1, (94)
f (0) = cos(nπ) = (−1)n , (95)
θ(0) = −θ(π) = 1, (96)
θ00 (0) = −θ00 (π) = −1. (97)
The result is, for large z:
1 1 h i(z−π/4  iπ n −i(z−π/4 i
Jn (z) ∼ √ e + e e (98)
2πz 2πin
1 h i(z−nπ/2−π/4 i
= √ e + e−i(z−nπ/2−π/4 (99)
2πz
s  
2 nπ π
= cos z − − . (100)
πz 2 4
This is the familiar result.
The reader may wish to go back and verify that the application of Eqn. 91
is really proper, as we have violated some of the assumptions made in ob-
taining that result. Indeed, a more general treatment may be obtained based
on the “Reimann-Lebesgue Lemma”:
Lemma: Let |f (x)| be an integrable function of real variable x, and θ(x) be
a continuosly differentiable real function on the interval [a, b], except
that θ is nowhere constant on any finite subinterval of [a, b]. Then
Z b
I() ≡ f (x)eiθ(x)/ dx → 0 as  → 0 + . (101)
a

4.2 Application: Quantum Mechanics of Free Particle


Asymptotic Wave Function
Let’s try an application of the method of stationary phase in quantum me-
chanics. Consider the free particle (in one dimension), and the question:

16
What is the asymptotic behavior of the wave function as t → ∞? Suppose
in particular, we are interested in a wave function which in momentum space
is localized around a momentum q at time t = 0. The position space wave
function at t = 0 is:
1 Z
ψ(x, 0) = dpψ̂(p)eipx . (102)

Taking ψ̂ to be real, we find that |ψ(x, 0)| is symmetric about 0. If we further
suppose that ψ̂ > 0, then |ψ| is maximal at x = 0 (alternatively, we could
choose our coordinate system such that this is the case1 ).
The time evolution of the wave function is:
Z
1 2 /2m
ψ(x, t) = dpψ̂(p)eipx−itp . (103)

The phase of interest here is

θ(p)/ = px − tp2 /2m. (104)

This is stationary at p = p0 , where:


1 dθ po
0= =x− t. (105)
 dp p0 m

Now we use equation 91 to determine the asymptotic form of ψ(x, t) as


t → ∞, with  = 1/t. The stationary point is p0 = mx/t, and

m x2
θ(p0 ) = , (106)
2 t2
θ00 (p0 ) = −1/m, (107)
1
f (p0 ) = ψ̂(mx/t). (108)

Thus,
1 2
q
ψ(x, t) → √ ψ̂(mx/t)eimx /2t e−iπ/4 m/2π, as t → ∞, (109)
t
r
m −iπ/4 imx2 /2t
= ψ̂(mx/t) e e . (110)
2πt
1
You may wish to recall that a translation by x0 in position corresponds to multiplica-
tion by a phase e−ipx0 in momentum space

17
The probability desnity, at large times, is thus:
m
|ψ(x, t)|2 = |ψ̂(mx/t)|2 . (111)
2πt
Note that this is peaked around x = qt/m, if ψ̂(p) is peaked around q, giving
the appropriate classical correspondence for the motion. The probability
density falls as 1/t, but also spreads out as 1/t (due to the 1/t in the argument
of ψ̂). Probability is conserved:
Z Z Z
m 1
|ψ(x, t)|2 dx =t→∞ |ψ̂(mx/t)|2 dx = |ψ̂(p)|2 dp = constant.
2πt 2π
(112)

4.3 Application: A Scattering Problem


Let us try another example, a one-dimensional scattering problem. Suppose
that there is a localized potential V (x):

V (x) = 0, |x| > L. (113)

For a given (positive) momentum component, and x > |L|, E = p2 /2m, and
the wave function corresponding to this momentum is:
(
eipx + R(E)e−ipx , x < −L,
ψE (x) = (114)
T (E)eipx , x > L.

The solution to the time-dependent Schrödinger equation, for |x| > L, is


Z
1
ψ(x, t) = dpψ̂(p)ψE (x)e−iEt , (115)

where we have now summed over the plane wave components to obtain a
physical wave packet. Again, we assume that ψ̂(p) is localized around p = q.
Note that, if V (x) = 0, this is just
Z
1 2 t/2m
ψ(x, t) = dpψ̂(p)eipx e−ip , (116)

as in the previous section.
Consider x > L: Let T (E) = |T (E)|eiδ(E) . Then,
Z
1
ψ(x, t) = dpψ̂(p)|T (E)|eiδ(E) ei(px−Et) . (117)

18
The phase factor is:
1
φ = θ = δ(E) + px − Et. (118)

This phase is stationary at dφ/dp = 0. The solution is p = p0 (x, t). Now,
significant contributions to ψ(x, t) will only occur when p = p0 (x, t) is near
q. Hence, we may approximate

δ(E) ≈ δ(E0 ) + (E − E0 )τ (E0 ), (119)

where E0 ≡ q 2 /2m and τ (E) ≡ ∂δ(E)/∂(E).


Thus, our problem is to evaluate:
Z
iδ(E0 )−iE0 τ (E0 ) dp
ψ(x, t) ≈ e ψ̂(p)|T (E)|ei[px−E(t−τ (E0 )] . (120)

This is just like our free particle case, Eqn. 103, except t is replaced with
t − τ (E0 ). Hence, by the method of stationary phase, as t → ∞, for x > L:
" #s
x m 2
ψ(x, t) = ψ̂ m e−iπ/4 eimx /2[t−τ (E0 )] |T (E)|ei[δ(E0 )−E0 τ (E0 )] .
t − τ (E0 ) 2π [t − τ (E0 )]
(121)
The asymptotic probability density is:
" # 2 ( ) 2
x m [mx/(t − τ (E0 ))]2
2
|ψ(x, t)| = ψ̂ m T ,
t − τ (E0 ) 2π [t − τ (E0 )] 2m
(122)
[mx/(t−τ (E0 ))]2
where we have substituted E = 2m
.
This result is of the form of the free particle probability density times the
transmission probability. However, the time t is replaced by t − τ (E0 ). That
is, the outgoing wave is delayed relative to free propagation (V = 0) by a
time delay given by τ (E) = dδ/dE. It is suggested that the reader consider
the classical correspondence for this effect. We will encounter this notion
again when we discuss phase shifts in scattering theory.

5 Stationary State Perturbation Theory


We are frequently concerned with the problem of determining the stationary
state eigenvalues and eigenfunctions of the Hamiltonian. There is a rela-
tively systematic iterative approach to solving this problem, if a suitable

19
first approximation can be found. This is the method of stationary state
perturbation theory, which we develop with a simplified approach here.
Suppose we are given a Hamiltonian of the form H = H0 + V , where we
know the eigenfunctions and eigenvalues corresponding to H0 :

H0 |ni = εn |ni. (123)

We are interested in solving, at least approximately, the problem:

H|N i = EN |Ni. (124)

If V is “small”, we expect that |ni and εn will be approximate eigenfunctions


and eigenvalues of H. We use this idea to form an iterative expansion for
|N i and EN .
We introduce a “bookkeeping” parameter, λ, to count powers (of V ) in
the expansion. Let H(λ) ≡ H0 + λV . We expect the eigenstates of H(λ) to
vary smoothly from eigenstates of H0 at λ = 0 to eigenstates of H at λ = 1.
Thus, consider the following series:

|Ni = |ni + λ|N1 i + λ2 |N2 i + . . . , (125)


En = εn + λEn1 + λ2 En2 + . . . (126)

Note, however, that such an expansion does not always make sense. For
example, we cannot do perturbation theory on the free particle problem to
solve a bound state problem, no matter how “weak” the potential.
We need to develop an algorithm to solve for the terms in the series.
Assume our unperturbed system of eigenstates is orthonormal:

hn|mi = δnm . (127)

Normalize the eigenstates of H so that:

hn|N i = 1. (128)

Assuming our perturbation is not too large, this should be possible. Of


course, we won’t have hN |Ni = 1 in general, and will have to renormalize
these functions at the end. Then we have

1 = hn|N i = hn|ni + λhn|N1 i + λ2 hn|N2 i + . . . (129)

20
The coefficient of each power of λ must vanish, so that:

hn|Nk i = 0, k = 1, 2, . . . (130)

Now consider the Schrödinger equation:

(H0 + λV )|N i = EN |Ni, (131)

or

(H0 + λV )(|ni + λ|N1 i + λ2 |N2 i + . . .) (132)


2 2
= (εn + λEN 1 + λ EN 2 + . . .)(|ni + λ|N1 i + λ |N2 i + . . .).

Equating powers of λ on both sides, we obtain:

λ0 : H0 |ni = εn |ni (133)


λ1 : H0 |N1 i + V |ni = ε|N1 i + EN 1 |ni (134)
k
X
λk : H0 |Nk i + V |Nk−1 i = Enj |Nk−j i, (135)
j=0

where

En0 ≡ εn (136)
|N0 i ≡ |ni. (137)

Consider the λ1 equation, and take the scalar product with hn|:

hn|H0 |N1 i + hn|V |ni = hn|εn |N1 i + hn|EN 1 |ni. (138)

Thus, the first order correction to the nth energy level is

EN 1 = hn|V |ni. (139)

Or, to first order in the potential V (setting λ = 1):

EN = εn + hn|V |ni + O(V 2 ). (140)

This is the most commonly used equation in stationary state perturbation


theory.

21
In general, we find from Eqn. 135:
Enk = hn|V |Nk−1 i. (141)
If we know the (k − 1)th order correction to the wave function, we can
obtain the kth order energy correction. To find the wave function corrections,
expand in the eigenstates of H0 :
X
|Nk i = |mihm|Nk i, k = 1, 2, . . . , (142)
m6=n

where the m = n terms in the sum are excluded since hn|Nk i = 0, ∀k > 0.
Now take the scalar product of Eqn. 135 with hm| to get the expansion
coefficients:
k
X
hm|H0 |Nk i + hm|V |Nk−1 i = Enj hm|Nk−j i (143)
j=0
k−1
X
εm hm|Nk i + hm|V |Nk−1 i = εn hm|Nk i + Enj hm|Nk−j i, (144)
j=1

where the j = k term in the sum may be excluded, since hm|ni = 0. Thus,
if εm 6= εn :
 
k−1
X
1 
hm|Nk i = hm|V |Nk−1 i − Enj hm|Nk−j i . (145)
εn − εm j=1

We’ll try an example – estimating the helium ground state energy. Let
H = H0 + V , with
p21 p2 2α 2α
H0 = + 2 − − (146)
2m 2m r1 r2
α
V = . (147)
r12
The ground state wave function for the unperturbed H0 is just
Z 3 − aZ (r1 +r2 )
|n = 0i = |0i0 = 3 e 0 , with Z = 2. (148)
πa0
The unperturbed ground state energy is:
1
ε0 = − mα2 (2Z 2 ) = −108.9 eV. (149)
2
22
The first order correction to the energy is
Z Z !2
Z3 − 2Z (r1 +r2 ) α
E01 = 0 h0|V |0i0 = d3 (x1 ) d3 (x2 ) e a 0 . (150)
(∞) (∞) πa30 r12
This is an integral we already evaluated, hence
 
1 5
E01 = mα2 Z = 34.0 eV. (151)
2 4 Z=2

The ground state energy, to first order in perturbation theory is thus


E0 = ε0 + 0 h0|V |0i0 + O(V 2 ) (152)
 
1 5
= − mα 2Z − Z + O(V 2 )
2 2
(153)
2 4
= −74.9 eV + O(V 2 ). (154)
Our first order perturbation theory correction to the energy is 34 eV, to be
applied to the zeroth order level at -109 eV. This is a rather large correction,
so it isn’t surprising that the result is still 4.1 eV away from the observed
value of -79.0 eV. Note that the variational calculation we performed does
better – it really is the same calculation, except that there in addition we
allowed Z to vary to accommodate the screening effect.
We might guess that the size of the second order effect could be estimated
by squaring the size of the first order correction:
 2
2 34.0
O(V ) ≈ 74.9 = 7.3 eV. (155)
108.9
This is certainly of the right order, but is only an order-of-magnitude esti-
mate. To do better, we should perform the second order perturbation theory
calculation:
En2 = hn|V |N1 i (156)
X
= hn|V |mihm|N1 i (157)
m6=n
X |hm|V |ni|2
= . (158)
m6=n εn − εm

The second order correction depends on the “overlap” between hm| and V |ni
where hm| 6= hn|, and also on the energy level spacing in the unperturbed
Hamiltonian.

23
5.1 Normalization
We chose the normalization hn|N i = 1 in order to make the expansion conve-
nient. The drawback is that |N i is not normalized to unit total probability.
We may √ renormalize to obtain a normalized wave function as follows: Let
|N̂ i = A|Ni such that hN̂|N̂ i = 1. The constant A = 1/hN |N i is called
the “wave function renormalization constant”. Take
√ √
A = Ahn|N i = hn|N̂ i. (159)

Let’s compute A to second order in V :


1
= hN |Ni (160)
A   
= hn| + λhN1 | + λ2 hN2 | |ni + λ|N1 i + λ2 |N2 i + O(λ3 ) (161)
= 1 + λ2 hN1 |N1 i + O(λ3 ) (162)
X
= 1 + λ2 hN1 |mihm|N1 i + O(λ3 ) (163)
m6=n
X |hm|V |ni|2
= 1 + λ2 2
+ ... (164)
m6=n (εn − εm )

We see that |Ni is normalized already to one to first order, with corrections
only appearing at second order.
To second order,
X |hm|V |ni|2
A = 1 − λ2 2
+ O(λ3 )
m6=n (ε n − εm )
 
∂  X |hm|V |ni|2
= εn + λhn|V |ni + λ2  + O(λ3 )
∂εn m6=n εn − εm

∂En
= + O(λ3 ). (165)
∂εn
This result is actually √valid to all orders in perturbation theory: If a system
is in eigenstate |N̂i = A|Ni of the perturbed Hamiltonian, the probability,
A = |hn|N̂ i|2 , to observe it in the unperturbed state, |ni, is just the partial
derivative, ∂En /∂εn , of the perturbed energy with respect to the unperturbed
energy. The partial derivative here means keeping εm (m 6= n) and hm|V |ni
fixed.

24
6 Degenerate State Perturbation Theory
If εn = εm and hn|V |Mi 6= 0, then the perturbation theory we developed
above breaks down. This is not an essential difficulty, however, and we
address getting around it here.
Suppose that a set of states,

|n1 i, |n2 i, . . . , |n`i, (166)

are degenerate with respect to H0 :

H0 |ni i = εn |ni i, i = 1, 2, . . . , `. (167)

As we already noticed, if hni |V |nj i 6= 0 for i 6= j, the perturbation theory we


have developed breaks down. However, we may choose any linear combina-
tions of these degenerate states and again obtain eigenstates of H0 with the
same eigenvalue. Thus, choose the set which diagonalizes V in this subspace:

|n0i i = bij |nj i, (168)
j=1

such that
hn0i |V |n0j i = 0, i 6= j (169)
The matrix B = {bij } which does this is just the matrix formed by the
normalized eigenvectors of the V matrix (in this `-dimensional subspace):

V Bj = λj Bj , (170)

where

|(Bj )i |2 = 1. (171)
i=1

Thus,

|n0i i = (Bi )j |nj i. (172)
j=1

With this change in basis, we recover our original first order perturbation
theory result for the wave functions:
X |mihm|V |n0i i
|Ni0 i = |n0i i + + ... (173)
m∈{n
/ k}
 n − m

25
We also recover our original second order perturbation theory result for the
energies:
X |hm|V |n0i i|2
Eni = εn + hn0i |V |n0i i + + ... (174)
m∈{n
/ k}
 n − m

7 Time-dependent Perturbation Theory


We now add an element of time variation to our discussion of perturbations.
We’ll develop the basic ideas here; they are especially useful in the application
to scattering theory, which we’ll treat at more length in a later note.
Suppose that, at some time t < t0 , the system is in a state |ψt0 i, satisfying
the Schrödinger equation:

i∂t |ψt0 i = H0 |ψt0 i, t < t0 . (175)

At time t = t0 , we turn on a perturbing potential, Vt . For t > t0 , we thus


must solve:
i∂t |ψt i = (H0 + Vt )|ψt i, t > t0 , (176)
with the boundary condition

|ψt i = |ψt0 i, t ≤ t0 . (177)

Often, Vt is “small enough” so that we can find an approximate solution,


even if an exact solution is too daunting a prospect. As for the stationary
state perturbation theory, we make an expansion in powers of Vt . Note that
|ψt i contains the time dependence from H0 , and if H0  Vt , we expect this
to be a large portion of the time dependence. As this isn’t usually the time
dependence of interest in such problems, it is convenient to factor it out by
writing:
|ψt i = e−iH0 t |ψ(t)i. (178)
Then,
i∂t |ψt i = H0 |ψt i + e−iH0 t i∂t |ψ(t)i, (179)
and hence:

e−iH0 t i∂t |ψ(t)i = (i∂t − H0 )|ψt i (180)


= Vt |ψt i (181)
= Vt e−iH0 t |ψ(t)i. (182)

26
If we define
V (t) ≡ eiH0 t Vt e−iH0 t , (183)
we may write
i∂t |ψ(t)i = V (t)|ψ(t)i. (184)
We see that V (t) looks like a Hamiltonian for “state” |ψ(t)i. We call |ψ(t)i
the state vector in the interaction representation, implying that the time
dependence is due only to the interaction. The operator V (t) is the inter-
action representation for operator Vt . We may note that, if Vt = 0, the
interaction representation is just the Heisenberg representation.
Now integrate with respect to time:
Z t
∂t |ψ(t)idt = |ψ(t)i − |ψ(t0 )i, (185)
t0

or, Z
1 t
|ψ(t)i = |ψ(t0 )i +
V (t1 )|ψ(t1 )idt1 . (186)
i t0
This suggests that we try an iterative solution, which we hope converges.
Thus, to first order in V :
Z t
1
|ψ(t)i = |ψ(t0 )i + V (t1 )|ψ(t0 )idt1 . (187)
i t0

To find the second order solution, we substitute the approximation of Eqn. 187
into Eqn. 186:
Z t  2 Z t Z t1
1 1
|ψ(t)i = |ψ(t0 )i+ dt1 V (t1 )|ψ(t0 )i+ dt1 V (t1 ) dt2 V (t2 )|ψ(t0 )i.
i t0 i t0 t0
(188)
In general, we see that the nth order correction in this expansion is:
 n Z t Z t1 Z tn−1
1
dt1 dt2 . . . dtn V (t1 )V (t2 ) . . . V (tn )|ψ(t0 )i. (189)
i t0 t0 t0

If we define a kernel as K(t0 , t) = V (t)θ(t0 − t), we see that this series is really
in the form of a Neumann series for the solution of an integral equation.

27
7.1 The Time-Ordered Product
There is another interesting way to express this series result. Let us define the
concept of a time-ordered product, denoted by {AB}t , of two operators
A and B, according to:

A(t1 )B(t2 ), if t1 ≥ t2 ,
{A(t1 )B(t2 )}t = (190)
B(t2 )A(t1 ) if t1 < t2 .

With this idea, consider


(Z 2 ) Z t Z t 
t
V (t1 )dt1 = dt1 V (t1 ) dt2 V (t2 ) (191)
t0 t t0 t0 t
Z t Z t
= dt1 dt2 {V (t1 )V (t2 )}t (192)
t0 t0
Z t Z t1 Z t Z t2
= dt1 dt2 {V (t1 )V (t2 )}t + dt2 dt1 {V (t2 )V (t1 )}t
t0 t0 t0 t0
Z t Z t1
= 2 dt1 dt2 {V (t1 )V (t2 )}t . (193)
t0 t0

In general, we find:
Z t n  Z t Z t Z t
V (t1 )dt1 = dt1 dt2 . . . dtn {V (t1 )V (t2 ) . . . V (tn )}t
t0 t t0 t0 t0
Z t Z t1 Z tn−1
= n! dt1 dt2 . . . dtn V (t1 )V (t2 ) . . . V (tn ). (194)
t0 t0 t0

Thus, Z t n 
1
n
V (t1 )dt1 |ψ(t0 )i (195)
i n! t0 t

is the nth order term in our expansion for |ψ(t)i. We might think of this in
terms of the picture in Fig. 3: Imagine that, in propagating from time t0 to
time t, the wave “interacts” with the potential at discrete times t1 , t2 , . . . tn .
To get the total evolution of the wave, we must integrate over all possible
interaction times, and sum over all possible numbers of interactions. When
we sum over all terms in this expansion, we find:
  Z t 
|ψ(t)i = exp −i V (t1 )dt1 |ψ(t0 )i. (196)
t0 t

28
t

V
t1
V t
n-1

V t
n
t0
Figure 3: Illustration of nth order interaction term in time evolution from t0
to t.

7.2 Transition Probability, Fermi’s Golden Rule


Suppose a system is initially (at t = t0 ) in eigenstate |ii of H0 :
|ψ(t)i = |ii, where H0 |ii = εi |ii. (197)
Let |f i denote an arbitrary eigenstate of H0 . For convenience, we choose |ii,
|f i to be in the interaction representation.
We wish to address the question: At time t > t0 , what is the probability
that the system will be observed in state |f i, i.e., what is the probability
that the interaction has caused the transition |ii → |f i? In the interaction
representation, this amplitude is hf |ψ(t)i. Using Eqn. 187, we have to first
order in V :
1Z t
hf |ψ(t)i = hf |ii + dt1 hf |V (t1 )|ii. (198)
i t0
We find for the matrix element in the integrand:
hf |V (t1 )|ii = hf |eiH0 t1 Vt1 e−iH0 t1 |ii (199)
= ei(εf −εi )t1 hf |Vt1 |ii. (200)

29
Hence, if hf |ii = 0, the transition amplitude is
Z t
1
hf |ψ(t)i = dt1 ei(εf −εi)t1 hf |Vt1 |ii. (201)
i t0

The transition probability is:


Z t 2

2
Pi→f (t) = |hf |ψ(t)i| = dt1 ei(εf −εi )t1
hf |Vt1 |ii . (202)

t0

For example, suppose the potential is “turned on” at time t = 0, and is


constant thereafter:

0, t < t0 = 0,
Vt = (203)
V = V (x), t > 0.

In this case,
Z t 2

Pi→f (t) = dt1 ei(εf −εi )t1
hf |V |ii (204)

t0
" εf −εi #2
sin t
= 2
εf −εi |hf |V |ii|2 . (205)
2

This probability is plotted as a function of the energy difference in Fig. 4.


The zeros occur at εf = εi ± 2πn
t
, n = 1, 2, . . .. The magnitudes of the bumps
2
decrease as 1/(εf − εi ) .
For very small times,

Pi→f (t) ≈ t2 |hf |V |ii|2 , (206)

approximately independent of εf − εi , if |hf |V |ii|2 is not very dependent. As


t increases, the probability is largest for states with εf near εi – the height
of the central bump varies approximately as t2 , and the width as 1t , yielding
a total probability to be in the central bump that grows approximately as
t. This may be thought of in terms of an “uncertainty relation”: If the
perturbation turns on, or acts, in a very short time ∆t, transitions may be
induced in first order to a wide range of energy states,

> 2π
∆ε∆t∼ ∆t = 2π. (207)
∆t

30
1

0.8

0.6

0.4

0.2

0
-4 -3 -2 -1 0 1 2 3 4

Figure 4: The i → f transistion probability as a function of energy difference.


The vertical axis is Pi→f (t)/ |hf |V |ii|2 in units of t2 . The horizontal axis is
εf − εi in units of 2π/t.

But as the interaction is turned on more slowly, or has acted for a longer
time, the uncertainty in energy induced by the perturbation decreases, i.e.,
energy is conserved to ∆ε ∼ 2π/t.
If the levels εf and εi are discrete, with εf 6= εi , the transition prob-
ability simply oscillates with period T = 2π/|εf − εi |. If |f i and |ii are
degenerate, then the probability grows as t2 . This cannot continue indefi-
nitely, since probabilities are bounded by one. Eventually, higher orders in
the perturbation become important.
Consider the case where |f i is drawn from a continuum of energy states
(or, perhaps a very closely spaced spectrum). For example, we could be
dealing with a free particle in H0 . In this case it makes more sense to ask for
the transition probability to some set of states in a neighborhood of |ni. For
example, for a free particle, we are interested in the transition probability
to phase space volume element d3 (p) about p. Since the area of the central
bump grows as t, we expect the transition probability to a set of such states

31
with εf ≈ εi to grow linearly with time. Hence, the transition rate to such a
set of states is a constant. Let us calculate the transition rate for this case.
We must sum Eqn. 205 over the region of interest (call this region R):
Z " εf −εi #2
X sin t
Pi→f (t) = dεf ρ(εf ) 2
εf −εi |hf |V |ii|2 , (208)
f ∈R R
2

where ρ(εf ) is the number of states per unit energy, known as the density
of states. Let’s suppose |hf |V |ii|2 doesn’t change much over the region of
interest, and take this quantity outside the integration:
Z " εf −εi #2
X 2 sin t
2
Pi→f (t) ≈ |hf |V |ii| dεf ρ(εf ) εf −εi , (209)
f ∈R R
2

As t becomes larger, the central bump falls entirely within R, and then the
density of states can also be considered effectively constant over the sharply
peaked integrand. Hence, we take ρ(εf ) outside the integral also, in this
limit:
Z " εf −εi #2
X h i sin t
2 2
Pi→f (t) ≈ |hf |V |ii| ρ(εf ) dεf εf −εi , (210)
εf =εi R
f ∈R 2

Finally, since the central peak is contained entirely within R, we may let the
limits of integration go to ±∞.
We wish to evaluate the integral:
Z ∞ " εf −εi #2 Z ∞
sin 2
t sin2 x
dεf εf −εi = 2t dx . (211)
−∞
2
−∞ x2

One way to compute this integral is to notice that

1 1 − e2ix
sin2 x = (1 − cos 2x) = < , (212)
2 2
and consider the contour integral:

1 I 1 − e2iz
dz = 0 (213)
2 C z2
around the contour in Fig. 5 The integral around the large semicircle is zero

32
y

R
ε
x
Figure 5: Contour for the evaluation of the integral in Eqn. 213.

in the limit R → ∞. The desired integral is thus minus the integral around
the small semicircle, in the limit  → 0:
Z ∞ Z iθ
sin2 x 0 1 − e2ie
dx = − lim i dθ (214)
−∞ x2 →0 π 22 e2iθ
Z 0
i
= − (−2i) dθ (215)
2 π
= π. (216)

Hence, X
Pi→f (t) ≈ Γt, (217)
f ∈R

where the transition rate, Γ, is:


h i
Γ = 2π |hf |V |ii|2 ρ(εf ) . (218)
εf =εi

Equation 218 is an important result; it is known as Fermi’s Golden Rule.


Our discussion is evidently not valid when:

1. The time t is too “short”. We must have the central bump within the
region of interest. That is, we must have (∆ε)R large compared with

33
2π/t, i.e.,

t> . (219)
(∆ε)R
2. The time t is too “long”. If t is too long, then there may be only a
few states within the central bump (not a problem if the spectrum is
continuous, of course). Suppose δε is the level spacing in the region of
interest. This spacing must be small compared with 2π/t for the above
analysis. That is, we must have

t . (220)
δε
Furthermore, if t becomes too long, the initial state becomes depleted,
and the transition rate will no longer be constant.
Let us apply this framework to the case of a particle in a box of volume
L3 .2 Turn on the potential V (x) inside the box. Start with a particle in
momentum state p, and ask for the rate at which it transitions to other
momentum states, p0 . The matrix element of V (x) between momentum
states is
Z 0
0 e−ip ·x
3 eip·x
hp |V |pi = d (x) 3/2 V (x) 3/2 (221)
L3 L L
0 3
= V̂ (p − p)/L , (222)

where V̂ is the fourier transform (in the box) of V (x).


To put this into the golden rule, we restate the golden rule somewhat:
We notice that as t grows, the function
" εf −εi #2
sin 2
t
εf −εi →t→∞ 2πtδ(εf − εi ). (223)
2

Then our transition rate is:

Γ = 2π |hf |V |ii|2 δ(εf − εi ). (224)

This version of Fermi’s Golden Rule must be applied in the context of a sum
over states |f i, that is, there must be an integral over the delta function.
2
We have in mind that we will eventually take the limit as the box size becomes infinite,
and develop this into a theory for scattering.

34
Using Eqn. 224, the transition rate for p → p0 is:
|V̂ (p0 − p)|2 0
Γp→p0 = 2π δ(εp − εp ), (225)
L6
where
p2
εp = . (226)
2m
The rate Γ here is actually a differential decay rate. Let us apply it to obtain
the rate of scattering, dΓ, into an element of solid angle dΩ0 :
X
dΓ = Γp→p0 . (227)
p0 ∈dΩ0

To perform this summation, we need the density of states, i.e., we need


the number of states in phase space element d3 (p0 ). On dimensional grounds,
we must have a number of states dN 0 :
dN 0 ∝ L3 d3 (p0 ), (228)
and it remains to determine the constant of proportionality. Let’s consider
the problem in one dimension. The free particle wave functions are
1
ψp (x) = √ e±ipx . (229)
L
Imposing periodic boundary conditions ensures no net flux of particles out
of the box3 :
ψ(x) = ψ(x + L) (230)
ψ 0 (x) = ψ 0 (x + L). (231)
Thus, we must have eipL = 1, or pL = 2πn, where n is an integer. Hence,
dN L
= . (232)
dp 2π
We generalize to three dimensions to obtain
L3 3
dN = d (p). (233)
(2π)3
3
Note that we are really thinking in terms of eventually letting the box boundaries go
off to infinity, and the constraint we want is conservation of probability. We don’t care
here whether the wave function goes to zero at the box boundary.

35
Thus, we have
Z
L3 3 0 |V̂ (p0 − p)|2
dΓ = d (p )2π δ(εp0 − εp ) (234)
p0 ∈dΩ0 (2π)3 L6
Z ∞
dΩ0 m
= p0 dεp0 |V̂ (p0 − p)|2 δ(εp0 − εp ) (235)
L3 (2π)2 0
dΩ0 mp
= |V̂ (p0 − p)|2 , (236)
L3 (2π)2
0 0
where p0 = |p|Ω̂ Ω a unit vector in dΩ0 .
Ω , with Ω̂
We may think of this example as a “scattering experiment” since the
potential is effectively “turned on” as the incident particle nears it. As long as
the potential falls off rapidly enough at large distances, the use of free particle
wave functions for the incident wave at early times and for the scattered wave
at late times is a plausible approximation to make. If we suppose that we
have a beam of incident particles, then Eqn. 236 tells us the rate at which
particles scatter into dΩ0 per incident particle in volume L3 . The flux of
particles per incident particle of momentum p in volume L3 is just:
1 p
Number of beam/area/time = 3 |v| = . (237)
L mL3
This may be seen by interpreting the factor 1/L3 as the number of beam
particles per unit volume (i.e., we have normalized our wave to one parti-
cle in the box of volume L3 ), and |v| (the speed of a beam particle) gives
the distance per unit time. Dividing the rate by this flux, we obtain the
differential scattering cross section:
dσ m2
0
= 2
|V̂ (p0 − p)|2 . (238)
dΩ (2π)
This formula is referred to as the Born Approximation (or, as the “first”
Born approximation) for the differential cross section. Notice that the size
of the box has disappeared once we have divided out the incident flux; we
expect the formula to apply in the limit of infinite spatial extent (i.e., in the
continuum limit).

7.3 Coulomb Scattering


In Coulomb scattering, we consider the potential
q1 q2
V (x) = , (239)
r
36
where r = |x|. We need the Fourier transform:
Z
0 1 0
V̂ (p − p)/q1 q2 = d3 (x) e−i(p −p)·x (240)
(∞) r
Z ∞ Z 1
= 2π rdr d cos θ exp(−i|p0 − p|r cos θ) (241)
0 −1
Z ∞ 0 0
e−i|p −p|r − ei|p −p|r
= 2π rdr (242)
0 −i|p0 − p|r
Z ∞
2πi
= (−2i) sin xdx (243)
(p − p)2
0 0

= . (244)
(p − p)2
0

R
Actually, the integral 0∞ sin xdx = 1 may be a bit suspicious. However, we
get the same result if we consider scattering on a Yukawa potential and take
the limit as the Yukawa range goes to infinity. Writing

(p0 − p)2 = 2p2 (1 − cos θ), (245)

we obtain the differential cross section:


dσ m2 16π 2
= (q1 q2 )2 (246)
dΩ 4π 2 4p4 (1 − cos θ)2
(q1 q2 )2 m2
= 4 (247)
p (1 − cos θ)2
(q1 q2 )2
= , (248)
16E 2 sin4 2θ

where p2 = 2mE, and θ is the scattering angle. This result may be recog-
nized as the Rutherford cross section. There should be some real concern
whether we had any business applying the Born approximation here, since
the Coulomb potential falls off so slowly with distance. We’ll discuss this
issue further later.

7.4 Decays
We are sometimes faced with the problem of describing a decay process, such
as the radioactive decay of a nucleus, in which a particle of momentum p

37
is produced. To be more explicit, suppose that we have a nucleus in an
initial state |ii which decays to a final state |f i plus a particle in momentum
state |pi. We’ll neglect the nuclear recoil here, i.e., we’ll assume that the
momentum p is small compared with the nuclear masses involved.
Assume that we know the interaction matrix element, hf ; p|V |ii for the
decay. The differential rate to emit the particle into solid angle element dΩ
is
Z ∞
mpL3
dΓ = dΩ dεp 2π|hf ; p|V |ii|2 δ(Ei − Ef − εp ) (249)
0 (2π)3
mpL3
= dΩ |hf ; p|V |ii|2 , (250)
(2π)2

where we have used


L3 3 L3
dN = d (p) = dΩ pmdεp , (251)
(2π 3 ) (2π)3

Ei is the energy of the nucleus in the initial state, Ef is the


q
energy of the
2
nucleus in the final state, εp = p /2m, p = pΩ̂, and p = 2m(Ei − Ef ).
Note that the wave function of the emitted particle will have a normalization
proportional to 1/L3/2 , so the rate will actually be independent of L.
Integrating over dΩ gives the total decay rate:
Z
mpL3
Γ= dΩ|hf ; p|V |ii|2 . (252)
4π 2 (4π)

If at t = 0 we have N0 nuclei in state |ii, then at later time t we will have


seen ∆N = N0 Γt decays, according to first order perturbation theory. For
large times, this must break down, since the initial state becomes depleted.
We may rectify this by writing
dN
= −N(t)Γ, (253)
dt
i.e., the rate of observing nuclear decays is proportional the number of avail-
able nuclei, as well as to the decay rate of a nucleus. Thus, we have the
familiar exponential decay law:

N(t) = N0 e−Γt . (254)

38
7.5 Adiabatically Increasing Potential
Earlier, we considered that our potential was turned on more-or-less suddenly
at some time t0 . Let us consider the situation where the potential is turned
on very slowly, compared with some relevant time scale. We’ll see that this
case is not fundamentally different from our previous discussion, in the spirit
of the perturbative nature of the interaction.
Let us consider a situation where we imagine a slow turn-on of a potential,
for example, suppose we have an atom in state |ii which we subject to an
electromagnetic field which is slowly increased from zero. A measure of
“slow” here must mean that the rate of energy change associated with the
external field must be small compared with the orbit frequency of the atomic
electrons, i.e., the time scale for significant variation must be longer than
a0 1 1
∼ ∼ ∼ 10−16 s. (255)
v mα α
Formally, we may turn on a potential, V, slowly by writing
Vt = eγt V, (256)
where γ > 0 so that V−∞ → 0. We’ll assume here that V itself is independent
of time. To first order in time-dependent perturbation theory:
Z t
1
|ψ(t)i = |ψ(t0 )i + dt1 V (t1 )|ψ(t0 )i. (257)
i t0

Consider the transition from |ii to |f i, assuming |ii and |f i are orthogonal.
The transition amplitude, in first order, is:
1Z t
hf |ψ(t)i = dt1 hf |V (t1 )|ii (258)
i t0 →−∞
Z
1 t
= dt1 hf |eiH0 t1 Vt1 e−iH0 t1 |ii (259)
i −∞
1Z t
= dt1 ei(εf −εi )t1 eγt1 hf |V |ii (260)
i −∞
eγt+i(εf −εi )t
= hf |V |ii. (261)
εi − εf + iγ
The resulting transition probability is:
e2γt
|hf |ψ(t)i|2 = |hf |V |ii|2 . (262)
(εf − εi )2 + γ 2

39
The dependence of this probability on energy is in the form of a Breit-
Wigner distribution, or of a Cauchy probability distribution. The energy

0.8

0.6

0.4

0.2

0
-4 -3 -2 -1 0 1 2 3 4

Figure 6: The i → f transistion probability as a function of energy difference.


The smooth curve is γ 2 /[(εf − εi )2 + γ 2 ], plotted for γ = 2. The horizontal
axis is εf − εi in units of 2π/t. The wavy curve is reproduced from Fig. 4 for
comparison.

spread of this distribution is of order γ, which may be interpreted roughly


as the inverse of the length of time the potential has been “on”.4
If |f i is one of a continuum of states, we may calculate the transition rate
to such states according to:
d 2γ
|hf |ψ(t)i|2 = e2γt 2 2
|hf |V |ii|2 . (263)
dt (εf − εi ) + γ
Consider the limit of arbitrarily slow turn-on: γ → 0, and

lim e2γt → Aδ(εf − εi ), (264)
γ→0 (εf − εi )2 + γ 2
4
Note that the standard deviation, or rms spread, of a Cauchy distribution is infinite.
Hence, we use here something like the half width at half maximum as our measure of
energy spread.

40
where A is a constant to be determined by matching the normalization in
the γ → 0 limit:
Z ∞

A = dεf (265)
−∞ (εf − εi )2 + γ 2
Z ∞
dx
= 2 (266)
−∞ 1 + x2
= 2π. (267)

Thus, we have the transition rate

Γi→f = 2π|hf |V |ii|2 δ(εf − εi ), (268)

which we recognize as Fermi’s Golden rule once again! We observe that this
rule is robust with respect to the details of how the perturbing potential is
turned on.

8 Eigenvalues – Comparison Theorems


We return to some further discussion of some techniques similar to our dis-
cussion of the variational method. Consider the problem of a particle in a
(time-independent) potential, and ask what we might say qualitatively about
the existence and number of bound states, and related questions.
We start with the following “comparison theorem”:
Theorem: Consider a self-adjoint Hamiltonian:
1 2
H=− ∇ + V (x). (269)
2m
Let θ > 0. Then
1 2
H(θ) = − ∇ + θ2 V (θx) (270)
2m
is also self-adjoint, and if ψ(x) ∈ DH , then ψ(θx) ∈ DH(θ) . If λ is an
eigenvalue of H (λ ∈ Σ(H)), then θ 2 λ is an eigenvalue of H(θ), and we
have: n o
Σ [H(θ)] = θ2 λ|λ ∈ Σ(H) . (271)
In particular, if the negative spectrum of H is discrete (with only 0 as
a possible point of accumulation), then the number of negative eigen-
values of H(θ) is the same as of H.

41
ψ (x)

0 L

ψ ( θ x)

0 L/θ
Figure 7: Illustration of the scaling of the wave function corresponding to
the scaling of the potential.

Proof: Domain is a question of boundary conditions – the vectors must scale


in the same way the potential is scaled. See Fig. 7 for an illustration.
Let Hψ(x) = λψ(x). Consider:
 
1 2
H(θ)ψ(θx) = − ∂x + θ2 V (θx) ψ(θx) (272)
2m
 
2 1 2
= θ − ∂ + V (θx) ψ(θx) (273)
2m θx
= θ2 λψ(θx). (274)

This “comparison theorem”, relating the spectra of two related operators,


tells us, for example, that any valid formula which gives an upper/lower limit
on the number of negative eigenvalues in terms of the potential must be
invariant under the substitution V (x) → θ2 V (θx), for θ > 0. For example,
consider
p2 1
H= − V0 + kx2 . (275)
2m 2
q
The energy levels are at −V0 + (n + 1/2)ω, where ω = k/m. The number

42
of negative eigenvalues is n = [ Vω0 − 12 ]. Now consider modifying the potential
from V (x) = −V0 + 12 kx2 to
1
θ2 V (θx) = −θ 2 V0 + θ4 kx2 , (276)
2
giving the spectrum:
   
2 1
Σ[H(θ)] = θ −V0 + n + . (277)
2
The solution for the number of negative eigenvalues is the same as before.
Now consider, for a finite dimensional Hilbert space the following “min-
max” theorem:
Theorem: Let Q be a Hermitain N × N matrix, with eigenvalues:
λ 1 ≤ λ2 ≤ . . . ≤ λ N . (278)
Let Pk , 1 ≤ k ≤ N be the set of Hermitian projections onto a k-
dimensional subspace. Then:
!
λn = min max hψ|Q|ψi , (279)
F ∈Pn ψ∈F,kψk=1

where we use the symbol F to mean both the projection operator onto
a subspace, and the corresponding subspace itself.
This theorem tells us that we can find the nth eigenvalue by first finding
the maximum of hψ|Q|ψi for all unit vectors ψ in a fixed n-dimensional
subspace F , and then minimizing the result as a function of F . If we don’t
minimize, then we obtain an upper bound on λn . The case n = 1 corresponds
to the variational principle we have already discussed, since the subspaces are
then one-dimensional, hence there is only one unit vector in each subspace
(i.e., there is no maximization step required), and our minimization step is
only to the extent of our trial function parameterization. The case n = N is
also trivial, since then F = H, and there is thus no minimization step.

Proof: We have already dealt with the trivial cases n = 1 and n = N , so we


now suppose that 1 < n < N. Since Q is Hermitian, it has the spectral
decomposition:
N
X
Q= λk Ek , (280)
k=1

43
P
where Ek E` = δk` and I = N k=1 Ek . That is, the Ek are Hermitian
projections into one-dimensional subspaces. Then we can write:
N
X n
X
Q = λn I + (λk − λn )Ek − (λn − λk )Ek . (281)
k=n k=1

Hence, for any kψk = 1:


N
X n
X
hψ|Q|ψi = λn + | (λk − λn )Ek | − | (λn − λk )Ek |, (282)
k=n k=1

where taking the absolute values does not alter the validity, since each
term in the two sums is non-negative. Now let F be any n-dimensional
subspace, and let ψ ∈ F . Select ψ to be orthogonal to E1 , E2 , . . . , En−1 ,
that is, orthogonal to these n−1 projections. This is certainly possible,
since there are n independent directions in F . Then we have:
N
X
hψ|Q|ψi = λn + | (λk − λn )Ek | (283)
k=n
≥ λn . (284)
Now select F orthogonal to En+1 , En+2 , . . . , EN . Again, this is certainly
possible, since there are still n directions left. In this case,
n
X
hψ|Q|ψi = λn − | (λn − λk )Ek | (285)
k=1
≤ λn . (286)
For both statements 284 and 286 to be true, we must have the statement
in the theorem.
From the minmax theorem, it follows that:
Theorem: Let Q and V be N × N Hermitian matrices, and let:
Q̂ ≡ Q + V. (287)
Also, let the spectra of these operators be denoted:
Σ(Q) = {λ1 ≤ λ2 ≤ . . . ≤ λN }, (288)
Σ(Q̂) = {λ̂1 ≤ λ̂2 ≤ . . . ≤ λ̂N }, (289)
Σ(V ) = {V1 ≤ V2 ≤ . . . ≤ VN }. (290)

44
Then
λn + VN ≥ λ̂n ≥ λn + V1 . (291)

It also follows that:

Theorem: Let Q be an N × N Hermitian matrix with spectrum

Σ(Q) = {λ1 ≤ λ2 ≤ . . . ≤ λN }. (292)

Let Pk be any k-dimensional Hermitian projection, 1 ≤ k ≤ N. We


may regard the operator Pk QPk as a Hermitian k × k matrix, when
restricted to the subspace Pk . Let the eigenvalues of this restricted
matrix be
{λ̂1 ≤ λ̂2 ≤ . . . ≤ λ̂k }. (293)
Then
λ̂n ≥ λn , for n = 1, 2, . . . , k. (294)

We may see a classical analog of these theorems in terms of a system of


coupled oscillators: First, if we add more springs to a system, none of the
normal-mode frequencies decrease. Second, if we remove degrees of freedom
by clamping, then the k remaining frequencies will be at least as large as the
k lowest frequencies of the original unclamped system.

9 Exercises
1. Prove the theorem quoted in section 2.3:

Theorem: If we have a normalized function |ψi such that

E0 ≤ hψ|H|ψi ≤ E1 , (295)

then
hHψ|Hψi − hψ|H|ψi2
E0 ≥ hψ|H|ψi − . (296)
E1 − hψ|H|ψi

2. Let us pursue our variational approach to the estimation of ground state


energy levels of atoms to the “general” case. We consider an atom with

45
nuclear charge Z, and N electrons. The Hamiltonian of interest is:
H(Z, N ) = Hkin − ZVc + Ve (297)
N
X p2n
Hkin = , (298)
n=1 2m
where (299)
N
X 1
Vc = α (300)
n=1 |xn |
X 1
Ve = α (301)
N ≥j>k≥1|xk − xj |
m = electron mass (302)
α = fine structure constant. (303)
Denote the ground state energy of H(Z, N ) by −B(Z, N ), with B(Z, 0) =
0.

(a) Generalize the variational calculation we performed for the ground


state of helium to the general Hamiltonian H(Z, N ). Thus, se-
lect your “trial function” to be a product of N identical “hydro-
gen atom ground state” functions. Determine the resulting lower
bound B̂(Z, N ) on B(Z, N ) (i.e., an upper limit on the ground
state energies).
(b) Make a simple table comparing your variational bounds with the
observed ground state energies for lithium, beryllium, and nitro-
gen. Note that a simple web search for “ionization potentials” will
get you a multitude of tables of observed values, or you can look
at a reference such as the CRC Press’s Handbook of Chemistry
and Physics. The table entries are typically of the form:
B(Z, N ) − B(Z, N − 1).

(c) Do your results make sense? If not, can you figure out what is
wrong, and whether the calculation we did for He is to be trusted?

3. We consider the quantum mechanics of a particle in the earth’s gravi-


tational field:
Mm
V (r) = −G (304)
r
46
Mm
= −G (305)
R+z
Mm
≈ −G + mgz (306)
R
where (307)
M = mass of earth (308)
m = mass of particle (309)
r = distance from center of earth (310)
G = Newton’s gravitational constant (311)
R = radius of earth (312)
z = height of particle above surface of earth (313)
g = GM/R2 . (314)
We may drop the constant term in our discussion, and consider only the
mgz piece, with z  R. We further assume that no angular momen-
tum is involved, and treat this as a one dimensional problem. Finally,
assume that the particle is unable to penetrate the earth’s surface.
(a) Make a WKB calculation for the energy spectrum of the particle.
(b) If the particle is an atom of atomic weight A ∼ 100, use the result
of part (a) to estimate the particle’s ground state energy (in eV).
Is sunlight likely to move the particle into excited states?
(c) Now make a variational calculation for the ground state energy
(i.e., an upper bound thereon). Pick a “sensible” trial wave func-
tion, at least in the sense that it satisfies the right boundary con-
ditions. Compare your result with the ground state level from the
WKB approximation.
4. We discussed the method of stationary phase in section 4. Recall that
the problem it addresses is to evaluate integrals of the form:
Z ∞
I() = f (x)eiθ(x)/ dx, (315)
−∞

where f and θ are real, and  > 0. We showed that, in the situation
where  is very small, and θ has a stationary point at x = x0 , this
integral is approximately:
s
√ iθ(x0 )/ i π4 sign[θ 00 (x0 )] 2π
I() = f (x0 )e e [1 + O()] . (316)
|θ00 (x 0 )|

47
If there is more than one stationary point, then the contributions are
to be summed.
To get a little practice applying this method, evaluate the following
integral for large t:
Z 1 h i
J(t) = cos t(x3 − x) dx. (317)
0

5. I suggested in section 4.3 that you consider the classical correspondence


for the time delay (or advance) of the asymptotic motion due to scatter-
ing on a potential. Let us pursue this here. Consider one-dimensional
motion. A particle of mass m is incident from the left on a potential:

V (x) = −K x ∈ (−∆/2, ∆/2) (318)
0 otherwise.
We wish to solve for the motion for large x at large times.
(a) Let’s do the quantum mechanics calculation first. Suppose that
our momentum space wave function at early time is a gaussian
wave packet:
" #1/2
1 1 p−q 2
ψ̂(p) = √ e− 2 ( σ ) . (319)
2πσ
What is ψ(x, t) for large times and large x? Describe the motion,
relative to what it would be if K = 0.
(b) Now do the same problem classically. That is, solve for the motion
at large times and large x. Again, compare the result with what
it would be for K = 0. Contrast with the quantum result.
6. We have solved the Schrödinger equation for the Hydrogen atom with
Hamiltonian:
p2 e2
H0 = − .
2m r
The kinetic energy term is non-relativistic – the actual kinetic energy
will have relativistic corrections.
(a) Obtain an expression for the next order relativistic (kinetic en-
ergy) correction to the energy spectrum of hydrogen. It is con-
venient to avoid taking multiple derivatives by using the unper-
turbed Schrödinger equation to eliminate them. Thus, write your

48
expression in terms of the unperturbed energies and expectation
2 2
values of er and ( er )2 . Do not actually do the integration over
r here, but reduce the problem to such integrals. Make sure you
understand all of your steps.
(b) Now apply your formula to obtain the first-order relativistic ki-
netic energy correction to the ground state energy of hydrogen.
Express your answer as a multiple of the unperturbed ground state
energy, and also calculate the size of the correction in eV.

7. Let us consider an example of the use of degenerate stationary state


perturbation theory. Thus, let us take the hydrogen atom, with unper-
P2
turbed Hamiltonian H0 = 2m − αr , and consider the effect of putting this
atom in a uniform external electric field: E = Eêz . We are interested
in calculating, to first order in perturbation theory, the shifts in the
n = 2 energy levels. Note that the n = 2 level is four-fold degenerate,
corresponding to the eigenstates: |2S0 i, |2P1 i, |2P0 i, |2P−1 i, neglecting
spins.

(a) Write down the perturbing potential, V . [Note that we need only
consider the electron’s coordinates, relative to the nucleus – why?]
Calculate the commutator [V, Lz ], and hence determine the matrix
elements of V between states with different eigenvalues of Lz .
(b) You should have found a “selection rule” which simplifies the prob-
lem. What is the degeneracy that needs to be addressed in the
problem now that you have made this calculation?
(c) Using the invariance of the hydrogen atom Hamiltonian under
parity, write down the remaining matrix elements of V which need
to be determined, and compute their values.
(d) Now complete your degenerate perturbation theory calculation to
determine the splitting of the states in the applied electric field.
Calculate numerical splittings (in eV) for an applied field of 100
kV/cm. Also, estimate the “typical” electric field felt by the elec-
tron, due to the nucleus, in a hydrogen atom. Was the use of
perturbation theory reasonable for this problem?

8. It may happen that we encounter a situation where the eigenvalues of


H0 , call them εn and εm , are nearly, but not quite equal. In this case,

49
we cannot use degenerate perturbation theory, and ordinary perturba-
tion theory looks unreliable. Let us try to deal with such a situation:
Suppose the two eigenstates |ni and |mi of H0 have nearly the same en-
ergy (and all other eigenstates don’t suffer this disease, for simplicity).
Let H = H0 + V , and write
X
V = |iihi|V |jihj| (320)
i,j
H0 |ii = εi |ii, (321)
where

hi|ji = δij· (322)


Let

V = V1 + V2 , (323)
with

V1 ≡ |mihm|V |mihm| + |nihn|V |nihn| + (324)


+|mihm|V |nihn| + |nihn|V |mihm| (325)
and V2 is everything else.
If we can solve exactly the problem with H1 = H0 + V1 , then the trou-
blesome 1/(εn − εm ) terms are avoided by the exact treatment, and we
may treat V2 as a perturbation in ordinary perturbation theory (since
hi|V2 |ji = 0 for i, j = n, m). All states |ii, i 6= n, m, are eigenstates
of H1 , since V1 |ii = 0 in this case. However, |ni and |mi are not in
general eigenstates of H1 .
(a) Solve exactly for the eigenstates and eigenvalues of H1 , in the
subspace spanned by |ni, |mi. Express your answer in terms of
εn , εm , hm|V |ni, hn|V |ni, hm|V |mi.
(You may also use the shorthand
(1)
En,m = εn,m + hn, m|V |n, mi
if you find it convenient.)

50
(b) As an application, consider an electron in a weak one-dimensional
periodic potential (“lattice”) V (x) = V (x + d). Assume the lat-
tice has a size L = N d, and that we the have periodic boundary
condition on our wave functions: ψ(x) = ψ(x + L). With this
boundary condition, the unperturbed wave functions are plane
waves, ψp (x) = √1L eipx , where p = 2πn/L, n=integer, and the
 2
p2 2πn 1
unperturbed eigenenergies are εn = 2m
= L 2m
. We expand
the potential in a Fourier series:

X
V (x) = ein2πx/d Vn
n=−∞

If we label our eigenfunctions by |pi = √1 e2πinp x/L , determine all


L
nonvanishing matrix elements of V :

hq|V |pi

Express your answer in terms of Vn .


(c) Suppose εnp and εnq are not close to each other ∀nq , given some
np . Calculate the perturbed wave function in ordinary first order
perturbation theory corresponding to unperturbed wave function
ψp (x). Also, calculate the energy to 2nd order. Express your
answer in terms of Vn and the unperturbed energies.
(d) What is the condition on np (and hence on p) so that |pi will be
nearly degenerate in energy with another eigenstate of H0 ?
(e) Assume that the condition in (d) exists, and use part (a) to solve
this “almost degenerate” case for the eigenenergies. Complete the
graph in Fig. 8 for higher values of |p|.

9. When we calculated the density of states for a free particle, we used a


“box” of length L (in one dimension), and imposed periodic boundary
conditions to ensure no net flux of particles into or out of the box.
We have in mind that we can eventually let L → ∞, and are really
interested in quantities per unit length (or volume). Let us justify
more carefully the use of periodic boundary conditions, i.e., we wish to
convince ourselves that the intuitive rationale given above is correct.
To do this, consider a free particle in a one-dimensional “box” from

51
Ep

|V0 |

p
- π /d 0 π /d

Figure 8: Energy versus momentum for the one-dimensional lattice problem


(6).

−L/2 to L/2. Remembering that the Hilbert space of allowed states is


a linear space, show that the periodic boundary condition:
ψ(−L/2) = ψ(L/2), (326)
ψ 0 (−L/2) = ψ 0 (L/2) (327)
gives acceptable wave functions. “Acceptable” here means that the
probability to find a particle in the box must be constant. Are there
other acceptable choices?
10. See if you can generalize the result for the first Born approximation:
dσ m2
= |V̂ (p0 − p)|2 . (328)
dΩ0 (2π)2
to the case where the scattered particle (mass mf ) may have a different
mass than the incident particle (mass mi ).
11. We consider the potential (called the “Yukawa potential”):
Ke−µr
V (x) = , r = |x|,
r
52
with real parameters K and µ > 0. The parameter K can be regarded
as the “strength” of the potential (“interaction”), and µ1 is effectively
the “range” of distance over which the potential is important. µ itself
has units of mass – note that as µ → 0 we obtain the Coulomb potential:
µ can be thought of as the mass of an “exchanged particle” which
mediates the force. In electromagnetism, this is the photon, hence
µ → mγ = 0

(a) Find a condition on K and µ which guarantees that there are at


least n bound states in this potential. You will likely fashion and
use some kind of “comparison” theorem in arriving at your result.
You should give at least a “heuristically convincing” argument, if
you don’t actually prove it.
(b) Using the Born approximation for the differential cross section
that we developed in our discussion of time-dependent perturba-

tion theory, calculate the differential cross section, dΩ , for scatter-
ing on this potential. Consider the limit µ → 0 and compare with
the Coulomb differential cross section we obtained in the notes.
(c) Integrate your differential cross section over all solid angles to
obtain the “total cross section”. Again, consider the limit µ → 0.
Hence, what is the total cross section for scattering on a Coulomb
potential?

53
Physics 125c
Course Notes
Approximate Methods
Solutions to Problems
040415 F. Porter

1 Exercises
1. Prove the theorem quoted in section ??:
Theorem: If we have a normalized function |ψi such that
E0 ≤ hψ|H|ψi ≤ E1 , (1)
then
hHψ|Hψi − hψ|H|ψi2
E0 ≥ hψ|H|ψi − . (2)
E1 − hψ|H|ψi
Solution: The theorem is equivalent to the statement
hψ|H 2 ψi − hψ|H|ψi2 ≥ (hψ|H|ψi − E0 )(E1 − hψ|H|ψi). (3)
Notice that if we add a constant A to H, obtaining H 0 = H + A (hence
also En → En0 = En + A), both sides of this inequality are unaltered.
The left hand side is a measure of the width of the energy distribution
and is not altered by shifting the energy scale. Likewise, the right hand
side only depends on energy differences. Thus, as long as the spectrum
of H is bounded below, the problem is equivalent to a problem where
the spectrum is non-negative. In particular, we may simplify by taking
E0 = 0.
Hence, consider:
hψ|H 2 ψi − hψ|H|ψi2 − hψ|H|ψi(E1 − hψ|H|ψi)
= hψ|H 2 ψi − E1 hψ|H|ψi (4)
X X
= |cn |2 En2 − E1 |cn |2 En (5)
n=0 n=0
X
2
= |cn | En (En − E1 ) (6)
n=0
X
= |cn |2 En (En − E1 ), since E0 = 0, (7)
n=1
≥ 0, since each term in the sum is non-negative.

1
2. Let us pursue our variational approach to the estimation of ground state
energy levels of atoms to the “general” case. We consider an atom with
nuclear charge Z, and N electrons. The Hamiltonian of interest is:
H(Z, N ) = Hkin − ZVc + Ve (8)
N
X p2n
Hkin = , (9)
n=1 2m
where (10)
N
X 1
Vc = α (11)
n=1 |xn |
X 1
Ve = α (12)
N ≥j>k≥1|xk − xj |
m = electron mass (13)
α = fine structure constant. (14)
Denote the ground state energy of H(Z, N ) by −B(Z, N ), with B(Z, 0) =
0.

(a) Generalize the variational calculation we performed for the ground


state of helium to the general Hamiltonian H(Z, N ). Thus, se-
lect your “trial function” to be a product of N identical “hydro-
gen atom ground state” functions. Determine the resulting lower
bound B̂(Z, N ) on B(Z, N ) (i.e., an upper limit on the ground
state energies).
Solution: Let Ze be the fixed nuclear charge in the Hamiltonian,
and let z be the effective Z variational parameter. The trial wave
function we are told to use is thus:
s
N
Y z 3 − az (rn ) 1
ψZN (x1 , . . . , xN ) = e 0 , a0 = . (15)
n=1 πa30 mα
The expectation value of the total kinetic energy for this trial
function is:
p2 1
Hkin = N hψ| 1 |ψi = N z 2 mα2 . (16)
2m 2
The expectation value of the potential energy of the electrons in
the nuclear electric field is:
−ZVc = −N Zzmα2 . (17)

2
The expectation value of the potential energy of the electrons in
the fields of the other electrons is:
N (N − 1) 1 5
Ve = z mα2 . (18)
2 24
Putting these terms together, we have
 
1 5
hH(Z, N )iz = mα2 N z z − 2Z + (N − 1) (19)
2 8
We minimize with respect to z:
 
d 2 5 5
0= z − 2Zz + (N − 1)z = 2z − 2Z + (N − 1). (20)
dz 8 8
Thus, the minimum occurs at
5
z=Z− (N − 1) (21)
16
The variational bound on the (negative of the) ground state ener-
gies is then:
 2 
1 2 5
B̂(Z, N ) = −hH(Z, N )imin = mα N Z − (N − 1) . (22)
2 16
(b) Make a simple table comparing your variational bounds with the
observed ground state energies for lithium, beryllium, and nitro-
gen. Note that a simple web search for “ionization potentials” will
get you a multitude of tables of observed values, or you can look
at a reference such as the CRC Press’s Handbook of Chemistry
and Physics. The table entries are typically of the form:
B(Z, N ) − B(Z, N − 1).

Solution: A Google search on “ionization potentials” results in


many suitable hits, including:
http://www.chemistrycoach.com/ionization potentials f.htm
The ionization potentials for lithium, beryllium, and nitrogen are
reproduced in Table 2b.

The predicted bounds on the energies, according to Eqn. 22, are


compared with the observed values in Table 2b.

3
Ionization Potentials for Lithium, Beryllium, and Nitrogen
(from http://www.chemistrycoach.com/ionization potentials f.htm)
Lithium Beryllium Nitrogen
1st I.P. 5.4 9.3 14.5
2nd I.P. 75.6 18.2 29.6
3rd I.P. 122 154 47.5
4th I.P. 218 77.5
5th I.P. 97.9
6th I.P. 552
7th I.P. 667

Comparison of Variational Prediction with Measured Energies


Lithium Beryllium Nitrogen
N Pred. Meas. Pred. Meas. Pred. Meas.
1 122.5 122 217.7 218 666.7 667
2 196.5 198 370.0 372 1217.0 1219
3 230.2 203 464.9 390 1658.8 1317
4 510.4 400 2000.2 1394
5 2249.2 1442
6 2413.6 1472
7 2501.5 1486

4
(c) Do your results make sense? If not, can you figure out what is
wrong, and whether the calculation we did for He is to be trusted?

Solution: Note that for N = 1, the calculation is “exact”, up to the


approximations made and effects neglected. The differences between
predicted and observed values for N = 1 are an indication of the un-
certainty due to the neglect in these matters, and possibly experimental
uncertainties.
We see that for N > 2, the computed bounds are always violated by the
data. The trial wave function for N > 2 is not properly antisymmetric
under interchange of the electrons, hence is not in the Hilbert space.
The calculation for He is still all right, since the electron has a spin
degree of freedom – A symmetric spatial wave function for two electrons
is permitted, as the spin wave function can be antisymmetric.
The computation for N = 2 should be all right, and we see that the
bound is always on the side it is supposed to be. Indeed, we do rather
well, always getting within a percent of the actual energy.
3. We consider the quantum mechanics of a particle in the earth’s gravi-
tational field:
Mm
V (r) = −G (23)
r
Mm
= −G (24)
R+z
Mm
≈ −G + mgz (25)
R
where (26)
M = mass of earth (27)
m = mass of particle (28)
r = distance from center of earth (29)
G = Newton’s gravitational constant (30)
R = radius of earth (31)
z = height of particle above surface of earth (32)
2
g = GM/R . (33)
We may drop the constant term in our discussion, and consider only the
mgz piece, with z  R. We further assume that no angular momen-

5
tum is involved, and treat this as a one dimensional problem. Finally,
assume that the particle is unable to penetrate the earth’s surface.

(a) Make a WKB calculation for the energy spectrum of the particle.
Solution: The potential is

V (z) = mgz. (34)

We need the turning points z1 and z2 :

V (z1 ) = V (z2 ) = E (35)

In this case, since we hit the ground at z = 0, and can go no


further, z1 = 0. The other turning point is at
E
z2 = . (36)
mg

Now we compute the function:


Z z2 (En ) q
f (En ) = 2m [En − V (z)]dz (37)
z1 (En )
Z En /mg q
= 2m(En − mgz)dz (38)
0
 
1
= n+ π. (39)
2
That is,
  q Z
1 En 1 √
n+ π = 2mEn ydy (40)
2 mg 0
s
2 2 3/2
= E . (41)
3g m n
Solving for En , we obtain the estimated bound state energy spec-
trum: !1  2
9π 2 2
3
1 3
En = mg n+ . (42)
8 2

6
(b) If the particle is an atom of atomic weight A ∼ 100, use the result
of part (a) to estimate the particle’s ground state energy (in eV).
Is sunlight likely to move the particle into excited states?
Solution: If A = 100, then m ∼ 100 × 109 eV. Also,
g ∼ 10m/s2 (43)
1
∼ 10m/s2 × × 200MeV-fm × 10−15 m/fm
(3 × 108 m/s)2
∼ 2 × 10−23 eV. (44)
The ground state energy is (n = 0):
!1  2
9π 2 3
1 3
E0 = mg 2 (45)
8 2
−12
∼ 10 eV. (46)
Since photons in sunlight have energies of order eV, they will read-
ily excite such atoms into highly excited states in the gravitational
potential.
(c) Now make a variational calculation for the ground state energy
(i.e., an upper bound thereon). Pick a “sensible” trial wave func-
tion, at least in the sense that it satisfies the right boundary con-
ditions. Compare your result with the ground state level from the
WKB approximation.
Solution: The wave function must vanish at z = 0 and at z = ∞.
A simple trial function which satisfies these boundary conditions
is:
z
ψ(z) = √ e−z/2R , (47)
2R 3

where the variational parameter is R.


We must evaluate the expectation value of the Hamiltonian for
our trial wave function. The kinetic energy part is:
Z ∞ !
z −z/2R 1 d2 z
hT i = √ e − 2
√ e−z/2R dz (48)
0 2R 3 2m dz 2R 3
Z ∞ !
2
1 1 z
= 3
z− e−z/R dz (49)
4mR 0 R 4R
1
= . (50)
8mR2
7
The potential energy part is:
Z ∞
z z
hV i = e−z/2R mgz √
√ e−z/2R dz (51)
0 2R 3 2R3
= 3mgR. (52)
1
Thus, we wish to minimize the quantity 3mgR+ 8mR 2 as a function

of R. The minimum occurs at


!1
1 3
R= . (53)
12m2 g
Thus, the variational bound on the ground state energy is
" 5 #1/3
3
E0 ≤ mg 2 . (54)
2
We note that this bound is slightly larger than the WKB estimate:
" 5 #1/3 !1/3  1/3
3 2 9π 2 27
mg / mg 2 = . (55)
2 32 π2

4. We discussed the method of stationary phase in section ??. Recall that


the problem it addresses is to evaluate integrals of the form:
Z ∞
I() = f (x)eiθ(x)/ dx, (56)
−∞

where f and θ are real, and  > 0. We showed that, in the situation
where  is very small, and θ has a stationary point at x = x0 , this
integral is approximately:
s
√ iθ(x0 )/ i π4 sign[θ 00 (x0 )] 2π
I() = f (x0 )e e [1 + O()] . (57)
|θ00 (x 0 )|

If there is more than one stationary point, then the contributions are
to be summed.
To get a little practice applying this method, evaluate the following
integral for large t:
Z 1 h i
J(t) = cos t(x3 − x) dx. (58)
0

8
Solution: To start to get it into the desired form, write
Z 1
ei(t(x ) dx.
3 −x)
J(t) = < (59)
0

Thus, f (x) = 1, θ(x) = x3 − x, and  = 1/t. The first two derivatives


are θ0 (x) √
= 3x2 − 1 and θ00 (x) = 6x. √ The first derivative is zero at
x = ±1/ 3. The zero at x0 = 1/ 3 falls within the range of the
integral, so this is the only stationary point
√ of interest. The value of θ
at the stationary point is θ(x0 ) =√−2/3 3. The second derivative at
the stationary point is θ00 (x0 ) = 2 3.
Plugging into our stationary phase formula, we get:
s
1 − 2it
√ 2π
J(t) ≈ < √ e 3 3 eiπ/4 √ (60)
t 2 3
s  
π i π − 2t

= < √ e 4 3 3 (61)
t 3
s !
π π 2t
= √ cos − √ . (62)
t 3 4 3 3

5. I suggested in section ?? that you consider the classical correspondence


for the time delay (or advance) of the asymptotic motion due to scatter-
ing on a potential. Let us pursue this here. Consider one-dimensional
motion. A particle of mass m is incident from the left on a potential:

V (x) = −K x ∈ (−∆/2, ∆/2) (63)
0 otherwise.
We wish to solve for the motion for large x at large times.

(a) Let’s do the quantum mechanics calculation first. Suppose that


our momentum space wave function at early time is a gaussian
wave packet:
" #1/2
1 − 12 ( p−q )
2
ψ̂(p) = √ e σ . (64)
2πσ
What is ψ(x, t) for large times and large x? Describe the motion,
relative to what it would be if K = 0.

9
(b) Now do the same problem classically. That is, solve for the motion
at large times and large x. Again, compare the result with what
it would be for K = 0. Contrast with the quantum result.

6. We have solved the Schrödinger equation for the Hydrogen atom with
Hamiltonian:
p2 e2
H0 = − .
2m r
The kinetic energy term is non-relativistic – the actual kinetic energy
will have relativistic corrections.

(a) Obtain an expression for the next order relativistic (kinetic en-
ergy) correction to the energy spectrum of hydrogen. It is con-
venient to avoid taking multiple derivatives by using the unper-
turbed Schrödinger equation to eliminate them. Thus, write your
expression in terms of the unperturbed energies and expectation
2 2
values of er and ( er )2 . Do not actually do the integration over
r here, but reduce the problem to such integrals. Make sure you
understand all of your steps.
Solution: The relativistic kinetic energy is
q
T = p2 + m2 − m (65)
q 
= m 1 + (p/m)2 − 1 (66)
 
1 1
= m (p/m)2 − (p/m)4 + O((p/m)6 ) . (67)
2 8
Thus, the next order relativistic correction to H0 is

1 p4
Hr = − . (68)
8 m3
Following the hint, notice that p4 = 4m2 (H0 − V )2 , where V =
−e2 /r. Thus, the perturbation Hamiltonian may be written
1 1  2 
Hr = − (H0 − V )2 = − H0 − H0 V − V H0 + V 2 . (69)
2m 2m
To determine how the energy levels change in first order pertur-
bation theory, we take the expectation value of Hr with respect

10
to the unperturbed stationary state wave functions:
Er = hψn`m |Hr |ψn`m i (70)
1
= − hψn`m |H02 − H0 V − V H0 + V 2 |ψn`m i (71)
2m  
1 2 2 1 4 1
= − E + 2En e h i + e h 2 i (72)
2m n r r
 
1 2 2 1 4 1
= − E + 2En e h i + e h 2 i . (73)
2m n r r
(b) Now apply your formula to obtain the first-order relativistic ki-
netic energy correction to the ground state energy of hydrogen.
Express your answer as a multiple of the unperturbed ground state
energy, and also calculate the size of the correction in eV.
Solution: The ground state wave function of hydrogen is:
2
ψ100 (x) = q e−r/a0 (74)
3
4πa0
As there is no dependence anywhere on the angular coordinates,
we are interested in the radial wave function
2
R10 (r) = q e−r/a0 . (75)
a30
Let us evaluate the integral:
Z ∞
Ik ≡ xk e−x dx, k≥0 (76)
0
Z ∞
dk
= lim (−)k e−ax dx (77)
a→1 dxk 0
= k!. (78)
Thus, in the hydrogen atom ground state:
Z
1 4 ∞ 1 −2r/a0 2
h ki = 3 e r dr (79)
r a0 0 rk
 
1 2 k
= (2 − k)!, k ≤ 2 (80)
2 a0
1 1
h i = (81)
r a0
1 2
h 2i = 2 (82)
r a0

11
Thus, the first-order relativistic energy correction to the ground
state enrgy of hydrogen is (noting that E0 = − 12 mα2 is the un-
perturbed ground state energy, and that a0 = 1/mα):
!
1 1 2
Er = − E02 + 2E0 e2 + e4 2 (83)
2m a0 a0
 
1 1 2 4
= − m α − m2 α4 + 2m2 α4 (84)
2m 4
 
5 2 1 5
= − α mα2 = α2 E0 (85)
4 2 4
= 0.91 meV. (86)

7. Let us consider an example of the use of degenerate stationary state


perturbation theory. Thus, let us take the hydrogen atom, with unper-
P2
turbed Hamiltonian H0 = 2m − αr , and consider the effect of putting this
atom in a uniform external electric field: E = Eêz . We are interested
in calculating, to first order in perturbation theory, the shifts in the
n = 2 energy levels. Note that the n = 2 level is four-fold degenerate,
corresponding to the eigenstates: |2S0 i, |2P1 i, |2P0 i, |2P−1 i, neglecting
spins.

(a) Write down the perturbing potential, V . [Note that we need only
consider the electron’s coordinates, relative to the nucleus – why?]
Calculate the commutator [V, Lz ], and hence determine the matrix
elements of V between states with different eigenvalues of Lz .
Solution: We are interested in computing the shift in the n = 2
energy levels (note that n is the principal quantum number here).
These energy levels are computed with the center-of-mass motion
separated out. Since the hydrogen atom is neutral, there is no
effect on the center-of-mass motion of turning on this electric field.
Hence, we only need consider the relative motion between the
electron and the nucleus. Effectively, we have an electric dipole
interacting with the electric field. The perturbing potential is

V (x) = −eEz = −eEr cos θ. (87)

The commutator [V, Lz ] = −eE[z, xpy − ypx ] = 0. Hence,

0 = h`0 m0 |[V, Lz ]|`mi = −eE(m − m0 )h`0 m0 |z|`mi. (88)

12
Thus, h`0 m0 |V |`mi = δmm0 h`0 m|V |`mi, that is, the matrix element
is zero unless m = m0 .
(b) You should have found a “selection rule” which simplifies the prob-
lem. What is the degeneracy that needs to be addressed in the
problem now that you have made this calculation?
Solution: The degeneracy that concerns us is the one between
states with like eigenvalues of Lz . These are the states |2S0 i and
|2P0 i.
(c) Using the invariance of the hydrogen atom Hamiltonian under
parity, write down the remaining matrix elements of V which need
to be determined, and compute their values.
Solution: Because V = eEr cos θ is odd under parity, the ex-
pectation value of V is zero between states of like parity. As the
unperturbed Hamiltonian commutes with parity, its eigenstates
may be expressed as eigenstates of parity. In particular, the P
states have odd parity, and the S states have even parity. Hence
the only (potentially) non-zero matrix elements of V in the four-
dimensional subspace we are considering here are h2S0 |V |2P0 i and
h2P0 |V |2S0 i. Since our wave functions are real (by convention),
these two matrix elements are equal.
The hydrogenic wave functions we need are (from solutions to
problem 40):
1 1 1
ψ200 = √ √ 3/2 (2 − r/a0 )e−r/2a0 (89)
4π 2 2 a0
s
3 1 1 r
ψ210 = cos θ √ 3/2 2 e−r/2a0 . (90)
4π 4 6 a0 a0

Recall also, from the solution to exercise 4:


Z ∞
Ik ≡ xk e−x dx, k≥0 (91)
0
= k!. (92)

Thus, the desired matrix element is:


√ Z Z 1
3 2π
h2P0 |V |2S0 i = −eE dφ cos2 θd cos θ
4π 0 −1

13
1 1 Z∞ 2 r
√ 3 r drr(2 − r/a0 ) e−r/a0 (93)
8 3 a0 0 a0
a0 Z ∞ 4
= −eE x (2 − x)e−x dx (94)
3·8 0
a0
= −eE (2 · 4! − 5!) (95)
3·8
= 3eEa0 . (96)
The perturbing Hamiltonian may thus be written, in the |2S0 i, |2P1 i, |2P0 i, |2P−1 i
basis:  
0 0 1 0
0 0 0 0
 
V = 3eEa0  . (97)
1 0 0 0
0 0 0 0
(d) Now complete your degenerate perturbation theory calculation to
determine the splitting of the states in the applied electric field.
Calculate numerical splittings (in eV) for an applied field of 100
kV/cm. Also, estimate the “typical” electric field felt by the elec-
tron, due to the nucleus, in a hydrogen atom. Was the use of
perturbation theory reasonable for this problem?
Solution: The eigenstates of the perturbing Hamiltonian are:
1
√ (|2S0 i + |2P0 i) (98)
2
1
√ (|2S0 i − |2P0 i) (99)
2
|2P1 i (100)
|2P1 i, (101)
with eigenvalues 3eEa0 , −3eEa0 , 0, 0, respectively. In a 100
kV/cm field,
3eEa0 ∼ 3 × 100 keV/cm0.5 × 10−8 cm (102)
= 1.5 × 10−3 eV. (103)
The “typical” electric field felt by the electron in the field of the
proton is:
e2 /r 2 × −13.6 eV
Ep ∼ = (104)
er e × 0.5 × 10−8 cm
9
∼ 5 × 10 kV/cm. (105)

14
The use of perturbation theory for this problem appears justified.

8. It may happen that we encounter a situation where the eigenvalues of


H0 , call them εn and εm , are nearly, but not quite equal. In this case,
we cannot use degenerate perturbation theory, and ordinary perturba-
tion theory looks unreliable. Let us try to deal with such a situation:
Suppose the two eigenstates |ni and |mi of H0 have nearly the same en-
ergy (and all other eigenstates don’t suffer this disease, for simplicity).
Let H = H0 + V , and write
X
V = |iihi|V |jihj| (106)
i,j
H0 |ii = εi |ii, (107)
where

hi|ji = δij· (108)


Let

V = V1 + V2 , (109)
with

V1 ≡ |mihm|V |mihm| + |nihn|V |nihn| + (110)


+|mihm|V |nihn| + |nihn|V |mihm| (111)

and V2 is everything else.


If we can solve exactly the problem with H1 = H0 + V1 , then the trou-
blesome 1/(εn − εm ) terms are avoided by the exact treatment, and we
may treat V2 as a perturbation in ordinary perturbation theory (since
hi|V2 |ji = 0 for i, j = n, m). All states |ii, i 6= n, m, are eigenstates
of H1 , since V1 |ii = 0 in this case. However, |ni and |mi are not in
general eigenstates of H1 .

(a) Solve exactly for the eigenstates and eigenvalues of H1 , in the


subspace spanned by |ni, |mi. Express your answer in terms of

εn , εm , hm|V |ni, hn|V |ni, hm|V |mi.

15
(You may also use the shorthand
(1)
En,m = εn,m + hn, m|V |n, mi
if you find it convenient.)
Solution: A vector |ii in our restricted subspace is of the form:
|ii = α|mi + β|ni, (112)
where |mi and |ni are eigenstates of H0 with eigenvalues εm and
εn , respectively. Schrödinger’s equation with Hamiltonian H1 =
H0 + V1 is:
    
εm + hm|V |mi hm|V |ni α α
=E . (113)
hn|V |mi εn + hn|V |ni β β
We find eigenvalues E by taking the determinant:

εm + hm|V |mi − E hm|V |ni
= 0. (114)
hn|V1 |mi εn + hn|V |ni − E
Letting Vij ≡ hi|V |ji = Vji∗ , we find eigenvalues E± :
 q 
1 (1) (1) (1)
E± = E + En(1) ± (Em − En )2 + 4|Vmn |2 . (115)
2 m
The corresponding eigenvectors are determined by:
(1)
Em α± + Vmn β± = E± α± (116)
(1)
En β± + Vnm α± = E± β± (117)
Hence,
(1)
E± − Em
β± = α± . (118)
Vmn
Imposing the normalization |α± |2 + |β± |2 = 1, and choosing α±
real and positive, we have:
v
u (1)
u (E± − Em )2
t
α± = 1/ 1 + (119)
|Vmn |2
v !2
u
u Vmn 2
Vmn
β± = 1/t + (1)
(120)
|Vmn | (E± − Em )2
The eigenvectors are thus,
|±i = α± |mi + β± |ni. (121)

16
(b) As an application, consider an electron in a weak one-dimensional
periodic potential (“lattice”) V (x) = V (x + d). Assume the lat-
tice has a size L = N d, and that we the have periodic boundary
condition on our wave functions: ψ(x) = ψ(x + L). With this
boundary condition, the unperturbed wave functions are plane
waves, ψp (x) = √1L eipx , where p = 2πn/L, n=integer, and the
 2
p2 2πn 1
unperturbed eigenenergies are εn = 2m
= L 2m
. We expand
the potential in a Fourier series:

X
V (x) = ein2πx/d Vn
n=−∞

If we label our eigenfunctions by |pi = √1 e2πinp x/L , determine all


L
nonvanishing matrix elements of V :

hq|V |pi

Express your answer in terms of Vn .


Solution:

X Z L
1
hq|V |pi = Vn ein2πx/d √ e2πi(np −nq )x/L dx (122)
n=−∞ 0 L
X∞ Z 1
= Vn ei2π(np −nq +N n)u du (123)
n=−∞ 0

X∞
= Vn δN n,(nq −np ) . (124)
n=−∞

Thus, hq|V |pi = 6 0 if and only if (nq − np )/N is an integer, and


V(nq −np )/N 6= 0.
(c) Suppose εnp and εnq are not close to each other ∀nq , given some
np . Calculate the perturbed wave function in ordinary first order
perturbation theory corresponding to unperturbed wave function
ψp (x). Also, calculate the energy to 2nd order. Express your
answer in terms of Vn and the unperturbed energies.
Solution: The first order perturbed wave function is:
X hq|V |pi
Ψp (x) = ψp (x) + |qi . (125)
q6=p εnp − εnq

17
2π 2
2 2
Now εnp − εnq = mL 2 (np − nq ), and hq|V |pi = 0 unless nq =

np + Nn. Thus, letting |P i = Ψp (x):


X Vn
|P i = |pi + |np + nNi (126)
nq =np +nN εnp − εnq

X Vn
|P i = |pi + |np + nNi . (127)
n=−∞,6=0
εnp − εnp +nN
Alternatively,

X Vn
|P i = |pi + |np + nNi 2π 2
(128)
n=−∞,6=0 mL2
(n2p − n2q )
2 X∞
md Vn
= |pi + |np + nNi (129)
2π 2 n=−∞,6=0 (n2p − n2q )/N 2
2 X∞
md Vn
= |pi − |np + nNi (130)
2π 2 n=−∞,6=0
n(n + 2np /N )
2 X∞
md Vn
= |pi − |np + nNi (131)
2π 2 n=−∞,6=0 n(n + 2np /N )
 

md2 X Vn
= |pi 1 − e−i2πnp x/L 2 exp i2π(np + nN)x/L
2π n=−∞,6=0 n(n + 2np /N )
 

md2 X Vn
= |pi 1 − exp i2πnx/d . (132)
2π 2 n=−∞,6=0 n(n + 2np /N )

The first order energy correction is


Ep1 = hp|V |pi = V0 . (133)
The second order energy correction is
X |hq|V |pi|2 X |Vn |2
Ep2 = = (134)
q6=p εnp − εnq q6=p εnp − εnq

X md2 |Vn |2
= − 2 n(n + 2n /N )
. (135)
n=−∞,6=0
2π p

Thus, the energy to second order is:



md2 X |Vn |2
Ep = εnp + V0 − (136)
2π 2 n=−∞,6=0 n(n + 2np /N )

18
2π 2 n2p md2 ∞
X |Vn |2
= + V0 − (137)
mL2 2π 2 n=−∞,6=0
n(n + 2np /N )

(d) What is the condition on np (and hence on p) so that |pi will be


nearly degenerate in energy with another eigenstate of H0 ?
Solution: From part (c), we see that the coefficient in the ex-
panison for |P i blows up when n + 2np /N = 0, i.e., at
Nn
np = − , n = ±1, ±2, . . . . (138)
2
When np takes on such a value, p = 2πnp /L = −πn/d. This is
degenerate with q = πn/d = p + 2πn/d.
(e) Assume that the condition in (d) exists, and use part (a) to solve
this “almost degenerate” case for the eigenenergies. Complete the
graph in Fig. 1 for higher values of |p|.

Ep

|V0 |

p
- π /d 0 π /d

Figure 1: Energy versus momentum for the one-dimensional lattice problem


(6).

Solution: Let |pi be nearly degenerate with |qi:


(n + )π 2πn (n − )π
p=− , q =p+ = , (139)
d d d
19
where 0 < ||  1. The nearly degenerate energies are:

π2 π2
εnp = (n + )2 , εn q = (n − )2 . (140)
2md2 2md2
The matrix elements we need are:

hp|V |pi = hq|V |qi = V0 , (141)


hp|V |qi = hq|V |pi = Vn . (142)

The energies, using part (a), are


1h q i
E± = εnp + V0 + εnq + V0 ± (εnp − εnq )2 + 4|Vn |2 (143)
2 v
u !2
π2 u π2n
2 2 t
= 2
(n +  ) + V 0 ± 2 + |Vn |2 . (144)
2md md2

The difference in energy between these two values is


v
u !2
u π2n
t
E+ − E1 = 2 2 + |Vn |2 (145)
md2
!2
1 π2n
≈ 2|Vn | + 2 . (146)
|Vn | md2

9. When we calculated the density of states for a free particle, we used a


“box” of length L (in one dimension), and imposed periodic boundary
conditions to ensure no net flux of particles into or out of the box.
We have in mind that we can eventually let L → ∞, and are really
interested in quantities per unit length (or volume). Let us justify
more carefully the use of periodic boundary conditions, i.e., we wish to
convince ourselves that the intuitive rationale given above is correct.
To do this, consider a free particle in a one-dimensional “box” from
−L/2 to L/2. Remembering that the Hilbert space of allowed states is
a linear space, show that the periodic boundary condition:

ψ(−L/2) = ψ(L/2), (147)


ψ 0 (−L/2) = ψ 0 (L/2) (148)

20
gives acceptable wave functions. “Acceptable” here means that the
probability to find a particle in the box must be constant. Are there
other acceptable choices?
Solution: The Schrödinger equation for a free particle is
1 2
−i∂t ψ(x, t) = − ∂ ψ(x, t). (149)
2m x
We suppose that an “acceptable” wave function is one which has a
constant probability to be in the “box” (−L/2, L/2):
Z
d L/2
|ψ(x, t)|2 dx = 0. (150)
dt −L/2
It is readily verified that the function
2π 2 2π
φ(x, t) = ei mL2 t sin x (151)
L
has the desired property.
If we admit φ(x, t) as an acceptable solution, and if ψ(x, t) is any other
acceptable solution, then φ + ψ must be acceptable, since any linear
combination of acceptable solutions must be acceptable. Hence, we
must have:
Z
d L/2
|ψ(x, t)|2 dx = 0; (152)
dt −L/2
Z
d L/2
|φ(x, t)|2 dx = 0; (153)
dt −L/2
Z
d L/2
|ψ(x, t) + φ(x, t)|2 dx = 0. (154)
dt −L/2
Then we may write (assuming Eqns 152 and 153):
d Z L/2
0 = [ψ(x, t)φ∗ (x, t) + ψ ∗ (x, t)φ(x, t)] dx (155)
dt −L/2
Z L/2
= ∂t [ψ(x, t)φ∗ (x, t) + ψ ∗ (x, t)φ(x, t)] dx (156)
−L/2
i Z L/2 h 2  ∗       i
= ∂x ψ φ − ψ ∂x2 φ + ψ ∗ ∂x2 φ − ∂x2 ψ ∗ φ dx
(157)
2m −L/2
Z L/2
= ∂x [(∂x ψ) φ∗ − ψ (∂x φ) + ψ ∗ (∂x φ) − (∂x ψ ∗ ) φ] dx (158)
−L/2
L/2
= [(∂x ψ) φ∗ − ψ (∂x φ) + ψ ∗ (∂x φ) − (∂x ψ ∗ ) φ]−L/2 . (159)

21
But φ(±L/2, t) = 0, so
L/2
0 = [−ψ(∂x φ∗ ) + ψ ∗ (∂x φ)]−L/2 . (160)
Further, since
2π i 2π22 t
∂x φ(±L/2, t) = − e mL , (161)
L
we obtain
2π 2 2π 2 2π 2 2π 2
0 = ψ(L/2, t)e−i mL2 t −ψ(−L/2, t)e−i mL2 t +ψ ∗ (L/2, t)ei mL2 t −ψ ∗ (−L/2, t)ei mL2 t .
(162)

This must be true for all times; also if ψ is acceptable, then e ψ must
be acceptable, for real θ. Hence, ψ is acceptable if and only if Eqn. 152
holds, and:
ψ(L/2, t) = ψ(−L/2, t). (163)
2π 2
We note that the function ei mL2 t cos 2π
L
x satisfies these criteria. Thus,
we could also have picked

2π 2
φ(x, t) = ei mL2 t cos
x (164)
L
as an acceptable solution. Then the same argument reveals that any
other acceptable solution ψ must satisfy the boundary condition:
∂x ψ(L/2, t) = ∂x ψ(−L/2, t). (165)
 2 2π 2 2 2

in t n 2π
We finally remark that the set of functions e mL2 sin 2πn
L
x, ei mL2 t cos 2πn
L
x; n = 0, 1, . . .
is a complete set of functions with the required boundary conditions.
10. See if you can generalize the result for the first Born approximation:
dσ m2
= |V̂ (p0 − p)|2 . (166)
dΩ0 (2π)2
to the case where the scattered particle (mass mf ) may have a different
mass than the incident particle (mass mi ).
Solution:

dσ mi mf pf
0
= 2
|V̂ (pf − pi )|2 . (167)
dΩ (2π) pi

22
11. We consider the potential (called the “Yukawa potential”):
Ke−µr
V (x) = , r = |x|,
r
with real parameters K and µ > 0. The parameter K can be regarded
as the “strength” of the potential (“interaction”), and µ1 is effectively
the “range” of distance over which the potential is important. µ itself
has units of mass – note that as µ → 0 we obtain the Coulomb potential:
µ can be thought of as the mass of an “exchanged particle” which
mediates the force. In electromagnetism, this is the photon, hence
µ → mγ = 0

(a) Find a condition on K and µ which guarantees that there are at


least n bound states in this potential. You will likely fashion and
use some kind of “comparison” theorem in arriving at your result.
You should give at least a “heuristically convincing” argument, if
you don’t actually prove it.
Solution: We assume K < 0, so that the potential is attractive.
Consider !
e−µr 1
D(r) ≡ K − . (168)
r r
For r > 0, 0 < e−µr ≤ 1, hence 0 ≤ D(r). Also
dD(r) Kh i
= − 2 (1 + µr)e−µr − 1 < 0. (169)
dr r
Hence, D(r) decreases monotonically for r > 0, with a maximum
at r = 0:
lim D(r) = −Kµ. (170)
r→0

Thus, 0 ≤ D(r) ≤ −Kµ.


We may write
!
K e−µr 1
V (r) = +K − . (171)
r r r
Hence, E ≤ E(hydrogen) + (−Kµ). So, the nth level of V is a
bound state, En < 0, if

En (hydrogen) − Kµ < 0 (172)

23
mK 2
− − Kµ < 0 (173)
2n2
µ m
− < 2. (174)
K 2n

(b) Using the Born approximation for the differential cross section
that we developed in our discussion of time-dependent perturba-

tion theory, calculate the differential cross section, dΩ , for scatter-
ing on this potential. Consider the limit µ → 0 and compare with
the Coulomb differential cross section we obtained in the notes.
Solution:

4πK
V̂ (p) = . (175)
µ2 + p2
dσ m2 0 2 4m2 K 2
= |V (p − p)| =  2 . (176)
dΩ (2π)2 µ2 + 4p2 sin2 2θ

(c) Integrate your differential cross section over all solid angles to
obtain the “total cross section”. Again, consider the limit µ → 0.
Hence, what is the total cross section for scattering on a Coulomb
potential?
Solution:

16πm2 K 2
σT = . (177)
µ2 (µ2 + 4p2 )

24
Physics 125
Course Notes
Scattering
040407 F. Porter

Contents
1 Introduction 1

2 The S Matrix 2

3 The Differential Cross Section 5

4 Partial Wave Expansion 11

5 Optical Theorem 13

6 Interim Remarks 15

7 Resonances 16

8 The Phase Shift 18


8.1 Relation of Phase Shift to Logarithmic Derivative of the Radial
Wave Function . . . . . . . . . . . . . . . . . . . . . . . . . . 23
8.2 Low Energy Limit . . . . . . . . . . . . . . . . . . . . . . . . . 24
8.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

9 The Born Expansion, Born Approximation 31


9.1 Born Approximation for Phase Shifts . . . . . . . . . . . . . . 36
9.2 Born Approximation and Example of the “Soft Sphere” Potential 39

10 Angular Distributions 40

11 Exercises 44

1 Introduction
Much of what we learn in physical investigations is performed by “scattering”
objects off of one another. The quantum mechanical theory for scattering
1
processes is thus an important subject. We have already gotten a glimpse
into how to deal with scattering in our discussion of time-dependent pertur-
bation theory, and in our investigation of angular momentum. We carry this
development further in this note and derive more of the important concepts
and tools for dealing with scattering. Many of the results we obtain will in
fact be applicable to relativistic processes.
We start by developing the basic physical picture and mathematical for-
malism for a simple situation with the essential features. That is, we begin
by considering the problem of the scattering of a spinless particle of mass
m > 0 from a fixed, static, spherically symmetric center of force, centered at
the origin. Assume further that the force is of “finite range” – far enough
away, any wave packet acts as if there is no force.1 In another view, consid-
ering the spreading of the wave function, as t → ±∞, the wave packet is so
spread out that little of it is affected by the force field.

2 The S Matrix
Let φ(p, t) be the momentum space wave function of the scattering wave
packet. Assume it is normalized: hφ|φi = 1. The time dependent Schrödinger
equation is:
i∂t φ(p, t) = Hφ(p, t), (1)
√ 2
where H includes the influence of the force. Let E(p) = m + p2 , or p2 /2m
in the non-relativistic case, where p = |p|. We restrict consideration to the
subspace in which φ(p, t) is orthogonal to any bound state wave functions,
if any exist. That is, we restrict ourselves to “scattering”, in which the wave
function can be considered to be far away from the center of force at large
times.
Corresponding to φ(p, t) (an exact solution to the Schrödinger equation
for a possible scattering event), define the notion of an “initial” wave function,
φi (p), by taking the following limit at asymptotically early times:

φi (p) = lim φ(p, t)eitE(p) . (2)


t→−∞

1
The Coulomb force is thus excluded by this criterion. As seen in the exercises of
the Approximate Methods note, the cross section for the Coulomb interaction is infinite.
However, we have also seen how the Coulomb force may be treated as a limiting case.

2
That is, φi (p)e−itE(p) is a solution to the Schrödinger equation in the absence
of the scattering force. Likewise, we define a “final” wave function according
to:
φf (p) = lim φ(p, t)eitE(p) . (3)
t→∞

We should perhaps remark a bit further on the choice of phase factor in


these asymptotic limits. Put most simply, these choices just correspond to
the interaction picture. However, we may gain a bit of additional insight as
follows: Recall that for a free particle the momentum space wave function
behaves under a time translation by t0 as:
p2
M(t0 )ψ(p, t) = eit0 2m ψ(p, t). (4)

Thus, our initial asymptotic limit, Eqn. 2, is like taking

φi (p) = M (t0 → −∞)φ(p, t = 0), (5)

if the motion were strictly force-free for all times. A similar correspondence
holds for the asymptotic final state wave function. Thus, our initial and final
phases are “matched” to the t = 0 phase for the case of no force. Of course,
even with a force, φi (p) determines φ(p, t), and hence φf (p). We note that it
is convenient to deal with momentum space wave functions in this discussion
of scattering, because the coordinate space wave functions in general become
infinitely “spread out” as t → ∞.
We are couching our discussion in terms of physically realizable wave
packets here. Typically, textbooks resort to the use of (“unphysical”) plane-
wave states, which simplifies the treatment somewhat, at the expense of
requiring further justification. It is reassuring to find that we can obtain
the same results by considering only physical states. Thus, our development
follows closely the nice pedagogical treatment in Ref. [1].
We are interested in the problem of determining φf , given φi . In principle,
this requires solving the Schrödinger equation with φi specifying the initial
boundary condition. Usually, however, we do not require the explicit detailed
solution – we are interested in only certain aspects of the relation between
φi and φf , such as the force-induced phase difference.
The transformation giving φf , in terms of φi is a linear transformation,
which also preserves the normalization of the wave packet. Hence, it is a
unitary transformation:
|φf i = S|φi i, (6)

3
where
SS † = I. (7)
The unitary operator S is called the S matrix.
In general, we may represent the S matrix as an integral transform:
Z
φf (p) = d3 (q)S(p, q)φi (q), (8)
(∞)

and the unitarity condition is represented as:


Z Z
d3 (q)S(p0 , q)S † (q, p00 ) = d3 (q)S(p0 , q)S ∗ (p00 , q)
(∞) (∞)
Z
= d3 (q)S ∗ (q, p0 )S(q, p00 )
(∞)

= δ (3) (p0 − p00 ). (9)

The scattering matrix S(p, q) contains precisely all the information which is
available in an “asymptotic” experiment, as described by φi , φf , where the
details of what happens near the origin are not observed.
Consider briefly the classical analog, for the classical scattering of a parti-
cle of mass m on a static, spherically symmetric force centered at the origin.
The motion is described by canonical variables x(t) and p(t), which may be
obtained by solving the Hamilton equations of motion:

ẋi = ∂pi H (10)


ṗi = −∂xi H. (11)

The quantities x(t) and p(t) are the classical analog of φ(p, t), since they
describe the detailed behavior of the particle. As long as the force falls off
sufficiently rapidly at large distances, the motion at large distances and times
(presuming no bound states are involved) is uniform:
(
xi + pi t/m, as t → −∞,
x(t) = (12)
xf + pf t/m, as t → ∞.

Thus, we can define the “initial” and “final” asymptotic values as:
 
t
pi = lim p(t), xi = lim x(t) − pi (13)
t→−∞ t→−∞

m
t
pf = lim p(t), xf = lim x(t) − pf . (14)
t→∞ t→∞ m

4
In particular, xi , xf are (or rather, is) the positions that would occur at
t = 0 if the motion were strictly uniform for all times, with the asymptotic
momenta. Thus, (xi , pi ) is the analog of φi (p), and (xf , pf ) is the analog of
φf (p).
Once again, specification of (xi , pi ) uniquely determines (xf , pf ), and the
transformation giving this relationship is the analog of the S matrix. Note
that, in the absence of a force field,

x i = xf , p i = pf . (15)

The analogous statement in quantum mechanics is

φi (p) = φf (p), (16)

since φ(p, t) = φi (p)e−itE(p) is the solution of the Schrödinger equation for


free particle motion:

φf (p) = lim φ(p, t)eitE(p) (17)


t→∞

= lim φi (p)e−itE(p) eitE(p) , (free particle) (18)


t→∞
= φi (p). (19)

3 The Differential Cross Section


In the Approximate Methods note, we defined a differential cross section
in the discussion of time-dependent perturbation theory. We now return to
this, and develop the notion of a cross section with more care. We start with
the S matrix formalism just introduced. Given φi (p) and S, we obtain φf (p)
according to: Z
φf (p) = d3 (q)S(p, q)φi (q). (20)
(∞)

Now define the scattered wave by:

φs (p) = φf (p) − φi (p), (21)

or

|φs i = |φf i − |φi i (22)


= (S − I)|φi i. (23)

5
In the absence of a force, φf (p) = φi (p), hence S = I and φs (p) = 0. Thus,
the name “scattered wave” is appropriate for φs (p).
Since the scattering force is assumed to be stationary, energy is conserved
in the scattering event (e.g., for an infinitely heavy source of force, any finite
momentum transfer to the source yields zero kinetic energy transfer). Hence,
the asymptotic initial and final magnitudes of momenta will be equal. We
may thus write:

S(p, q) − δ (3) (p − q) = δ(p − q)T (p, q), (24)

where p = |p|, q = |q|, and we have extracted the factor δ(p − q) from the
integral transform for S − I. Thus, we have:
Z
φs (p) = d3 (q)δ(p − q)T (p, q)φi (q). (25)
(∞)

Note that T (p, q) is “physical” only for p = q (“on the energy shell”), though
it is sometimes useful to extend its domain of definition analytically “off-
shell”.
Let {φ0 (p; α) | α ∈ some index set} be a set of potential wave packets,
such as might be available in a given experimental arrangement (e.g., a beam
delivered by an accelerator). We assume that the particles in our “beam” all
have approximately the same momentum pi , i.e.,

φ0 (p; α) ≈ 0, unless p ≈ pi . (26)

Assume further that hφ0 |φ0 i = 1 for all α, and that pi is in the 3-direction:
pi = pi ê3 . The index α is a “shape parameter”, since we use it to specify
the shape of the incident wave packet of a particle in the beam. note that
different particles may have different wave packets. We consider the packet

φi (p; α; x) = φ0 (p; α)e−ix·p , (27)

obtained by translating the packet φ0 (p; α) by amount x in space (assumed


to be transverse to the three direction, x3 = 0). Fixing x and α, and taking
the packet as our initial wave, we obtain the scattered wave as:
Z
φs (p; α; x) = d3 (q)δ(p − q)T (p, q)φ0 (q; α)e−ix·q . (28)
(∞)

We are interested in the probability, P (u; α; x)dΩu , that the particle is in


the scattered wave with momentum in dΩu around unit vector u. At least if

6
u is not too close to the forward direction defined by pi , this probability is
just the probability that the particle’s momentum after scattering is in dΩu :
Z ∞
P (u; α; x) = p2 dp |φs (pu; α; x)|2 (29)
Z0 ∞ Z Z
= p2 dp d3 (q) d3 (q0 )δ(p − q)δ(p − q 0 ) (30)
0 (∞) (∞)
0
T (pu, q)T ∗ (pu, q0 )φ0 (q; α)φ∗0 (q0 ; α)e−ix·(q−q ) . (31)

We may integrate over p, setting p = q 0 :


Z Z
3
P (u; α; x) = d (q) d3 (q0 )q 2 δ(q − q 0 ) (32)
(∞) (∞)
0
T (qu, q)T ∗ (q 0 u, q0 )φ0 (q; α)φ∗0 (q0 ; α)e−ix·(q−q ) , (33)

where we have made use of the fact that q = q 0 for non-vanishing contribu-
tions.
The point of the parameters α, x is the following: We wish to make sure
that our formalism is valid for real experimental situations, i.e., that we have
not left out some essential feature in our theory for describing experimen-
tal situations. Thus, we assume that the real exdperimental “beam” is a
statistical ensemble of wave packets φ0 (p; α; x). Thus, the parameter x de-
scribes the displacement of a wave packet in this ensemble from some “ideal”
(i.e., x = 0) position, and α describes the shape of the wave packet in the
ensemble of shapes. The ensemble is described by specifying the probabil-
ity distributions for x and α, which depends on the particular experimental
arrangement. Note that we are interested here in scattering of a “one particle
beam” on a “one particle target”, i.e., correlations among particles in the
beam or in the target are assumed to be zero.
Let P (u)dΩu be the probability that a beam particle comes out in dΩu
about u. We compute P (u) by averaging P (u; α; x) over the ensemble prob-
ability distribution:
Z Z
1
P (u) = f (α)dα 2 d2 (x)P (u; α; x), (34)
{α} πR |x|≤R

where f (α) is the probability distribution for α, with


Z
f (α)dα = 1. (35)
{α}

7
The other integral is evaluated on a circular disk centered on the origin
and perpendicular to the beam direction. We have assumed here that the
probability distribution for x is uniform over the disk, and that the size of
the disk (or, size of the uniform region of the beam) is large compared with
relevant target dimensions (e.g., atomic or nuclear sizes). These assumptions
may be modified as the situation warrants.

beam
R
^
e 3

Figure 1: Illustration of a beam setup, with uniformity over a disk of radius


R.

We define an “effective” differential cross section by:


dσeff (u) = πR2 P (u)dΩu (36)
Z Z
= f (α)dα d2 (x)P (u; α; x)dΩu . (37)
{α} |x|≤R

This cross section is the differential cross section, per scattering center, which
we would observe experimentally with our given beam (e.g., the observed
event rate for a scattering is equal to dσeff times the “luminosity”). It de-
pends on the properties of our beam (i.e., the probability distributions for α
and x) and hence, is an “effective” cross section rather than a “fundamental”
cross section depending only on the interaction. We wish to define such a
fundamental cross section and relate it to the S-matrix.
Consider the limit in which R → ∞, and the momentum in the beam is
sharply defined. First, for R → ∞:
Z Z
dσeff (u) = f (α)dα d2 (x)P (u; α; x) (38)
{α} (∞)
Z Z Z Z
= f (α)dα d2 (x) d3 (q) d3 (q0 )q 2 δ(q − q 0 ) (39)
{α} (∞) (∞) (∞)

8
0
T (qu, q)T ∗ (q 0 u, q0 )φ0 (q; α)φ∗0 (q0 ; α)e−ix·(q−q ) . (40)

Noting that
Z
0
d2 (x)e−ix·(q−q ) = (2π)2 δ(q1 − q10 )δ(q2 − q20 ), (41)
(∞)

we have:
Z Z Z
3
dσeff (u) = f (α)dα d (q) d3 (q0 )q 2 δ(q1 − q10 )δ(q2 − q20 )δ(q − q 0 )
{α} (∞) (∞)

(2π) T (qu, q)T (q u, q0 )φ0 (q; α)φ∗0 (q0 ; α).


2 ∗ 0
(42)

We next use the identity:


q h (3) i
δ(q1 −q10 )δ(q2 −q20 )δ(q−q 0 ) = δ (q − q0 ) + δ(q1 − q10 )δ(q2 − q20 )δ(q3 + q30 ) .
|q3 |
(43)
While substituting this in, let us also apply the assumption that the momen-
tum in the (“incident”) beam is well-defined, i.e.,

φ0 (p; α) ≈ 0, unless p ≈ pi = pi ê3 . (44)

Thus, the product

φ0 (q; α)φ∗0 (q0 ; α) ≈ 0, unless both q ≈ pi and q0 ≈ pi . (45)

In particular, this product is small unless q ≈ q0 , and q3 ≈ q30 . Thus, we


may neglect the contribution in the integrand from the δ(q3 + q30 ) term, and
obtain:
Z Z Z
q 3 (3)
dσeff (u) = (2π)2 f (α)dα d3 (q) d3 (q0 )
δ (q − q0 )
{α} (∞) (∞) |q3 |
T (qu, q)T ∗ (q 0 u, q0 )φ0 (q; α)φ∗0 (q0 ; α) (46)
Z Z 3
q
= (2π)2 f (α)dα d3 (q) |T (qu, q)|2 |φ0 (q; α)|2 . (47)
{α} (∞) |q3 |
Now, let us again impose our assumption of a well-defined beam momen-
tum, in a more stringent way. Let us require that |φ0 (q; α)|2 is sufficiently
sharply peaked about q = pi that the function
q3
|T (qu, q)|2 (48)
|q3 |

9
does not vary significantly over the region where |φ0 (q; α)|2 is significant.
With this assumption, we may take:

q3 p3
|T (qu, q)|2 → i |T (pi u, pi )|2 = p2i |T (pi u, pi )|2 . (49)
|q3 | |pi3 |

Then
Z Z
dσeff (u) = (2πpi )2 |T (qu, q)|2 f (α)dα d3 (q)|φ0 (q; α)|2 (50)
{α} (∞)
2 2
= (2πpi ) |T (qu, q)| . (51)

Thus, in this limit, the effective cross section no longer depends on the precise
shape of the beam distribution. We interpret this limiting form as the desired
“fundamental” (differential) cross section, σ(pf ; pi ).
We may rewrite the S matrix, as an integral transform, in the form:

S(pf ; pi ) = δ (3) (pf − pi ) + δ(pf − pi )T (pf ; pi ) (52)


i
= δ (3) (pf − pi ) + δ(pf − pi )f (pf , pi ), (53)
2πpi
where we have defined the scattering amplitude:

f (pf , pi ) ≡ −2πipi T (pf ; pi ). (54)

The cross section is given by:

σ(pf ; pi ) = (2πpi )2 |T (pf ; pi )|2 (55)


= |f (pf , pi )|2 . (56)

That is, the scattering cross section, appropriately, is the square of the scat-
tering amplitude. Note that the scattering amplitude has units of length.
We have defined a differential cross section, and related the S matrix to
the scattering amplitude. We have made some assumptions concerning the
physical situation, and the results may need to be modified in certain cases.
For example, there may be narrow “resonances” where the assumption that
the beam is well-defined compared with the variation in |T |2 breaks down,
and hence the effective and fundamental cross sections differ (Fig. 2).

10
σ

Figure 2: The fundamental cross section (narrow peak) may become smeared
out to the effective cross section (wide peak).

4 Partial Wave Expansion


For a spherically symmetric center-of-force, the symmetry may be used to
obtain a useful expression for the scattering amplitude (and, hence, the S
matrix), called the partial wave expansion. This form is useful in practice
because it is often the case that only a few terms in the expansion contribute
significantly to the description of the scattering process. In particular, if
the wavelength of the scattered particle is large compared with the “size”
of the force center, only a few phase shifts dominate – this is “low energy”
scattering.
We may obtain the partial wave expansion as follows: By spherical sym-
metry, S(p, q) can only depend on rotationally invariant quantities. There
are three such invariants in spinless scattering: p2 , q2 and p · q. Since
|p| = |q|, we are left with only two distinct invariants. Thus, we may write
the S matrix in the form:
1
S(p, q) = δ(p − q)B(p; up · uq ), (57)
pq
where
p·q
up · uq ≡ . (58)
p2
Notice that this form is such that B is dimensionless.

11
Now consider an eigenstate of J2 and J3 with eigenvalues j(j + 1) and
m = 0 (the initial momentum is along ê3 , hence hJ3 i = hL3 i = 0):

φj,0(p) = φ(p)Pj (up · u3 ), (59)

where Pj is a Legendre polynomial. Let S operate on this initial wave func-


tion:
Z
φ0j,0(p) = d3 (q)S(p, q)φj,0(q) (60)
Z
1
= d3 (q) δ(p − q)B(p; up · uq )φ(q)Pj (uq · u3 ) (61)
pq
Z
= φ(p) dΩu0 B(p; up · u0 )Pj (u0 · u3 ). (62)

By conservation of angular momentum, φ0j,0 (p) is an eigenstate of J2 and J3


with eigenvalues j(j + 1) and m = 0. Since S is unitary, φ0j,0(p) is normalized
to one if φj,0 (p) is. Thus, φ0j,0(p)/φj,0 (p) must be a function of p of modulus
one, and
φ0j,0(p) = φ0 (p)Pj (up · u3 ). (63)
Hence, we may write

φ0j,0 (p) = e2iδj (p) φj,0(p) (64)


= e2iδj (p) φ(p)Pj (up · u3 ), (65)

where the phase shifts, δj (p), are real functions of p. Substituting into
Eqn. 62, we have
Z
e2iδj (p) φ(p)Pj (up · u3 ) = dΩu0 B(p; up · u0 )Pj (u0 · u3 ). (66)

We obtain an expression for δj (p) by letting up = u3 and using the fact


that Pj (1) = 1:
Z
2iδj (p)
e φ(p) = dΩu0 B(p; u3 · u0 )Pj (u0 · u3 ) (67)

Z 1
= 2π d cos θB(p, cos θ)Pj (cos θ). (68)
−1

The Legendre polynomials are complete, and obey the orthogonality relation:
Z 1
2
dzPj (z)Pj 0 (z) = δjj 0 . (69)
−1 2j + 1

12
Hence, we may expand:

X 2j + 1 2iδj (p)
B(p, cos θ) = e Pj (cos θ), (70)
j=0 4π
or ∞
δ(p − q) X 2j + 1 2iδj (p)
S(p, q) = e Pj (cos θ), (71)
pq j=0 4π
where cos θ = up · uq . This result, Eqn. 71, is the partial wave expansion
for the S matrix.
When all phase shifts δj = 0, S = I, so we may write:

δ(p − q) X h i
S(p, q) − δ (3) (p − q) = (2j + 1) e2iδj (p) − 1 Pj (cos θ). (72)
4πpq j=0

But we also have,


i
S(p, q) − δ (3) (p − q) = δ(p − q)f (p, q). (73)
2πp
Comparing these, we obtain the partial wave expansion for the scatter-
ing amplitude:
1 X ∞ h i
f (p, q) = f (p; cos θ) = (2j + 1) e2iδj (p) − 1 Pj (cos θ). (74)
2ip j=0

In order to have convergence, we suspect we’ll have δj (p) → 0 as j → ∞.


The unitarity of the S matrix is contained in the reality of the functions
δj (p).

5 Optical Theorem
Let us consider more generally than the partial wave expansion the unitarity
of the S matrix:
Z
d3 (q)S(p0 , q)S ∗ (p00 , q) = δ (3) (p0 − p00 ). (75)
(∞)

Substitute in
i
S(p, q) = δ (3) (p − q) + δ(p − q)f (p, q), (76)
2πp

13
obtaining

δ (3) (p0 − p00 ) = δ (3) (p0 − p00 ) (77)


i i
+ 0
δ(p0 − p00 )f (p0 , p00 ) − 00
δ(p0 − p00 )f ∗ (p00 , p0 )
2πp 2πp
Z
1
+ 2 0 00 q 2 dqdΩu δ(p0 − q)δ(p00 − q)f (p0 , q)f ∗ (p00 , q).
4π p p (∞)
This simplifies to
Z
i δ(p0 − p00 ) 0 00 ∗ 00 0 δ(p0 − p00 )
− [f (p , p ) − f (p , p )] = dΩu f (p0 , q)f ∗ (p00 , q),
2π p0 4π 2 (4π)
(78)
0
with q = p . But for a symmetric central force we must have

f (p0 , p00 ) = f (p00 , p0 ) = f (p0 ; u0 · u00 ). (79)

Thus, letting p0 = p00 = p, we find


Z
0 p
00
=f (p; u · u ) = dΩu f (p; u0 · u)f ∗ (p; u00 · u). (80)
4π 4π

In particular, if we take u0 = u00 , then


Z
p
=f (p; 1) = dΩu |f (p; u0 · u)|2 (81)
4π Z4π
p
= dΩu σ(pu, pu0 ) (82)
4π 4π
p
= σT (p), (83)

where σT is the total cross section.
We have just derived what is known as the optical theorem:

σT (p) = =f (p; 1). (84)
p
The total cross section is equal to 4π/p times the imaginary part of the
forward scattering amplitude.
The total cross section may also be readily obtained in terms of the partial
wave expansion:
Z
dσ(p, cos θ)
σT (p) = dΩ (85)
(4π) dΩ

14
Z
= |f (p, cos θ)|2 dΩ (86)
(4π)
Z 1 ∞ X ∞ h ih i
1 X 0 2iδj (p) −2iδj 0 (p)
= 2π dz (2j + 1)(2j + 1) e − 1 e − 1 Pj (z)Pj 0 (z)
−1 4p2 j=0 j 0 =0
4π X∞ h ih i
2iδj (p) −2iδj (p)
= (2j + 1) e − 1 e − 1 (87)
p2 j=0

4π X
= (2j + 1) sin2 δj (p). (88)
p2 j=0

It is left as an exercise for the reader to show that this result agrees with the
optical theorem.
We conclude this section by remarking that the optical theorem is a rather
general consequence of wave scattering in which there is a conservation prop-
erty akin to the conservation of probability in quantum mechanics. For exam-
ple, it also holds in the scattering of electromagnetic waves where energy and
power flow are conserved. For a discussion, see, for example, J. D. Jackson’s
text “Classical Electrodynamics”.

6 Interim Remarks
We pause here to make a few remarks concerning the nature of our results
and possible extensions to them:
1. Our discussion has been pretty general, up to assumptions of spherical
symmetry, and of the force falling off rapidly enough with distance.
2. In particular, the results are valid whether the particles are relativis-
tic or non-relativistic. We nowhere made any assumptions concerning
speed, only using general properties of waves and quantum mechanics.
Any “wave equation” suffices, since we only use it in the discussion of
asymptotic states.
3. The discussion also applies to the elastic collision of two particles in
their center-of-mass system. Note that this system has spherical sym-
metry in this frame. Relativity is unimportant here as well (though
we may need it to transform to a different frame of reference). We
may also consider inelastic scattering in the CM frame, as long as we
properly formulate our conservation of energy.

15
4. We may further extend the discussion to the elastic scattering of two
particles with any spins in the CM frame. The essential change is
that the asymptotic wave function for particles of spin s are (2s + 1)-
component momentum space wave functions, and the scattering am-
plitude f (pf , pi ) now becomes also a matrix operator in spin space.
Ref. [2] develops this theory for scattering of the form a + b → c + d.
We have actually already seen the essential aspects in our class dis-
cussions of the consequence of angular momentum conservation on the
description of scattering scattering in the helicity basis.
Note that we have not yet said much about how to determine the phase
shifts δj (p) in a problem of interest – we have only shown that they are
the key ingredients in the scattering problem. Our discussion will now turn
towards this issue.

7 Resonances
Consider the partial wave expansion:
1 X ∞
2j + 1 h 2iδj (p) i
f (p, q) = f (p; cos θ) = e − 1 Pj (cos θ). (89)
2ip j=0 4π

We note that
1 h 2iδj (p) i 1
e −1 = . (90)
2i cot δj − i
This function has a maximum magnitude of one whenever
π
δj = + nπ, n = integer. (91)
2
When this occurs, we are said to have a resonance in the jth partial wave.
Suppose now that the jth channel exhibits a resonance at energy E = E0 .
Then cot δj (E) (written as a function of energy) vanishes at E = E0 . To
describe the scattering in the neighborhood of this resonance, we make an
expansion of cot δj (E) to linear order:
2 h i
cot δj (E) = − (E − E0 ) + O (E − E0 )2 . (92)
Γ
Thus, in the neighborhood of a resonance, the contribution of this channel

16
5

)j 1
_a
tl
e
d
t( 0 0.2 0.4 0.6 0.8 1
o
c -1

-3

-5

delta_j(E)/pi

Figure 3: Graph of cot δj (E), showing linear behavior near π/2, with negative
slope.

to the scattering amplitude is:


2j + 1 1
fj (p; cos θ) = 2 Pj (cos θ) (93)
p − Γ (E − E0 ) − i
−Γ/2p
= (2j + 1) Pj (cos θ). (94)
E − E0 + iΓ/2
The contribution to the total cross section from this channel alone is (using
the optical theorem):

σT j (p) = =fj (p; 1) (95)
p
4π −Γ/2p
= (2j + 1) = (96)
p E − E0 + iΓ/2
4π(2j + 1) Γ E − E0 − iΓ/2
= − 2
= (97)
p 2 (E − E0 )2 + Γ2 /4
4π Γ2 /4
= (2j + 1) . (98)
p2 (E − E0 )2 + Γ2 /4
We arrive at a Breit-Wigner form for the cross section in the neighborhood
of a resonance. The choice of parameterization is such that Γ is the full

17
width at half maximum of the Breit-Wigner distribution. The cross section
decreases from its peak value by 1/2 at E = E0 ± Γ/2.
Note, finally, that the maximum contribution to the cross section in par-
tial wave j is just:

σT j (E)max = σT j (E0 ) = (2j + 1). (99)
p2
This maximum follows from the unitarity bound:



1 h 2iδj i

e − 1 = eiδj sin δj ≤ 1. (100)
2i

8 The Phase Shift


Intuitively, we may think of the phase shift as follows: Suppose, for example,
we have a “dominantly” attractive potential. The wave will oscillate more
rapidly in the region of the potential than it would if the potential were not
present. A wave starting at phase zero at the origin will accumulate phase
as it propagates out to large distances. The phase is accumulated faster
in the presence of an attractive potential than if no potential is present.
Thus, there will be a phase shift of the wave in the potential relative to the
wave in no potential at large distances (and once outside the region of the
potential, this shift is independent of distance). Fig. 4 provides an illustration
of this effect. Asymptotically, the scattered wave will be positively phase
shifted with respect to an unscattered wave. Similarly, a dominantly repulsive
potential will yield negative phase shifts.
To see how we may get at the phase shifts, let us consider plane wave solu-
tions, and their partial wave expansion. The force-free Schrödinger equation
is
∇2 ψ + k 2 ψ = 0, (E = k2 /2m). (101)
The plane wave solution for a wave traveling in the +z direction (if k > 0)
is:
ψ(x) = eikz = eikr cos θ . (102)
The separation of variables expansion of such a function in spherical polar
coordinates is:
∞ X̀
X
ψ(x) = [A`m j` (kr) + B`m n` (kr)] Plm (θ)eimφ , (103)
`=0 m=−`

18
r

V(r)

Figure 4: Illustration of the production of a phase shift due to a potential.


The dashed line illustrates the wave in the absence of a potential. At large
r the accumulated phase (starting from r = 0) of the wave in the potential
is larger than the accumulated phase of the wave without the potential.

where r r
π π
j` (x) = J`+ 1 (x); n` (x) = (−)`+1 J 1 (x) (104)
2x 2 2x −`− 2
are spherical Bessel functions.
Since eikr cos θ does not depend on φ (beam along the z-axis, hence no
z-component of L), only the m = 0 terms contribute. Also, n` (kr) is not

19
regular at r = 0,2 so B`m = 0. Hence,

X
eikr cos θ = A` j` (kr)Pl (cos θ). (107)
`=0

It remains to determine the coefficients A` . We use the orthogonality


relation Z 1
2
dxP` (x)P`0 (x) = δ``0 , (108)
−1 2` + 1
to obtain Z 1
2
A` j` (kr) = dxeikrx P` (x). (109)
2` + 1 −1
We can match the two sides by considering large kr. For the left hand side,
!
1 `π
j` (x) −→x→∞ sin x − . (110)
x 2
On the right side, we note that by partial integrations we may obtain an
expansion in powers of 1/kr, e.g., after two such integrations:
Z 1
1 ikrx
dxeikrx P` (x) = e P`(x)|1−1 (111)
−1 ikr
 
1 1 ikrx 0 1 1 Z1 ikrx 00
− e P` (x)|−1 − dxe P` (x) .
ikr ikr ikr −1
It appears (left to the reader to prove) that the first term is dominant for
large kr. Using also P` (±1) = (±)` , we thus have:
Z 1
1 h ikr i
dxeikrx P`(x) →kr→∞ e − (−)` e−ikr . (112)
−1 ikr
So, we must compare:
!
1 h ikr i 1 2 `π
e − (−)` e−ikr = A` sin kr − (113)
ikr kr 2` + 1 2
1 1 h i
= A` e−iπ`/2 eikr − (−)` e−ikr . (114)
ikr 2` + 1
2
For small x:

− x1 , `=0
n` (x) →x→0 (105)
− (2`−1)!!
x`+1
, ` > 0;
x`
j` (x) →x→0 . (106)
(2` + 1)!!

20
Solving for A` = (i)` (2` + 1), and therefore

X
eikr cos θ = (i)` (2` + 1)j` (kr)Pl (cos θ). (115)
`=0

Now suppose we have a potential V (r) which vanishes everywhere outside


a sphere of radius R. The scattering amplitude, in terms of the phase shifts,
as we have seen, is:

1 X ∞  
f (k; θ) = (2` + 1) e2iδ` − 1 P` (cos θ). (116)
2ik `=0

Let us relate this to the scattered wave function. For r < R, we may write:

X
ψ= (i)` (2` + 1)R`(k; r)P` (cos θ), (117)
`=0

where χ` = rR` , and


" #
`(` + 1)
χ00` + k − 2
− 2mV (r) χ` = 0. (118)
r2

Outside the sphere r = R we may write:



X  
1 (1)
ψ= (i)` (2` + 1) j` (kr) + α` h` (kr) P` (cos θ), (119)
`=0 2
(1)
where h` is the spherical Hankel function of the first kind:
(1)
h` (kr) = j` (kr) + in` (kr) (120)
1 eikr
→kr→∞ . (121)
(i)`+1 kr

The asymptotic form shows that this corresponds to an asymptotically out-


(2)
going spherical wave. The spherical Hankel function of the second kind, h` ,
would correspond to an asymptotically incoming spherical wave. We don’t
include such a term, because we have already included the incoming plane
wave in the j` (kr) term.

21
The asymptotic behavior of the wave function is therefore:
   


X sin kr − 2 α` 1 eikr 
ψ ∼ (2` + 1)i`  + P` (cos θ) (122)
`=0 kr 2 (i)`+1 kr
1 X ∞ h i
∼ (2` + 1) eikr (1 + α` ) − (−)` e−ikr P` (cos θ). (123)
2ikr `=0

This is now expressed in terms of an incoming spherical wave and an outgoing


spherical wave.
For elastic scattering, we must have conservation of the number of parti-
cles (unitarity), so
|1 + α` |2 = |(−)` |2 , (124)
where the left hand side corresponds to the outgoing wave, and the right
hand side is the incoming. This is satisfied if α` is of the form:

α` = e2iδ` − 1, (125)

where δ` is real.
Now consider the scattered wave function in the asymptotic limit:

ψS = ψ − eikr cos θ (126)


1 X ∞ h i
= (2` + 1) (1 + α` )eikr − (−)` e−ikr P`(cos θ)
2ikr `=0

!
X 1 `π
`
− (2` + 1)(i) sin kr − P`(cos θ)
`=0 kr 2
1 X ∞ h i
= (2` + 1) (1 + α` )eikr − (−)` e−ikr − eikr + (−)` e−ikr P` (cos θ)
2ikr `=0
(127)
X∞
1
= (2` + 1)α` eikr P` (cos θ) (128)
2ikr `=0
eikr
= f (k; θ) . (129)
r
We see that the scattering amplitude has the interpretation as the coefficient
of the outgoing (scattered) spherical wave.

22
Notice that the outgoing part of the wave in Eqn. 127 is
h i 1 X ∞
eikr cos θ ∼ (2` + 1)eikr P` (cos θ), (130)
out 2ikr `=0

while the outgoing part of the actual wave (which includes in addition the
outgoing part of the unscattered wave) is

1 X
ψout ∼kr→∞ (2` + 1)e2iδ` eikr P` (cos θ), (131)
2ikr `=0

where e2iδ` = 1 + α` . Comparing Eqns. 130 and 131, we see that 2δ` is the
difference in phase between the outgoing parts of the actual wave function
and the eikr cos θ plane wave.
It is perhaps useful to make a comment here concering interference among
partial waves. Note that in the differential cross section,

= |f (k; θ)|2 , (132)
dΩ
the different partial waves can interfere, i.e., in general we cannot distinguish
which partial angular momentum states contribute to the scattering in a
given direction. On the other hand, in the total cross section:
Z ∞
4π X
σT = dΩ|f (k; θ)|2 = (2` + 1) sin2 δ` ; (133)
(4π) k 2 `=0

by virtue of the orthogonality of the P`(cos θ), there is no interference among


the partial waves. The cross section decomposes into a sum of partial cross
sections of definite angular momenta.

8.1 Relation of Phase Shift to Logarithmic Derivative


of the Radial Wave Function
Recall the radial wave equation for r < R:
" #
`(` + 1)
χ00` + k 2 − − 2mV (r) χ` = 0, χ` = rR`. (134)
r2

23
We must have continuity of the wave function and its derivative on the sphere
r = R. Hence,
1 (1)
R` (k, R) = j` (kR) + α` h` (kR) (135)
2
 
dR` (k, r) 0 1 (1)0
= k j` (kR) + α` h` (kR) . (136)
dr 2
r=R

The factor of k on the right side of Eqn: 136 is because the prime notation
denotes derivatives with respect to the argument of the function, i.e., with
respect to kr.
We divide Eqn. 136 by Eqn. 135, and let x = kR:
h i
(1)0
dR` (k,r)
j`0 (x) + 12 α` h` (x)
dr
=k . (137)
R` (k, r) r=R
(1)
j` (x) + 12 α` h` (x)
With dr = rd log r, we may write this in the form:
h i
(1)0
d log R` (k, r) j`0 (x) + 12 α` h` (x)
L` ≡ =x . (138)
d log r (1)
j` (x) + 12 α` h` (x)
r=R

Solving for α` = e2iδ` − 1:


L` j` (x) − xj`0 (x)
α` = −2 (1) (1)0
, (139)
L` h` (x) − xh`
where x = kR. Thus, if the logarithmic derivative L` is determined, then the
phase shift is known.

8.2 Low Energy Limit


Let us use the result just obtained to demonstrate that the partial wave
expansion normally converges faster in the low energy limit, x = kR  1.
We’ll start with the power series expansions for the spherical Bessel/Hankel
functions:
(1)
h` (x) = j` (x) + in` (x), (140)

X (−)n (n + `)! 2n
j` (x) = 2` x` x , (141)
n=0 n!(2n + 2` + 1)!

1 X (−)n−` (n − `)! 2n
n` (x) = (−)`+1 x . (142)
2` x`+1 n=0 n!(2n − 2`)!

24
As a technical aside, we see that we really need to make sense of the last
formula above in the case ` > n. Guided by the identity:
πz
z!(−z)! = , (143)
sin πz
we define
(n − `)! (2` − 2n)!
= (−)n−` , if n < `. (144)
(2n − 2`)! (` − n)!
For x  1 (the low energy limit), we keep only the n = 0 term in these
expansions:
`!
j` (x) = 2` x` + O(x`+1) (145)
(2` + 1)!
(−)`+1 (−)−` (2`)!
n` (x) = + O(x−`+1 ). (146)
2` x`+1 `!
Thus,
`!
L` j` (x) − xj`0 (x) = 2` (L` − `)x` + O(x`+2 ) (147)
(2` + 1)!
(1) (1)0
L` h` (x) − xh` = i [L` n` (x) − xn0` (x)] + O(x` ) (148)
(2`)! 1
= −i ` (L` + ` + 1) `+1 + O(x` , x−`+1 ). (149)
2 `! x
Therefore, " #2
2i 2` `! L` − ` 2`+1
α` ≈ − x . (150)
2` + 1 (2`)! L` + ` + 1
Let us use this result to check the convergence of the partial wave ex-
pansion for small x. The coefficient of the P`+1 (cos θ) term divided by the
coefficient of the P` (cos θ) term is:

(2` + 3)α`+1 1 L`+1 − ` − 1 L` + ` + 1 2


= 2
x . (151)
(2` + 1)α` (2` + 1) L`+1 + ` + 2 L` − `

In general then, we should have very good convergence for x  1.


In the low energy limit the `th scattering partial wave is proportional to:

α` ∝ k2`+1 . (152)

25
That is,
α` = e2iδ` − 1 = 2i sin δ` eiδ` , (153)
or
| sin δ` | ∼ k 2`+1 , as kR → 0. (154)
Using
1 h (1) (2)
i
j` (x) = h` (x) + h` (x) , (155)
2
we may also write:
(2) (2)0
L` h` (x) − xh` (x)
1 + α` = − (1) (1)0
. (156)
L` h` (x) − xh` (x)
We may obtain exact expressions for the two lowest phase shifts with,
(2) (1)∗
h` (x) = h` (x), for real x, (157)
(1) i
h0 (x) = − eix , (158)
x  
(1) 1 i
h1 (x) = − eix 1 + . (159)
x x
Hence,

1 + ix/L̂0
1 + α0 = e−2ix (160)
1 − ix/L̂0
1 + ix − x2 /(L̂1 + 1)
1 + α1 = e−2ix , (161)
1 − ix − x2 /(L̂1 + 1)
where x = kR and

d log χ`
L̂` ≡ , (162)
d log r r=R

d log R`
= + 1. (163)
d log r r=R

8.3 Example
Suppose we scatter from the potential (see Fig. 5):

V0 > 0 r < R,
V (r) = (164)
0, r ≥ R.

26
V(r)

V0

r
R
Figure 5: Graph of the example scattering potential.

We wish to determine, for example, the phase shift and partial cross
section for ` = 0 (S-wave) scattering. Since the potential is repulsive, we
should find δ0 < 0. Let

k2 = 2mE (165)
k02 = 2mV0 (166)
q
∆ = k02 − k2 . (167)

The radial wave equation for r < R is


" #
`(` + 1)
χ00` + k − 2
− k02 χ` = 0. (168)
r2

In particular, for ` = 0:3

χ000 + (k 2 − k02 )χ0 = 0. (169)


3
Recall that we may express the wave function for the spherically-symmetric central
force problem in general in the form
X
ψ(x) = An`m Rn` (r)Y`m (θ, φ),
n,`,m

27
If E < V0 , then ∆ > 0, and χ000 = ∆2 χ0 , or

χ0 = A sinh ∆r, r < R, E < V0 . (170)

If instead E > V0 , then ∆ = i∆0 is purely imaginary, hence

χ0 = A0 sin ∆0 r, r < R, E > V0 . (171)

Outside the sphere bounded by r = R, we have χ000 = −k2 χ0 , hence

χ0 (r) = sin(kr + δ), r>R (172)

where we have chosen an arbitrary normalization. We compare this with our


expansion:

X  
1 (1)
ψ(x) = i` (2` + 1) j` (kr) + α` H` (kr) P` (cos θ), r > R. (173)
`=0 2

Consider the ` = 0 term in particular:


 
1 (1)
χ0 (r) = r j0 (kr) + α0 h0 (kr) . (174)
2
(1)
Using, j0 (x) = sin x/x, h0 (x) = −ieix /x, and α0 = e2iδ0 − 1, we obtain (not
worrying about the overall normalization),
i
χ0 (r) = sin kr − α0 eikr (175)
2
1  ikr 
= e − e−ikr + eikr+2iδ0 − eikr (176)
2i
= eiδ0 sin(kr + δ0 ), (177)

or, simply χ0 (r) = sin(kr + δ0 ), absorbing the phase factor into the normal-
iztion. Comparing with Eqn. 172, we find δ = δ0 .
where  
1 ∂2 `(` + 1)
[rRn` (r)] + − V (r) Rn` (r) = En Rn` (r)
2mr ∂r2 2mr2
(n may be a continuous index in general), and we have defined χn` (r) ≡ rRn` (r). At least
as long as V (r) is finite as r → 0, Rn` (0) must be finite, hence the boundary condition
χn` (0) = 0.

28
We determine δ0 by matching χ0 and χ00 at r = R. The logarithmic
derivative L0 = Rχ00 (R)/χ0 (R) contains the information about the parameter
of interest, δ0 , and we need never determine A (or A0 ). For E < V0 :
cosh ∆R
L0 = ∆R = ∆R coth ∆R (178)
sinh ∆R
cos(kR + δ0 )
= kR = kR cot(kR + δ0 ). (179)
sin(kR + δ0 )
That is,
L0 = kR cot(kR + δ0 ) = ∆R coth(∆R), E < V0 . (180)
Similarly, for E > V0 :
L0 = kR cot(kR + δ0 ) = ∆0 R cot(∆0 R), E > V0 . (181)
Given V0 and R, we can thus solve these equations for δ0 (E).
Noting our earlier result:
1 + ikR/L0
1 + α0 = e2iδ0 = e−2ikR , (182)
1 − ikR/L0
we have, for E < V0 :
1 + i(k/∆) tanh ∆R
e2iδ0 = e−2ikR , E < V0 . (183)
1 − i(k/∆) tanh ∆R
 
1+ia
This is of the form e−2ikR 1−ia
= e−2ikR eiθ , where θ = tan−1 [2a/(1 − a2 )].
Therefore, we may write:
 
1  2 ∆k
tanh ∆R 
δ0 = −kR + tan−1   2 , E < V0 , (184)
2 1− k 2
tanh ∆R ∆
 
1  2 ∆k0
tan ∆ R 
0
δ0 = −kR + tan−1   2 , E > V0 . (185)
2 1 − ∆k0 tan2 ∆0 R
Consider the low energy limit, k → 0 (∆ → k0 ), E < V0 :
 
k
1  2 k0 tanh ∆R 
δ0 → −kR + tan−1   2  (186)
2 1− k 2
tanh ∆R k0
 !2 
k k
= −kR + tanh k0 R + O  . (187)
k0 k0

29
Alternatively, we may consider the “hard sphere” limit, k0 → ∞ (i.e., V0 →
∞). This is nearly the same limit, for any fixed energy. To lowest order in
k/k0 ,
δ0 = −kR + O(k/k0 ). (188)
Thus, for the hard sphere the lowest partial wave phase shift depends linearly
on k, as shown in Fig. 6.

δ0

0 kR

Figure 6: The S wave phase shift for the hard sphere potential.

Consider now the total cross section in the ` = 0 channel:



σ0 = 2 sin2 δ0 . (189)
k
In the low energy limit, this is
!
4π k
σ0 → 2
sin2 −kR + tanh k0 R (190)
k k0
!2
4π k
→ 2
−kR + tanh k0 R (191)
k k0

30
!2
tanh k0 R
= 4πR2 1− . (192)
k0 R
In the hard sphere limit (k0 → ∞), this becomes
σ0 = 4πR2 . (193)
In the low energy limit, this partial wave dominates, so this is the total
cross section as well. This result is larger than the result we might expect
purely geometrically, i.e., for the scattering of a well-localized wave packet.
The difference must be attributed to the wave nature of our probe, and the
phenomenon of diffraction.
We note in passing that when writing
!
tanh k0 R
σ0 = 4πa20 , a0 = R 1 − , (194)
k0 R
a0 is referred to as the “scattering length” of the potential.
Now let us consider the case of finite V0 (a “soft sphere”?). For example,
in the high energy limit, k  k0 , and ∆0 → k, hence:
1 2 tan kR
δ0 → −kR + tan−1 (195)
2 1 − tan2 kR
= 0. (196)
This should be the expected result in the high energy limit, since the effect
of a finite fixed potential becomes negligible as the energy increases.
Thus, at very low energies and at very high energies, the phase shift
approaches zero. In between, it must reach some maximum (negative) value.
Fig. 7 shows the S wave phase shift for scattering on the soft sphere potential,
and Fig. 8 shows the S wave cross section for scattering on the soft sphere
potential.

9 The Born Expansion, Born Approximation


We have mentioned the Born expansion in our note on resolvents and Green’s
functions, and the Born approximation in our note on approximate methods.
We now revisit it explicitly in the context of scattering. We start by de-
veloping the approach in general, and then apply it to the example of the
preceding section.

31
kR
0
5 10 15 20
-0.5

-1.0

-1.5

-2.0

-2.5

-3.0
δ0

Figure 7: The S wave phase shift for the soft sphere potential, with R = 1,
k0 R = 4.

In the resolvent note, we saw the Born expansion as the Liouville-Neumann


expansion of a perturbed Green’s function. We will see this again, except
that it is convenient (eliminates a minus sign) here to redefine the Green’s
function with the opposite sign from the resolvent note. Thus, we here take
1
G(z) = − . (197)
H −z
The Schrödinger equation we wish to solve is of the form:
!
∇2
− +V ψ = Eψ. (198)
2m
It is convenient to let E = k2 /2m and V = U/2m, so that this equation is
of the form:  
∇2 + k2 ψ(x) = U (x)ψ(x). (199)
We already found the Green’s function for the free particle wave equation
(Helmholtz equation), (∇2 + k2 )ψ = 0 in the resolvent note:
e±ik|x−y|
G(x, y; k) = − . (200)
4π|x − y|

32
8
7
6

σ0 5
4
3
2
1
0
0 2 4 6 8 10 12 14 16 18 20
kR

Figure 8: The S wave cross section for the soft sphere potential, with R = 1,
k0 R = 4.

The solution to the desired Schrödinger equation is then


Z
ψ(x) = ψ0 (x) + d3 (x0 )G(x, x0 ; k)U (x0 ψ(x0 ), (201)
(∞)

where ψ0 (x) is a solution to the Helmholtz equation. The reader is encour-


aged to verify that ψ is indeed a solution, by substituting into the Schrödinger
equation.4
Now we may apply this to the scattering of an incident plane wave, ψ0 =
eik·x , using the “outgoing” Green’s function,

e+ik|x−y|
G+ (x, y; k) = − , (202)
4π|x − y|
to obtain the scattered wave:

ψS (x) ≡ ψ(x) − ψ0 (x) (203)


Z
= d3 (x0 )G+ (x, x0 ; k)U (x0 ψ(x0 ). (204)
(∞)

4
Recall that (∇2x + k 2 )G(x, x0 ; k) = δ (3) (x − x0 ).

33
Since
eikr
ψS (x) = f (k; θ) , (205)
r
where θ is the polar angle to the observation point, hence equal to the po-
lar angle of the scattered k-vector, we can extract the scattering amplitude
f (k; θ) by considering the asymptotic situation, where the source “size” (non-
negligible region of U (x)) is small:

1 eik|x| −ik0 ·x0


G+ (x, x0 ; k) →|x|→∞ − e . (206)
4π |x|
In obtaining this expression, we have held x0 finite, and used
x · x0
|x − x0 | ≈ |x| − . (207)
x
Thus, Z
1 0 0
f (k; θ) = − d3 (x0 )e−ik ·x U (x0 )ψ(x0 ). (208)
4π (∞)
Let us return to the general integral equation for the wave function:
Z
ψ(x) = ψ0 (x) + d3 (x0 )G(x, x0 ; k)U (x0 )ψ(x0 ). (209)
(∞)

We plug this expression into the integrand, obtaining:


Z
ψ(x) = ψ0 (x) + d3 (x)G(x, x0 ; k)U (x0 )ψ0 (x0 ) (210)
(∞)
Z
+ d3 (x0 )d3 (x00 )G(x, x0 ; k)U (x0 )G(x0 , x00 ; k)U (x00 )ψ(x00 ).
(∞)

Iteration gives the Neumann series:


∞ Z
X
ψ(x) = ψ0 (x) + d3 (x0 )d3 (xn0 )G(x, x0 ; k)U (x0 ) (211)
n=1 (∞)
G(x0 , x ; k)U (x ) . . . G(xn−10 , xn0 ; k)U (xn0 )ψ0 (xn0 ).
00 00
(212)

Typically, we use this by substituting a plane wave for ψ0 (x). If the


potential is “weak”, we expect this series to converge quickly. This expansion
is sometimes called the Born expansion. To first order in U (x) we have:
Z
ψ(x) = ψ0 (x) + d3 (x0 )G(x, x0 ; k)U (x0 )ψ0 (x0 ) + O(U 2 ). (213)
(∞)

34
With ψ0 (x) = eik·x , and

1 eikr −ik0 ·x0


G(x, x0 ; k) ∼ − e , (214)
4π r
we find:
1 eikr Z 0 0 0
ψ(x) ∼ eik·x − d3 (x0 )e−ik ·x U (x0 )eik·x . (215)
4π r (∞)
The interpretation of k is as the momentum vector of the incident plane
wave, and of k0 is as the momentum vector of the scattered wave. Hence,
Z
1 eikr 0 0
ψS (x) ≈ − d3 (x0 )ei(k−k )·x U (x0 ), (216)
4π r (∞)

or Z
1 0 0
f (k; Ω0 ) ≈ − d3 (x0 )ei(k−k )·x U (x0 ), (217)
4π (∞)

where Ω0 is the direction of the scattered wave, i.e., the direction of k0 . This
result is known as the Born approximation.
With this result, we have the differential cross section:
dσ 2
= |f (k; Ω0 )| (218)
dΩ0
Z 2
1 3 0 i(k−k0 )·x0 0

= d (x )e U (x ) (219)
16π 2 (∞)
2 2
m 0
= 2
V̂ (k − k) , (220)

where we have used U (x) = 2mV (x), and V̂ is the Fourier transform of V .
Recall that we already derived this result in our discussion of time-dependent
perturbation theory (Approximate Methods course note).
If U (x) = U (r) only, i.e., we have a spherically symmetric scattering
potential, we may incorporate this into our result. Let θ be the “scattering
angle”, i.e., the angle between k and k0 . Then, noting that |k| = |k0 | = k,
we have |k − k0 | = 2k sin θ2 . Since U (x) = U (r), f (k; Ω0 ) = f (k, θ) only, and
Z ∞ Z
1 02 0 0 θ 0 cos θ0
f (k; θ) = − r dr U (r ) d cos θ0 dφ0 ei2k(sin 2 )r , (221)
4π 0 4π

35
where θ 0 is the angle between x0 and k − k0 . Now let K = |k − k0 | = 2k sin 2θ .
We can do the integrals over angles, to obtain:
Z ∞
sin Kr 0
f (k; θ) = − r02 dr0 U (r 0 )
. (222)
0 Kr 0
This is the Born approximation for a spherically symmetric potential.

9.1 Born Approximation for Phase Shifts


Now let us consider what the Born approximation yields for the phase shifts
in a partial wave expansion. Start with our earlier (exact asymptotic) result:
ψ(x) = ψ0 (x) + ψS (x) (223)
eikr
= ψ0 (x) + f (k; θ), (224)
r
where Z
1 0 0
f (k; θ) = − d3 (x0 )e−ik ·x U (x0 )ψ(x0 ). (225)
4π (∞)
The expansion of ψ(x) in partial waves is

X
ψ(x) = i` (2` + 1)R` (r)P` (cos θ). (226)
`=0

We insert this into the expression for f :


∞ Z ∞ Z 1 Z 2π
1 X ` 02 0 0 0 0 0
f (k; θ) = − i (2`+1) r dr R` (r ) d cos θ dφ0 e−ik ·x U (x0 )P`(cos θ0 ).
4π `=0 0 −1 0
(227)
If, again, we have a spherical potential, U (x) = U (r) only, then we can
perform the angular integrals. Note that θ is the scattering angle, i.e., the
angle between k and k0 . If k is along the z axis, then θ is the polar angle of
k0 . Thus, k0 is not necessarily along the z-axis, and we have:
k0 · x0 = kr0 [cos θ cos θ0 + sin θ sin θ 0 cos(φk0 − φx0 )] . (228)
Now consider our partial wave expansion of a plane wave:

X
eik·x = eikr cos θkx = i` (2` + 1)j` (kr)P` (cos θkx ) (229)
`=0

X q
= i` 4π(2` + 1)j` (kr)Y`0 (θkx , φkx ), (230)
`=0

36
where θkx is the angle between k and x, and we have used the identity
s

P` (cos θ) = Y`0 (θ, φ). (231)
2` + 1

We may re-express this in terms of the polar angles (θx , φx ) of x and (θk , φk )
of k using the addition theorem for spherical harmonics:
q X̀

4π(2` + 1)Y`0 (θkx , φkx ) = Y`m (θk , φk )Y`m (θx , φx ), (232)
m=−`

where
cos θkx = cos θk cos θx + sin θk sin θx cos(φk − φx ). (233)
Thus,

X X̀
eik·x = 4π i` j` (kr) ∗
Y`m (θk , φk )Y`m (θx , φx ), (234)
`=0 m=−`

and hence,
∞ Z ∞
1 X `
f (k; θ) = − i (2` + 1) r02 dr0 R` (r 0 ) (235)
4π `=0 0

X 0
4π i` j`∗0 (k 0 r0 )U (r 0 ) (236)
`0 =0
Z 1 Z 2π X̀
0
d cos θ dφ0 Y`0 m (θ, φk0 )Y`∗0 m (θ 0 , φ0 )P`(cos θ0 ).(237)
−1 0 m=−`

We may simplify this in several ways:

• j` (x) is real if x is real.

• We may relabel integration variable r 0 as r. Also, k 0 is equal to k.

• The only dependence on φ0 is in Y`∗0 m (θ 0 , φ0 ), hence


Z 2π
dφ0 Y`∗0 m (θ 0 , φ0 ) = 2πδm0 Y`∗0 0 (θ 0 , 0) (238)
0
s
2`0
= 2πδm0 P`0 (cos θ0 ). (239)

37
• The integral over cos θ0 may then be accomplished with:
Z 1
2
dxP` (x)P`0 (x) = δ``0 . (240)
−1 2` + 1

• Finally, we may use


s
2` + 1
Y`0 (θ, 0) = P`(cos θ). (241)

Including all of these points leads to:



X Z ∞
f (k; θ) = − (2` + 1)P` (cos θ) r2 drR` (r)j` (kr)U (r). (242)
`=0 0

So far, we have made no approximation – this result is “exact” as long


as the potential falls off rapidly enough as r → ∞. However, it depends on
knowing R` (r), and this is where we shall now make an approximation. Note
first, that by comparison with the expansion in phase shifts:

1 X ∞  
f (k; θ) = (2` + 1) e2iδ` − 1 P` (cos θ), (243)
2ik `=0

we obtain:
Z ∞
1  2iδ` 
e − 1 = eiδ` sin δ` = −k r2 drR` (r)j` (kr)U (r). (244)
2i 0

The Born approximation consists in approximating the wave function by its


value for no potential, which should be a good approximation as long as the
potential has a small effect on it. Here, this means we replace R` (r) with
j` (kr), valid as long as R` (r) ≈ j` (kr):
Z ∞
eiδ`
sin δ` ≈ −k r2 dr [j` (kr)]2 U (r). (245)
0

Note that the integral is real in this approximation. In this approximation,


δ` is small, hence we have the Born approximation for the phase shifts:
Z ∞
δ` ≈ −k r2 dr [j` (kr)]2 U (r). (246)
0

38
9.2 Born Approximation and Example of the “Soft
Sphere” Potential
As an example, let us return to our earlier example of the “soft sphere”:

V0 , r < R,
V (r) = (247)
0, r > R.

Now U (r) = 2mV (r), and let k02 = 2mV0 . Then,


Z R
δ` ≈ −kk02 r2 dr [j` (kr)]2 (248)
0
!2 Z
k0 kR
= − [xj` (x)]2 dx. (249)
k 0

sin x
For instance, consider δ0 , with j0 (x) = x
:
!2 Z
k0 kR
δ0 ≈ − sin2 xdx (250)
k 0
!2
k0 1
= − (2kR − sin 2kR). (251)
k 4

Note that the Born approximation is a “high energy” approximation –


there must be little effect on the scattering wave. Consider what happens at
low energy, kR  1:
!2
k0 1 (k0 R)3
δ0 = − (2kR − sin 2kR) ≈ −kR . (252)
k 4 3

This may be compared with our earlier low energy result:


!
tanh k0 R
δ0 ≈ −kR 1 − . (253)
k0 R

Other than the linear dependence, they are not very similar. Fig. 9 illustrates
this. At low kR, the agreement with the exact calculation is poor, as the
energy increases, the agreement improves.

39
kR
2 4 6 8 10 12 14 16 18 20
0

-1

-2
δ0
-3

-4

-5

-6

Figure 9: The S wave phase shift for the soft sphere potential, with R = 1,
k0 R = 4. The curve which goes further negative is the Born approximation;
the other curve is the exact result.

10 Angular Distributions
If the laws of physics are rotationally invariant, we have angular momentum
conservation. As a consequence, the angular distribution of particles in a
scattering process is constrained. Here, we consider how this may be applied
in the case of a scattering reaction of the form:

a + b → c + d. (254)

These particles may carry spins – let ja denote the spin of particle a, etc.
It is convenient to work in the center-of-mass frame, and to pick particle
a to be incoming along the z axis. If a different frame is desired, then it
may be reached from the center-of-mass frame with a Gallilean or Lorentz
transformation, whichever is appropriate.
It is furthermore convenient to work in a helicity basis for the angular
momentum states. Along the helicity axis, the component of orbital angular
momentum of a particle is zero. We’ll typically label the helicity of a particle
with the symbol λ. For example, the helicity of particle a is denoted λa .

40
Let us also work with a plane wave basis for our particles. Physical
states may be constructed as a superposition of such states. We’ll omit the
magnitude of momentum in our labelling, and concentrate on the angles
(polar angles denoted by θ, and azimuth angles denoted by φ). We’ll also
omit the spins of the scattering particles in the labels, to keep things compact.
Thus, we label the initial state according to:

|ii = |θa = 0, φa , λa , λb i, (255)

and the final state by,


|f i = |θc , φc, λc, λd i, (256)
Note that θd = π − θc and φd = π + φc in the center-of-mass. This basis may
be called the “plane wave helicity basis”.
We may write the transition amplitude from the initial to final state in
the form:
hf |T |ii ∝ hθc , φc, λc, λd |T |0, φa , λa , λb i, (257)
and the scattering cross section is:

dσλa λb λc λd (θ ≡ θc , φ ≡ φc) ∝ |hθ, φ, λc, λd |T |0, φa , λa , λb i|2 . (258)

This gives the scattering cross section for the specified helicity states.
If we do not measure the final helicities, then we must sum over all possible
λc and λd . Similarly, if the initial beams (a and b) are unpolarized, we must
average over all possible λa and λb . Thus, to get the scattering cross section
in such a case, we must evaluate
1 X 1 XXX
|hθ, φ, λc, λd |T |0, φa , λa , λb i|2 . (259)
2ja + 1 λa 2jb + 1 λb λc λd

If a(or b) is massless and not spinless, then the 1/(2ja + 1) factor must be
replaced by 1/2.
The scattering process may go via intermediate states of definite angular
momentum, for example, scattering through an intermediate resonance. For
example. in e+ e− scattering just below µ+ µ− threshold, we may expect
a resonance due to the presence of “muonium” bound states. Such states
have definite angular momenta. In particular, the resonance for scattering
through the lowest 3 S1 bound state of muonium might be expected to be a
large contribution in the scattering amplitude. Thus, we are led to describing

41
the transition amplitude most naturally in a basis which specifies the total
angular momentum state. We may label a basis state as:
|j, m, λa , λb i, or |j, m, λc , λd i, (260)
where j specifies the total angular momentum, and m is its component along
a chosen quantization axis. This basis is sometimes called the “spherical
helicity basis”.
For a given pair of initial and final helicity states, we are thus interested
in computing matrix elements of the form:
X X
hf |T |ii ∝ hθc , φc, λc, λd |j, m, λc, λd ihj, m, λc , λd |T |j 0 , m0 , λa , λb ihj 0 , m0 , λa , λb |0, φa , λa , λb i.
j,m j 0 ,m0
(261)
The terms of the form hθ, φ, λ1 , λ2 |j, m, λ01 , λ02 i
are just scalar products of
vectors expressed in different bases. We thus need to learn how to do the
basis transformation.
We may write the basis transformation in the form:
X
|θ, φ, λ1 , λ2 i = |j, m, λ01 , λ02 ihj, m, λ01 , λ02 |θ, φ, λ1 , λ2 i (262)
j,m,λ01 ,λ02
X
= |j, m, λ1 , λ2 ihj, m, λ1 , λ2 |θ, φ, λ1 , λ2 i, (263)
j,m

where we have used the fact that states with different helicities are orthogo-
nal:
hj, m, λ01 , λ02 |θ, φ, λ1 , λ2 i ∝ δλ01 λ1 δλ02 λ2 . (264)
Consider now the state
X
|θ = 0, φ, λ1 , λ2 i = |j, m, λ1 , λ2 ihj, m, λ1 , λ2 |θ = 0, φ, λ1 , λ2 i. (265)
j,m

For this state, the helicity axis is parallel(or antiparallel) to the quantization
axis for the third component of angular momentum, and hence λ1 = m1 ,
λ2 = −m2 . Thus, m = λ1 − λ2 ≡ α. The coefficient
hj, m, λ01 , λ02 |θ, φ, λ1 , λ2 i = cj δmα , (266)
where cj is a constant depending only on j, to be determined (see exercises).
The expansion of this state in the spherical helicity basis is thus:
X
|θ = 0, φ, λ1 , λ2 i = cj |j, α, λ1 , λ2 i. (267)
j,m

42
Let us now perform a rotation on our special state to an arbitrary state:
|θ, φ, λ1 , λ2 i = R3 (φ)R2 (θ)R3 (−φ)|θ = 0, φ, λ1 , λ2 i. (268)
This product of rotations on a state of angular momentum j is given by the
Euler angle parameterization of the Dj rotation matirces:
R3 (φ)R2 (θ)R3 (−φ) = Dj (φ, θ, −φ). (269)
The matrix elements for rotation u are:
j
Dm 1 m2
(u) = hj, m1 |D j (u)|j, m2 i. (270)
Hence,
X
Dj (u)|j, α, λ1 , λ2 i = |j, m, λ1 , λ2 ihj, m, λ1 , λ2 |D j (u)|j, α, λ1 , λ2 i, (271)
m

and X
j
ketθ, φ, λ1 , λ2 = |j, m, λ1 , λ2 iDmα (φ, θ, −φ). (272)
j,m
q
If we anticipate the result of the exercises for cj , namely cj = 2j + 1/4π,
we have the desired result:
s
2j + 1 j∗
hθ, φ, λ1 , λ2 |j, m, λ01 , λ02 i = Dmα (φ, θ, −φ)δλ01 λ1 δλ02 λ2 , (273)

where α ≡ λ1 − λ2 . Putting this into our expression for the transition
amplitude, we obtain (defining αf ≡ λc − λd and αi ≡ λa − λb ):
s s
X X 2j 0 + 1 j∗
2j + 1 j0 0 0
hf |T |ii ∝ Dmαf (φ, θ, −φ)Dm 0 α (φa , 0, −φa )hj, m, λc , λd |T |j , m , λa , λb i.
0
j,m j ,m 0 4π4π i

(274)
Suppose the interaction is one which conserves angular momentum. In
this case,
hj, m, λc, λd |T |j 0 , m0 , λa , λb i = δjj 0 δmm0 hj, m, λc , λd |T |j, m, λa , λb i, (275)
and further m = λa −λb ≡ αi . Thus, the non-zero transition matrix elements
are numbers of the form Tλja λb λc λd , which may be called “helicity amplitudes”.
We have:
X 2j + 1 j
hf |T |ii ∝ Tλa λb λc λd Dαj∗i αf (φ, θ, −φ)Dαj i αi (φa , 0, −φa ). (276)
j 4π

43
The “big-D” functions give us the angular distribution for each intermediate
j value and helicity state. Using
j
Dm 1 m2
(φ, θ, −φ) = e−i(m1 −m2 )φ djm1 m2 (θ), (277)

and djmm (0) = 1, we have


X 2j + 1 j
hf |T |ii ∝ T ei(αi −αf )φ djαi αf (θ). (278)
j 4π λa λb λc λd

Note that the azimuth dependence is a phase, depending only on the helici-
ties.
Squaring, we obtain the scattering angular distribution:
2

dσλa λb λc λd X j
(θ, φ) ∝ (2j + 1)Tλa λb λc λd dαi αf (θ) ,
j
(279)

dΩ j

where αf ≡ λc − λd and αi ≡ λa − λb .

11 Exercises
1. Show that the total cross section we computed in the partial wave
expansion,

4π X
σT (p) = 2 (2j + 1) sin2 δj (p), (280)
p j=0
is in agreement with the optical theorem.
2. We have discussed the “central force problem”. Consider a particle of
mass m under the influence of the following potential:

V0 , 0 ≤ r ≤ a
V (r) = (281)
0, a < r,
where V0 is a constant.

(a) Write down the Schrödinger equation for the wave function ψ(xx).
Consider solutions which are simultaneous eigenvectors of H, L2 ,
and Lz . Solve the angular dependence, and reduce the remaining
problem to a problem in one variable. [You’ve done this already
first quarter, so you may simply retrieve that result here.]

44
(b) Let E be the eigenvalue of the Hamiltonian, H. Consider the case
where E > V0 . Solve the Schrödinger equation for eigenstates
ψ(xx). It will probably be convenient to use the quantity k =
q
2m(E − V0 ). Consider the limit as r → ∞ for your solutions,
and give an interpretation in terms of spherical waves.
E < V0 . It will probably be
(c) Repeat the solution for the case whereq
convenient to use the quantity K = 2m(V0 − E). Again, con-
sider the limit as r → ∞ and give an interpretation, contrasting
with the previous case.

Hint: You will probably benefit by thinking about solutions in the


form of spherical Bessel/Neumann functions, and/or spherical Hankel
functions.

3. When we calculated the density of states for a free particle, we used a


“box” of length L (in one dimension), and imposed periodic boundary
conditions to ensure no net flux of particles into or out of the box.
We have in mind, of course, that we can eventually let L → ∞, and
are really interested in quantities per unit length (or volume). Let us
justify more carefully the use of periodic boundary conditions, i.e., we
wish to convince ourselves that the intuitive rationale given above is
correct. To do this, consider a free particle in a one-dimensional “box”
from −L/2 to L/2. Remembering that the Hilbert space of allowed
states is a linear space, show that the periodic boundary condition:

ψ(−L/2) = ψ(L/2),

is required for acceptable wave functions. “Acceptable” here means


that the probability to find a particle in the box must be constant.

4. In our discussion of scattering theory, we supposed we had a beam of


particles from some ensemble of wave packets, and obtained an “effec-
tive” (observed) differential cross-section:
Z Z
σeff (u) = f (α)dα d2 (x)P (µ; ∝; x)
{α} |x|≤R

This formula assumed that the beam particles were distributed uni-
formly in a disk of radius R centered at the origin in the ê1 − ê2 plane,

45
and that the distribution of the shape parameter was uncorrelated with
position in this disk.

(a) Try to obtain an expression for σeff (u) without making these as-
sumptions.
(b) Using part (a), write down an expression for σeff (u) appropriate
to the case where the beam particles are distributed according to
a Gaussian of standard deviation ρ in radial distance from the
origin (in the ê2 − ê3 plane), and where the wave packets are also
drawn from a Gaussian distribution in the expectation value of
the magnitude of the momentum. Let the standard deviation of
this momentum distribution be α = α(x), for beam position x.
(c) For your generalized result of part (a), try to repeat our limiting
case argument to obtain the “fundamental” cross section. Discuss.

5. Let us briefly consider the consequences of reflection invariance (parity


conservation) for the scattering of a particle with spin s on a spinless
target. [We consider elastic scattering only here]. Thus, assume the
interaction is reflection invariant:

(a) How does the S matrix transform under parity, i.e., what is P −1 SP ,
where P is the parity operator?
(b) What is the condition on the helicity amplitudes Ajλµ (pi ) (corre-
sponding to scattering with total angular momentum j) imposed
by parity conservation?
(c) What condition is imposed on the orbital angular momentum am-
j
plitudes B``0 (pi )? You may use “physical intuition” if you like,

but it should be convincing. In any event, be sure your answer


makes intuitive sense.

6. We consider the resonant scattering of light by an atom. In particular,


let us consider sodium, with 2 P1/2 ↔2 S1/2 resonance at λ = λ0 =
5986Å. Let σ0T be the total cross section at resonance, for a mono-
chromatic light source (i.e., σ0T is the “fundamental” cross section).

(a) Ignoring spin, estimate σ0T , first in terms of λ0 /2π, and then
numerically in cm2 . Compare your answer with a typical atomic
size.

46
(b) Suppose that we have a sodium lamp source with a line width
governed by the mean life of the excited 2 P1/2 state (maybe not
easy to get this piece of equipment!). The mean life of this state
is about 10−8 second. Suppose that this light is incident on an ab-
sorption cell, containing sodium vapor and an inert (non-resonant)
buffer gas. Let the temperature of the gas in the absorption cell
be 200◦ C. Obtain an expression for the effective total cross sec-
tion, σeffT which an atom in the cell presents to the incident light.
Again, make a numerical calculation in cm2 .
(c) Using your result above, find the number density of Na atoms (#
of atoms/cm3 ) which is required in the cell in order that intensity
of the incident light is reduced by a factor of two in a distance
of 1 cm. It should be noted (and your answer should be plausible
here) that such a gas will be essentially completely transparent to
light of other (non-resonant) wavelengths.

7. Consider scattering from the simple potential:



V (x) = V0 r = |x| < R
0 r > R.
In the low energy limit, we might only look at S-wave ` = 0 scatter-
ing. However, in the high energy limit, we expect scattering in other
partial waves to become significant. For simplicity, let us here consider
scattering on a hard sphere, V0 → ∞.

(a) For a hard sphere potential, calculate the total cross section in
partial wave `. Give the exact result, i.e., don’t take the high
energy limit yet. You may quote your answer in terms of the
spherical Bessel functions.
(b) Find a simple expression for the phase shift δ` in the high energy
limit (kR  `). Keep terms up to O(1) in your result.
(c) Determine the total cross section (including all partial waves) in
the high energy limit, kR → ∞. [This is the only somewhat tricky
part of this problem to calculate. One approach is as follows:
Write down the total cross section in terms of your results for part
(a). Then, for fixed k, consider which values of ` may be important
in the sum. Neglect the other values of `, and make the high energy

47
approximation to your part (a) result. Finally, evaluate the sum,
either directly, or by turning it into an appropriate integral.]

8. Consider the graph in Fig. 10.

Phase
Shift
(degrees)

160

δ1

120

80

40
δ0

0
0 50 100 150 200
Tπ (MeV)

Figure 10: Made-up graph of phase shifts δ0 and δ1 for elastic π + p scattering
(neglecting spin).

Assume that the other phase shifts are negligible (e.g., “low energy”
is reasonably accurate). The pion mass and energy here are suffi-
ciently small that we can at least entertain the approximation of an
infinitely heavy proton at rest – we’ll assume this to be the case,
in any event.
q Note that Tπ is the relativistic kinetic energy of the
+
π : Tπ = Pπ + m2π − mπ .
2

(a) Is the π + p force principally attractive or repulsive (as shown in


this figure)?
(b) Plot the total cross section in mb (millibarns) as a function of
energy, from Tπ =40 to 200 MeV.

48
(c) Plot the angular distribution of the scattered π + at energies of
120, 140 and 160 MeV.
(d) What is the mean free path of 140 MeV pions in a liquid hydrogen
target, with these “protons”?

9. We now start to consider the possibility of “inelastic scatting”. For


example, let us suppose there is a “multiplet” of N non-identical parti-
cles, all of mass m. We consider scattering on a spherically symmetric
center-of-force, with the properly that the interaction can change a
particle from one number of the multiplet to another member. We
may in this case express the scattering amplitude by fαβ (k; cos θ), with
α, β = 1, . . . , N, corresponding to a unitary S-matrix on Hilbert space
L2 (R3 ) ⊗ VN :
i
Sαβ (kf ; ki ) = δαβ δ (3) (kf − ki ) + δ(kf − ki )fαβ (kf ; ki )
2πki
β is here to be interpreted as identifying the initial particle, and α the
final particle. The generalization of our partial wave expansion to this
situation is clearly:

1 X (`)
fαβ (k; cos θ) = (2` + 1)(Aαβ − δαβ )P`(cos θ)
2ik `=0

Where A(`) is an N × N unitary matrix:

A(`) = exp[2i4` (k)]

with 4` an N × N Hermitian “phase shift matrix” note that fαα is the


elastic scattering amplitude for particle α.

(a) Find expressions, in terms of A(`) (k), for the following total cross
sections, for an incident particle α: (integrated over angles)
el
i. σαTOT , the total elastic cross section
inel
ii. σαTOT , the total inelastic cross section (sometimes called the
“reaction” cross section).
iii. σαTOT , the total cross section.
(b) Try to give the generalization of the optical theorem for this scat-
tering of particles in a multiplet.

49
10. In the previous problem you considered the scattering of particles in a
multiplet. You determined the total elastic (sometimes called “scatter-
ing”) cross section and the total inelastic (“reaction”) cross sections in
(`)
terms of the Aαβ matrix in the partial wave expansion. Consider now
the graph in Fig. 11.

4
el(l) Not
k2 σ
α TOT Allowed
π (2l+1)

Allowed

0
0 1
k 2 σ inel(l)
α TOT
π (2l+1)

Figure 11: The allowed and forbidden regions for possible elastic and inelastic
cross sections for the scattering of particles in a multiplet.

This graph purports to show the allowed and forbidden regions for the
total elastic and inelastic cross sections in a given partial wave `. Derive
the formula for the allowed region of this graph. Make sure to check
the extreme points.

50
11. In the angular distribution section, we discussed the transformation
between two different types of “helicity bases”. In particular, we con-
sidered a system of two particles, with spins j1 and j2 , in their CM
frame.
One basis is the “spherical helicity basis”, with vectors of the form:
|j, m, λ1 , λ2 i, (282)
where j is the total angular momentum, m is the total angular mo-
mentum projection along the 3-axis, and λ1 , λ2 are the helicities of the
two particles. We assumed a normalization of these basis vectors such
that:
hj 0 , m0 , λ01 , λ02 |j, m, λ1 , λ2 i = δjj 0 δmm0 δλ1 λ01 δλ2 λ02 . (283)

The other basis is the “plane-wave helicity basis”, with vectors of the
form:
|θ, φ, λ1 , λ2 i, (284)
where θ and φ are the spherical polar angles of the direction of particle
one. We did not specify a normalization for these basis vectors, but an
obvious (and conventional) choice is:
hθ 0 , φ0 , λ01 , λ02 |θ, φ, λ1 , λ2 i = δ (2) (Ω0 − Ω)δλ1 λ01 δλ2 λ02 , (285)

where d(2) Ω refers to the element of solid angle for particle one.
In the section on angular distributions, we obtained the result for the
transformation between these bases in the form:
X
j
|θ, φ, λ1 , λ2 i = cj |j, m, λ1 , λ2 iDmα (φ, θ, −φ), (286)
j,m

where α ≡ λ1 − λ2 . Determine the numbers cj .

References
[1] Eyvind H. Wichmann, “Scattering of Wave Packets”, American Journal
of Physics, 33 (1965) 20-31.
[2] M. Jacob and G. C. Wick, “On the General Theory of Collisions for
Particles with Spin”, Annals of Physics, 7 (1959) 404.

51
Physics 125
Course Notes
Scattering
Solutions to Problems
040416 F. Porter

1 Exercises
1. Show that the total cross section we computed in the partial wave
expansion,

4π X
σT (p) = 2 (2j + 1) sin2 δj (p), (1)
p j=0
is in agreement with the optical theorem.
Solution: Starting with the optical theorem,

σT (p) = =f (p; 1) (2)
p
4π 1 X ∞ h i
= = (2j + 1) e2iδj (p) − 1 Pj (1) (3)
p 2ip j=0
∞   h i
4π X 1 2iδj (p)
= (2j + 1) − < e − 1 (4)
p2 j=0 2

4π X
= (2j + 1) [cos 2δj (p) − 1] (5)
p2 j=0

4π X
= (2j + 1) sin2 δj (p). (6)
p2 j=0

2. We have discussed the “central force problem”. Consider a particle of


mass m under the influence of the following potential:

V0 , 0 ≤ r ≤ a
V (r) = (7)
0, a < r,
where V0 is a constant.

(a) Write down the Schrödinger equation for the wave function ψ(xx).
Consider solutions which are simultaneous eigenvectors of H, L2 ,
and Lz . Solve the angular dependence, and reduce the remaining

1
problem to a problem in one variable. [You’ve done this already
first quarter, so you may simply retrieve that result here.]
Solution: The Schrödinger equation is
 
1 2
− ∇ + V (r) ψ(x) = Eψ(x). (8)
2m
The wave function for a state of definite L2 = `(`+1), and Lz = M
is Rε` (r)Y`M (θ, φ. The radial wave equation is:
" #
`(` + 1)
χ00` + k 2 − − k02 χ` = 0, r < a, (9)
r2
" #
00 2 `(` + 1)
χ` + k − χ` = 0, r > a, (10)
r2
where χ` = rR` (suppressing the radial index ε), k 2 = 2mE, and
k02 = 2mV0 .
(b) Let E be the eigenvalue of the Hamiltonian, H. Consider the case
where E > V0 . Solve the Schrödinger equation for eigenstates
ψ(xx). It will probably be convenient to use the quantity κ =
q
2m(E − V0 ). Consider the limit as r → ∞ for your solutions,
and give an interpretation in terms of spherical waves.
q
Solution: Let’s use K = 2m(V0 − E), and do parts (b) and (c)
together (hence K = iκ in part (b)). For r < a we need a solution
for the wave function which is finite at r = 0, and for r > a we
need something finite at r = ∞:

(
X A` j` (iKr), r < a,
`
ψ(x) = i (2`+1)P`(cos θ) α (1) (11)
`=0
j` (kr) + 2 h (kr), r > a.

The constants A` and α` may be determined by satisfying the


continuity conditions at r = a:
lim ψ(x) =
r→a−
lim ψ(x),
r→a+
(12)
∂ψ(x) ∂ψ(x)
lim = lim . (13)
r→a− ∂r r→a+ ∂r
The result is:
(1)
j` (ka) + α2 h` (ka)
A` = , (14)
j` (iKa)

2
and
L` j` (ka) − kaj`0 (ka)
α` = −2 (1) (1)0
, (15)
L` h` (ka) − kah` (ka)
where
j`0 (iKa)
L` = iKa . (16)
j` (iKa)
Asymptotically,

sin(x − `π/2)
j` (x) ∼x→∞ , (17)
x
(1) 1 eix
h` (x) ∼x→∞ . (18)
i`+1 x
See the discussion in section 9 for further interpretation.
(c) Repeat the solution for the case whereq
E < V0 . It will probably be
convenient to use the quantity K = 2m(V0 − E). Again, con-
sider the limit as r → ∞ and give an interpretation, contrasting
with the previous case.
Solution: See part (b).

Hint: You will probably benefit by thinking about solutions in the


form of spherical Bessel/Neumann functions, and/or spherical Hankel
functions.

3. When we calculated the density of states for a free particle, we used a


“box” of length L (in one dimension), and imposed periodic boundary
conditions to ensure no net flux of particles into or out of the box.
We have in mind, of course, that we can eventually let L → ∞, and
are really interested in quantities per unit length (or volume). Let us
justify more carefully the use of periodic boundary conditions, i.e., we
wish to convince ourselves that the intuitive rationale given above is
correct. To do this, consider a free particle in a one-dimensional “box”
from −L/2 to L/2. Remembering that the Hilbert space of allowed
states is a linear space, show that the periodic boundary condition:

ψ(−L/2) = ψ(L/2), (19)


ψ 0 (−L/2) = ψ 0 (L/2) (20)

3
is required for acceptable wave functions. “Acceptable” here means
that the probability to find a particle in the box must be constant.
Solution: The Schrödinger equation for a free particle is
1 2
−i∂t ψ(x, t) = −∂ ψ(x, t). (21)
2m x
We suppose that an “acceptable” wave function is one which has a
constant probability to be in the “box” (−L/2, L/2):
d Z L/2
|ψ(x, t)|2 dx = 0. (22)
dt −L/2
It is readily verified that the function
2π 2 2π
φ(x, t) = ei mL2 t sin x (23)
L
has the desired property.
If we admit φ(x, t) as an acceptable solution, and if ψ(x, t) is any other
acceptable solution, then φ + ψ must be acceptable, since any linear
combination of acceptable solutions must be acceptable. Hence, we
must have:
d Z L/2
|ψ(x, t)|2 dx = 0; (24)
dt −L/2
d Z L/2
|φ(x, t)|2 dx = 0; (25)
dt −L/2
d Z L/2
|ψ(x, t) + φ(x, t)|2 dx = 0. (26)
dt −L/2
Then we may write (assuming Eqns 24 and 25):
d Z L/2
0 = [ψ(x, t)φ∗ (x, t) + ψ ∗ (x, t)φ(x, t)] dx (27)
dt −L/2
Z L/2
= ∂t [ψ(x, t)φ∗ (x, t) + ψ ∗ (x, t)φ(x, t)] dx (28)
−L/2
i Z L/2 h 2  ∗       i
= ∂x ψ φ − ψ ∂x2 φ + ψ ∗ ∂x2 φ − ∂x2 ψ ∗ φ dx(29)
2m −L/2
Z L/2
= ∂x [(∂x ψ) φ∗ − ψ (∂x φ) + ψ ∗ (∂x φ) − (∂x ψ ∗ ) φ] dx (30)
−L/2
L/2
= [(∂x ψ) φ∗ − ψ (∂x φ) + ψ ∗ (∂x φ) − (∂x ψ ∗ ) φ]−L/2 . (31)

4
But φ(±L/2, t) = 0, so
L/2
0 = [−ψ(∂x φ∗ ) + ψ ∗ (∂x φ)]−L/2 . (32)
Further, since
2π i 2π22 t
∂x φ(±L/2, t) = − e mL , (33)
L
we obtain
2π 2 2π 2 2π 2 2π 2
0 = ψ(L/2, t)e−i mL2 t −ψ(−L/2, t)e−i mL2 t +ψ ∗ (L/2, t)ei mL2 t −ψ ∗ (−L/2, t)ei mL2 t .
(34)

This must be true for all times; also if ψ is acceptable, then e ψ must
be acceptable, for real θ. Hence, ψ is acceptable if and only if Eqn. 24
holds, and:
ψ(L/2, t) = ψ(−L/2, t). (35)
2π 2
We note that the function ei mL2 t cos 2π
L
x satisfies these criteria. Thus,
we could also have picked
2π 2 2π
φ(x, t) = ei mL2 t cos x (36)
L
as an acceptable solution. Then the same argument reveals that any
other acceptable solution ψ must satisfy the boundary condition:
∂x ψ(L/2, t) = ∂x ψ(−L/2, t). (37)
 2 2π 2 2 2

in t n 2π
We finally remark that the set of functions e mL2 sin 2πn
L
x, ei mL2 t cos 2πn
L
x; n = 0, 1, . . .
is a complete set of functions with the required boundary conditions.
4. In our discussion of scattering theory, we supposed we had a beam of
particles from some ensemble of wave packets, and obtained an “effec-
tive” (observed) differential cross-section:
Z Z
σeff (u) = f (α)dα d2 (x)P (µ; ∝; x)
{α} |x|≤R

This formula assumed that the beam particles were distributed uni-
formly in a disk of radius R centered at the origin in the ê1 − ê2 plane,
and that the distribution of the shape parameter was uncorrelated with
position in this disk.

5
(a) Try to obtain an expression for σeff (u) without making these as-
sumptions.
Solution: We start with Eqn. 33 from the note:
Z Z
P (u; α; x) = d3 (q) d3 (q0 )q 2 δ(q − q 0 ) (38)
(∞) (∞)
0
T (qu, q)T ∗ (q 0 u, q0 )φ0 (q; α)φ∗0 (q0 ; α)e−ix·(q−q(39)
)
,

In general, if f (α, x) describes the beam position and shape dis-


tribution (possibly correlated), with
Z Z
f (α, x)dαd2 (x) = 1, (40)
{α} (∞))

then the effective differential cross section is:


Z Z
dσeff (u)
= Aeff f (α, x)P (u; α; x)dαd2(x), (41)
dω {α} (∞))

where Aeff is an “effective” area of the beam.


The effective area of the beam may be computed by requiring that
we get a consistent answerqfor a small “hard” target, of area a. In
this case, P = 1 for |x| < a/π. Thus,
Z Z
a = Aeff √ f (α, x)dαd2 (x). (42)
{α} |x|< a/π

We want this equality to hold in the limit as a → 0:


a
Aeff = lim √ . (43)
a→0 R R a/π R 2π
{α} dα 0 rdr 0 f (α, x)

(b) Using part (a), write down an expression for σeff (u) appropriate
to the case where the beam particles are distributed according to
a Gaussian of standard deviation ρ in radial distance from the
origin (in the ê2 − ê3 plane), and where the wave packets are also
drawn from a Gaussian distribution in the expectation value of
the magnitude of the momentum. Let the standard deviation of
this momentum distribution be α = α(x), for beam position x.

6
Solution: We have a beam distribution:
1 −r2 /2ρ2 1 2 2
f (p, x) = 2
e √ e−(p−p0 ) /2α (x) . (44)
2πρ 2πα(x)
The effective area is:
a
Aeff = lim √
a→0 R a/π R 2π R∞ 1
rdr dφ dp 2πρ −r 2 /2ρ2 √ 1 e−(p−p0 )2 /2α2 (x)
0 0 −∞ 2e
2πα(x)
a
= lim √
a→0 R a/π 1
rdr2π 2πρ −r 2 /2ρ2
0 2e

2a
= lim √
a→0 R a/π
0 dr2 ρ12 e−r2 /2ρ2
2a
= lim R a/π
a→0 dr2 ρ12 e−r2 /2ρ2
0
a
= lim 2 2
a→0 1 − e−a /2πρ

= 2πρ2 . (45)

Thus,
Z Z
dσeff (u) 2
= 2πρ d (x) dpf (p, x)P (u; p; x)d2(x).
2
(46)
dω (∞))

(c) For your generalized result of part (a), try to repeat our limiting
case argument to obtain the “fundamental” cross section. Discuss.
Solution: The limiting case corresponds to the beam being spread
out over a size large compared with the target, and with a sharply
defined momentum. The same arguments as in the note will hence
apply.

5. Let us briefly consider the consequences of reflection invariance (parity


conservation) for the scattering of a particle with spin s on a spinless
target. [We consider elastic scattering only here]. Thus, assume the
interaction is reflection invariant:

(a) How does the S matrix transform under parity, i.e., what is P −1 SP ,
where P is the parity operator?

7
Solution: If the interaction is invariant under reflection, then

P −1 SP = S. (47)

(b) What is the condition on the helicity amplitudes Ajλµ (pi ) (corre-
sponding to scattering with total angular momentum j) imposed
by parity conservation?
Solution: Under parity, the helicity λ reverses sign to −λ. Hence,

P |p; jλi = (−)ηintrinsic |p; jλi. (48)

The helicity amplitude is:

Ajλµ (pi ) = hpi ; jm, λ|S|pi ; jm, µi (49)

Under parity,
Ajλµ (pi ) → Aj−λ−µ (pi ) (50)
Thus, parity conservation requires

Ajλµ (pi ) = Aj−λ−µ (pi ) (51)

(c) What condition is imposed on the orbital angular momentum am-


j
plitudes B``0 (pi )? You may use “physical intuition” if you like,

but it should be convincing. In any event, be sure your answer


makes intuitive sense.
Solution: Under a parity transformation, a wave function corre-
sponding to definite orbital angular momentum ` transforms as

P ψ` (x) = (−)` ψ` (x). (52)

The orbital angular momentum amplitude is:


j
B``0 (pi ) = hpi ; jm, `|S|pi ; jm, `0 i (53)
= hpi ; jm, `|P † P SP †P |pi ; jm, `0 i (54)
0
= (−)`−` hpi ; jm, `|S|pi ; jm, `0 i. (55)
j 0
If parity is conserved, then B``0 (pi ) = 0 if ` − ` is odd.

8
6. We consider the resonant scattering of light by an atom. In particular,
let us consider sodium, with 2 P1/2 ↔2 S1/2 resonance at λ = λ0 =
5986Å. Let σ0T be the total cross section at resonance, for a mono-
chromatic light source (i.e., σ0T is the “fundamental” cross section).

(a) Ignoring spin, estimate σ0T , first in terms of λ0 /2π, and then
numerically in cm2 . Compare your answer with a typical atomic
size.
Solution: The wavelength and photon momentum are related by
λ = 2π/p, or λ̄ = λ/2π = 1/p. The total cross section on a
resonance in partial wave ` is:

4π Γ2 /4
σ`T (E) = (2` + 1) . (56)
p2 (E − E0 )2 + Γ2 /4

The wavelength here is much larger than the atom, so we presume


this to be an S-wave resonance (as suggested by the “0” in the
problem statement). Hence, the cross section at the resonance
peak is:

σ0T (E0 ) = = λ2 /π = 4πλ̄2 (57)
p2
= 1.14 × 10−9 cm2 . (58)

A typical atomic size is of order Å2 , or ∼ 10−15 cm2 , which is much


smaller than this resonant cross section.
(b) Suppose that we have a sodium lamp source with a line width
governed by the mean life of the excited 2 P1/2 state (maybe not
easy to get this piece of equipment!). The mean life of this state
is about 10−8 second. Suppose that this light is incident on an ab-
sorption cell, containing sodium vapor and an inert (non-resonant)
buffer gas. Let the temperature of the gas in the absorption cell
be 200◦ C. Obtain an expression for the effective total cross sec-
tion, σeffT which an atom in the cell presents to the incident light.
Again, make a numerical calculation in cm2 .
Solution: The line width of our light source is

Γ ≈ 1/10−8 = 108 Hz. (59)

9
The sodium atoms will move thermally according to a Maxwell-
Boltzmann distribution:

p(E) = Ae−E/kB T (60)

The Doppler broadening of the absorption line in the sodium vapor


cell, due to thermal motion, is

∆ν = 2ν0 (2kB T ln 2/m)1/2 (61)


= 1.5 × 109 Hz. (62)

The atoms thus see a gaussianly distributed line-width:

2(ln 2)1/2 −4 ln 2(ν−ν0 )2 /∆ν02


f (ν) = √ e . (63)
π∆ν0
In principle, we take a convolution of this with the resonant line
shape. However, the gaussian is relatively wide, so we may ap-
proximate it with its value at ν0 :

σeffT ≈ 2
f (ν0 )Γ ≈ 6.6 × 10−11 cm2 . (64)
k
(c) Using your result above, find the number density of Na atoms (#
of atoms/cm3 ) which is required in the cell in order that intensity
of the incident light is reduced by a factor of two in a distance
of 1 cm. It should be noted (and your answer should be plausible
here) that such a gas will be essentially completely transparent to
light of other (non-resonant) wavelengths.
Solution: The attenuation of the light is exponential:
−N σ
I(L) = I(0)e eff L , (65)

where N is the number density. We want the density such that


I(1 cm)/I(0) = 1/2:
ln 2
N= . (66)
σeff L
7. Consider scattering from the simple potential:

V (x) = V0 r = |x| < R
0 r > R.

10
In the low energy limit, we might only look at S-wave ` = 0 scatter-
ing. However, in the high energy limit, we expect scattering in other
partial waves to become significant. For simplicity, let us here consider
scattering on a hard sphere, V0 → ∞.

(a) For a hard sphere potential, calculate the total cross section in
partial wave `. Give the exact result, i.e., don’t take the high
energy limit yet. You may quote your answer in terms of the
spherical Bessel functions.
Solution: The total cross section is

X
σT = σl , (67)
`=0

where σ` is the total cross section in partial wave `.


The problem has azimuthal symmetry, taking the incident wave
to be along the z axis, so the angular solutions may be expressed
in terms of the Legendre polynomials. The radial wave equation,
for r > R, may be expressed as
" #
00 `(` + 1)
2
χ + k − χ` = 0. (68)
r2
The solution to the Schrödinger equation for r > R is:

X  
` 1 (1)
ψ(x) = i (2` + 1) j` (kr) + α` h` (kr) P` (cos θ). (69)
`=0 2

If we have a hard sphere potential, then the boundary condition


is that ψ(r = R, Ω) = 0. Hence,
1 (1)
j` (kR) + α` h` (kR) = 0, ` = 0, 1, . . . (70)
2
Therefore, for the hard sphere potential,
j` (kR)
α` (k) = e2iδ` − 1 = −2 (1)
. (71)
h` (kR)

For ` = 0, this reduces to α0 = e−2ikR − 1, or δ0 = −kR. This is


also the result for ` = 1.

11
The total cross section in partial wave ` is

σ` = 2
(2` + 1) sin2 δ` (72)
k

4π e2iδ` − 1 2

= (2` + 1) (73)
k 2 2i

= 2
(2` + 1)|α`/2|2 (74)
k

4π j (kR) 2
`
= (2` + 1) (1) (75)
k 2 h` (kR)
4π [j` (kR)]2
= (2` + 1) , (76)
k2 [j` (kR)]2 + [n` (kR)]2
(1)
where, in the final step we have used h` (x) = j` (x) + in` (x).
(b) Find a simple expression for the phase shift δ` in the high energy
limit (kR  `). Keep terms up to O(1) in your result.
Solution: At high energies, letting x = kR:

j` (kR)
α` (k) = e2iδ` − 1 = −2 (1)
(77)
h` (kR)
 
π π −i(x−` π2 − π2 )
≈ −2 cos x − ` − e (78)
2 2
π π
= e−2i(x−` 2 − 2 ) − 1. (79)

Thus, for kR  `,
π π
δ` = −kR + ` + . (80)
2 2

(c) Determine the total cross section (including all partial waves) in
the high energy limit, kR → ∞. [This is the only somewhat tricky
part of this problem to calculate. One approach is as follows:
Write down the total cross section in terms of your results for part
(a). Then, for fixed k, consider which values of ` may be important
in the sum. Neglect the other values of `, and make the high energy
approximation to your part (a) result. Finally, evaluate the sum,
either directly, or by turning it into an appropriate integral.]

12
Solution: We must evaluate:
2 1 X ∞
[j` (x)]2
σT = 4πR lim (2` + 1) . (81)
x=kR→∞ x2
`=0 [j` (x)]2 + [n` (x)]2
We make use of the following facts:
r
π
j` (x) = J 1 (x) (82)
r
2x `+ 2
π
n` (x) = Y 1 (x) (83)
r
2x `+ 2
(1) π (1)
h` (x) = H 1 (x) (84)
s
2x `+ 2
2 π π
Jν (x) ∼ cos(x − ν − ) (85)
πx 2 4
s
2 i(x−ν π − π )
Hν(1) (x) ∼ e 2 4 . (86)
πx
Thus, with ν = ` + 1/2,

j (x) 2 cos(x − ` π − π ) 2
` 2 2
(1) ∼ π π (87)
h (x) ei(x−` 2 − 2 )
`
π π
∼ cos2 (x − ` − ) (88)
2 2
2 π
∼ sin (x − ` ), for x  `. (89)
2
We further note that, for fixed x, j` (x) approaches zero for large
`, and h` (x) approaches infinity for large `. Let us argue that
we may cut off the sum at ` = kR on physical grounds: At high
energy, 1/k  R. Now ` ∼ kr, since ` is the orbital angular mo-
mentum quantum number. If ` > kR, then r > R, and the short
wavelength beam misses the target, hence there is no contribution
to the scattering cross section. Thus, in the high energy limit, for
scattering on a hard sphere:
x
1 X π
σT = 4πR2 lim 2
(2` + 1)sin2 (x − ` ). (90)
x=kR→∞ x 2
`=0

Now (
2 π sin2 x, ` even,
sin (x − ` ) = (91)
2 cos2 x, ` odd.

13
We evaluate the sum in this limit:
x
1 X 2 π
(2` + 1)sin (x − ` ) = sin2 x + cos2 x + 2 sin2 x + 2 cos2 x
x2 `=0 2
+3 sin2 x + 3 cos2 x + . . . (92)
Xx
1
= ` = x(x + 1). (93)
`=0 2
Hence, !
kR
4π X 4π k 2 R2
σT ∼ 2 `= 2 = 2πR2 . (94)
k `=0 k 2
8. Consider the graph in Fig. 1.
Phase
Shift
(degrees)

160

δ1

120

80

40
δ0

0
0 50 100 150 200
Tπ (MeV)

Figure 1: Made-up graph of phase shifts δ0 and δ1 for elastic π + p scattering


(neglecting spin).

Assume that the other phase shifts are negligible (e.g., “low energy”
is reasonably accurate). The pion mass and energy here are suffi-
ciently small that we can at least entertain the approximation of an

14
infinitely heavy proton at rest – we’ll assume this to be the case,
in any event.
q Note that Tπ is the relativistic kinetic energy of the
+
π : Tπ = Pπ2 + m2π − mπ .

(a) Is the π + p force principally attractive or repulsive (as shown in


this figure)?
Solution: The phase shifts are positive, indicating a dominantly
attractive potential.
(b) Plot the total cross section in mb (millibarns) as a function of
energy, from Tπ =40 to 200 MeV.
Solution: The total cross section in terms of the partial wave
phase shifts is:

4π X
σT = (2` + 1) sin2 δ` (95)
k 2 `=0

= (sin2 δ0 + 3 sin2 δ1 ). (96)
k2
q
The kinetic energy Tπ is related to k by Tπ = m2π + k2 − mπ , or
q
k= T (T + 2mπ ). (97)
To convert to millibarns, we multiply by:
1 = (197 MeV-fm)2 10 mb/fm2 = 3.88 × 105 MeV2 mb. (98)
Figure 2 shows the result.
(c) Plot the angular distribution of the scattered π + at energies of
120, 140 and 160 MeV.
Solution:

dσ 1 X ∞ h i
= | (2j + 1) e2iδj (k) − 1 Pj (cos θ)|2 (99)
dΩ 2ik j=0
1 2iδ0 (k)
= |e − 1 + 3(e2iδj (k) − 1) cos θ|2 (100)
4k2
1 n 2 2
o
= [cos δ0 − 1 + 3(cos δ 1 − 1) cos θ] + [sin δ0 + 3 sin δ1 cos θ] .
4k2
The result is shown in Fig. 3.

15
300

250

(mb)
200

150
σT

100

50

0
0 50 100 150 200 250
Tπ (MeV)

Figure 2: Total (made-up) cross section for elastic π + p scattering (neglecting


spin).

(d) What is the mean free path of 140 MeV pions in a liquid hydrogen
target, with these “protons”?
Solution: The cross section for 140 MeV pions is ∼ 260 mb. The
density of liquid hydrogen is 0.0708 g/cm3 . The number density
is ρ == 4.2 × 102 8 m−3 . The mean free path is thus
1
λ= = 0.9 m. (101)
σT ρ
9. We now start to consider the possibility of “inelastic scatting”. For
example, let us suppose there is a “multiplet” of N non-identical parti-
cles, all of mass m. We consider scattering on a spherically symmetric
center-of-force, with the properly that the interaction can change a
particle from one number of the multiplet to another member. We
may in this case express the scattering amplitude by fαβ (k; cos θ), with
α, β = 1, . . . , N, corresponding to a unitary S-matrix on Hilbert space
L2 (R3 ) ⊗ VN :
i
Sαβ (kf ; ki ) = δαβ δ (3) (kf − ki ) + δ(kf − ki )fαβ (kf ; ki )
2πki

16
60

120
140
50 160

dcos θ

40

30

20

10

0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

cos θ

Figure 3: Differential (made-up) cross section for elastic π + p scattering (ne-


glecting spin), at three values of Tπ .

β is here to be interpreted as identifying the initial particle, and α the


final particle. The generalization of our partial wave expansion to this
situation is clearly:

1 X (`)
fαβ (k; cos θ) = (2` + 1)(Aαβ − δαβ )P`(cos θ)
2ik `=0

Where A(`) is an N × N unitary matrix:

A(`) = exp[2i4` (k)]

with 4` an N × N Hermitian “phase shift matrix” note that fαα is the


elastic scattering amplitude for particle α.

(a) Find expressions, in terms of A(`) (k), for the following total cross
sections, for an incident particle α: (integrated over angles)

17
el
i. σαTOT , the total elastic cross section
Solution: We’ll use
Z 1
2
dxP` (x)P`0 (x) = δ``0 . (102)
−1 2` + 1
Z 1
el
σαTOT = 2π d cos θ|fαα (k; cos θ)|2 (103)
−1
Z
π X 2
1
= 2
(2` + 1) dx|P`(x)|2 |A(`) 2
αα − 1| (104)
2k ` −1
π X
= 2 (2` + 1)|A(`) 2
αα − 1| . (105)
k `
inel
ii. σαTOT , the total inelastic cross section (sometimes called the
“reaction” cross section).
Solution:
XZ 1
inel
σαTOT = 2π d cos θ|fβα (k; cos θ)|2 (106)
β6=α −1
π XX (`)
= (2` + 1)|Aβα |2 . (107)
k 2 β6=α `

iii. σαTOT , the total cross section.


Solution:
el inel
σαTOT = σαTOT + σαTOT (108)
π XX (`)
= 2 (2` + 1)|Aβα − δαβ |2 . (109)
k β `

(b) Try to give the generalization of the optical theorem for this scat-
tering of particles in a multiplet.
Solution: Start with the unitarity os the S matrix:
XZ †
d3 (q)Sαβ (p0 , q)Sγβ (q, p00 ) = δαγ δ (3) (p0 − p00 ). (110)
β (∞)

Substitute in our form for S in terms of the scattering amplitude


f , and arrive at:
X p
=fαγ (p; 1) = σ . (111)
γ 4π αTOT

18
10. In the previous problem you considered the scattering of particles in a
multiplet. You determined the total elastic (sometimes called “scatter-
ing”) cross section and the total inelastic (“reaction”) cross sections in
(`)
terms of the Aαβ matrix in the partial wave expansion. Consider now
the graph in Fig. 4.

4
el(l) Not
k2 σ
α TOT Allowed
π (2l+1)

Allowed

0
0 1
k 2 σ inel(l)
α TOT
π (2l+1)

Figure 4: The allowed and forbidden regions for possible elastic and inelastic
cross sections for the scattering of particles in a multiplet.

This graph purports to show the allowed and forbidden regions for the
total elastic and inelastic cross sections in a given partial wave `. Derive
the formula for the allowed region of this graph. Make sure to check
the extreme points.

19
Solution: For simplicity, let the vertical axis be v, and the horizontal
axis u:
inel(`) el(`)
k 2 σαTOT k2 σαTOT
u= ;v = . (112)
π(2` + 1) π(2` + 1)
From the solution to the previous problem, and unitarity of the A(`)
matrix, we thus have
X (`)
u = |Aβα |2 = 1 − |A(`) 2
αα | , (113)
β6=α

v = |A(`) 2 (`) 2 (`)


αα − 1| = 1 + |Aαα | − 2<Aαα . (114)
The constraint imposed by unitarity is that |A(`) 2 (`) iθ
αα | ≤ 1. Let Aαα = re .
Then r ≤ 1 and 0 ≤ θ < 2π gives the allowed region. In terms of the
plotted quantities, u = 1 − r 2 and v = 1 + r 2 − 2r cos θ. Thus
0 ≤ u ≤ 1, (115)
and for given u, v must be in the range
(1 − r)2 ≤ y ≤ (1 + r)2 , (116)

where r = 1 − x. If r = 0 then (u, v) = (1, 1). If r = 1 then u = 0
and 0 ≤ v ≤ 4.
11. In the angular distribution section, we discussed the transformation
between two different types of “helicity bases”. In particular, we con-
sidered a system of two particles, with spins j1 and j2 , in their CM
frame.
One basis is the “spherical helicity basis”, with vectors of the form:
|j, m, λ1 , λ2 i, (117)
where j is the total angular momentum, m is the total angular mo-
mentum projection along the 3-axis, and λ1 , λ2 are the helicities of the
two particles. We assumed a normalization of these basis vectors such
that:
hj 0 , m0 , λ01 , λ02 |j, m, λ1 , λ2 i = δjj 0 δmm0 δλ1 λ01 δλ2 λ02 . (118)
The other basis is the “plane-wave helicity basis”, with vectors of the
form:
|θ, φ, λ1 , λ2 i, (119)

20
where θ and φ are the spherical polar angles of the direction of particle
one. We did not specify a normalization for these basis vectors, but an
obvious (and conventional) choice is:

hθ 0 , φ0 , λ01 , λ02 |θ, φ, λ1 , λ2 i = δ (2) (Ω0 − Ω)δλ1 λ01 δλ2 λ02 , (120)

where d(2) Ω refers to the element of solid angle for particle one.
In the section on angular distributions, we obtained the result for the
transformation between these bases in the form:
X
j
|θ, φ, λ1 , λ2 i = cj |j, m, λ1 , λ2 iDmα (φ, θ, −φ), (121)
j,m

where α ≡ λ1 − λ2 . Determine the numbers cj .


Solution: To select a particular cj , i.e., a particular j, let us invert
the basis transformation:
Z
j∗
dΩDmδ (φ, θ, −φ)|θ, φ, λ1 , λ2 i = (122)
4π Z
X j 0
cj 0 |j 0 , m0 , λ1 , λ2 i j∗
dΩDmα (φ, θ, −φ)Dm 0 α (φ, θ, −φ)
j 0 ,m0 4π
X Z
0
= cj 0 |j 0 , m0 , λ1 , λ2 i dΩdjmδ (θ)djm0 α (θ) exp [−i(m0 φ − αφ) + i(mφ − αφ)]
j 0 ,m0 4π

X Z 1 Z 2π
0 0
= cj 0 |j 0 , m0 , λ1 , λ2 i d cos θdjmα (θ)djm0 δ (θ) dφei(m−m )φ (123)
j 0 ,m0 −1 0

X Z 1
0 0
= 2π cj 0 |j , m, λ1 , λ2 i d cos θdjmα (θ)djmα (θ) (124)
j0 −1

X 2αjj 0
= 2π cj 0 |j 0 , m, λ1 , λ2 i (125)
j0
2j + 1

= cj |j, m, λ1 , λ2 i. (126)
2j + 1
Note that we should perhaps justify the interchange of the order of
summation and integration in the very first step above. Thus,
2j + 1 Z j∗
|j, m, λ1 , λ2 i = dΩDmα (φ, θ, −φ)|θ, φ, λ1 , λ2 i. (127)
4πbj 4π

21
Now,
1 = hj, m, λ1 , λ2 |j, m, λ1 , λ2 i (128)
" #2 Z Z
2j + 1 j∗ j
= dΩDmα (φ, θ, −φ) dΩ0 Dmδ (φ0 , θ0 , −φ0 )hθ 0 , φ0 , λ1 , λ2 |θ, φ, λ1 , λ2 i
4π|cj | 4π 4π
" #2 Z Z
2j + 1 j∗
= dΩDmα (φ, θ, −φ) dΩ0 Dmα
j
(φ0 , θ0 , −φ0 )δ(cos θ0 − cos θ)δ(φ0 − φ)
4π|cj | 4π 4π
" #2 Z
2j + 1 j∗ j
= dΩDmα (φ, θ, −φ)Dmα (φ, θ, −φ) (129)
4π|cj | 4π
" #2 Z
2j + 1 1 h i2
= 2π d cos θ djmα (θ) (130)
4π|cj | −1
" #2
4π 2j + 1
= . (131)
2j + 1 4π|bj |
Therefore, |cj |2 = (2j + 1)/4π, or picking a phase convention,
s

cj = . (132)
2j + 1
where we assume that it is all right to interchange the summation and
integration. Since each term is non-negative (and each finite), there is
no potential for cancellations. Hence, if we find convergence for one
ordering of the operations, we will for the other as well.
Note that we have used the result of Eqn. 348 of my angular momentum
notes to obtain:
Z 1 h i2 2
d cos θ djmα (θ) = . (133)
−1 2j + 1
12. In the notes we derived the optical theorem assuming that we had a
“symmetric central force”. Show that this assumption is unnecessary.
Hint: This is trivial, except for one piece of the assumption which you
will have to retain.
Solution: Start with the step prior to making the assumption in the
notes:
i δ(p0 − p00 ) 0 00 ∗ 00 0 δ(p0 − p00 ) Z
− [f (p , p ) − f (p , p )] = dΩu f (p0 , q)f ∗ (p00 , q).
2π p0 4π 2 (4π)
(134)

22
Note that we must have p0 = p00 = q ≡ p. Thus, write:
i 0 00 ∗ 00 0 1 Z
− [f (pu , pu ) − f (pu , pu )] = dΩu f (pu0 , pu)f ∗ (pu00 , pu).
p 2π (4π)
(135)
Now consider foward scattering: u00 = u0 :
Z
i 1
− [f (pu0 , pu0 ) − f ∗ (pu0 , pu0 )] = dΩu f (pu0 , pu)f ∗ (pu0 , pu).
p 2π (4π)
(136)
0 0
With the assumption that f (pu , pu) = f (pu, pu ), we immediately see
that we have once again the optical theorem:

σT (p) = =f (p; 1). (137)
p
Note that the assumption we retained was that the scattering amplitude
is invariant (up to a phase) under interchange of incoming and outgoing
directions.

References
[1] Eyvind H. Wichmann, “Scattering of Wave Packets”, American Journal
of Physics, 33 (1965) 20-31.

[2] M. Jacob and G. C. Wick, “On the General Theory of Collisions for
Particles with Spin”, Annals of Physics, 7 (1959) 404.

23
Physics 125
Course Notes
Identical Particles
040422 F. Porter

1 Introduction
We briefly summarize the issues of dealing with systems of identical particles
in this note.
A classical particle, such as a ping pong ball, may be labelled and followed
without affecting its dynamics. A collection of identical quantum mechani-
cal particles, such as electrons, cannot be similarly “labelled” and followed
without affecting their dynamics. The key distinction is encapsulated as the
“overlap of wave functions”.
A system with one electron at point x1 and spin s1 and another electron
at point x2 and spin s2 is completely indistinguishable, and hence arguably
the same as, the system with the positions and spins switched. Let us develop
this idea, for spinless particles initially.

2 Symmetry Under Interchange


Let
ψ(x1 , x2 ) = |x1 , x2 i (1)
be the wave function for one particle to be at x1 and an identical particle to
be at x2 . Let P12 be an operator which interchanges the positions of the two
particles:
P12 |x1 , x2 i = |x2 , x1 i. (2)
Note that
2
P12 |x1 , x2 i = P12 |x2 , x1 i (3)
= |x1 , x2 i. (4)
2
That is, P12 = I. The pair {P12 , I} forms an operator group of order two. It
is the simplest non-trivial “permutation” group.
Consider the Hamiltonian H for the two particles:
1
H=− (∇2 + ∇22 ) + V (x1 , t) + V (x2 , t) + U (|x1 − x2 |, t). (5)
2m 1
1
Note that I have written this in a form such that the interchange of the two
particles leaves the Hamiltonian unchanged. If this were not the case, then
the two particles would not be “identical”. Thus,

[P12 , H] = 0, (6)

and we can find simultaneous eigenfunctions of P12 and H. What are the
eigenfunctions of P12 ? Let ψλ (x1 , x2 ) be an eigenstate:

P12 ψλ (x1 , x2 ) = λψλ (x1 , x2 ) = ψλ (x2 , x1 ) (7)


2
P12 ψλ (x1 , x2 ) = λ2 ψλ (x1 , x2 ) = ψλ (x1 , x2 ). (8)

Thus, λ2 = 1, or λ = ±1.
If λ = +1, ψλ (x1 , x2 ) = ψλ (x2 , x1 ); the state is symmetric. If λ = −1,
ψλ (x1 , x2 ) = −ψλ (x2 , x1 ); the state is antisymmetric. Note that this implies
a degeneracy in the energy levels, since there are two states with the same
energy. This is referred to as an “exchange degeneracy”. However, it is exper-
imentally observed that a pair of identical particles is always in an eigenstate
of P12 , and that eigenstate depends only on the kind of particle. From our
present perspective, this has the status of a fundamental principle. However,
in quantum field theory, it is a theorem – see, for example, Feynman’s 1986
Dirac Memorial Lecture, “The Reason for Antiparticles”, Cambridge Univer-
sity Press, or, for a more mathematical treatment, Streater and Wightman’s
book, “PCT Spin&Statistics and All That”.
To describe this observation, we need to add spin to the discussion. When
we add spin, P12 has the effect (by definition):

P12 |x1 , s1 ; x2 , s2 i = |x2 , s2 ; x1 , s1 i. (9)

That is, the spin quantum numbers are also interchanged. Particles which
are symmetric under this interchange are called “bosons”. Particles which
are antisymmetric under this interchange are called “fermions”. It is found
that particles with integer spin (s=0,1,2,. . . ) are always bosons, and particles
with half-integer spin ( 12 , 32 , . . .) are always fermions. This is the celebrated
connection between spin and statistics.
An important consequence for fermions is the Pauli Exclusion Princi-
ple: Two identical fermions cannot be in the same quantum state. We may
“demonstrate” this as follows: Let {φk (x, s)|k = 1, 2, . . .} be an orthonormal

2
basis for a single particle state. Then a basis for a two particle identical
particle state is formed out of the product space:

{φk (x1 , s1 )} ⊗ {φk (x2 , s2 )} (10)

Any two-particle wave function can be written in such a basis as:


XX
ψ(x1 , s1 , x2 , s2 ) = Ajk φj (x1 , s1 )φk (x2 , s2 ). (11)
j k

If the two particles are in the same quantum state, then the coefficients must
be symmetric:
Ajk = Akj . (12)
In that case,

−ψ(x2 , s2 , x1 , s1 ) = ψ(x1 , s1 , x2 , s2 ) (13)


XX
= Ajk φj (x1 , s1 )φk (x2 , s2 ) (14)
j k
XX
= Akj φj (x1 , s1 )φk (x2 , s2 ) (15)
j k
XX
= Ajk φk (x1 , s1 )φj (x2 , s2 ) (16)
j k
= ψ(x2 , s2 , x1 , s1 ). (17)

Thus, ψ = −ψ, showing that the two particles cannot be in the same state.
The Pauli exclusion principle is crucial to atomic physics: once an energy
level in an atom is filled, any additional electrons must go into a different
level.

3 Example: Helium
Consider helium, with two electrons. The wave function of these two electrons
must be antisymmetric under interchange. We may use this to construct ap-
propriate ground state quantum numbers. As validated in class discussion,
we assume that the spin-dependent forces (“fine structure”) may be regarded
as perturbations, and look for states with the smallest orbital angular mo-
mentum, and lowest radial quantum numbers. Thus, we look for a ground

3
Symmetric (S = 1) Antisymmetric (S = 0)
| + +i
√1 (| + −i + | − +i √1 (| + −i − | − +i
2 2
| − −i

state where both electrons have ` = 0 and both are in the lowest radial state
(principal quantum numbers n1 = n2 = 1):

ψ(x1 , x2 ) ∼ R(r1 , r2 , |x1 − x2 |), (18)

where R(r1 , r2 , |x1 − x2 |) = R(r2 , r1 , |x1 − x2 |). That is, the spatial wave
function is symmetric under interchange: ψ(x1 , x2 ) = ψ(x2 , x1 ). For this
to be possible, the part of the wave function involving the spins must be
antisymmetric. Labelling the spin states by z-projections, i.e., for single-
particle states:
1
|m = + i = |+i, (19)
2
1
|m = − i = |−i, (20)
2
we have states for two electrons shown in the table above.
We conclude that the ground state of helium has S = 0, in order that
the overall wave function be antisymmetric with respect to interchange of
the two electrons. In spectroscopic (L-S) notation, the ground state is a 1 S0
state. We see that the ground state is, in fact, non-degenerate. There is only
one state, instead of the possible four spin arrangements otherwise.

4 Example: Isospin and Extended Pauli Prin-


ciple
We consider an example in nuclear physics, which will lead us to formulating
a convenient “extended Pauli principle”. Consider the deuteron, made of a
proton and a neutron. We guess that this ground state of a proton and a
neutron is S-wave (` = 0), hence the spatial part of the wave function is
symmetric under neutron-proton interchange.

4
Experimentally, we observe no corresponding nn or pp bound states,
which we might expect form the charge independence of the nuclear force.
Hence, we suspect that the corresponding nn and pp states are forbidden
by the Pauli principle. From this inference, we shall deduce the spin of the
deuteron. Consider nn or pp in an ` = 0 state. If S = 1, the state is symmet-
ric, hence not allowed. If S = 0 the state is antisymmetric, hence allowed.
As the deuteron does not have nn or pp analogs, we conclude that the spin
of the deuteron is S = 1.
With this example in mind, we can generalize our discussion of identi-
cal particles and obtain a useful bookkeeping tool. Thus, consider isotopic
spin, and regard the proton and neutron as identical particles with different
isotopic spin projections: The states of a nucleon, N are
1 1
|pi = |I = , I3 = i (21)
2 2
1 1
|ni = |I = , I3 = − i. (22)
2 2
Since the deuteron does not have nn and pp partners, it is an isotopic spin
singlet state (I(d) = 0):
1
|di = √ (|pni − |npi). (23)
2
Regarding the p and n as states of identical fermions, we require that the
wave function be antisymmetric under interchange of all quantum numbers
(space, spin, and isotopic spin). A state of zero orbital angular momentum
and I = 0 must thus be even under spin interchange. Again the conclusion
is S(d) = 1.
Note that we haven’t actually introduced any new physical principle here,
just a bookkeeping device to arrive at the same result as before. Note that
the I = 1 combination, which is symmetric in isospin space, must have s = 0
(antisymmetric), if ` is even. We quote our bookkeeping device in the form
of an “Extended Pauli Principle”:

A state of two fermions, identical in the sense that they belong


to the same isotopic spin multiplet, must be antisymmetric under
interchange of all quantum numbers (space, spin, and isospin).

Likewise, we quote the corresponding “Extended Boson Principle”:

5
A state of two bosons, identical in the same sense, must be sym-
metric under interchange of all quantum numbers.

The identical particle symmetries have consequences also in scattering,


and also are manifest in the discussion of second quantization.

5 Exercises
1. Let us use the Pauli exclusion principle, and the combination of angular
momenta, to find the possible states which may arise when more than
one electron in an atom are in the same p-shell. Express your answers
for the allowed states in the spectroscopic notation: 2S+1 LJ , where S
is the total spin of the electrons under consideration, L is the total
orbital angular momentum, and J is the total angular momentum of
the electrons.
(a) List the possible states for 2 electrons in the same p-shell.
(b) List the possible states for 3 electrons in the same p-shell.
(c) List the possible states for 4 electrons in the same p-shell. Hint:
before you embark on something complicated for this part, think
a bit!
2. The pion (π) is a boson (with spin zero) with isotopic spin I = 1.
(a) Use our “extended “identical” boson symmetry principle to clas-
sify the allowed (I, J) values for a system of two pions. Here,
J refers to the relative angular momentum, and I to the total
isotopic spin, of the two pion state.
(b) Look up the experimental situation for particles which do and
don’t decay into two pions. [See, for example:
http://pdg.lbl.gov/2002/mxxx.pdf] Discuss what you find. Try to
resolve any puzzles, e.g., do you find particles which “ought” to
decay to two pions, but don’t? Do some decay to two pions when
they “shouldn’t”?
3. The magnetic dipole moment of the proton is:
e
µp = gp sp , (24)
2mp

6
with a measured magnitude corresponding to a value for the gyro-
magnetic ratio of gp = 2 × (2.792847337 ± 0.000000029). We haven’t
studied the Dirac equation yet, but the prediction of the Dirac equa-
tion for a point spin-1/2 particle is g = 2. We may understand the
fact that the proton gyromagnetic ratio is not two as being due to its
compositeness: In the simple quark model, the proton is made of three
quarks, two “ups” (u), and a “down” (d). The quarks are supposed
to be point spin-1/2 particles, hence, their gyromagnetic ratios should
be gu = gd = 2 (up to higher order corrections, as in the case of the
electron). Let us see whether we can make sense out of the proton
magnetic moment.
The proton magnetic moment should be the sum of the magnetic mo-
ments of its constituents, and any moments due to their orbital motion
in the proton. The proton is the ground state baryon, so we assume
that the three quarks are bound together (by the strong interaction)
in a state with no orbital angular momentum. By Fermi statistics, the
two identical up quarks must have an overall odd wave function under
interchange of all quantum numbers. We must apply this with a bit of
care, since we are including “color” as one of these quantum numbers
here.
Let us look a little at the property of “color”. It is the strong interaction
analog of electric charge in the electromagnetic interaction. However,
instead of one fundamental dimension in charge, there are three color
directions, labelled as “red” (r), “blue” (b), and “green” (g). Unitary
transformations in this color space, up to overall phases, are described
by elements of the group SU (3), the group of unitary unimodular 3 ×
3 matrices. Just like combining spins, we may combine three colors
according to a Clebsch-Gordan series, with the result:

3 × 3 × 3 = 10 + 8 + 8 + 1. (25)

We haven’t studied this group, so this decomposition into irreducible


representations of the product representation is probably new to you.
However, the essential aspect here is that there is a singlet in the de-
composition. That is, it is possible to combine three colors in such a
way as to get a color-singlet state, i.e., a state with no net color charge.
These are the states of physical interest for our observed baryons, ac-
cording to a postulate of the quark model.

7
(a) After some thought (perhaps involving raising and lowering op-
erators along different directions in this color space), you could
probably convince yourself that the singlet state in the decom-
position above must be antisymmetric under the interchange of
any two colors. Assuming this is the case, write down the color
portion of the proton wave function.
(b) Now that you know the color wave function of the quarks in the
proton, write down the spin wave function.
(c) Since the proton is uud and its isospin partner the neutron is
ddu, and mp ≈ mn , let us make the simplfying assumption that
mu = md . Given the measured value of gp , what does your model
give for mu ? Recall that the up quark has electric charge 2/3, and
the down quark has electric charge −1/3, in units of the positron
charge.
(d) Finally, use your results to predict the gyromagnetic moment of
the neutron, and compare with observation.

8
Physics 125
Course Notes
Identical Particles
Solutions to Problems
040422 F. Porter

1 Exercises
1. Let us use the Pauli exclusion principle, and the combination of angular
momenta, to find the possible states which may arise when more than
one electron in an atom are in the same p-shell. Express your answers
for the allowed states in the spectroscopic notation: 2S+1 LJ , where S
is the total spin of the electrons under consideration, L is the total
orbital angular momentum, and J is the total angular momentum of
the electrons.
(a) List the possible states for 2 electrons in the same p-shell.
Solution: The possible total orbital angular momenta are
L = 1 ⊗ 1 = 0 ⊕ 1 ⊕ 2, (1)
and the possible total spin states are
1 1
S= ⊗ = 0 ⊕ 1. (2)
2 2
We want the overall wave function to be antisymmetric. The odd
L states are antisymmetric, while the S = 0 state is antisymmet-
ric. Thus, the possible overall antisymmetric states are:
J = (S = 0) ⊗ (L = 0) = 0 (3)
J = 0⊗2=2 (4)
J = 1 ⊗ 1 = 0 ⊕ 1 ⊕ 2. (5)
That is, the possible states are:
1
S 0 ,1 D 2 ,3 P 0 ,3 P 1 ,3 P 2 . (6)
(b) List the possible states for 3 electrons in the same p-shell.
Solution: We note that we cannot give all three electrons Lz =
+1, hence there is no F state – we’ll only be able to make S, P, D

1
states. Likewise, we can only make an S = 3/2 state in S-wave.
All other states will have S = 1/2. Proceeding in this manner to
count all possible states, we have the possibilities:
4
S 3 , 2 D 5 , 2 D 3 , 2 P 3 ,2 P 1 , 2 S 1 . (7)
2 2 2 2 2 2

(c) List the possible states for 4 electrons in the same p-shell. Hint:
before you embark on something complicated for this part, think
a bit!
Solution: If all six states were occupied, then we would have only
one possible state, with

L = S = J = 0. (8)

Imagine breaking this state up into the contributions from four


electrons and two electrons. Then:

0 = L = L(2) + L(4) (9)


0 = S = S(2) + S(4) (10)
0 = J = J(2) + J(4) (11)

Hence, the same angular momentum states are available to the


four electrons as the two electrons. That is, the possible states
are:
1
S 0 ,1 D 2 ,3 P 0 ,3 P 1 ,3 P 2 . (12)

2. The pion (π) is a boson (with spin zero) with isotopic spin I = 1.

(a) Use our “extended “identical” boson symmetry principle to clas-


sify the allowed (I, J) values for a system of two pions. Here,
J refers to the relative angular momentum, and I to the total
isotopic spin, of the two pion state.
Solution: The pion has spin zero, hence J = L. Odd J is anti-
symmetric, even symmetric. The pion has isospin 1. Combining
two pions can give states with I = 0, 1, 2. Even I states are sym-
metric, odd antisymmetric. To obtain an overall symmetric state,
we must have I + J even. These are the allowed possibilities, i.e.,
if I = 0 or I = 2, then J must be even, and if I = 1 then J must
be odd.

2
(b) Look up the experimental situation for particles which do and
don’t decay into two pions. [See, for example:
http://pdg.lbl.gov/2002/mxxx.pdf] Discuss what you find. Try to
resolve any puzzles, e.g., do you find particles which “ought” to
decay to two pions, but don’t? Do some decay to two pions when
they “shouldn’t”?
Solution: The lowest mass non-strange meson above the pion is
the η, with I = 0 and J = 0. According to our rule, it ought to
decay to two pions. But it doesn’t! The problem is parity, which
is conserved in strong and electromagnetic interactions, and which
is odd for the η. Two pions in even L have even spatial parity.
Thus, we may augment our rule to include that P = (−)J .
The next state is the f0 (600). It also has I = J = 0, but P = +.
It ought to decay to two pions, and it does!
The ρ(770) has I = J = 1, and negative parity. It should decay
to two pions and it does. Another success!
The ω(782) has I = 0 and J = 1. It shouldn’t decay to two pions.
It does though. However, it dominantly decays to three pions,
which is kinematically less likely than two pions. Thus, the two
pion decay is at least suppressed. Indeed, it is smaller than the
clearly electromagnetic π 0 γ decay. Hence, we conclude that our
prediction is still all right, up to the electromagnetic interaction.
The electromagnetic interaction clearly violates isospin conserva-
tion (e.g., the proton and neutron interact differently electromag-
netically).
The KS0 meson has I = 1/2, so it shouldn’t decay to two pions
according to our rule. But it does! This decay is occurs via the
weak interaction; isospin is not conserved in the weak interaction.

3. The magnetic dipole moment of the proton is:


e
µp = gp sp , (13)
2mp

with a measured magnitude corresponding to a value for the gyro-


magnetic ratio of gp = 2 × (2.792847337 ± 0.000000029). We haven’t
studied the Dirac equation yet, but the prediction of the Dirac equa-
tion for a point spin-1/2 particle is g = 2. We may understand the

3
fact that the proton gyromagnetic ratio is not two as being due to its
compositeness: In the simple quark model, the proton is made of three
quarks, two “ups” (u), and a “down” (d). The quarks are supposed
to be point spin-1/2 particles, hence, their gyromagnetic ratios should
be gu = gd = 2 (up to higher order corrections, as in the case of the
electron). Let us see whether we can make sense out of the proton
magnetic moment.
The proton magnetic moment should be the sum of the magnetic mo-
ments of its constituents, and any moments due to their orbital motion
in the proton. The proton is the ground state baryon, so we assume
that the three quarks are bound together (by the strong interaction)
in a state with no orbital angular momentum. By Fermi statistics, the
two identical up quarks must have an overall odd wave function under
interchange of all quantum numbers. We must apply this with a bit of
care, since we are including “color” as one of these quantum numbers
here.
Let us look a little at the property of “color”. It is the strong interaction
analog of electric charge in the electromagnetic interaction. However,
instead of one fundamental dimension in charge, there are three color
directions, labelled as “red” (r), “blue” (b), and “green” (g). Unitary
transformations in this color space, up to overall phases, are described
by elements of the group SU (3), the group of unitary unimodular 3 ×
3 matrices. Just like combining spins, we may combine three colors
according to a Clebsch-Gordan series, with the result:

3 × 3 × 3 = 10 + 8 + 8 + 1. (14)

We haven’t studied this group, so this decomposition into irreducible


representations of the product representation is probably new to you.
However, the essential aspect here is that there is a singlet in the de-
composition. That is, it is possible to combine three colors in such a
way as to get a color-singlet state, i.e., a state with no net color charge.
These are the states of physical interest for our observed baryons, ac-
cording to a postulate of the quark model.

(a) After some thought (perhaps involving raising and lowering op-
erators along different directions in this color space), you could

4
probably convince yourself that the singlet state in the decom-
position above must be antisymmetric under the interchange of
any two colors. Assuming this is the case, write down the color
portion of the proton wave function.
(b) Now that you know the color wave function of the quarks in the
proton, write down the spin wave function.
(c) Since the proton is uud and its isospin partner the neutron is
ddu, and mp ≈ mn , let us make the simplfying assumption that
mu = md . Given the measured value of gp , what does your model
give for mu ? Recall that the up quark has electric charge 2/3, and
the down quark has electric charge −1/3, in units of the positron
charge.
(d) Finally, use your results to predict the gyromagnetic moment of
the neutron, and compare with observation.
Solution: Note that there are six permutations of the three colors
among the three quarks, if no two quarks have the same color. The
completely antisymmetric combination of three colors is:
1
√ (|rgbi − |rbgi + |brgi − |bgri + |gbri − |grbi). (15)
6

To construct the spin wave function, we first note that the three
spin-1/2 quarks must combine in such a way as to give an overall
spin-1/2 for the proton. Second, since the space wave function is
symmetric, and the color wave function is antisymmetric, the spin
wave function of the two up quarks must be symmetric. Thus,
the two up quarks are in a spin 1 state. To give the spin wave
function of the proton, let us pick the z axis to be along the spin
direction. Then the spin state is:
s
11 2 1 1 1 11
| i= |11; − i − √ |10; i. (16)
22 3 2 2 3 22

The magnetic moment of the proton in this model is thus:


2 1 4 1
µp = (2µu − µd ) + µd = µu − µd . (17)
3 3 3 3

5
Hence,  
e 4 2 e 1 1 e
gp = 2 − 2 − . (18)
2mp 3 3 2mu 3 3 2md
With gp = 5.58, mp = 938 MeV, and mu = md , we obtain

mu = 2mp /gp = 336 MeV. (19)

The neutron wave function may be obtained from the proton wave
function by interchanging the u and d quark labels. Thus,
2 1 4 1
µn = (2µd − µu ) + µu = µd − µu . (20)
3 3 3 3
We predict the gyromagnetic moment of the nuetron to be:
4
µn µ − 13 µu
3 d
= 4 (21)
µp µ − 13 µd
3 u
 
4 1
3
− 3
− 13 23
=   (22)
42 1 1
33
− 3
− 3
2
= − . (23)
3
That is, we predict (neglecting the mass difference) gn = −3.72.
This may be compared with the observed value of −3.83.

6
Physics 125
Course Notes
Electromagnetic Interactions
040512 F. Porter

1 Introduction
We discuss in this note the interaction of electromagnetic radiation with
matter. The framework remains the Schrödinger equation. However, we
develop the beginnings of the notion of second quantization and field theory
here.

2 Charged Particle Interaction with Electro-


magnetic Field
Classically, the Hamiltonian for a charged particle, with charge q, in an
electromagnetic field, is
1
H= [p − qA(x, t)]2 + qΦ(x, t) + U (x, t), (1)
2m
where U represents any other (non-electromagnetic) potentials the particle
may experience. The corresponding Schrödinger equation is:
 
∂ψ(x, t) 1
i = [−i∇ − qA(x, t)]2 + qΦ(x, t) + U (x, t) ψ(x, t). (2)
∂t 2m
In classical electromagnetism, the physics is unaltered by a gauge trans-
formation:
A(x, t) → A0 (x, t) = A(x, t) + ∇χ(x, t) (3)
Φ(x, t) → Φ0 (x, t) = Φ(x, t) − ∂t χ(x, t), (4)
where χ(x, t) is a scalar function of position and time. In particular, the elec-
tric and magnetic fields are unchanged under such a gauge transformation.
We may investigate how the Schrödinger equation is altered under a gauge
transformation:
∂ψ 0 (x, t)
i = H 0 ψ 0 (x, t) (5)
∂t
1
 
1 2
= [−i∇ − qA0 (x, t)] + qΦ0 (x, t) + U (x, t) ψ 0 (x, t) (6)
2m
 
1
= [−i∇ − qA − q∇χ]2 + qΦ − q∂t χ + U ψ 0 (7)

2m 
q h i
= H+ 2(i∇2 χ + ∇χ · ∇ + qA · ∇χ) + q(∇χ)2 − q∂t χ ψ 0 .
2m
We offer the following theorem:

Theorem: If i∂t ψ = Hψ and i∂t ψ 0 = H 0 ψ 0 , then

ψ 0 (x, t) = eiqχ(x,t) ψ(x, t). (8)

It is left as an exercise for the reader to prove this theorem.


Thus, under a gauge transformation, the wave function transforms ac-
cording to
ψ(x, t) → ψ 0 (x, t) = eiqχ(x,t) ψ(x, t). (9)
Note that expectation of the momentum, p = −i∇, is not gauge invariant:
Z Z
d3 (x)ψ ∗ (−i∇)ψ 6= d3 (x)e−iqχ ψ ∗ (−i∇)eiqχ ψ (10)

The second integral gives an extra term:


Z
d3 (x)ψ ∗ (q∇χ)ψ. (11)

But q∇χ = q(A0 −A), so the quantity p −qA is gauge invariant. Classically,
this corresponds to the fact that this is just m dx
dt
. In the Heisenberg picture:

dx(t)
m = p(t) − qA(x, t). (12)
dt
This corresponds to the physical observable of position. While the potentials
and wave functions change under a gauge transformation, physical observ-
ables are gauge invariant quantum mechanically, as classically.
As an aside, let us remark a little further on our gauge transforma-
tion. The quantity χ(x, t) is an arbitrary scalar function (required to be
differentiable). Thus, our electromagnetic theory is invariant under gauge
transformations where the wave function transforms according to ψ → eiχ ψ.
The eiχ factor may be regarded as a one-by-one unitary matrix. We have

2
a group symmetry, where the transformation group is just U (1), called a
(local) “gauge group”. The word “local” is used because χ may vary with
position. A “global” U (1) symmetry corresponds to constant χ. In quantum
field theory, one often starts with the symmetry (for example, the symmetry
group SU (3) for quantum chromodynamics) and works out the transforma-
tion properties, and hence the interactions, of the gauge fields.
Returning to electromagnetic interactions, let us rewrite the Hamiltonian
in the form H = H0 + Hint , where H0 is the Hamiltonian in the absence of
the electromagnetic fields,
p2
H0 = + V, (13)
2m
and hence,
q q2 2
Hint = − (p · A + A · p) + A + qΦ. (14)
2m 2m
If there are N particles interacting with the field, we have:
N
( )
X qi q2
Hint = − [pi · A(xi , t) + A(xi , t) · pi ] + i A2 (xi , t) + qi Φ(xi , t) ,
i=1 2mi 2mi
(15)
where any interactions (including electromagnetic) between the particles are
included separately in V .
Suppose all of the particles have mass m and charge q. Then,
X XZ
qΦ(xi , t) = d3 (x)qδ (3) (x − xi )Φ(x, t) (16)
i i
Z
= d3 (x)qρ(x)Φ(x, t), (17)
P
where ρ(x) ≡ i δ (3) (x − xi ). Note that
Z
d3 (x)ρ(x) = N. (18)

We interpret ρ as the number density operator. Likewise, we define a “num-


ber current density” operator according to:
1 X h (3) i
j(x) ≡ pi δ (x − xi ) + δ (3) (x − xi )pi , (19)
2m i

3
where we have been careful to define it such that it is a hermitian operator.
Thus, we may write:
Z " #
3 q2
Hint = d (x) −qj(x) · A(x, t) + ρ(x)A2 (x, t) + qρ(x)Φ(x, t) .
(∞) 2m
(20)
We remark that p/m is not the velocity of the particle in the presence of
the electromagnetic field. Instead,
1 q
v= p − A. (21)
m m
Thus, the operator for the particle current density is actually
q
J(x) = j(x) − A(x, t)ρ(x). (22)
m

3 Example: Absorption of Light


We consider the absorption of light by a system of charged particles (for
example, an atom). We’ll make the following assumption: The fields A are
small compared with the “atomic” fields (∼ e/a20 for atoms) in the problem.
Thus, we may neglect the ρA2 term in comparison with the j · A term linear
in A.
Also, we pick a convenient gauge to work in, namely the “transverse”
gauge, in which:
Φ = 0, and ∇ · A = 0. (23)
Thus, we have: Z
Hint = −q d3 (x)j(x) · A(x, t). (24)
(∞)

We’ll now expand the external field in plane waves in a (large) box of
volume V with periodic boundary conditions:
!
X eik·x−iωt e−ik·x+iωt ∗
A(x, t) = Ak √  + A∗k √  , (25)
k, V V
P
where  is the polarization vector,  sums over two orthogonal polariza-
tions for given k, and  is orthogonal to k in the transverse gauge. Where
convenient, we will take the continuum limit:
Z 2
1 X k dkdΩ
→ . (26)
V k (2π)3

4
We have:
!
XZ eik·x−iωt e−ik·x+iωt
3
Hint = −q d (x)j(x) · Ak  √ + A∗k  ∗ √
k, (∞) V V
!
X −iωt iωt
e e
= −q Ak ĵ(−k) ·  √ + A∗k ĵ(k) ·  ∗ √ , (27)
k,  V V
where
Z
ĵ(k) = d3 (x)j(x)e−ik·x (28)
(∞)
1 X  −ik·xi 
= pi e + e−ik·xi pi . (29)
2m i
Let us calculate the absorption rate of a beam of light with this superpo-
sition of plane waves, by an atom in some state |0i. We assume that the light
is incoherent – here meaning that there aren’t phase correlations among the
different Fourier components. For example, the light source could be a hot
gas (e.g., sodium vapor), with atoms emitting independently. In this case,
we can compute the result for each Fourier component separately, and then
sum over the components.
Recall Fermi’s golden rule:
Γ0→n = 2π|hn|Hint |0i|2 δ(En − E0 − ω). (30)
In this case, conservation of energy includes the emitted photon, so the delta-
function has the photon energy ω = k in it. Note that the atomic states may
not be part of a continuum (as we assumed when we obtained the golden rule),
but we will still get transition probabilities proportional to time (golden rule)
as long as the incident light beam has a continuum of frequencies.
The Fourier components in our beam induce both upward and downward
transitions of the atom. The upward transitions are caused by the “positive”
frequency component of the perturbation, and the downward transitions by
the “negative” frequency component. From the golden rule, then, the rate
of upward transitions is:
q2
Γk (abs; 0 → n) = 2πδ(En − E0 − ω) |Ak |2 |hn|ĵ(−k) · |0i|2 . (31)
V
Summing over k,  (with two orthogonal polarizations for each k):
1 X
Γ(abs; 0 → n) = 2πδ(En − E0 − ω)q 2 |Ak |2 |hn|ĵ(−k) ·  |0i|2 . (32)
V k,

5
Changing the sum to an integral (with k = ω) and doing the delta function
integral (hence ω = En − E0 ) yields:
Z
ω2 X
Γ(abs; 0 → n) = 2πq 2 dΩ |Ak |2 |hn|ĵ(−k) · |0i|2 .
(33)
(2π)3 (4π)

Suppose, for example, that the incident light beam subtends a solid angle
∆Ω, and is polarized, with polarization vector . According to Maxwell’s
equations,
E(x, t) = −∂t A(x, t) (34)
B(x, t) = ∇ × A(x, t), (35)
and the energy in an electromagnetic field is
1 Z 3
E= d (x)(E2 + B2 ). (36)

We thus have the energy in our superposition of plane waves, averaged over
a few cycles:
X ω2
E= |Ak |2 . (37)
k,

With c = 1, this gives the rate of energy transport of our beam (in the given
polarization):
Z
1 X ω2 ω4
|Ak |2 = ∆Ω dω |Ak |2 , (38)
V k 2π (2π)4
with units of energy per unit area per unit time. The intensity of the incident
beam per unit frequency is thus
ω 4 |Ak |2
I(ω) = ∆Ω, (39)
(2π)4
and we may write the absorption rate in terms of this intensity:
(2π)2 q 2
Γ (abs; 0 → n) = I(ω)|hn|ĵ(−k) ·  |0i|2 . (40)
ω2
The rate of downward transitions (“induced emission”), from |ni to |0i
is similarly calculated to be (for polarized beam):
1 X
Γ (ind em; n → 0) = 2πδ(En − E0 − ω)q 2 |Ak |2 |h0|ĵ(k) · ∗ |ni|2 .
V k
(2π)2 q 2
= I(ω)|h0|ĵ(k) ·  ∗ |ni|2 , (41)
ω2
6
where ω = En − E0 . Since
h0|ĵ(k) ·  ∗ |ni = (hn|ĵ(−k) ·  |0i)∗ , (42)
we see that
Γ(abs; 0 → n) = Γ(ind em; n → 0). (43)
Let us transform the absorption rate into a cross section. Suppose that
there are Nk photons in the k mode of the incident beam, and let ω = |k|.
Then the total energy in the incident beam is:
X
E= ωNk . (44)
k,

This must be equal to


X ω2
E= |Ak |2 . (45)
k,

Hence, we have the relation

|Ak |2 = Nk . (46)
ω
Thus, we may write the absorption and induced emission rates in terms of
the number of photons:
1 X (2π)2
Γ(abs; 0 → n) = Γ(ind em; n → 0) = δ(En −E0 −ω)q 2 Nk |hn|ĵ(−k)·|0i|2 .
V k, ω
(47)
Now to get a total absorption cross section, we note first that the total
absorption rate for a beam of Nk photons in mode k is:
Nk 4π 2 q 2 X
Γk (abs) = |hn|ĵ(−k) · |0i|2 δ(En − E0 − ω). (48)
V ω n

But Vk is just the density of incident photons per unit volume, in the spec-
N

ified mode, and hence, with c = 1, is the incident photon flux. Thus, we
define the total absorption cross section:
Γk (abs)
σk (abs) = (49)
incident flux
4π 2 q 2 X
= |hn|ĵ(−k) · |0i|2 δ(En − E0 − ω). (50)
ω n

7
4 Quantized Radiation Field
The discussion in terms of the number of photons in the beam suggests the
following approach, of a quantized radiation field. Instead of talking about the
absorption or induced emission of energy from/to a classical electromagnetic
field, we can think, for example, of the absorption process as the atom making
the |0i → |ni transition, while the electromagnetic field makes a transition
from a state with “N” photons to a state with “N − 1” photons.
Adopting this approach, we specify our incoherent beam by giving the
number of photons in each (k) mode. Hence, the normalized intitial state
of the electromagnetic field is:
|Nk1 1 , Nk22 , . . .i, (51)
where Nk1 1 is the number of photons in mode k1 1 , etc. Two states are
orthogonal if any of the Nk differ. The absorption by an atom (or other
“matter” sysytem) in state |0i of a photon in mode k, resulting in the atom
in state |ni, corresponds to the transition between states of the entire system:
|0; Nk1 1 , Nk2 2 , . . . , Nk , . . .i → |n; Nk1 1 , Nk2 2 , . . . , Nk − 1, . . .i. (52)
The initial energy is X
Ei = E0 + k 0 Nk00 , (53)
k0 , 0

and the final energy is


X
Ef = En + k 0 Nk0 0 − k. (54)

k0 , 0

Using the golden rule, the transition rate is:


Γi→f = 2πδ(En −E0 −k)|hn; Nk1 1 , Nk2 2 , . . . , Nk −1, . . . |Hint |0; Nk1 1 , Nk2 2 , . . . , Nk , . . .i|2 .
(55)
We determine the form of Hint in this “quantum” description by requiring
that we get the same result as our treatment in terms of a classical external
field. That is, we demand:
|hn; Nk1 1 , Nk2 2 , . . . , Nk − 1, . . . |Hint |0; Nk1 1 , Nk2 2 , . . . , Nk , . . .i|2 (56)
q2
= |Ak |2 |hn|ĵ(−k) ·  |0i|2 (57)
V
q 2 2π
= Nk |hn|ĵ(−k) ·  |0i|2 . (58)
V ω
8
Thus, Hint includes a ĵ(−k) ·  piece acting on the matter subspace, times a
piece that decreases the number of photons in the k mode by one. Referring
back to our original expression for the “classical” interaction Hamiltonian,
Eqn. 27, we see that our quantum version must have the corresponding form:
q Xh i
Hint = − √ ĵ(−k) · Âk + ĵ(k) ·  ∗ †k , (59)
V k
where Âk is an operator which operates on photon states, reducing the
number of photons in mode k by one. The †k term is required to make the
Hamiltonian Hermitian. It will shortly be seen that this term has the effect
of increasing the number of photons in mode k by one.
By the orthogonality of states with different numbers of photons in any
mode, we have:
hn; Nk11 , Nk2 2 , . . . , Nk − 1, . . . |Hint |0; Nk11 , Nk2 2 , . . . , Nk , . . .i (60)
q
= − √ hn|ĵ(−k) · |0ih. . . , Nk − 1, . . . |Âk | . . . , Nk , . . .i.
V
By comparison with Eqn. 58, we have:
s
2π q
h. . . , Nk − 1, . . . |Âk | . . . , Nk , . . .i = Nk , (61)
ω
up to a phase factor, which we choose to be one.1
If we take the complex conjugate of Eqn. 61, we obtain:
h. . . , Nk − 1, . . . |Âk | . . . , Nk , . . .i∗ (62)

= h. . . , Nk , . . . |Âk | . . . , Nk − 1, . . .i
s
2π q
= Nk . (63)
ω
That is, †k is an operator which increases the number of photons in mode
k by one.
We have:
s
2π q
Âk |Nk1 1 , . . . , Nk , . . .i = Nk |Nk1 1 , . . . , Nk − 1, . . .i (64)
ω
s
2π q
†k |Nk1 1 , . . . , Nk , . . .i = Nk + 1|Nk1 1 , . . . , Nk + 1, . . .i.(65)
ω
1
Note that we are free to choose the relative phases of states with different numbers of
photons, since they are orthogonal.

9
Notice the close similarity with the harmonic oscillator creation and destruc-
tion operators a† and a. We may regard the quantum mechanical description
of the elecromagnetic radiation field as an infinite number of hamonic oscil-
lators (one per mode), with the photon as the quntum of these oscillators.
Let us define a Hermitian “electromagnetic field operator”:
1 X 
Â(x) ≡ √ Âk  eik·x + †k ∗ e−ik·x . (66)
V k

In terms of this operator, we may write the operator, Ĥint , for the interaction
of matter with this quantum mechanical radiation field as:
Z ( )
q2 h i2
3
Ĥint = d (x) −qj(x) · Â(x) + ρ(x) Â(x) , (67)
2m

where we have now included the ρA2 term.


To lowest order, the description of absoprtion of electromagnetic radia-
tion in quantum mechanics is identical to the description of absorption of
classical radiation – this was by construction. Let us consider the quantum
description of emission. We want to determine the transition rate from a
state |n; Nk11 , . . . , Nk , . . .i with energy
X
Ei = En + k 0 Nk0 0 (68)
k0 ,0

to a state |0; Nk11 , . . . , Nk + 1, . . .i with energy


X
Ef = E0 + k 0 Nk0 0 + k. (69)
k0 ,
0

Using the golden rule, the desired rate (with ω = k) is:

Γk (em; n → 0) = 2πδ(En −E0 −ω)|h0; . . . , Nk +1, . . . |Ĥint |n; . . . , Nk , . . .i|2 .
(70)
The matrix element is:

h0; . . . , Nk + 1, . . . |Ĥint |n; . . . , Nk , . . .i (71)


q
= − √ h0|ĵ(k) ·  ∗ |nih. . . , Nk + 1, . . . |†k | . . . , Nk , . . .i
V
s
2π q
= −q h0|ĵ(k) ·  ∗ |ni Nk + 1. (72)
ωV

10
Hence, the emission rate is

4π 2 q 2
Γk (em; n → 0) = δ(En − E0 − ω)|h0|ĵ(k) ·  ∗ |ni|2 (Nk + 1). (73)
ωV
This is not quite the same as the “induced emission” rate we calculated
earlier in our classical correspondence treatment,

4π 2 q 2
Γk (ind em; n → 0) = δ(En − E0 − ω)|h0|ĵ(k) ·  |ni|2 Nk . (74)
ωV
We now have an additional “+1” in the Nk +1 term. That is, even if Nk = 0,
we can have the emission take place. This is “spontaneous emission”, and the
total emission rate is just the sum of the induced and spontaneous emission
rates. Spontaneous emission may be regarded as the quantum mechanical
version of classical radiation from an accelerating charge.

4.1 Vacuum Fluctuations


Because the electromagnetic field is quantized, we have “vacuum fluctua-
tions” in the field, analogous to the “zero point” motion of a harmonic oscil-
lator. The vacuum state of the electromagnetic field is, of course:

|Ωi = |0, 0, . . . , 0, . . .i. (75)

The expectation value of Â(x) is

hΩ|Â(x)|Ωi = 0. (76)

However, the expectation value of Â(x)Â(x0 ) is not zero, since it contains


terms of the form
s

hΩ|Âk †k |Ωi = hΩ|Âk |0, 0, . . . , 1, 0, . . .i
ω
2π 2π
= hΩ|Ωi = 6= 0. (77)
ω ω
Hence, for example, the product of the electric fields at two different points
will be non-zero. One may interpret spontaneous emission as “induced emis-
sion”, due to the vacuum fluctuations of the electromagnetic field.

11
4.2 Einstein’s A and B Coefficients
Let us formulate a statistical argument for the rate of spontaneous emission:
The probability of finding a system at temperature T with energy E is given
by the Boltzmann distribution, i.e., is proportional to e−E/T . Apply this to
the problem of photons in a cavity with walls at temperature T . The relative
probability of having N photons in mode k, since E = N k, is e−N k/T .
Therefore, the average number of photons in mode k is:
P∞
N e−N k/T
N =0
hNk i = P∞ −N k/T
(78)
N =0 e
d P∞ −N k/T
− d(k/T ) N =0 e
= P∞ (79)
e−N k/T
N =0
d
− d(k/T )
(1 − e−k/T )−1
= (80)
(1 − e−k/T )−1
1
= k/T , (81)
e −1
a result known as “Planck’s distribution law”. The average energy per mode
is thus,
k
hEk i = khNk i = k/T . (82)
e −1
Now, in a cavity, photons are constantly being absorbed and emitted on
the walls. How must the absorption and emission rates be related in order
to give the above average for Nk ? Let us consider a simplified model for
the walls: Suppose that the atoms of the walls have two levels, with energies
E0 < En . According to the Boltzmann law, the probability to have an atom
in the upper state is Pn ∝ e−En /T , and hence:
Pn
= e−(En −E0 )/T . (83)
P0
Consider a state of equilibrium between these atoms and radiation of fre-
quency ω = En − E0 in the cavity. Photons of this energy are absorbed at
a rate proportional to the number, N , of such photons present, and propor-
tional to the probability that an atom is in its lower level, P0 :
!
dN
= −BNP0 , (84)
dt abs

12
where the minus sign is for the absorption of photons (decrease in N), and
B is a proportionality constant. Likewise, emission is induced at the rate:
!
dN
= BNPn . (85)
dt ind em

Note that the same constant of proportionality appears.


Since Pn < P0 , if these were the only processes involved, all of the photons
would eventually be absorbed. That is, we would have an exponential decay:

N(t) = N (0)eB(Pn −P0 )t → 0, as t → ∞. (86)

But there is an additional process, spontaneous emission, with a rate:


!
dN
= APn , (87)
dt spon em

where A is another constant of proportionality. In equilibrium, dN/dt = 0,


and N = hN i. Thus,

−BhN iP0 + BhN iPn + APn = 0, (88)

or, with Pn /P0 = e−(En −E0 )/T ,

A/B
hN i = . (89)
e(En −E0 )/T −1
Comparing this result with Planck’s law, we find that A = B. Thus,
!
dN
= BPn (N + 1), (90)
dt em

in agreement with our earlier result for the quantum emission rate. We
have deduced this without knowing B, simply by using “detailed balance”
arguments.

5 Exercises
1. Prove the theorem:

13
Theorem: Let the Hamiltonian for a charged particle interacting with
an electromagnetic field be H:
1
H= [p − qA(x, t)]2 + qΦ(x, t) + U (x, t), (91)
2m
Let H 0 be the Hamiltonian obtained from H by a gauge transfor-
mation:

A(x, t) → A0 (x, t) = A(x, t) + ∇χ(x, t) (92)


Φ(x, t) → Φ0 (x, t) = Φ(x, t) − ∂t χ(x, t), (93)

If i∂t ψ = Hψ and i∂t ψ 0 = H 0 ψ 0 , then

ψ 0 (x, t) = eiqχ(x,t) ψ(x, t). (94)

2. We stated that the actual number current density operator is:


q
J(x) = j(x) − A(x, t)ρ(x), (95)
m
where ρ(x) is the number density operator,
X
ρ(x) = δ (3) (x − xi ), (96)
i

and
1 X h (3) i
j(x) = pi δ (x − xi ) + δ (3) (x − xi )pi . (97)
2m i

(a) Show that J(x) is a gauge invariant operator (i.e., that its matrix
elements are gauge invariant).
(b) Show, in the Heisenberg representation, the familiar law:

∂ρ(x, t)
+ ∇ · J(x, t) = 0 (98)
∂t
3. We defined the quantum mechanical electromagnetic field operators
Âk and †k .

(a) Determine the commutation relations among these operators.

14
(b) We may define the quantum mechanical electric field operator
according to:
1 X 
Ê(x) = √ −iω Âk eik·x + iω †k  ∗ e−ik·x . (99)
V k

Make sure this definition makes sense to you. Compute the ex-
pectation value:
hΩ|Ê(x) · Ê(x0 )|Ωi (100)
ˆ (x), of Ê(x) over a small volume V. What
(c) Consider the average, Ē
is  2
ˆ
hΩ| Ē(x) |Ωi, (101)

and what happens as V → 0?

15
Physics 125
Course Notes
Electromagnetic Interactions
Solutions to Problems
040520 F. Porter

1 Exercises
1. Prove the theorem:
Theorem: Let the Hamiltonian for a charged particle interacting with
an electromagnetic field be H:
1
H= [p − qA(x, t)]2 + qΦ(x, t) + U (x, t), (1)
2m
Let H 0 be the Hamiltonian obtained from H by a gauge transfor-
mation:
A(x, t) → A0 (x, t) = A(x, t) + ∇χ(x, t) (2)
Φ(x, t) → Φ0 (x, t) = Φ(x, t) − ∂t χ(x, t), (3)
If i∂t ψ = Hψ and i∂t ψ 0 = H 0 ψ 0 , then
ψ 0 (x, t) = eiqχ(x,t) ψ(x, t). (4)
Solution: In the original gauge, the Schrödinger equation is:
∂ψ(x, t)
i = Hψ(x, t) (5)
∂t  
1
= [−i∇ − qA(x, t)]2 + qΦ(x, t) + U (x, t) ψ(x, t).
2m
In the new gauge, the Schrödinger equation is:
∂ψ 0 (x, t)
i = H 0 ψ 0 (x, t) (6)
∂t  
1 2
= [−i∇ − qA0 (x, t)] + qΦ0 (x, t) + U (x, t) ψ 0 (x, t)
2m
Suppose ψ 0 (x, t) = eiqχ(x,t) ψ(x, t). Consider taking e−iqχ times equa-
tion 7, and subtracting Eqn. 6. The left hand side of the result is:
h i
e−iqχ (i∂t ψ 0 ) − i∂t ψ = i e−iqχ (eiqχ ∂t ψ + ψ∂t eiqχ ) − ∂t ψ (7)
= −q(∂t χ)ψ. (8)

1
The right hand side is:

e−iqχ H 0 ψ 0 − Hψ (9)
 
1
= e−iqχ (i∇ + qA + q∇χ)2 + qΦ − q∂t χ + U eiqχ ψ
2m
 
1 2
− (i∇ + qA) + qΦ + U ψ
2m
 
1 1
= e−iqχ (i∇ + qA + q∇χ)2 eiqχ − (i∇ + qA)2 ψ − q(∂t χ)ψ.
2m 2m
Comparing the two sides, we find that the theorem will be verified if
we can show that
h i
e−iqχ (i∇ + qA + q∇χ)2 eiqχ − (i∇ + qA)2 ψ = 0. (10)

For convenience, absorb the charge into A and into χ. Then we want
to evaluate:
h i
e−iχ (i∇ + A + ∇χ)2 eiχ − (i∇ + A)2 ψ (11)
(
h  i
= e−iχ −∇2 + (A + ∇χ)2 + i 2(A + ∇χ) · ∇ + ∇ · A + ∇2 χ eiχ
)
2 2
+∇ − A − i(2A · ∇ + ∇ · A) ψ
(
h  i
= e−iχ −∇2 + 2A · ∇χ + (∇χ)2 + i 2(A + ∇χ) · ∇ + ∇2 χ eiχ
)
2
+∇ − 2iA · ∇ ψ

= −i∇χ · (∇ + i∇χ) − i∇2 χ − i∇χ · ∇ + 2A · ∇χ + (∇χ)2
 
+i 2iA · ∇χ + 2i(∇χ)2 + 2∇χ · ∇ + ∇2 χ ψ
= 0, (12)

as desired.

2. We stated that the actual number current density operator is:


q
J(x) = j(x) − A(x, t)ρ(x), (13)
m
2
where ρ(x) is the number density operator,
X
ρ(x) = δ (3) (x − xi ), (14)
i

and
1 X h (3) i
j(x) = pi δ (x − xi ) + δ (3) (x − xi )pi . (15)
2m i

(a) Show that J(x) is a gauge invariant operator (i.e., that its matrix
elements are gauge invariant).
Solution:
1 X h (3) i q X
J(x) = pi δ (x − xi ) + δ (3) (x − xi )pi − A(x, t) δ (3) (x − xi ),
2m i m i
1 Xh i
= (pi − qA(xi ))(δ (3) (x − xi ) + δ (3) (x − xi )(pi − qA(xi )) (. 16)
2m i

Under a gauge transformation ψ → eiqχ ψ and A → A+∇χ. Also,

(pi − qA(xi ) − q∆χ)(eiqχ ψ) = eiqχ (pi − qA(xi ) − q∆χ + q∆χ)ψ


= eiqχ (pi − qA(xi ))ψ. (17)

Thus, if J → J0 , then

ψ ∗ Jψ → ψ ∗ e−iqχ J0 eiqχ ψ (18)


= ψ ∗ e−iqχ eiqχ Jψ (19)
= ψ ∗ Jψ. (20)

(b) Show, in the Heisenberg representation, the familiar law:

∂ρ(x, t)
+ ∇ · J(x, t) = 0 (21)
∂t
Solution: In the Schrödinger picture, we have:
X
ρS (x) = δ (3) (x − xi ), (22)
i
1 Xh i
JS (x) = (pi − qA(xi ))(δ (3) (x − xi ) + δ (3) (x − xi )(pi − qA(xi(23)
)) .
2m i

3
The Heisenberg operators may be obtained from the Schrödinger
operators according to:

ρH (x, t) = e−iHt ρS (x)eiHt , (24)


JH (x, t) = e−iHt JS (x)eiHt . (25)

The Hamiltonian is:


1
H= [−i∇ − qA(x, t)]2 + qΦ(x, t) + U (x, t). (26)
2m
In the case where A and φ are independent of time, the proof is a
bit easier. Let’s assume this is the case for now. Let us also work
in the gauge where ∇ · A = 0.
The partial derivative of the number density operator with respect
to time is:
∂ρH
= e−iHt [−iHρS (x) + iρS (x)H] eiHt (27)
∂t
Xh i
= −ie−iHt H, δ (3) (x − xj ) eiHt (28)
j
1 −iHt X h 2 i
= −i e ∇ + 2iqA(x) · ∇, δ (3) (x − xj ) eiHt
(29)
2m j

The remainder of the proof is to show that this is cancelled by


the divergence of the current density operator. With our choice
of gauge,
∇ · JH (x, t) = e−iHt [∇ · JS (x)] eiHt . (30)

3. We defined the quantum mechanical electromagnetic field operators


Âk and †k .

(a) Determine the commutation relations among these operators.


Solution: We start with:
s
2π q
Âk |Nk1 1 , . . . , Nk , . . .i = Nk |Nk1 1 , . . . , Nk − 1, . . .i (31)
ω
s
2π q
†k |Nk1 1 , . . . , Nk , . . .i = Nk + 1|Nk1 1 , . . . , Nk + 1, .(32)
. .i.
ω

4
It is obvious that:
h i
Âk , Âk00 = 0 (33)
h i
†k , †k00 = 0. (34)
Also, h i
†k , Âk0 0 = 0, if k 6= k0 or  6=  0 . (35)
It remains to consider:
h i
†k , Âk0 0 |Nk1 1 , . . . , Nk , . . .i = (36)
2π 2π
Nk |Nk1 1 , . . . , Nk , . . .i − (Nk + 1)|Nk11 , . . . , Nk , . . .i.
ω ω
Thus,
h i 2π
†k , Âk0 0 = − . (37)
ω
(b) We may define the quantum mechanical electric field operator
according to:
1 X 
Ê(x) = √ −iω Âkeik·x + iω †k  ∗ e−ik·x . (38)
V k
Make sure this definition makes sense to you. Compute the ex-
pectation value:
hΩ|Ê(x) · Ê(x0 )|Ωi (39)
Solution:
1 XX
hΩ|Ê(x) · Ê(x0 )|Ωi = ωω 0 (40)
V k k0 0
   0 0 0 0

hΩ| −iÂk eik·x + i†k  ∗ e−ik·x · −iÂk0  0 0 eik ·x + i†k0  0 0∗ e−ik ·x |Ωi
1 X 2 2π ik·(x−x0 )
= ω e (41)
V k ω
Z
d3 k 0
=2 3
2πωeik·(x−x ) (42)
(∞) (2π)
We have:
Z
04π X ik·(x−x0 ) 1 0
hΩ|Ê(x) · Ê(x )|Ωi = ωe = 2 d3 kωeik·(x−x ) .
V k 2π (∞)
(43)

5
Let’s work further with the integral form. Let r ≡ x − x0 , r = |r|,
and ω = k = |k|. Let θ be the angle between k and r. Then,
Z Z
1 ∞ 3 1
hΩ|Ê(x) · Ê(x0 )|Ωi = k dk d cos θeikr cos θ (44)
π 0 −1
2 Z∞ 2
= k sin krdk (45)
πr 0
Z ∞
2
= x2 sin xdx. (46)
πr4 0

ˆ (x), of Ê(x) over a small volume V. What


(c) Consider the average, Ē
is  2
hΩ| Ēˆ (x) |Ωi, (47)

and what happens as V → 0?


Solution: First, find the average of of Ê(x) over a small volume V.
Suppose the volume is cubic (it might actually be more convenient
to use a spherical volume), with V = `3 . We need:
Z 3
Y Z `/2
1 ±ik·x 3 1
e d (x) = e±ikj xj dxj (48)
V V j=1 ` −`/2

3
Y 1 1
= 2i sin kj `/2 (49)
j=1 ` ikj
3
Y 2
= sin kj `/2 (50)
j=1 kj `
≈ 1 for kj ` small. (51)

Hence,
Z
ˆ (x) = 1
Ē Ê(x)d3 (x) (52)
V V  
1 X 1 Z ik·x 3 1 Z −ik·x 3
= √ iω −Âk  e d (x) + †k  ∗ e d (x)
V k V V V V
1 X h iY3
2
= √ iω −Âk  + †k ∗ sin kj `/2. (53)
V k j=1 kj `

6
The vacuum expectation of the square of this averaged operator
is:
 2   3 !2
ˆ (x) 1 X 2 ∗ 2π Y 2
hΩ| Ē |Ωi = (−ω ) ·  − sin2 kj `/2
V k ω j=1 kj `
3
!2
4π X Y 2
= ω sin2 kj `/2. (54)
V k j=1 kj `

Thus, for k  1/` we sum the photon energies, and for k  ` the
terms in the sum fall off rapidly with increasing k. As the volume
V becomes smaller and smaller, the sum gets larger and larger,
with the “limit”:
 2
ˆ (x) 4π X
hΩ| Ē |Ωi → ω. (55)
V k

Note that the energy density in an electromagnetic field E is


E2 /8π. Thus, our result corresponds to an energy density of
1 Xω
. (56)
V k 2

This gives a total energy divergent as k 4 . The result is readily in-


terpreted in terms of the harmonic oscillator picture of our second
quantization: We have an infinite number of harmonic oscillators
corresponding to the different modes of the photon field. Each
oscillator has a zero point energy of ω/2 where ω = |k| for that
oscillator. Summing over the zero point energies gives an infinite
result.

7
Physics 195
Course Notes
Second Quantization
030304 F. Porter

1 Introduction
This note is an introduction to the topic of “second quantization”, and hence
to quantum “field theory”. In the Electromagnetic Interactions note, we have
already been exposed to these ideas in our quantization of the electromagnetic
field in terms of photons. We develop the concepts more generally here, for
both bosons and fermions. One of the uses of this new formalism is that it
provides a powerful structure for dealing with the symmetries of the states
and operators for systems with many identical particles.

2 Creation and Annihilation Operators


We begin with the idea that emerged in our quantization of the electro-
magnetic field, and introduce operators that add or remove particles from a
system, similar to the changing of excitation quanta of a harmonic oscillator.
To follow an explicit example, suppose that we have a potential well,
V (x), with single particle eigenstates φ0 (x), φ1 (x), . . . Suppose we have an
n (identical) boson system, where all n bosons are in the lowest, φ0 , level.
Denote this state by |n. We assume that |n is normalized: n|n = 1. Since
the particles are bosons, we can have n = 0, 1, 2, . . ., where |0 is the state
with no particles (referred to as the “vacuum”).
Now define “annihilation” (or “destruction”) operators according to:

b0 |n = n|n − 1 (1)


b0 |n = n + 1|n + 1. (2)
Note that these operators subtract or add a particle to the system, in the state
φ0 . They have been defined so that their algebraic properties are identical
to the raising/owering operators of the harmonic oscillator. For example,
consider the commutator:
[b0 , b†0 ]|n = (b0 b†0 − b†0 b0 )|n (3)
= [(n + 1) − (n)] |n (4)
= |n. (5)
1
Thus [b0 , b†0 ] = 1. With these operators, we may write the n-particle state in
terms of the vacuum state by:

(b†0 )n
|n = √ |0. (6)
n!

As in the case of the harmonic oscillator, b†0 is the hermitian


√ conjugate

of b0 . To see this, consider the following: We have b0 |n = n + 1|n + 1.
Thus, √
n + 1|b†0 |n = n + 1, (7)

and hence, n + 1|b†0 = n + 1n|, or

n|b†0 = nn − 1|. (8)

Likewise, b0 acts as a creation operator when acting to the left:



n|b0 = n + 1n + 1|. (9)

We may write the n-particle state in terms of the vacuum state by:

(b†0 )n
|n = √ |0. (10)
n!

Finally, we have the “number of particles” operator: B0 ≡ b†0 b0 , with

B0 |n = n|n. (11)

Now suppose that the particles are fermions, and define fermion annihi-
lation and creation operators:

f0 |1 = |0, f0 |0 = 0; (12)


f0† |1 = 0, †
f0 |0 = |1. (13)

In the |0, |1 basis, these operators are the 2 × 2 matrices:


   
0 1 0 0
f0 = , f0† = . (14)
0 0 1 0
With this explicit representation, we see that they are hermitian conjugate
to each other. By construction, we cannot put two fermions in the same state
with these operators.

2
The algebraic properties of the fermion operators are different from those
of the boson operators. The commutator, in the |0, |1 basis, is
 
1 0
[f0 , f0† ] = = I. (15)
0 −1
Consider the anticommutator:

{f0 , f0†}|1 = (f0 f0† + f0† f0 )|1 = |1, (16)


{f0 , f0†}|0 = |0 (17)

That is, {f0 , f0† } = 1. Also,

{f0 , f0 } = 0, (18)
{f0† , f0† } = 0. (19)

The number of particles operator is F0 = f0† f0 .


Now return to bosons, and consider two levels, φ0 and φ1 . Let |n0 , n1  be
the state with n0 bosons in φ0 and n1 bosons in φ1 . As before, define,

b0 |n0 , n1  = n0 |n0 − 1, n1  (20)
† √
b0 |n0 , n1  = n0 + 1|n0 + 1, n1 , (21)

and also,

b1 |n0 , n1  = n1 |n0 , n1 − 1 (22)
† √
b1 |n0 , n1  = n1 + 1|n0 , n1 + 1. (23)

In addition to the earlier commutation relations, we have that the anni-


hilation and creation operators for different levels commute with each other:

[b0 , b1 ] = 0; [b†0 , b1 ] = 0 (24)


 
b0 , b†1 = 0; [b†0 , b†1 ] =0 (25)

We can construct an arbitrary state from the vacuum by:

(b† )n0 (b†1 )n1


|n0 , n1  = √0 √ |0, 0 (26)
n0 ! n1 !
The total number operator is now

B = B0 + B1 = b†0 b0 + b†1 b1 , (27)

3
so that
B|n0 , n1  = (n0 + n1 )|n0 , n1 . (28)
In the case of fermions, we now have four possible states: |0, 0, |1, 0, |0, 1,
and |1, 1. We define:

f0† |0, 0 = |1, 0; f0† |1, 0 = 0, (29)


f0 |1, 0 = |0, 0; f0 |0, 0 = 0, (30)
f0 |0, 1 = 0; f0† |1, 1 = 0, (31)
f1† |0, 0 = |0, 1; f1 |0, 0 = f1 |1, 0 = 0, (32)
f1† |1, 0 = |1, 1; f1 |0, 1 = |0, 0, (33)
f1† |0, 1 = f1† |1, 1 = 0; f1 |1, 1 = |1, 0. (34)

But we must be careful in writing down the remaining actions, of f0 , f0† on


the states with n1 = 1. These actions are constrained by consistency with
the exclusion principle. We must get a sign change if we interchange the
two fermions in a state. Thus, consider using the f and f † operators to
“interchange” the two fermions in the |1, 1 state: First, take the fermion
away from φ1 ,
|1, 1 → |1, 0 = f1 |1, 1. (35)
Then “move” the other fermion from φ0 to φ1 ,

|1, 0 → |0, 1 = f1† f0 |1, 0. (36)

Finally, restore the other one to φ0 ,

|0, 1 → f0† |0, 1 = f0† f1† f0 f1 |1, 1 (37)

We require the result to be a sign change, i.e.,

f0† |0, 1 = −|1, 1. (38)

Since f0 is the hermitian conjugate of f0† , we also have f0 |1, 1 = −|0, 1.
We therefore have the anticommutation relations:

{f0 , f0† } = {f1 , f1† } = 1. (39)

All other anticommutators are zero, including {f0 , f1 } = {f0 , f1† } = 0, fol-
lowing from the antisymmetry of fermion states under interchange.

4
We may generalize these results to spaces with an arbitrary number of
single particle states. Thus, let |n0 , n1 , . . . be a vector in such a space. For
the case of bosons, we have, in general:

[bi , b†j ] = δij , (40)


[bi , bj ] = [b†i , b†j ] = 0, (41)
(b†1 )n1 (b†0 )n0
|n0 , n1 , . . . = · · · √ √ |0, (42)
n1 ! n0 !
where |0 represents the vacuum state, with all ni = 0. Note that these
are the same as the photon annihilation and creation operators † , Â,
 that
we defined in the Electromagnetic Interactions note, except for the 2π/ω
factor.
For the fermion case, we have the generalization:

{fi , fj† } = δij , (43)


{fi , fj } = {fi† , fj† } = 0, (44)
|n0 , n1 , . . . = · · · (f1† )n1 (f0† )n0 |0. (45)

The number operators are similar in both cases:


  †
B = Bi = bi bi , (46)
i
  †
F = Fi = fi fi , (47)
i

and [Bi , Bj ] = [Fi , Fj ] = 0.

3 Field Operators
Consider now plane wave states in a box (rectangular volume V , sides Li , i =
1, 2, 3), with periodic boundary conditions:

eik·x
φk (x) = √ , (48)
V

where ki = 2πnj /Li , nj = 0, ±1, ±2, . . .. The creation operator a†ks (a is


either b or f , for bosons or fermions, respectively), adds a particle with

5
momentum k and spin projection s; the annilation operator aks removes
one. Note that φk (x) is the amplitude at x to find a particle added by a†ks .
Now consider the operator:
 e−ik·x †
ψs† (x) ≡ √ aks . (49)
k V

This operator adds a particle in a superpositon of momentum states with


−ik·x
amplitude e √V , so that the amplitude for finding the particle at x added by

√ √
ψs† (x) is a coherent sum of amplitudes eik·x / V , with coefficients e−ik·x / V .
That is, the amplitude at x is

 e−ik·x eik·x
√ √ = δ (3) (x − x ) (50)
k V V

[by Fourier series expansion of δ (3) (x − x ):


1  ik·x  3  −ik·x
g(x ) = e d (x )e g(x ), (51)
V k V

with g(x ) = δ (3) (x − x )].


The operator ψs† (x) thus adds a particle at x – it creates a particle at
point x (with spin projection s). Likewise, the operator

 eik·x
ψs (x) ≡ √ aks (52)
k V

removes a particle at x. The operators ψs† (x) and ψs (x) are called “field op-
erators”. They have commutation relations following from the commutation
relations for the a and a† operators:

ψs (x)ψs (x ) ± ψs (x )ψs (x) = 0 (53)


ψs† (x)ψs† (x ) ± ψs† (x )ψs† (x) = 0, (54)

where the upper sign is for fermions, and the lower sign is for bosons. For
bosons, adding (or removing) a particle at x commutes with adding one at
x . For fermions, adding (or removing) a particle at x anticommutes with

6
adding one at x . Also,
  
 eik·x eik ·x {fks , fk† s }
ψs (x)ψs† (x ) ± ψs† (x )ψs (x) = (55)
k,k
V [bks , b†k s ]
 
 eik·x eik ·x
= δkk δss (56)
k,k
V

 eik·(x−x
= δss (57)
k V
= δ(x − x )δss . (58)

Thus, adding particles commutes (bosons) or anticommutes (fermions) with


removing them, unless it is at the same point and spin projection. If it is
at the same point (and spin projection) we may consider the case with no
particle originally there – the ψ † ψ term gives zero, but the ψψ † term does
not, since it creates a particle which it then removes.
If we suppress the spin indices, we construct a state with n particles at
x1 , x2 , x3 , . . . , xn by:
1
|x1 , x2 , x3 , . . . , xn  = √ ψ † (xn ) . . . ψ † (x1 )|0. (59)
n!
Note that such states form a useful basis for systems of many identical par-
ticles, since, by the commutation relations of the ψ † ’s, they have the desired
symmetry under interchanges of xi ’s.1 For example, for fermions,

ψ † (x2 )ψ † (x1 ) = −ψ † (x1 )ψ † (x2 ) (60)

gives
|x2 , x1 , x3 , . . . , xn  = −|x1 , x2 , x3 , . . . , xn . (61)
Note also that we can add another particle, and automatically maintain the
desired symmetry:

ψ † (x)|x1 , x2 , x3 , . . . , xn  = n + 1|x1 , x2 , x3 , . . . , xn , x. (62)
1
These Hilbert spaces of multiple, variable numbers of particles, are known as Fock
spaces.

7
Now let us evaluate:
1
ψ(x)|x1 , x2 , x3 , . . . , xn  = √ ψ(x)ψ † (xn ) . . . ψ † (x1 )|0
n!
1  (3) 
= √ δ (x − xn ) ± ψ † (xn )ψ(x) ψ † (xn−1 ) . . . ψ † (x1 )|0
n!
1  (3)
= √ δ (x − xn )|x1 , x2 , . . . , xn−1 
n!
±δ (3) (x − xn−1 )|x1 , x2 , . . . , xn−2 , xn 
+...+ 
(±)n−1 δ (3) (x − x1 )|x2 , x2 , . . . , xn  , (63)
where the upper sign is for bosons and the lower for fermions. This quantity
is non-zero if and only if x = xj (and the corresponding suppressed spin
projections are also the same). If this is the case, the n − 1 particle state
which remains after performing the operation has the correct symmetry.
Note that
1  † †
x1 , x2 , . . . , xn | = √ ψ (xn )ψ † (xn−1 ) . . . ψ † (x1 )|0
n!
1
= 0|ψ(x1 ) . . . ψ(xn ) √ . (64)
n!
Thus, by iterating the above repeated commutation process we calculate:

x1 , x2 , . . . , xn |x1 , x2 , . . . , xn  = δnn (±)P P [δ(x1 − x1 )δ(x2 − x2 ) . . . δ(xn − xn )] ,
P
(65)

where P is a sum over all permutations, P , of x1 , x2 , . . . , xn and the (−)P
factor for fermions inserts a minus sign for odd permutations.
Suppose we wish to create an n particle state φ(x1 , x2 , . . . , xn ) which has
the desired symmetry, even if φ itself does not. The desired state is:

|Φ = d3 (x1 ) . . . d3 (xn )φ(x1 , x2 , . . . , xn )|x1 , x2 , . . . , xn . (66)

We can calculate the amplitude for observing the particles at x1 , x2 , . . . , xn


by:

x1 , x2 , . . . , xn |Φ = d3 (x1 ) . . . d3 (xn )φ(x1 , x2 , . . . , xn )x1 , x2 , . . . , xn |x1 , x2 , . . . , xn 
1 
= (±)P P φ(x1, x2 , . . . , xn ). (67)
n! P

8
That is, x1 , x2 , . . . , xn |Φ is properly symmetrized. If φ is already properly

symmetrized, then all n! terms in P are equal and x1 , x2 , . . . , xn |Φ =
φ(x1 , x2 , . . . , xn ). If φ is normalized to one, and symmetrized, we have:

Φ|Φ = d3 (x1 ) . . . d3 (xn )φ∗ (x1 , x2 , . . . , xn )x1 , x2 , . . . , xn |

d3 (x1 ) . . . d3 (xn )φ(x1 , x2 , . . . , xn )|x1 , x2 , . . . , xn 
 
= d3 (x1 ) . . . d3 (xn ) d3 (x1 ) . . . d3 (xn )
1 
φ∗ (x1 , x2 , . . . , xn )φ(x1 , x2 , . . . , xn ) (±)P P [δ(x1 − x1 )δ(x2 − x2 ) . . . δ(xn − xn )]
n! P

= d3 (x1 ) . . . d3 (xn )|φ(x1 , x2 , . . . , xn )|2 (68)
= 1. (69)

We may write the state |Φ in terms of an expansion in the amplitudes


x1 , x2 , . . . , xn |Φ for observing the particles at x1 , x2 , . . . , xn :

|Φ = d3 (x1 ) . . . d3 (xn )|x1 , x2 , . . . , xn x1 , x2 , . . . , xn |Φ. (70)

That is, we have the identity operator on symmetrized n particle states:



In = d3 (x1 ) . . . d3 (xn )|x1 , x2 , . . . , xn x1 , x2 , . . . , xn |. (71)

If |Φ is an n particle state, then

In |Φ = δnn |Φ. (72)

Summing the n particle identity operators gives the identity on the sym-

metrized states of any number of particles: I = ∞n=0 In , where I0 = |00|.

4 Exercises
1. Consider a two-level fermion system. With respect to basis |0, 0, |0, 1, |1, 0, |1, 1,
construct the explisict 4 × 4 matrices representing the creation and an-
nihilation operators f0 , f1 , f0† , f1† . Check that the desired anticommu-
tation relations are satisfied. Form the explicit matrix representation
of the total number operator.

9
2. You showed in Exercise 1 of the Electromagnetic Interactions course
note that under a gauge transformation:

A(x, t) → A (x, t) = A(x, t) + ∇χ(x, t) (73)


Φ(x, t) → Φ (x, t) = Φ(x, t) − ∂t χ(x, t), (74)

that the wave function (the solution to the Schrödinger equation) has
the corresponding transformation:

ψ  (x, t) = eiqχ(x,t) ψ(x, t). (75)

Generalize this result to the case of an N particle system.

10
Complex Variables
020701 F. Porter
Revision 091006

1 Introduction
This note is intended as a review and reference for the basic theory of complex
variables. For further material, and more rigor, Whittaker and Watson is
recommended, though there are very many sources available, including a
brief review appendix in Matthews and Walker.

2 Complex Numbers
Let z be a complex number, which may be written in the forms:

z = x + iy (1)
= reiθ , (2)

where x, y, r, and θ are real numbers. The quantities x and y are referred
to as the real and imaginary parts of z, respectively:

x = (z), (3)
y = (z). (4)

The quantity r is referred to as the modulus or absolute value of z,



r = |z| = x2 + y 2 , (5)

and θ is called the argument, θ = arg(z), or the phase, or simply the angle
of z. We have the transformation between these two representations:

x = r cos θ, (6)
y = r sin θ, (7)

and finally also


θ = tan−1 (y/x), (8)
with due attention to quadrant. Noticing that eiθ = ei(θ+2nπ) , where n is any
integer, we say that the principal value of arg z is in the range:

−π < arg z ≤ π. (9)

1
y

z
r

θ
x
−θ

z*

Figure 1: Complex number and its complex conjugate.

The complex conjugate, z ∗ , of z is obtained from z by changing the


sign of the imaginary part:

z ∗ = numbers,
The product of two complex z1 −iθ
x − iy = re and. z2 , is given by: (10)

z1 z2 = r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 )


= (x1 + iy1 )(x2 + iy2 ) = (x1 x2 − y1 y2 ) + i(x1 y2 + x2 y1 ). (11)

Notice that
zz ∗ = x2 + y 2 = |z|2 . (12)
It is also interesting to notice that in the product:

z1 z2∗ = (x1 x2 + y1 y2 ) − i(x1 y2 − x2 y1 ), (13)

the real part looks something like a “scalar product” of two vectors, and the
imaginary part resembles a “cross product”.

3 Complex Functions of a Complex Variable


We are interested in (complex-valued) functions of a complex variable z. In
particular, we are especially interested in functions which are single-valued,
continuous, and possess a derivative in some region.

2
Defining a suitable derivative requires some care. Start with the definition
for real functions of a real number:
df f (x + Δx) − f (x)
f  (x) = (x) = lim . (14)
dx Δx→0 Δx
But in the complex case we have real and imaginary parts to worry about.
First, define what we mean by a limit. Let f (z) be a single-valued function
defined at all points in a neighborhood of z0 (except possibly at z0 ). Then
we say that f (z) → w0 as z → z0 , or limz→z0 f (z) = w0 , if, for every  > 0,
there exists a δ > 0 such that (Fig. 2):
|f (z) − w0 | <  ∀z satisfying 0 < |z − z0 | < δ. (15)
Note that we have not required “f (z0 )” to be defined, in order to define the
limit (Fig. 3).
y

d
z0

Figure 2: Circle of radius δ about z0 .

"f(z)"

z0 "z"

Figure 3: Function not defined at z0 .

However, in order to define the derivative at z0 , we require f (z0 ) to be


defined. If limz→z0 = f (z0 ), where the limit exists, then we say that f (z) is
continuous at z0 . In general f (z) is complex, and we may write:
f (z) = u(x, y) + iv(x, y), (16)

3
where u and v are real. Then limz→z0 = f (z0 ) implies

lim u(x, y) = u(x0 , y0 ), (17)


x→x0 , y→y0
lim v(x, y) = v(x0 , y0 ), (18)
x→x0 , y→y0

where the path of approach to the limit point must lie within the region of
definition. We may thus define continuity to the boundary of a closed region,
if the path is within the region.
Now, in our definition of f  (z), we note that there are an infinite number
of possible paths along which we can make Δz = Δx + iΔy → 0.

x
Figure 4: Various paths along which to approach a point.

For our derivative to be well-defined, we demand that the value of f  (z)


be independent of the way in which Δz → 0. Thus, if we approach along the
path Δx = 0:
f (z + iΔy) − f (z)
f  (z) = lim
Δy→0 iΔy
 
u(x, y + Δy) − u(x, y) i [v(x, y + Δy) − v(x, y)]
= lim +
Δy→0 iΔy iΔy
∂u ∂v
= −i + . (19)
∂y ∂y
If instead we make our approach along the path Δy = 0, we obtain:
f (z + Δx) − f (z)
f  (z) = lim
Δx→0 Δx
∂u ∂v
= +i (20)
∂x ∂x

4
The two expressions are equal if and only if the real and imaginary parts are
separately equal:
∂u ∂v
= (21)
∂x ∂y
∂u ∂v
= − . (22)
∂y ∂x
These important conditions are known as the Cauchy Riemann equations,
or C-R equations, for short. We may state this in the following theorem:
Theorem: If u, v possess first derivatives throughout a neighborhood of z0 ,
which are continuous at z0 , then the Cauchy Riemann equations, if
df
satisfied, guarantee the existence of dz (z0 ).
Proof: Write:
∂u ∂u
Δu = Δx + Δy + ux Δx + uy Δy (23)
∂x ∂y
∂v ∂v
Δv = Δx + Δy + vx Δx + vy Δy, (24)
∂x ∂y
where the correction terms for non-linearities, ij , approach zero as
Δx, Δy → 0.
Using the Cauchy Riemann equations, we obtain:
∂u ∂v
Δu = Δx − Δy + ux Δx + uy Δy (25)
∂x ∂x
∂v ∂u
Δv = Δx + Δy + vx Δx + vy Δy. (26)
∂x ∂x
Thus,
Δf Δu + iΔv
=
Δz Δz
∂u ∂v
(Δx + iΔy) + i ∂x (Δx + iΔy) + x Δx + y Δy
= ∂x , (27)
Δx + iΔy
where x ≡ ux + ivx → 0, y ≡ uy + ivy → 0 as Δx, Δy → 0.
Furthermore,
   
 Δx   Δy 
   
  ≤ 1,   ≤ 1. (28)
 Δx + iΔy   Δx + iΔy 

Therefore,
df Δu + iΔv ∂u ∂v
= lim = +i , (29)
dz Δz→0 Δz ∂x ∂x
independent of path. This completes the proof.

5
We have the following equivalent ways of expressing the derivative:
df ∂u ∂v ∂v ∂v ∂v ∂u ∂u ∂u
= +i = +i = −i = −i . (30)
dz ∂x ∂x ∂y ∂x ∂y ∂y ∂x ∂y
It is of interest to also consider this discussion in terms of the polar form.
In this case, we may consider the Δr = 0 path:
y

z+D z Dz
Dq z
q
x

Figure 5: Polar path description.

f (zeiΔθ ) − f (z)
f  (z) = lim
Δθ→0 z(eiΔθ − 1)
 
u(r, θ + Δθ) − u(r, θ) i [v(r, θ + Δθ) − v(r, θ)]
= lim +
Δθ→0 izΔθ izΔθ
i ∂u 1 ∂v
= − + . (31)
z ∂θ z ∂θ
Similarly, for the Δθ = 0 path:
 
f (r + Δr)eiθ − f (z)
f  (z) = lim
Δr→0
 eiθ Δr 
u(r + Δr, θ) − u(r, θ) i [v(r + Δr, θ) − v(r, θ)]
= lim +
Δr→0 eiθ Δr eiθ Δr
1 ∂u i ∂v
= iθ + iθ . (32)
e ∂r e ∂r
Hence,
∂u ∂v i ∂u 1 ∂v
eiθ f  (z) =
+i =− + . (33)
∂r ∂r r ∂θ r ∂θ
We have thus obtained the Cauchy-Riemann relations in polar form:
∂u 1 ∂v
= (34)
∂r r ∂θ
∂v 1 ∂u
= − . (35)
∂r r ∂θ

6
A function f (z) of complex variable z is called analytic at the point z0 if
it is single-valued and possesses a derivative at every point in a neighborhood
of z0 . Otherwise, z0 is a singular point of f (z). If f (z) is analytic at every
point in a simply connected open region (“domain”) D, then it is referred
to as analytic throughout D. Other terms that are often used for this (with
some variation of meaning) are regular and holomorphic. Sometimes the
term “analytic” is not required to be single-valued, that is, single-valuedness
in a domain D means that, after following any closed path in D, the function
f (z) returns to its initial value. If f (z) is analytic for all finite z, then f (z)
is an entire function.
Examples:
• f (z) = z 3 is an entire function.
• f (z) = 1/z 2 is analytic everywhere except at z = 0, where it is not
defined. We note that for this function,
x2 − y 2 2xy
u= , v=− . (36)
(x2 + y 2 )2 (x2 + y 2)2

• f (z) = z 3/2 is analytic everywhere except at z = 0. Let’s look at why


this is the case in some detail. We may write
z 3/2 = r 3/2 e3iθ/2 = r 3/2 (cos 3θ/2 + i sin 3θ/2). (37)
Hence,
∂u 3 1/2
= r cos 3θ/2 (38)
∂r 2
∂v 3 1/2
= r sin 3θ/2 (39)
∂r 2
1 ∂u 3
= − r 1/2 sin 3θ/2 (40)
r ∂θ 2
1 ∂v 3 1/2
= r cos 3θ/2. (41)
r ∂θ 2
Comparison with Eqns. 34 and 35 shows that the C-R conditions are
satisfied everywhere. Now consider a path containing the origin as an
interior point (see Fig. 6). We’ll start at z = ei0 , with  real. Table 1
shows the values of f (z) as we traverse the path once around the origin.
We see that f (z) = z 3/2 is multi-valued in any neighborhood of the
origin, and hence is not analytic at z = 0.

7
y

Figure 6: Circular path around origin, radius .

Table 1: Evaluation of the function z 3/2 at various points on a circle.


θ z f (z) = z 3/2
0  3/2
π/2 i 3/2 ei3π/4
π − 3/2 ei3π/2 = −i3/2
3π/2 −i 3/2 ei9π/4
2π  3/2 ei3π = −3/2

4 Riemann Surfaces
Let us continue to think about the interesting f (z) = z 3/2 example. Note
that if θ = 4π and r = , then z 3/2 = 3/2 ei6π = 3/2 , so we come back to
the θ = 0 value after two circuits. Thus z 3/2 is a double-valued function. We
may visualize this behavior via the use of Riemann surfaces, or sheets.
For z 3/2 , we have two sheets (Fig. 7).
The point z = 0 is called a branch point. Since there are only a finite
number of branches (2) for this function, the origin is called an algebraic
branch point.
For another example, the function f (z) = z 1/4 will have four branches,
see Table 2 and Fig. 8.
Now consider the function f (z) = ln z, defined by ef (z) = z:
f (z) = ln z = ln r + iθ. (42)
This function has an infinite number of branches. In this case, the point
z = 0 is called a logarithmic branch point.
We note a couple of things about branches:
• There may be many branch points for a function.

8
Figure 7: Two Riemann sheets for the double-valued function z 3/2 . The
lower sheet is for 0 ≤ θ < 2π, 4π ≤ θ < 6π, etc., and the top sheet is for
2π ≤ θ < 4π, etc. The branch cuts are indicated by the cuts in the planes.

Table 2: The function f (z) = z 1/4 , evaluated at multiples of 2π.

θ eiθ/4
0 1
2π eiπ/2 = i
4π eiπ = −1
6π ei3π/2 = −i
8π ei2π = 1

• There are many ways to make branch cuts, but they can only terminate
at a branch point, they cannot intersect themselves, and they must have
the same form on all sheets.

For a slightly more complicated example illustrating these ideas, consider


the function: √
f (z) = 1 − z 2 = (1 − z)1/2 (1 + z)1/2 . (43)
This function has singularities (branch points) at z = ±1. There are two
sheets, and various possible ways of choosing the branch cuts, as illustrated
in Fig. 9.
There is a choice in how to take branch cuts – one makes cuts that are
convenient to the problem at hand (for example, when we integrate along a
path, we arrange it so that the path does not cross a cut). Branch cuts are
used in effect to make multi-valued functions “single-valued” – if you don’t
cross a branch, you stay on the same sheet.

9
Sheet
4

3

2

1
0,8π

Figure 8: Four Riemann sheets for the quadruple-valued function z 1/4 . The
view is edge-on, with the branch cut at the transitions among the sheets.

y y y

     
-1 +1 x -1 +1 x -1 +1 x


Figure 9: Some possible choices of branch cuts for the function 1 − z2 .

5 Integration of Complex Functions


As with the derivative, we must face the problem of forming an integral that
makes sense in some correspondence with the integral for real functions. For
real functions, the indefinite integral may be “defined” as the inverse of dif-
ferentiation (i.e., as the limit of a sum, rather than the limit of a difference).
For a complex function, such an indefinite integral may not always exist.
Consider f (z) = z ∗ = x − iy. Suppose

F (z) = f (z) dz = U + iV, (44)

and (if integration is inverse of differentiation)


dF
= f (z) = x − iy. (45)
dz
If the derivative exists, we must be able to use the Cauchy-Riemann equa-
tions, hence,
dF ∂U ∂U
= −i = x − iy. (46)
dz ∂x ∂y
Thus,
∂U ∂U
= x, = y, (47)
∂x ∂y

10
and
∂2U ∂2U
+ = 2. (48)
∂x2 ∂y 2
Let us see what the Cauchy-Riemann equations imply for this quantity:
∂  ∂U ∂V 
= (49)
∂x ∂x ∂y
∂  ∂U ∂V 
= − . (50)
∂y ∂y ∂x
Therefore:
∂2U ∂2U
+ = 0, (51)
∂x2 ∂y 2
which may be recognized as Laplace’s Equation in two dimensions. Thus,
F (z) cannot be an analytic function; it does not possess a derivative, and
there exists no function with derivative x − iy. This suggests that we should
restrict consideration to functions which are analytic in the region of interest.
Referring to Fig. 10, let us consider the definite integral:
 β
f (z)dz. (52)
α

y
β

Figure 10: Possible paths of integration from α to β.

There are an infinite number of possible paths to integrate along. In general,


we must specify the path, e.g.,
 β
f (z) dz. (53)
α
C
To define this integral, first divide path C into n intervals by points
z0 = α, z1 , z2 , . . . , zn = β, as in Fig. 11. Let Δj z ≡ zj − zj−1 , and let zj be

11
a point on C between zj−1 and zj . Then we define the line integral along C
as:  β n
f (z)dz = lim f (zj )Δj z, (54)
α n→∞
j=1
C
where we require the intervals to satisfy:
n
lim max |Δj z| = 0, (55)
n→∞ j=1

and the limit must exist, of course. Note that this definition is compatible
with the usual definition for real variables.

β = zn
z 'j zj
z j-1

α= z0
Figure 11: Dividing a path into intervals to obtain an approximate integral.

We list some immediate consequences of our definition:


1. Considering Δj z → −Δj z, we have the path-reversed integral:
 α  β
f (z)dz = − f (z)dz. (56)
β α
C C
2. If k is any complex constant, then
 β  β
kf (z)dz = k f (z)dz. (57)
α α
C C
3. If the integrals of f and g separately exist, then the integral of their
sum exists, and:
 β  β  β
[f (z) + g(z)] dz = f (z)dz + g(z)dz. (58)
α α α
C C C

12
4. If γ is a point on C (between α and β), then
 γ  β  β
f (z)dz + f (z)dz = f (z)dz. (59)
α γ α
C C C
Toward proving this, note that we can always arrange our subintervals
such that γ is a dividing point.
5. If M = maxβα |f (z)| (including the endpoints) then:
C
 β  
n 
   
 f (z)dz  = n→∞
lim f (zj )Δj z 
α j=1
C
n 

 
≤ lim f (zj )Δj z  (follows from triangle inequality)
n→∞
j=1
 β
≤ M |dz| = MLC , (60)
α
C
where LC is the length of the integration path (in the usual Euclidean
sense).

6 Cauchy’s Theorem
If a function f (z) is analytic at all points on and inside a contour C, then

f (z)dz = 0. (61)
C

Note that by
“contour”, we mean a simple closed curve. We could also use
the notation to stress this. Our assertion is known as Cauchy’s Theorem.
Let us prove the theorem: Assume f (z) is analytic as stated. Write
f (z) = u(x, y) + iv(x, y). Then:
 
f (z) dz = (u + iv)(dx + idy)
C C 
= (udx − vdy) + i (udy + vdx) (62)
C C

Let S stand for the region enclosed by contour C. Green’s theorem states
that, for functions α and β:
 
∂β ∂α
αdx + βdy = − dxdy. (63)
C S ∂x ∂y

13
C
S

Figure 12: Contour and surface of integration in Green’s theorem.

Therefore,
  
∂v ∂u ∂u ∂v
f (z) dz = − + dxdy + i − dxdy
C S ∂x ∂y S ∂x ∂y
= 0, by the Cauchy-Riemann relations. (64)
Cauchy’s theorem tells us that the integral of an analytic function is
path-independent in a domain of analyticity:
 β  β
f (z) dz = f (z) dz
α C α C
 β
= f (z) dz, (65)
α

where the latter equality is without ambiguity, due to Cauchy’s theorem.


y
β
C'

α C

Figure 13: Equivalent paths of integration in a region of analyticity.

Note that the way we have stated Cauchy’s theorem, it holds for func-
tions which have singularities, provided our contours do not “encircle” the
singularities:

14
C2

C1
(a) (b)


Figure 14: (a) C1 +C2 f (z) dz = 0, where C2 encircles a singularity, but the
“contour” C1 + C2 does not. Integrals along the portions joining C1 and C2
cancel out. (b) A branch cut may be chosen for convenience, so that the
contour does not cross it.

7 Indefinite Integral of an Analytic Function


Let f (z) be analytic in simply connected domain D. Then
 z
F (z) = f (z  ) dz  (66)
z0

depends only on z and z0 (and not the path), as long as the path is entirely
in D.
What is F  (z) = dF
dz
(z) (for z ∈ D)?
  
F (z + Δz) − F (z) 1 z+Δz z
= f (z  )dz  − f (z  ) dz 
Δz Δz z0 z0
 z+Δz
1
= f (z  )dz 
Δz z
 z+Δz
1
= f (z) + [f (z  ) − f (z)] dz  . (67)
Δz z
The last integral is path independent in D, so chose for path the straight
line segment joining z and z + Δz (noting that, for Δz small enough, such
a path must exist). Thus, by the continuity of f (z), given an  > 0, we can
always find |Δz| small enough such that |f (z  ) − f (z)| <  for any z  on the
path. Thus,  
 z+Δz 
 
 [f (z  ) − f (z)] dz   < |Δz|. (68)
 z 

Given any  > 0 then, we can find a Δz > 0 such that


 
 F (z + Δz) − F (z) 
 


− f (z) < . (69)
Δz

15
Therefore, 
 d z
F (z) = f (z  ) dz  = f (z). (70)
dz z0
The indefinite integral of an analytic function is an analytic function.
It is important, when performing integrations, to be careful about sin-
gularities and regions of non-analyticity. For example, consider the integral
1 1
−1 z dz. We might try an integration path along a semi-circle in the positive
y plane – 1/z is analytic there.


e -1 +1
-1 +1 x x

(a) (b)

Figure 15: Two possible semi-circular paths from −1 to +1.

We let z = eiθ , and hence dz = ieiθ dθ. Alternatively, we could choose to


integrate along a semicircle in the negative y plane – 1/z is analytic there as
well. The two choices yield:
 0
I+ = i e−iθ eiθ dθ = −iπ (71)
π
 2π
I− = i e−iθ eiθ dθ = iπ. (72)
π

The two answers are different! The path-dependence is a result of the fact
that we have chosen paths which lie in different simply-connected domains
of analyticity. There is a branch cut from the origin, a singular point. Note
that, while 1/z is not multi-valued, its integral (ln z) is.

8 Cauchy Integral Formula


Suppose f (z) is analytic everywhere in some domain D. Consider the inte-
gral:

f (z)
dz, (73)
C z − z0

16
f (z)
where C is contained in D, and z0 is interior to C. Thus, z−z 0
is analytic
everywhere on and inside C, except at the point z = z0 . The integral is
unchanged if we deform the contour to the circle C0 with center at z0 :

C
C0

z0

Figure 16: Domain D, contour C and deformed contour C0 about point z0 .

 
f (z) f (z)
dz = dz
C z − z0 C0 z − z0
 
f (z0 ) f (z) − f (z0 )
= dz + dz. (74)
C0 z − z0 C0 z − z0
Consider the second of the two integrals in the above expression:
  
 f (z) − f (z0 )  |f (z) − f (z0 )|

 dz  ≤ |dz|. (75)
 C0 z − z0  C0 |z − z0 |

Since f (z) is analytic at z0 , it must be continuous there. Hence, given any


 > 0, there exists a δ > 0 such that |f (z) − f (z0 )| <  whenever |z − z0 | < δ.
We pick an , and let δ = |z − z0 |, i.e., we pick a circle of small enough radius
such that |f (z) − f (z0 )| <  on the circle. Remember that the value of the

17
integral does not depend on the radius of the circle. Thus,
 

 f (z) − f (z0 )  
 dz  ≤ |dz|
 C0 z − z0  δ Cδ
≤ 2π. (76)
The integral is smaller than any positive number, i.e., is equal to zero. There-
fore,
 
f (z) dz
dz = f (z0 )
C0 z − z0 C0 z − z0
 2π
ireiθ dθ
= f (z0 ) (letting z − z0 = reiθ )
0 reiθ
= 2πif (z0 ). (77)
We have derived Cauchy’s Integral Formula: For any function f (z) which
is analytic on and inside the contour C,
1  f (z)
f (z0 ) = dz. (78)
2πi C z − z0
Note that the Cauchy integral formula tells us that if we know the value of a
function everywhere along a closed contour, then we know its value at every
point inside the contour, provided the function is analytic on and inside the
contour.

8.1 Cauchy Integral Formula and Derivatives of an An-


alytic Function
Start with Cauchy’s integral formula (assuming f (z) appropriately analytic),
and take derivatives:

1 f (z  ) 
f (z) = dz (79)
2πi C z − z

df 1 f (z  )
= dz  (80)
dz 2πi C (z − z)
 2

d2 f 2 f (z  )
= dz  (81)
dz 2 2πi C (z  − z)3
···

dn f n! f (z  )
= dz  . (82)
dz n 2πi C (z  − z)n+1
Is this procedure justified? If so, then we have evidently shown that the
derivative of an analytic function is analytic, at least at all points inside C.

18
If f (z) is analytic, we know its derivative exists:
f (z + h) − f (z)
f  (z) = lim
h
h→0
 
1 f (z  ) dz  f (z  ) dz 
= lim −
h→0 2πih C z − z − h C z − z
 
1 f (z  ) dz 
= lim (83)
2πi h→0 C (z  − z − h)(z  − z)
Adding and subtracting f (z  )/(z  − z)2 to the integrand, we obtain:
 
 1 f (z  ) dz  h f (z  ) dz 
f (z) = + lim . (84)
2πi C (z − z)
 2 h→0 2πi C (z  − z − h)(z  − z)2
By assumption, f (z  ) is continuous on C, hence it is bounded. Likewise, (z  −
z)−2 is bounded on C. Furthermore, take h < minC 21 |z  − z|, guaranteeing
that |z  − z − h| > 0. Therefore,
 
 f (z  )

  ≤ K < ∞, (85)
 (z − z)2 (z  − z − h)
i.e., the integrand is bounded for z  on C by some finite number K. Then,
  
 h f (z  ) dz   K
 
 lim ≤ lim |h|LC = 0, (86)
h→0 2πi C (z − z − h)(z − z)
  2  2π h→0
where LC is the length of contour C. Hence,

 1  f (z  )
f (z) = dz  , (87)
2πi C (z  − z)2
as desired.
Then we may similarly consider:
  
f  (z + h) − f  (z) 1 1 1
lim = lim f (z  ) dz  − 
h→0 h h→0 2πih C (z − z − h)
 2 (z − z)2

1 2(z  − z − h/2)
= lim f (z  ) dz  
h→0 2πi C (z − z)2 (z  − z − h)2

2 f (z  )
= dz  + lim hAh , (88)
2πi C (z  − z)3 h→0

where Ah is a bounded function of z when h < 12 |z  − z|. Hence, f  exists,


and
 2  f (z  )
f = dz  . (89)
2πi C (z  − z)3

19
The same argument may be continued indefinitely, since the integral repre-
sentation has f (z), which we know is continuous, hence bounded. Thus, we
have established the result for the nth derivative:

(n) n! f (z  )
f = dz  , (90)
2πi C (z − z)
 n+1

as hoped.

8.2 Mean Value Theorem from the Cauchy Integral


Formula
If f (z) is analytic on and within contour C, we know that

1 f (z  ) dz 
f (z) = . (91)
2πi C z − z
Consider contour C that is a circle of radius r with center at z0 :

1 f (z  ) dz 
f (z0 ) =
2πi C z  − z0
 2π  2π
1 f (z  )ireiθ dθ 1
= = f (z  )rdθ
2πi 0 reiθ 2πr 0

1
= f (z  ) ds, (92)
2πr C
where ds is an element of circular arc. Thus, f (z0 ) is given by the aver-
age value of f (z) on a circle centered at z0 (entirely within the domain of
analyticity).

9 Taylor Series
Let f (z) be analytic in domain D with z0 ∈ D, and circle C ⊂ D centered
at z0 [hence, f (z) is analytic within and on C]. Let z = z0 + h be interior to
C. Then, use Cauchy’s integral:

1  f (z  ) dz 
f (z) = f (z0 + h) =
2πi C z  − z0 − h
 
1 1 h
= f (z  ) dz   +  +···
2πi C z − z0 (z − z0 )2
hn hn+1 
+  + , (93)
(z − z0 )n+1 (z  − z0 )n+1 (z  − z0 − h)

20
R
z0 z
C
D
Figure 17: Illustration for Taylor series discussion.

where we have used


1 1 h
= + . (94)
(z  − z0 − h) (z  − z0 (z  − z0 )(z  − z0 − h)
We also know that:

(n) n! f (z  )
f = dz  , (95)
2πi C (z − z)
 n+1

Comparing, we have:

(1)h2 2 hn (n)
f (z) = f (z0 ) + hf (z0 ) + f (z0 ) + · · · + f (z0 ) (96)
2 n!

hn+1 f (z  ) dz 
+ .
2πi C (z  − z0 )n+1 (z  − z0 − h)
Thus, we have

n
f ( k)(z0 )
f (z) = (z − z0 )k + Rn , (97)
k=0 k!

21
where 
(z − z0 )n+1 f (z  ) dz 
Rn = . (98)
2πi C (z  − z0 )n+1 (z  − z)
We see that term by term this is the same form as the Taylor series expansion
for a real function of a real variable.
Let us investigate the remainder term, Rn . In particular, how big is it?
We first notice that f (z  ) and 1/|z  − z| are continuous, hence bounded, on
C:  
 f (z  ) 
 
   ≤ M, z  ∈ C, z inside C. (99)
z − z 

Let R be the radius of C. Then:


  
1  f (z  ) dz  

|Rn | = (z − z0 ) n+1

2π  C (z − z0 )
 n+1 (z − z) 


M 1
≤ |z − z0 |n+1 n+1 2πR
2π R
 
 z − z0 n+1
≤ MR   . (100)
R 
 
 
Since  z−z
R
0
 < 1, we can approximate f (z) to any desired accuracy with our
finite Taylor series expansion.

10 Bolzano-Weierstrass Theorem
We wish to consider infinite series next, which means we must concern our-
selves with issues of convergence. Let us begin with sequences. Given any
sequence of complex numbers, z1 , z2 , . . . ≡ {zn }, we say that the sequence
{zn } tends to the limit L as n → ∞:

lim zn = L,
n→∞
(101)

if, for every  > 0, there exists N such that |zN +k − L| <  for all positive
integers k. If {zn } is such that for any real number G, we can find N so that
|zN +k | > G for all positive intergers k, then we say that |zn | tends to ∞ as
n → ∞.
Finally, if a sequence does not tend to a unique limit, and does not tend
to plus or minus infinity, then the sequence is said to oscillate.

Definition: A limit point of a set S is a point such that there are an


unlimited number of elements of S which are arbitrarily close to the
limit point.

22
For example, 1 is a limit point for the sequence 1 + 1/n (even though 1 is
not an element of the sequence). For another example, 1 is a limit point for
the sequence 1, 2, 1, 2, 1, 2, 1, 2, . . .

Theorem: (Bolzano-Weierstrass) If {xn } is an infinite sequence of real num-


bers, and there exists a, b such that a ≤ xn ≤ b for all n (where a and
b are independent of n), then {xn } has at least one limit point.

Proof: Let G be a real number such that G > |a|, G > |b|. Then, G > |xn |
for all n. Consider the interval I0 = (−G, G). Cut it in half (say, to
(−G, 0) and [0, G)): At least one subinterval must contain an infinite
number of members of the sequence {xn }. Call the rightmost such
interval I1 . Now cut I1 in half. Again, at least one subinterval must
contain an infinite number of members of the sequence {xn }. Call the
rightmost such interval I2 . We may continue this interval subdivision
indefinitely, making our interval as small as we please. In the nested set
of intervals I1 , I2 , I3 , . . . there exists a point L which belongs to all the
intervals of the nest. Choose k sufficiently large such that the length
of Ik is less than any given  > 0. Then if {xn }k is the infinite set of
members of {xn } which lies in Ik , we have that |xn − L| <  for all
members of {xn }k . Hence L is a limit point of the sequence.

11 Cauchy’s Condition for the Existence of a


Limit, or, Cauchy’s Principle of Conver-
gence
Theorem: A sequence of complex numbers z1 , z2 , . . . has a limiting value if
and only if, given any  > 0 there is an N such that |zN +k − zN | < 
for all positive integers k.

This convergence condition is referred to as Cauchy’s condition. Note the


distinction between this theorem and the definition of the limiting value. To
apply this test, one does not need to know, a priori, what the limit is.

Proof: Necessity: We suppose a limit, L, exists. Then, given any  > 0,


there exists an N such that |zN − L| < /2, and |zN +k − L| < /2 for
all positive integers k. By the triangle inequality:

|zN +k − zN | ≤ |zN +k − L| + |zN − L| < . (102)

23
Sufficiency: We suppose that given an  > 0 there exists an N such
that |zN +k − zN | <  for all positive integers k. But the hypoteneuse
of a triangle is longer than either other leg, and hence:
 > |zN +k − zN | ≥ |xN +k − xN | (103)
≥ |yN +k − yN |. (104)
Thus, we may consider a real sequence {xn } which satisfies the Cauchy
condition. Consider  = 1, and pick an N = M such that:
|xM +k − xM | < 1 ∀k = 1, 2, 3, . . . (105)
Let a1 , b1 be the least and greatest values, respectively, of the finite
sequence x1 , x2 , . . . , xM . Let a = a1 −1 and b = b1 +1. Then a < xn < b
for all n. By the Bolzano-Weierstrass theorem, {xn } has at least one
limit point, G.
Now we must demonstrate that there is only one limit point: Suppose
there are at least two, G and H. Then, given  > 0, there exists an
n such that |xn+p − xn | < , by hypothesis, and there exists positive
integers q and r such that |G − xn+q | <  and |H − xn+r | < , since G
and H are limit points. Thus,
|G − H| = |G − xn+q + xn+q − xn + xn − xn+r + xn+r − H|
≤ |G − xn+q | + |xn+q − xn | + |xn+r − xn | + |H − xn+r |
< 4. (106)
Hence G=H, and there is only one limit point. Thus, given δ > 0,
there are at most a finite number of terms of the sequence outside the
interval (G − δ, G + δ), so G is the limit of {xn }.
Similarly, the imaginary part sequence has a limit, hence {zn } has
a limit [noting that if limn→∞ zn = L, and limn→∞ zn = L , then
limn→∞ (zn + zn ) = L + L ].

12 Infinite Series
Given a sequence {un }, we can construct a sequence:
S0 = u 0 (107)
S1 = u0 + u1 (108)
..
.

n
Sn = uk . (109)
k=0

24
These are the “partial sums” of the infinite series:


S= uk . (110)
k=0

The infinite series is said to converge if, given  > 0 there exists S and
n0 such that:
|S − Sn | < , ∀n > n0 . (111)

If the series ∞ n=0 un converges: It is said to be absolutely convergent if
∞
n=0 |un | converges; otherwise it is conditionally convergent. Note that an
absolutely convergent series may be rearranged at will, with identical results,
but this doesn’t hold for a conditionally convergent series.
We give some tests for convergence, leaving the proofs to the reader:
1. Cauchy Integral test for convergence: If f (x) is a positive, real,

decreasing function of x for real x ≥ 1, then the series S = ∞n=1 f (n)
converges or diverges, depending on whether the integral
 n
lim
n→∞
f (x)dx (112)
1

converges or diverges.
∞
2. Comparison test for absolute convergence: S = n=0 un is absolutely
convergent if
|un | < c|vn |, ∀n > N, (113)
∞
c is independent of n, and n=0 vn is known to be absolutely convergent.

3. d’Alembert’s ratio test for absolute convergence: ∞ n=0 un converges
absolutely if  
 un+1 
limn→∞   < 1, (114)
un 
where lim is the “limit superior”, or least upper bound of all convergent
subsequences of {un }. The sum diverges if
 
 un+1 

limn→∞   > 1. (115)

un
4. Raabe’s test for absolute convergence: If
 
 un+1 
lim   = 1, (116)
n→∞  u 
n

and   
 un+1 

limn→∞ n  −1 < −1, (117)

un
∞
then n=0 un converges absolutely.

25
5. Cauchy’s test for absolute convergence: If

limn→∞ |un |1/n < 1, (118)


∞
then n=0 un converges absolutely.

13 Series of Functions
If the terms of an infinite series are functions of complex variable z, then the
series may converge or not, depending on the value of z. We are interested
in the region of convergence of such a series. We are also interested in
continuity, integrability, and differentiability of such a series (especially of
analytic functions, including power series).
 N
If S(z) = ∞ n=0 un (z) and SN (z) = n=0 un (z), then S(z) is said to be
uniformly convergent over the set of points {z|z ∈ R} = R if, given any
 > 0, there exists an N such that:

|S(z) − SN +k (z)| < , ∀k = 0, 1, 2, . . . , and ∀z ∈ R. (119)

Note that the condition of uniform convergence is in a sense stronger than


simple convergence – S(z) may converge for all z ∈ R, without being uni-
formly convergent. As an example, consider f (z) = 1/(1 − z), for R = {z :
|z| < 1}.
A necessary and sufficient condition for uniform convergence is “Cauchy’s

principle for uniform convergence: Given S(z) = ∞ n=0 un (z) which converges
for all z ∈ R, where R is a closed region, and any  > 0, then S(z) converges
uniformly in R if there exists an N such that

|SN (z) − SN +k (z)| < , ∀k = 0, 1, 2, . . . , and ∀z ∈ R. (120)

It is left to the reader to prove this, using techniques similar to methods


already encountered.
Another, sufficient, test for uniform convergence is the “Weierstrass M
test”: If |un (z)| ≤ Mn , where Mn is a positive real number, independent

of z ∈ R, and if ∞ n=0 Mn converges, then S(z) is uniformly convergent on
z ∈ R.
Let us consider the following example: Suppose we have the real series
x2 x2
S(x) = x2 + + +··· (121)
1 + x2 (1 + x2 )2

x2
= 2 n
. (122)
n=0 (1 + x )

26
We see that S(x) converges absolutely for all real x, since:

N
x2
SN (x) = 2 n
(123)
n=0 (1 + x )

0 if x = 0,
= 1 + x2 − (1+x12 )N 0.
if x = (124)

Thus, S(x) converges absolutely for all possible real x values.


But does this series converge uniformly? We suspect trouble because of
the peculiar behavior at x = 0:

0 at x = 0,
S(x) = 2 (125)
1 + x for x = 0.
That is, S(x) is discontinuous at x = 0. For uniform convergence, we must
have the case that, given any  > 0 there exists an N, independent of x, such
that
|SN (x) − SN +k (x)| < , ∀k = 0, 1, 2, . . . , and ∀x. (126)
Assume x > 0. Then:
1
|SN (x) − SN +k (x)| = . (127)
(1 + x2 )N +k
Let’s choose  = 1/2. Notice that for any fixed N, and any chosen k, we can
always pick x > 0 small enough so that
1 1
2 N +k
> = . (128)
(1 + x ) 2
Hence the convergence is not uniform near x = 0.
The following theorem addresses the question of continuity:

Theorem: If S(z) = ∞ n=0 un (z) is a uniformly convergent series of continu-
ous functions un (z) for all z ∈ R, where R is a closed region, then S(z)
is a continuous function of z, for all z ∈ R.

Proof: Write S(z) = Sn (z)+Rn (z), where Rn (z) = ∞ k=1 un+k (z). Sn (z) is a
finite sum of continuous functions and hence is continuous throughout
R. By uniform convergence, given any  > 0, we can find N such that
RN (z) < 13 , for all z ∈ R. Furthermore, since SN (z) is continuous for
all z ∈ R, there exists a ρ > 0 such that |SN (z + δ) − SN (z)| < 13 
whenever |δ| < ρ and z + δ ∈ R. Therefore:
|S(z + δ) − S(z)| = |SN (z + δ) − SN (z) + RN (z + δ) − RN (z)|
≤ |SN (z + δ) − SN (z)| + |RN (z + δ)| + |RN (z)|
< . (129)
Hence, S(z) is continuous for all z ∈ R.

27
We may also be concerned with the question of multiplication of series. If
two series are absolutely convergent, then the series formed of product terms
is absolutely convergent independent of order, and the product series is equal
to the product of the individual series.
Next, let us consider the integration of a series:

Theorem: Let S(z) = ∞ n=0 be a uniformly convergent series of continuous
functions in a domain D. Then, if C ⊂ D, where C is a finite path in
D, we have:
 ∞ 

S(z)dz = un (z)dz. (130)
C n=0 C

The order of integration and summation may be interchanged for a


series of continuous functions in its domain of uniform convergence.
n ∞
Proof: Write S(z) = k=0 uk (z) + Rn (z), where Rn (z) = k=1 un+k (z).
Then  n  

S(z)dz = uk (z)dz + Rn (z)dz. (131)
C k=0 C C

Since S(z) is uniformly convergent, for any given  > 0, there exists
an N such that Rn (z) <  for all n ≥ N and all z ∈ D. Now, if

LC = C |dz| < ∞ is the “path length”, then
 
 
 Rn (z)dz  < LC , ∀n ≥ N. (132)

C

Hence,  
 n 

 
 S(z)dz − uk (z)dz  < LC , ∀n ≥ N, (133)
 C C 
k=0

which can be made arbitrarily small.

Next, we investigate the differentiation of a series.



Theorem: If S(z) = ∞ n=0 un (z) is a series of functions which are analytic
on and inside a contour C, and if S(z) converges uniformly on C, then
S(z) is analytic everywhere inside C, with derivative:

dS dun
(z) = (z). (134)
dz n=0 dz

That is, The order of differentiation and summation may be reversed.

28
Proof: Let z0 be a point inside C.
 
1  S(z)dz 1  ∞
dz
= un (z)
2πi C z − z0 2πi C n=0 z − z0
 
1  n
uk (z) 
Rn (z)
= dz + dz
2πi C k=0 z − z0 C z − z0


n
1  Rn (z)
= uk (z0 ) + dz. (135)
k=0 2πi C z − z0

The series converges uniformly on C, so given any  > 0 there exists


an N such that |Rk (z)| <  for all k ≥ n and for all z ∈ C. Thus,
   
 Rn (z)   dz 
 
 dz  <    < 2π. (136)
 C z − z0   C z − z0 
Therefore, the series converges, and we have:
 ∞
1 S(z)
dz = un (z0 ) ≡ S(z0 ), (137)
2πi C z − z0 n=0

where we take the latter as the definition of S(z0 ) interior to C.



Thus, S(z) = ∞ n=0 un (z) is defined on and inside C. To prove analyt-
icity inside C, we show that the derivative exists:
S(z0 + h) − S(z0 )
S  (z0 ) = lim
h→0 h
  
1 1 S(z) S(z)
= lim − dz
h→0 2πi h C z − z0 − h z − z0

1 S(z)
= lim dz
h→0 2πi C (z − z0 − h)(z − z0 )
 n   
1 uk (z) Rn (z)
= dz + dz . (138)
2πi k=0 C (z − z0 )2 C (z − z0 )2

The first term is the form of the derivative of an analytic function we


saw earlier. The second term can be made arbitrarily small by taking
n large enough, by the uniform convergence of S on C. Hence,

dun
S  (z) = (z). (139)
n=0 dz

Let us now turn to the special case of power series, of which the Taylor
series is an important example.

29

Theorem: If S(z) = ∞ n
n=0 an z converges for z = z1 , then it is absolutely
convergent for all |z| < |z1 |.

Proof: Since the series converges for z = z1 , an z1n must be bounded: |an z1n | <
M for all n. Pick any z such that |z| < |z1 |. Let r = |z|/|z1 | < 1. Then
 
 z n
|an z | =
n
|an z1n |  n
 < Mr , (140)
z1
and ∞ ∞ ∞

|an z n | < Mr n = M rn . (141)
n=0 n=0 n=0

Since r < 1, this is convergent, hence S(z) is absolutely convergent for


all |z| < |z1 |.

A similar argument can be used to show that, if S(z) = ∞ n
n=0 an z di-
verges for z = z1 , then it diverges for all |z| > |z1 |. Thus, the region of
convergence of a power series is a circle: Inside the circle there is absolute
convergence, and outside there is divergence. On the circle, we cannot say
in general. For example,


⎪ diverges for |z| > 1


zn ⎨ absolutely converges for |z| < 1
S(z) = (142)
n ⎪

⎪ converges for z = −1
n=1 ⎩
diverges for z = +1.
We state and leave it for the reader to prove the following:

Theorem: A power series is uniformly convergent in any closed region inside


the circle of convergence.

We have the following uniqueness theorem for power series:



Theorem: If S(z) = ∞ n=0 an (z − z0 ) converges for all points inside the
n

circle |z − z0 | = r0 , then the series is the Taylor series for S(z) (about
z0 ).

Proof: The proof consists in differentiating k times, and showing that an =


S (n) (z0 )/n!.

30
z
C1
z0×
C2

Figure 18: Illustration for Laurent series discussion.

14 Laurent Series
We now introduce a generalizaton of the Taylor series, the Laurent series.
Consider a function f (z) which is analytic in a region containing two con-
centric circles (but not necessarily in the interior of the smaller circle).
“Contour” C2 −C1 (Fig. 18 represents a closed path in a “simply-connected”
domain, so we can use the Cauchy Integral Formula (for z in the annulus):
 
1 f (z  )  1 f (z  ) 
f (z) = dz − dz . (143)
2πi C2 z −z
 2πi C1 z − z
Now,
1 1 1
=  · . (144)
z −z z − z0 1 − (z − z0 )/(z  − z0 )
For z  on C2 , z in the annulus, and with z0 the center of the circles, |(z −
z0 )/(z  − z0 )| < 1. We may thus write, for z  on C2 :

1 (z − z0 )n
= . (145)
z  − z n=0 (z  − z0 )n+1

Similarly, for z  on C1 :

1 (z  − z0 )n
− = . (146)
z  − z n=0 (z − z0 )n+1

31
Putting this back into 143:

   
(z − z0 )n f (z  ) 1 1
f (z) = dz  −  n  
(z − z0 ) f (z ) dz .
n=0 2πi C2 (z − z0 )
 n+1 2πi (z − z0 )n+1 C1
(147)
Thus, we can write:


1
f (z) = an (z − z0 ) +
n
bn , (148)
n=0 n=1 (z − z0 )n

where:

1 f (z  )
an = dz  , n = 0, 1, 2, . . . (149)
2πi C2 (z − z0 )
 n+1

1 f (z  )
bn = dz  , n = 1, 2, . . . (150)
2πi C1 (z − z0 )
 −n+1

Or, we may combine the series:




f (z) = An (z − z0 )n , (151)
n=−∞

where,
1  f (z  )
An = dz  , (152)
2πi C (z  − z0 )n+1
where C is any contour which makes one counter-clockwise passage around
z0 , and lies in the region bounded by C1 and C2 . This is called the Laurent
series.
If we express f (z) = φ(z) + ψ(z), where


φ(z) = An (z − z0 )n , (153)
n=0

ψ(z) = A−n (z − z0 )−n , (154)
n=1

then ψ(z) is called the principal part of f (z). Note that φ(z) converges
uniformly in any closed region interior to the outer edge of the annulus.
Hence, f (z) = φ(z) + ψ(z) converges uniformly in any closed region within
the annulus.
If z = z0 is a singularity of f (z), and there exists a neighborhood of z0
which contains no other singularity, then z0 is called an isolated singularity
of f (z). For example, z = 1 is an isolated singularity of f (z) = 1/(z − 1). If

32
all the coefficients of the principal part vanish, then an isolated singularity z0
is called a removable singularity. For example, the origin is a removable
singularity of f (z) = sin z/z. The singularity in this case may be “removed”
by defining
sin z
f (0) ≡ lim = 1. (155)
z→0 z

If the principal part terminates after a finite number of terms, say

A−m = 0, (156)
A−(m+k) = 0, ∀k = 1, 2, 3, . . . , (157)

then f (z) is said to have a pole of order m at z0 . For example, f (z) =


1/(z − z0 )2 has a pole of order 2 at z0 .
If the principal part has an infinite number of non-vanishing coeeficients,
then z0 is called an essential singularity of f (z). An essential singularity
need not be isolated. For example, z = 0 is an essential singularity of f (z) =
1/ sin(1/z). It is also the limit point of a sequence of poles, and hence is
not an isolated singularity. On the other hand, z = 0 is an isolated essential
singularity of f (z) = e1/z .
If the Laurent series is not known, the order of a pole may be determined
by examining limits. Consider the limits:

lim (z − z0 )n f (z), n = 1, 2, 3, . . .
z→z0
(158)

The lowest n for which the limit exists is the order of the pole at z0 .

15 Residues
Consider the integral: 
In = (z − z0 )n dz, (159)
C
where C is a closed contour surrounding z = z0 and n is an integer. Since
(z − z0 )n is analytic, except possibly at z0 , we may deform the contour into a
circle centered at z0 without affecting In . Then we may write z − z0 = Reiθ ,
and hence
 2π
n+1
In = iR ei(n+1)θ dθ. (160)
0

0 n = −1
= . (161)
2πi n = −1

33
Now suppose we have a function, f (z), which is analytic in a region except
at the point z0 in the region. Then we can make the Laurent expansion about
z0 :


f (z) = An (z − z0 )n . (162)
n=−∞

Take a contour C around z0 :


  ∞

1 1
f (z) dz = An (z − z0 )n dz (163)
2πi C 2πi C n=−∞

= A−1 . (164)

Thus the coefficient of 1/(z − z0 ) in the Laurent series is given by A−1 =


1
2πi C
f (z)dz. This coefficient is called the residue of f (z) at z0 . Notice
that the residue is zero if f (z) is analytic at z0 , or if the coefficient A−1 is
zero (even if z0 is a pole or isolated essential singularity).
We now come to the important and useful residue theorem. Consider
contour C in a region where f (z) is analytic except at isolated singularities
(poles or essential singularities).

× ×
b a Ca
Cb

×
c Cc

Figure 19: Contours to illustrate residue theorem. Singularities are at a, b,


and c.


We want to determine C f (z) dz. We can write this in terms of the sum
of the integrals around each singularity:
  
f (z) dz = f (z) dz + f (z) dz + · · ·
C Ca Cb

34
= 2πi(a−1 + b−1 + c−1 + · · ·)

= 2πi R, (165)
singularities

where a−1 , b−1 , c−1 , . . . are the residues at a, b, c, . . ., respectively, and R is
the sum of the residues of f (z) interior to the countour C.
The computation of the residues is thus often an important part of eval-
uating integrals. At a simple pole, the Laurent series is


A−1
f (z) = + An (z − z0 )n , (166)
z − z0 n=0
and hence
A−1 = lim [(z − z0 )f (z)] . (167)
z→z0
For a pole of order m,


A−m A−m+1 A−1
f (z) = + + · · · + + An (z − z0 )n . (168)
(z − z0 )m (z − z0 )m−1 z − z0 n=0
If we multiply both sides by (z − z0 )m , we have:


(z−z0 )m f (z) = A−m +A−m+1 (z−z0 )+· · ·+A−1 (z−z0 )m−1 + An (z−z0 )n+m .
n=0
(169)
Now differentiate m − 1 times and evaluate at z = z0 :
dm−1 [(z − z0 )m f (z)] 
 = (m − 1)!A−1 , (170)
dz m−1 z=z0

and hence,
1 dm−1 [(z − z0 )m f (z)] 
A−1 =  . (171)
(m − 1)! dz m−1 z=z0

But sometimes it is easier to just carry out the expansion sufficiently to find
A−1 directly.

16 Cauchy Principal Value Integral


Suppose f (z) has a simple pole at z = z0 = x0 + i0 on the real axis. We may
define an integral along the real axis through this pole according to:
 β   β 
x0 −
P f (z)dz ≡ lim f (x)dx + f (x)dx , (172)
α →0 α x0 +

where α < x0 < β. This is known as the Cauchy Principal Value Inte-
gral.

35
Hilbert Spaces
Physics 195
Supplementary Notes
020922 F. Porter
These notes collect and remind you of several definitions, connected with the
notion of a Hilbert space.
Def: A (nonempty) set V is called a (complex) linear (or vector) space,
and its elements are called vectors if:
(a) An operation of “addition” is defined for every pair of elements
x  V and y  V , such that the “sum” x + y  V .
(b) There exists a “zero vector,” 0  V such that
x+0=x for all x  V
(c) An operation of “multiplication” by a complex number (“scalar”)
is defined so that, if c is any complex number, and x any vector
in V , then the “product” cx  V .
(d) The following properties (of ordinary vector algebra) are satisfied:
(x, y, and z are any elements of V , and c, c1 , and c2 are any
complex numbers)
x+y =y+x 1) commutativity
(x + y) + z = x + (y + z) 2) associativity
c1 (c2 x) = (c1 c2 )x
x + (−x) = 0 3) inverse
1x = x 4) multiplication by scalar 1
(c1 + c2 )x = c1 x + c2 x 5) distributivity
c(x + y) = cx + cy.
If we restrict to real scalars, we have a “real vector space.” Note that the
0 element is unique by virtue of the fact that the vectors under the operation
of addition form an abelian group.
Def: m vectors x(1) , x(2) , · · · , x(m) are linearly dependent if there exist m
constants, not all zero, such that
m

ci x(i) = 0
i=1
Otherwise, the vectors are linearly independent.
1
Def: A linear space V is n-dimensional if it contains a set of n linearly
independent vectors, but no set of more than n linearly independent
vectors.

Def: A set of linearly independent vectors, e1 , e2 , e3 , · · · in a vector space


V forms a basis for V if, for any vector x  V , we can find scalars
c1 (x), c2 (x), · · · such that:

x = c1 (x)e1 + c2 (x)e2 + · · ·

Def: A relation is a set of ordered pairs.

Def: A function is a relation such that no two distinct members have the
same first coordinate. (To a mathematician, the following terms are
really synonymous: function, map, operator, transformation, corre-
spondence).

Def: A linear space V is called pre-Hilbert, or Euclidean, if a function


is defined which assigns to every pair of vectors x, y  V a complex
number x|y, called the scalar product (or inner product) of x
and y, which satisfies the following properties:

1. x|x ≥ 0; x|x = 0 iff x = 0

2. x|y = y|x

3. x|cy = cx|y (c is any complex number)

4. x|y1 + y2  = x|y1 + x|y2 

Def: A non-empty set M is called a metric space if to every pair of elements


x, y  M there is assigned a real number d(x, y) called the distance
between x and y, such that:

1. 0 ≤ d(x, y) (and < ∞)

2. d(x, y) = 0 iff x = y

3. d(x, y) = d(y, x)

4. d(x, y) + d(y, z) ≥ d(x, z) (triangle inequality)

2
Note that a metric space need not be a linear space. However, if we have
a pre-Hilbert space, we may define a suitable distance according to:

d(x, y) = (x − y)|(x − y)

We will typically deal only with metric spaces which are also pre-Hilbert
spaces. We will also use the notation

|x − y| = d(x, y)

for the distance function. We will furthermore define the “length,” or “norm,”
of a vector by its distance from the zero vector: |x| = |x − 0| = d(x, 0) =

x|x. Also, if x|y = 0, we say that x is “orthogonal” to y.
Def: A sequence of elements in a metric space, x1 , x2 , · · · is said to converge
to an element x if, given  > 0, there exists a number N such that:
|x − xn | <  whenever n > N. In this case, we write x = limn→∞ xn .

Def: A sequence of elements in a metric space is called a Cauchy sequence


if, given  > 0, there exists N such that:

| xn − xm |<  for all n, m > N


(“Cauchy Convergence Criterion”)
Theorem: Every convergent sequence of elements in a metric space is a
Cauchy sequence.

Proof: Suppose x1 , x2 , · · · is a convergent sequence of elements such that


limn→∞ xn = x. Then, given any  > 0, we may find a number N such
that:
1
| x − xm |<  for all n > N
2
Consider

|xn − xm | = |xn − x + x − xm | (1)


≤ |xn − x| + |x − xm | (triangle inequality) (2)
1 1
< +  for all n, m > N (3)
2 2
<  (4)

QED

3
Def: A metric space (V, d) is said to be complete if every Cauchy sequence
of points in V converges to a point in V .

Theorem: (and Definition) If (V, d) is an incomplete metric space, there


exists a complete metric space (V ∗ , d∗ ) called the completion of V
which corresponds to an isometric (i.e., distance preserving) mapping
of V into V ∗ such that the closure (i.e., the intersection of all closed
sets containing V ) of the image of V coincides with V ∗ . (For an in-
structive, but mildly lengthy proof, see: Fano, Mathematical Methods
of Quantum Mechanics.)

Note that this theorem tells us that every pre-Hilbert space admits a
completion, which we call a Hilbert space. If we wish to continue this
discussion, for example to construct a suitable Hilbert space of functions for
quantum mechanical problems, we will properly need to consider measure
theory, etc. However, let us turn our attention now to other matters.

Def: Suppose that we have two vector spaces, V and V  , and that there
exists a correspondence which assigns to every vector x  DA ⊂ V , a
vector x  V  . We say that this correspondence defines an operator A
from V into V  with domain DA , and write x = Ax.

The subset RA of V  defined by

RA = {x | x = Ax for some xDA }

is called the range of A.

If V  = V , then we say that A is defined in V .

If DA = V , then we say that A is defined on V .

If V  is the vector space formed by the complex numbers (with ordinary


operations of addition and multiplication by a complex number), we
often use the term functional instead of operator, and write x = f (x)
rather than x = Ax.

Def: Two operators A and B from V into V  are said to be equal if DA = DB


and

4
Ax = Bx for all x  DA
If, on the other hand, DB is a proper subset of DA , and Ax = Bx for all
x  DB , we call A an extension of B, and B a restriction of A. We may
denote this by writing B ⊂ A.

Def: An operator L is said to be linear if its domain DL is a subspace of V


(i.e. DL is a vector space) and

L(x + y) = Lx + Ly for all x, y  DL


L(cx) = c Lx for all x  DL and for all scalars c
Henceforth, we will typically mean “linear operator” whenever we say “op-
erator”.

5
Physics 195
Supplementary Notes
Groups, Lie algebras, and Lie groups
020922 F. Porter

This note defines some mathematical structures which are useful in the
discussion of angular momentum in quantum mechanics (among other things).
Def: A pair (G, ◦), where G is a non-empty set, and ◦ is a binary operation
defined on G, is called a group if:
1. Closure: If a, b ∈ G, then a ◦ b ∈ G.
2. Associativity: If a, b, c ∈ G, then a ◦ (b ◦ c) = (a ◦ b) ◦ c.
3. Existence of right identity: There exists an element e ∈ G such
that a ◦ e = a for all a ∈ G.
4. Existence of right inverse: For some right identity e, and for any
a ∈ G, there exists an element a−1 ∈ G such that a ◦ a−1 = e.
The ◦ operation is typically referred to as “multiplication”.
The above may be termed a “minimal” definition of a group. It is amusing
(and useful) to prove that:
1. The right identity element is unique.
2. The right inverse element of any element is unique.
3. The right identity is also a left identity.
4. The right inverse is also a left inverse.
5. The solution for x ∈ G to the equation a ◦ x = b exists and is unique,
for any a, b ∈ G.
We will usually drop the explicit ◦ symbol, and merely use juxtaposition to
denote group multiplication. Note that both G (the set) and ◦ (the “mul-
tiplication table”) must be specified in order to specify a group. Where the
operation is clear, we will usually just refer to “G” as a group.
Def: An abelian (or commutative) group is one for which the multiplica-
tion is commutative:
ab = ba ∀ a, b ∈ G. (1)

1
Def: The order of a group is the number of elements in the set G. If this
number is infinite, we say it is an “infinite group”.

In the discussion of infinite groups of relevance to physics (in particular,


Lie groups), it is useful to work in the context of a richer structure called an
algebra. For background, we start by giving some mathematical definitions
of the underlying structures:

Def: A ring is a triplet R, +, ◦ consisting of a non-empty set of elements


(R) with two binary operations (+ and ◦) such that:

1. R, + is an abelian group.


2. (◦) is associative.
3. Distributivity holds: for any a, b, c ∈ R

a ◦ (b + c) = a ◦ b + a ◦ c (2)
and
(b + c) ◦ a = b ◦ a + c ◦ a (3)

Conventions:
We use 0 (“zero”) to denote the identity of R, + . We speak of (+) as ad-
dition and of (◦) as multiplication, typically omitting the (◦) symbol entirely
(i.e., ab ≡ a ◦ b).

Def: A ring is called a field if the non-zero elements of R form an abelian


group under (◦).

Def: An abelian group V, ⊕ is called a vector space over a field F, +, ◦
by a scalar multiplication (∗) if for all a, b ∈ F and v, w ∈ V :

1. a ∗ (v ⊕ w) = (a ∗ v) ⊕ (a ∗ w) distributivity
2. (a + b) ∗ v = (a ∗ v) ⊕ (b ∗ v) distributivity
3. (a ◦ b) ∗ v = a ∗ (b ∗ v) associativity
4. 1 ∗ v = v unit element (1 ∈ F )

2
Conventions:
We typically refer to elements of V as “vectors” and elements of F as
“scalars.” We typically use the symbol + for addition both of vectors and
scalars. We also generally omit the ∗ and ◦ multiplication symbols. Note
that this definition is an abstraction of the definition of vector space given
in the note on Hilbert spaces, page 1.

Def: An algebra is a vector space V over a field F on which a multiplication


(◦) between vectors has been defined (yielding a vector in V ) such that
for all u, v, w ∈ V and a ∈ F :

1. (au) ◦ v = a(u ◦ v) = u ◦ (av)


2. (u + v) ◦ w = (u ◦ w) + (v ◦ w) and w ◦ (u + v) = (w ◦ u) + (w ◦ v)

(Once again, we often omit the multiplication sign, and hope that it is
clear from context which quantities are scalars and which are vectors.)
We are interested in the following types of algebras:

Def: An algebra is called associative if the multiplication of vectors is as-


sociative.

We note that an associative algebra is, in fact, a ring. Note also that
the multiplication of vectors is not necessarily commutative. An important
non-associative algebra is:

Def: A Lie algebra is an algebra in which the multiplication of vectors


obeys the further properties (letting u, v, w be any vectors in V ):

1. Anticommutivity: u ◦ v = −v ◦ u.
2. Jacobi Identity: u ◦ (v ◦ w) + w ◦ (u ◦ v) + v ◦ (w ◦ u) = 0.

We may construct the idea of a “group algebra”: Let G be a group,


and V be a vector space over a field F , of dimension equal to the order of
G (possibily ∞). Denote a basis for V by the group elements. We can now
define the multiplication of two vectors in V by using the group multiplication
table as “structure constants”: Thus, if the elements of G are denoted by gi ,
a vector u ∈ V may be written:

u= ai gi

3
We require that, at most, a finite number of coefficients ai are non-zero. The
multiplication of two vectors is then given by:
 
     
ai gi bj gj =  ai bj  gk
gi gj =gk


[Since only a finite number of the ai bj can be non-zero, the sum gi gj =gk ai bj
presents no problem, and furthermore, we will have closure under multipli-
cation.]
Since group multiplication is associative, our group algebra, as we have
constructed it, is an associative algebra.

4
Physics 195a
Problem set number 1
Due 2 PM, Thursday, October 10, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “Preliminaries” course note. Read sections 1,2, and
3 of the “Ideas of Quantum Mechanics” course note.
PROBLEMS:

0. Tell me all the typos and other errors in the course notes. I’ll at least
be grateful, and if you find enough issues of substance, I’ll add bonus
points to your total score.

1. Gravitational Bohr “atom”: Exercise 1 in the “Preliminaries” course


note.

2. Resonances I: Exercise 2 in the “Preliminaries” course note.

3. Plum pudding model: Exercise 3 in the “Preliminaries” course note.

1
4. Hilbert space: Exercise 3 in the “Ideas of Quantum Mechanics” course
note.

5. Time reversal: Exercise 6 in the “Ideas of Quantum Mechanics” course


note.

6. Action of Gallilean transformations: Exercise 7 in the “Ideas of Quan-


tum Mechanics” course note.

2
Physics 195a
Problem set number 2
Due 2 PM, Thursday, October 17, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Ideas of Quantum Mechanics” course note.


Read the “Path Integrals: An Example” course note.
PROBLEMS:

7. A little handy math:

(a) If you have never gone through a proof of the Schwarz inequality
before, do it now! Otherwise, you may write “been there, done
that”, and receive credit.
(b) Exercise 5 in the “Ideas of Quantum Mechanics” course note.

8. Fourier series: Exercise 10 in the “Ideas of Quantum Mechanics” course


note.

3
9. Tidying up the Ahronov-Bohm discussion: Exercise 1 in the “Path
Integrals: An Example” course note.

10. Tidying up the Ahronov-Bohm discussion: Exercise 2 in the “Path


Integrals: An Example” course note.

11. [Worth two problems] Extensions to Ahronov-Bohm discussion: Exer-


cise 4 in the “Path Integrals: An Example” course note. I hope you
find this problem amusing/stimulating!

4
Physics 195a
Problem set number 3
Due 2 PM, Thursday, October 24, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 1-6 of the “Density Matrix Formalism” course


note.
PROBLEMS:

12. Illustration of definition of a scalar product: Exercise 8 of the “Ideas


of Quantum Mechanics” course note.

13. Thinking about operators: Exercise 9 of the “Ideas of Quantum Me-


chanics” course note.

14. Resonances in quantum mechanics: Exercise 12 of the “Ideas of Quan-


tum Mechanics” course note.

5
15. Application of the uncertainty principle: Exercise 14 of the “Ideas of
Quantum Mechanics” course note.

16. Linear operators as n-term dyads: Exercise 1 of the “Density Matrix


Formalism” course note.

17. Some practice on the density matrix mathematics: Exercise 2 of the


“Density Matrix Formalism” course note.

6
Physics 195a
Problem set number 4
Due 2 PM, Thursday, October 31, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.
• Collaboration policy: OK to work together in small groups, and to help
with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.
• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/
• TA: Anura Abeyesinghe, anura@caltech.edu
• If you think a problem is completely trivial (and hence a waste of your
time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Density Matrix Formalism” course note.


PROBLEMS:

18. Let us try another example of the discussion we have been having in
class concerning the use of the uncertainty relation on “localized” wave
functions. Consider the three dimensional generalization. Hence, let
P (a) be the proability to find the particle, of mass m in a sphere of
radius a centered at the origin.
(a) Recall that in the one dimensional case, if the probability of finding
the particle in the interval (−a, a) was α, then a simple lower
bound on the kinetic energy was obtained as:
1 α2
T ≥ . (1)
8m a2
7
Make a simple, but rigorous, generalization of this result to the
three dimensional case. Don’t worry about finding the “best”
bound; even a “conservative” bound may be good enough to an-
swer some questions of interest.
(b) We know that an atomic size is of order 10−10 m. Suppose that
we have an electron which is known to be in a sphere of radius
10−10 m with 50% probability. What lower bound can you put on
its kinetic energy? Is the result consistent with expectation; e.g.,
with what you know about the kinetic energy of the electron in
hydrogen?
(c) In ancient times, before the neutron was discovered, it was sup-
posed that the nucleus contained both electrons and protons. A
comfortable nuclear size is 5 × 10−15 m. Find a lower bound on
the kinetic energy of an electron if the probability to be within
this radius is 90%. If there is a problem with the validity of your
bound, see if you can fix it.
(d) Now find a lower bound for a proton in the nucleus, if it has a
probability of 90% to be within a region of radius 5 × 10−15 m.

19. Some more thoughts about time reversal: Exercise 13 of the “Ideas of
Quantum Mechanics” course note.

20. The von Neumann mixing theorem: Exercise 3 of the “Density Matrix
Formalism” course note.

21. Operators in product spaces: Exercise 4 of the “Density Matrix For-


malism” course note.

22. Entropy in a two-state system: Exercise 5 of the “Density Matrix For-


malism” course note.

23. Measuring the density matrix: Exercise 6 of the “Density Matrix For-
malism” course note.

8
Physics 195a
Problem set number 4 – Solution to Problem 18
Due 2 PM, Thursday, October 31, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.
• Collaboration policy: OK to work together in small groups, and to help
with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.
• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/
• TA: Anura Abeyesinghe, anura@caltech.edu
• If you think a problem is completely trivial (and hence a waste of your
time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Density Matrix Formalism” course note.


PROBLEMS:

18. Let us try another example of the discussion we have been having in
class concerning the use of the uncertainty relation on “localized” wave
functions. Consider the three dimensional generalization. Hence, let
P (a) be the proability to find the particle, of mass m in a sphere of
radius a centered at the origin.
(a) Recall that in the one dimensional case, if the probability of finding
the particle in the interval (−a, a) was α, then a simple lower
bound on the kinetic energy was obtained as:
1 α2
T ≥ . (1)
8m a2
7
Make a simple, but rigorous, generalization of this result to the
three dimensional case. Don’t worry about finding the “best”
bound; even a “conservative” bound may be good enough to an-
swer some questions of interest.
Solution: The limit on T in Eqn. 1 corresponds to a limit on the
momentum of
1 α2
p2  ≥ . (2)
4 a2
In three dimensions the kinetic energy is
1 2
T = p + p2y + p2z . (3)
2m x
If the probability to find the particle within radius a is P (a), then
in each dimension we certainly have that the probability to be in
the interval (−a, a) is at least P (a). Thus, we can apply Eqn. 2
in each dimension, and hence, in sum:

3 P (a)2
T ≥ . (4)
8m a2

(b) We know that an atomic size is of order 10−10 m. Suppose that


we have an electron which is known to be in a sphere of radius
10−10 m with 50% probability. What lower bound can you put on
its kinetic energy? Is the result consistent with expectation; e.g.,
with what you know about the kinetic energy of the electron in
hydrogen?
Solution:
 
3 1 2 (200 MeV-fm)2
T ≥
8 2 0.5 MeV10−20 m2
≥ 0.8 eV. (5)

In the ground state, the expectation value of the kinetic energy


of the electron in hydrogen is 13.6 eV, consistent with our bound,
although our bound is not especially good.
(c) In ancient times, before the neutron was discovered, it was sup-
posed that the nucleus contained both electrons and protons. A
comfortable nuclear size is 5 × 10−15 m. Find a lower bound on

8
the kinetic energy of an electron if the probability to be within
this radius is 90%. If there is a problem with the validity of your
bound, see if you can fix it.
Solution:
3 (200 MeV-fm)2
T ≥ (0.9)2
8 0.5 MeV25 × 10−30 m2
≥ 4 GeV. (6)
The electron is relativistic, inconsistent with our assumption in
computing the kinetic energy with a non-relativistic equation. Our
derivation that p2  ≥ 1/4a2 in the one-dimensional case should
still be valid. The relativistic kinetic energy is T = E −m ≈ |p|,
where the approximationis in the limit E m. Let’s presume
we can estimate T with p2 . Then

3
T ≥
2a
200 MeV-fm
≥ = 40 MeV. (7)
5 fm
Let’s compare this with an estimated order of magnitude for the
electrostatic potential energy of an electron and a proton sepa-
rated by 5 fm:
e2 200 MeV-fm
|V | = ≈ = 0.4 MeV (8)
a 100 × 5 fm
This is much smaller than our limit on the kinetic energy, pre-
senting a theoretical problem with binding and with expectations
from the virial theorem.
(d) Now find a lower bound for a proton in the nucleus, if it has a
probability of 90% to be within a region of radius 5 × 10−15 m.
Solution:
3 (200 MeV-fm)2
T ≥ (0.9)2
8 900 MeV25 × 10−30 m2
≥ 0.5 MeV. (9)

19. Some more thoughts about time reversal: Exercise 13 of the “Ideas of
Quantum Mechanics” course note.

9
20. The von Neumann mixing theorem: Exercise 3 of the “Density Matrix
Formalism” course note.

21. Operators in product spaces: Exercise 4 of the “Density Matrix For-


malism” course note.

22. Entropy in a two-state system: Exercise 5 of the “Density Matrix For-


malism” course note.

23. Measuring the density matrix: Exercise 6 of the “Density Matrix For-
malism” course note.

10
Physics 195a
Problem set number 5
Due 2 PM, Thursday, November 7, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “The K 0 : An Interesting Example of a ‘Two-State’


System” course note.
PROBLEMS:

24. Suppose we have a system with total angular momentum 1. Pick a


basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1
ρ = 1 1 0 .
4
1 0 1

11
(a) Is ρ a permissible density matrix? Give your reasoning. For the
remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?
(c) What is the spread (standard deviation) in measured values of Jz ?

25. Coherent states with density matrices: Exercise 7 of the “Density Ma-
trix Formalism” course note.

26. Density matrix for a spin 1/2 system in a magnetic field: Exercise 8 of
the “Density Matrix Formalism” course note.

27. Entropy for a system of spin 1/2 particles in a magnetic field: Exercise
9 of the “Density Matrix Formalism” course note.

28. Hamiltonian in the particle-antiparticle basis: Exercise 1 of the K 0


course note.

29. Review of Schrödinger equation in three dimensions: Central potential


problem. There are some areas of elementary quantum mechanics that
I want to make sure don’t fall through the cracks in your education,
in particular, the central force problem and the specific case of the
one-electron atom.
Suppose we have two particles, of masses m1 and m2 , described by
position coordinates x1 and x2 . Assume that they interact with each
other via a potential V (x1 , x2 ).

(a) Write down the Hamiltonian for this system. Show that it may be
transformed to a description in terms of center-of-mass and rela-
tive coordinates. Show that the problem then reduces to two prob-
lems: one for the center-of-mass motion, and one for the relative
motion, if the potential can be separated into a term depending
only on the position of the center-of-mass, plus a term depending
only on the relative locations of the particles. Now assume that
the potential does not depend on the center-of-mass position, and
solve for the center-of-mass motion. Is your solution sensible?

12
(b) Suppose V is a function of the separation between the two particles
only, V = V (|x|), where x ≡ x1 − x2 . Solve the Schrödinger equa-
tion for the angular dependence and show that the Schrödinger
equation may be reduced to an equivalent one dimensional prob-
lem. Give the “effective potential” for this equivalent one dimen-
sional problem.

13
Physics 195a
Problem set number 5 – Solutions to Problems 24 and 29
Due 2 PM, Thursday, November 7, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “The K 0 : An Interesting Example of a ‘Two-State’


System” course note.
PROBLEMS:

24. Suppose we have a system with total angular momentum 1. Pick a


basis corresponding to the three eigenvectors of the z-component of
angular momentum, Jz , with eigenvalues +1, 0, −1, respectively. We
are given an ensemble described by density matrix:
 
2 1 1
1
ρ = 1 1 0 .
4
1 0 1

11
(a) Is ρ a permissible density matrix? Give your reasoning. For the
remainder of this problem, assume that it is permissible. Does it
describe a pure or mixed state? Give your reasoning.
Solution: Clearly ρ is hermitian. It is also trace one. This is
almost sufficient for ρ to be a valid density matrix. We can see
this by noting that, given a hermitian matrix, we can make a
transformation of basis to one in which ρ is diagonal. Such a
transformation preserves the trace. In this diagonal basis, ρ is of
the form:
ρ = a|e1 e1 | + b|e2 e2 | + c|e3 e3 |,
where a, b, c are real numbers such that a + b + c = 1. This is
clearly in the form of a density operator. Another way of arguing
this is to consider the n-term dyad representation for a hermitian
matrix.
However, we must also have that ρ is positive, in the sense that
a, b, c cannot be negative. Otherwise, we would interpret some
probabilities as negative. There are various ways to check this.
For example, we can check that the expectation value of ρ with
respect to any state is not negative. Thus, let an arbitrary state
be: |ψ = (α, β, γ). Then

ψ|ρ|ψ = 2|α|2 + |β|2 + |γ|2 + 2(α∗ β) + 2(α∗γ). (10)

This quantity can never be negative, by virtue of the relation:

|x|2 + |y|2 + 2(x∗ y) = |x + y|2 ≥ 0. (11)

Therefore ρ is a valid density operator.


To determine whether ρ is a pure or mixed state, we consider:
1 5
Tr(ρ2 ) = (6 + 2 + 2) = .
16 8
This is not equal to one, so ρ is a mixed state. Alternatively, one
can show explicitly that ρ2 = ρ.
(b) Given the ensemble described by ρ, what is the average value of
Jz ?

12
Solution: We are working in a diagonal basis for Jz :
 
1 0 0
Jz =  0 0 0 

.
0 0 −1

The average value of Jz is:


1 1
Jz  = Tr(ρJz ) = (2 + 0 − 1) = .
4 4

(c) What is the spread (standard deviation) in measured values of Jz ?


Answer: We’ll need the average value of Jz2 for this:
1 3
Jz2  = Tr(ρJz2 ) = (2 + 0 + 1) = .
4 4
Then:


11
∆Jz = Jz2  − Jz 2 = .
4
25. Coherent states with density matrices: Exercise 7 of the “Density Ma-
trix Formalism” course note.

26. Density matrix for a spin 1/2 system in a magnetic field: Exercise 8 of
the “Density Matrix Formalism” course note.

27. Entropy for a system of spin 1/2 particles in a magnetic field: Exercise
9 of the “Density Matrix Formalism” course note.

28. Hamiltonian in the particle-antiparticle basis: Exercise 1 of the K 0


course note.

29. Review of Schrödinger equation in three dimensions: Central potential


problem. There are some areas of elementary quantum mechanics that
I want to make sure don’t fall through the cracks in your education,
in particular, the central force problem and the specific case of the
one-electron atom.
Suppose we have two particles, of masses m1 and m2 , described by
position coordinates x1 and x2 . Assume that they interact with each
other via a potential V (x1 , x2 ).

13
(a) Write down the Hamiltonian for this system. Show that it may be
transformed to a description in terms of center-of-mass and rela-
tive coordinates. Show that the problem then reduces to two prob-
lems: one for the center-of-mass motion, and one for the relative
motion, if the potential can be separated into a term depending
only on the position of the center-of-mass, plus a term depending
only on the relative locations of the particles. Now assume that
the potential does not depend on the center-of-mass position, and
solve for the center-of-mass motion. Is your solution sensible?
Solution: Let pi = −i∇i denote the momentum of particle i,
with magnitude pi . The Hamiltonian is:

p21 p2 ∇2 ∇2
H= + 2 + V (x1 , x2 ) = − 1 − 2 + V (x1 , x2 ) (12)
2m1 2m2 2m1 2m2

Define total mass and center-of-mass position and momentum


variables:

M = m1 + m2 (13)
m1 x1 + m2 x2
X = (14)
M
P = M Ẋ = p1 + p2 . (15)

Define the reduced mass and relative position and momentum


variables:
m1 m2
m = (16)
m1 + m2
x = x1 − x2 (17)
m2 p1 − m1 p2
p = mẋ = . (18)
M
We may solve for x1 and x2 in terms of X and x:
m2
x1 = X + x (19)
M
m1
x2 = X− x (20)
M
(21)

14
Then the Hamiltonian can be written in terms of center-of-mass
and relative coordinates according to (letting P = |P| and p =
|p|):
P2 p2 m2 m1
H= + + V (X + x, X − x) (22)
2M 2m M M
Now let UT (X, x) = V (X + mM
2
x, X − m
M
1
x), and assume that it is
of the form:
UT (X, x) = UCM (X) + U(x). (23)
Let ∇CM be the gradient operator with respect to X, and ∇ be the
gradient with respect to x. Then we may write the Schrödinger
equation as:

∇2 ∇2
− CM − + UCM (X) + U(x) ψ(X, x) = Eψ(X, x). (24)
2M 2m

We expand ψ in a series of terms of the form Φ(X)φ(x), and apply


the technique of separation of variables to obtain:

1 ∇2
− CM + UCM (X) Φ = ECM (25)
Φ 2M

1 ∇2
− + U(x) φ = E − ECM (26)
φ 2M

Now we assume UCM (X) = 0. We may solve for the center-of-mass


motion:
∇2
− CM Φ(X) = ECM Φ(X) (27)
2M
The solution is
Φ(X) = AeiP·x + Be−iP·x , (28)
P2
with ECM = 2M
. This is simply the motion of a free particle of
mass M.
(b) Suppose V is a function of the separation between the two particles
only, V = V (|x|), where x ≡ x1 − x2 . Solve the Schrödinger equa-
tion for the angular dependence and show that the Schrödinger
equation may be reduced to an equivalent one dimensional prob-
lem. Give the “effective potential” for this equivalent one dimen-
sional problem.

15
Solution: We have the Schrödinger equation for the relative mo-
tion (letting |x| ≡ r):

∇2
− + V (r) ψ(x) = Eψ(x). (29)
2m

Since the problem clearly has spherical symmetry, we adopt spher-


ical polar coordinates. The Laplacian in spherical polar coordi-
nates is:
1 ∂ 2∂ 1 ∂ ∂ 1 ∂2
∇2 = r + sin θ + . (30)
r 2 ∂r ∂r r 2 sin θ ∂θ) ∂θ (r sin θ)2 ∂φ2

Once again we use the method of separation of variables. The


solution to the angular portion is Ym (θ, φ):

1 ∂ ∂ 1 ∂2
− sin θ − Ym (θ, φ) = "(" + 1)Ym (θ, φ)
sin θ ∂θ) ∂θ (sin θ)2 ∂φ2
(31)
The remaining dependence is on r. Letting ψ(x) = R(r)Ym (θ, φ),
we have:

1 d 2d "(" + 1)
− r + V (r) + R(r) = ER(r). (32)
2mr 2 dr dr 2mr 2

Write R(r) = u(r)/r to obtain:



1 d2 "(" + 1)
− + V (r) + u(r) = Eu(r). (33)
2m dr 2 2mr 2

Thus, we have reduced our problem to one of solving a one-


dimensional Schrödinger equation, with an “effective potential”
V (r) + (+1)
2mr 2
. The "(" + 1) term may be interpreted as a “centrifu-
gal barrier” due to angular motion.

16
Physics 195a
Problem set number 6
Due 2 PM, Thursday, November 14, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “The Simple Harmonic Oscillator: Creation and De-
struction Operators” course note.
PROBLEMS:

30. K 0 system in density matrix formalism: Exercise 2 of the K 0 course


note.

31. [Worth two problems] “Regeneration”: Exercise 3 of the K 0 course


note.

32. Qualitative features of wave functions: Exercise 1 of the Harmonic


Oscillator course note.

14
33. One of the failings of classical mechanics is that matter should be
“unstable”. Let us investigate this in the following system: Consider
a system consisting of N particles with masses mk and charges qk ,
k = 1, 2, . . . , N, where we suppose some of the charges are positive and
some negative. The Hamiltonian of this multiparticle system is:
N
 p2k  qk qj
H= + ,
k=1 2mk N ≥j>k≥1 |xk − xj |

where pk = |pk | is the magnitude of the momentum of the particle


labelled “k”.

(a) Assume we have solved the equations of the motion, with solutions
xk = sk (t). Show that for any ω > 0 we can select a number c > 0
such that xk = csk (ωt) is also a solution of the equations of motion.
Remember, we are dealing with the classical equations of motion
here.
(b) Find scaling laws relating the total energy, total momentum, to-
tal angular momentum, position of an individual particle, and
momentum of an individual particle for the original sk (t) solution
and the scaled csk (ωt) solution. The only parameter in your scal-
ing laws should be ω. Make sure that any time dependence is
clearly stated.
(c) Hence, draw the final conclusion that there does not exist any
stable “ground state” of lowest energy. As an aside, what Kepler’s
law follows from your analysis?
(d) We assert that quantum mechanics does not suffer from this dis-
ease, but this must be proven. You have seen (or, if not, see the
following problem) the analysis for the hydrogen atom in quan-
tum mechanics, and know that it has a ground state of finite
energy. However, it might happen for larger systems that sta-
bility is lost in quantum mechanics – there are typically several
negative terms in the potential function which could win over the
postive kinetic energy terms. We wish to prove that this is, in fact,
not the case. The Hamiltonian is as above, but now pk = −i∂k
(∂k = ( ∂x∂ k , ∂y∂k , ∂z∂k )).
Find a rigorous lower bound on the expectation value of H. It
doesn’t have to be very “good”– any lower bound will settle this

15
question of principle. You may take it as given that the lower
bound exists for the hydrogen atom, since we have already demon-
strated this. You may also find it convenient to consider center-
of-mass and relative coordinates between particle pairs.

34. The one-electron atom (review?): Continuing from problem 29, now
consider the case of the one-electron atom, with an electron under the
influence of a Coulomb field due to the nucleus of charge Ze:

Ze2
V (r) = − , (10)
r
(a) Without knowing the details of the potential, we may evaluate
the form of the radial wave function (Rn
(r) = un
(r)/r, where
ψn
m (x) = Rn
(r)Y
m (θ, φ)) for small r, as long as the potential
depends on r more slowly than 1/r2. Here, n is a quantum number
for the radial motion. Likewise, we find the asymptotic form of
the wave function for large r, as long as the potential approaches
zero as r becomes large. Find the allowable forms for the radial
wave functions in these two limits.
(b) Find the bound state eigenvalues and eigenfunctions of the one-
electron atom. [Hint: it is convenient to express the wave function,
or rather un
, with its asymptotic dependence explicit, so that
may be “divided out” in solving the rest of the problem.] You
may express your answer in terms of the Associated Laguerre
Polynomials:
n−
−1
 (−)k+1 (n + )!
L2
+1
n+
(x) = xk . (11)
k=0 (n −  − 1 − k)!(2 + 1 + k)!k!

16
Physics 195a
Problem set number 6 – Solutions to Problems 33 and 34
Due 2 PM, Thursday, November 14, 2002

READING: Read the “The Simple Harmonic Oscillator: Creation and De-
struction Operators” course note.
PROBLEMS:

30. K 0 system in density matrix formalism: Exercise 2 of the K 0 course


note.

31. [Worth two problems] “Regeneration”: Exercise 3 of the K 0 course


note.

32. Qualitative features of wave functions: Exercise 1 of the Harmonic


Oscillator course note.

33. One of the failings of classical mechanics is that matter should be


“unstable”. Let us investigate this in the following system: Consider
a system consisting of N particles with masses mk and charges qk ,
k = 1, 2, . . . , N, where we suppose some of the charges are positive and
some negative. The Hamiltonian of this multiparticle system is:
N
 p2k  qk qj
H= + ,
k=1 2mk N ≥j>k≥1 |xk − xj |

where pk = |pk | is the magnitude of the momentum of the particle


labelled “k”.

(a) Assume we have solved the equations of the motion, with solutions
xk = sk (t). Show that for any ω > 0 we can select a number c > 0
such that xk = csk (ωt) is also a solution of the equations of motion.
Remember, we are dealing with the classical equations of motion
here.
Solution: The solutions sk (t) must satisfy “F = ma”, that is:
 qi qj d2
(si − sj ) = mi si (t). (10)
j=i |si − sj |3 dt2

14
Let xi (t) = csi (ωt). Then,

d2 2 d
2
xi (t) = ω c si (t). (11)
dt2 dt2
Also,
 qi qj 1  qi qj
(xi − xj ) = (si − sj ). (12)
j=i |xi − xj |3 c2 j=i |si − sj |3

Thus, if ω 2 c = 1/c2 , then cf xi is also a solution.


(b) Find scaling laws relating the total energy, total momentum, to-
tal angular momentum, position of an individual particle, and
momentum of an individual particle for the original sk (t) solution
and the scaled csk (ωt) solution. The only parameter in your scal-
ing laws should be ω. Make sure that any time dependence is
clearly stated.
Solution: With c = ω −2/3 the position of a particle scales as:

xk (t) = ω −2/3 sk (ωt). (13)

The momentum of a particle scales as:


d d
p(t) → p (t) = mi xk (t) = mi ω −2/3 sk (ωt) = ω 1/3 p(ωt). (14)
dt dt
The total momentum is a constant of the motion, and scales as:

P → P = pk (t) = ω 1/3 P. (15)
k

The total energy is a constant of the motion, and scales as:


 [pk (t)]2 1  gj qk
E→E = + = ω 2/3 E. (16)
k mk 2 j,k;j=k |xk − xj |

The total angular momentum is a constant of the motion, scaling


like r × p:
L → L = ω −1/3 L. (17)

15
(c) Hence, draw the final conclusion that there does not exist any
stable “ground state” of lowest energy. As an aside, what Kepler’s
law follows from your analysis?
Solution: If we have any bound state with E < 0, such as for
two particles when qj = −qk , then we have another solution with
energy ω 2/3 E. Since ω can be taken arbitrarily large, there is no
lowest energy solution.
The Kepler’s law that follows from this analysis is the third: The
size of a trajectory scales as ω −2/3 . The period of the trajectory
scales as 1/ω. Thus, if a is the semi-major axis of the orbit, and
τ is the period,

a ∝ ω −2/3 (18)
τ ∝ ω −1 (19)
∝ a3/2 . (20)

(d) We assert that quantum mechanics does not suffer from this dis-
ease, but this must be proven. You have seen (or, if not, see the
following problem) the analysis for the hydrogen atom in quan-
tum mechanics, and know that it has a ground state of finite
energy. However, it might happen for larger systems that sta-
bility is lost in quantum mechanics – there are typically several
negative terms in the potential function which could win over the
postive kinetic energy terms. We wish to prove that this is, in fact,
not the case. The Hamiltonian is as above, but now pk = −i∂k
(∂k = ( ∂x∂ k , ∂y∂k , ∂z∂k )).
Find a rigorous lower bound on the expectation value of H. It
doesn’t have to be very “good”– any lower bound will settle this
question of principle. You may take it as given that the lower
bound exists for the hydrogen atom, since we have already demon-
strated this. You may also find it convenient to consider center-
of-mass and relative coordinates between particle pairs.
Solution: We know that the energy spectrum of the one-electron
atom is bounded below. Thus, the Hamiltonian

p2 Ze2
H1 = − , (21)
2m |x|

16
has a lower bound on the energy. For any acceptable wave function
|f1 , we have
f1 |H1 |f1  ≥ −Z 2 α2 m/2. (22)
We will make use of this in analyzing our more complicated sys-
tem.
We must consider the Hamiltonian
N
 p2k  qk qj
H= + . (23)
k=1 2mk N ≥j>k≥1 |xk − xj |

Let f (x) be any wave function in the allowed Hilbert space, nor-
malized so that f |f  = 1. We wish to show that f |H|f  > −∞.
We already know that some of the individual terms are bounded
below by zero:
N
 p2k
f | |f  ≥ 0 (24)
k=1
2mk
 qk qj
f | |f  ≥ 0, for qk qj ≥ 0. (25)
N ≥j>k≥1 |xk − xj |

We must concentrate our energy on those terms in the potential


energy with opposite charge particles.
The total energy is the sum of the individual expectation values.
If we can show that no term goes to −∞, then the sum will also
be bounded below (since the number of terms is finite). Con-
sider particle j. Suppose there are nj particles with sign opposite
to particle j. Suppose that particle k is one such particle. We
can thus write each the the potentially troublesome terms in the
Hamiltonian in the form of a two-particle problem with Hamil-
tonian:
1 p2j 1 p2k |qj qk |
Hjk = + − . (26)
nj 2mj nk 2mk |xj − xk |
We have arranged it such that the total Hamiltonian includes a
term of this form for each pair of oppositely charged particles,
and all of the potentially troublesome terms are included as terms
of this form. Note that this piece of the total Hamiltonian is
the Hamiltonian for two particles of masses nj mj and nk mk , and
charges qj and qk .

17
We may rewrite Hjk in terms of the relative and center-of-mass
motion of the two-particle subsystem: Hjk = Hjk;CM + Hjk;rel,
where
(pj + pk )2
Hjk;CM = (27)
2(nj mj + nk mk )
p2 |qj qk |
Hjk;rel = − , (28)
2m |x|

where x ≡ xj − xk , m = nj nk mj mk /(nj mj + nk mk ), and p =


(nk mk pj − nj mj pk )/(nj mj + nk mk ).
Since Hjk;CM involves the square of a Hermitian operator, its
spectrum is non-negative. If we can show that the spectrum of
Hjk;rel is bounded below, then we will have completed our task.
We achieve this by noting the similarity with the Hamiltonian for
the one-electron atom:
(qj qk )2 m
f |Hjk;rel|f  ≥ − . (29)
2
There are finite many terms of this form, all other contributions
are non-negative. Therefore,

f |H|f  > −∞. (30)

34. The one-electron atom (review?): Continuing from problem 29, now
consider the case of the one-electron atom, with an electron under the
influence of a Coulomb field due to the nucleus of charge Ze:

Ze2
V (r) = − , (31)
r
(a) Without knowing the details of the potential, we may evaluate
the form of the radial wave function (Rn (r) = un (r)/r, where
ψn m (x) = Rn (r)Y m (θ, φ)) for small r, as long as the potential
depends on r more slowly than 1/r2. Here, n is a quantum number
for the radial motion. Likewise, we find the asymptotic form of
the wave function for large r, as long as the potential approaches
zero as r becomes large. Find the allowable forms for the radial
wave functions in these two limits.

18
Solution: In problem 29, we showed that we could write the
Schrödinger equation for the relative motion in the form:
 
1 d2 &(& + 1)
− + V (r) + un (r) = Eun (r). (32)
2m dr 2 2mr 2

We wish to find the solution for small r. If we multiply the equa-


tion through by r 2 , and assume that r 2 V (r) → 0 as r → 0, then
we have the approximate equation for small r:
 
1 d2 &(& + 1)
− + un (r) = 0 (33)
2m dr 2 2mr 2

The solutions are un (r) ∝ r +1, and un (r) ∝ r − . The normal-


ization condition is: 
|ψ(x)|2 = 1. (34)
(∞)

Since the Y m functions are normalized to one themselves, the


radial portion of the wave function is normalized as:
 ∞  ∞
2 2
1= r |Rn | dr = |un (r)|2 dr. (35)
0 0

For & > 0, the r − solution diverges too rapidly near r = 0 for this
normalization condition.
The situation for & = 0 is more subtle. Often, this is glossed over,
and people just lump it in with the & > 0 case, but the same
argument for excluding the r 0 solution really doesn’t work. For
this case, un0 (r) = constant, and hence, ψn0 (x) ∝ 1/r at small r.
But  
1
∇2 = −4πδ(r), (36)
r
so this solution doesn’t satisfy the Schödinger equation at r = 0.
Thus, the physical solution is un (r) ∝ r +1 as r → 0. Or,
ψn (x) ∝ r as r → 0. For & > 0, ψn → 0, and for & = 0,
ψn → constant, as r → 0.
Now consider large r, and assume V (r) → 0 as r → ∞. Then the
asymptotic Schrödinger equation is:
1 d2
− un (r) = Eun (r). (37)
2m dr 2
19
The solutions to this equation are:
k2
un (r) ∝ e±ikr , E= , for E > 0; (38)
2m
κ2
un (r) ∝ e−κr , E=− , for E < 0. (39)
2m
The E < 0 solutions are the bound states (κ > 0). Note that the
e+κr solutions are unnormalizable. The asymptotic wave functions
are thus of the form e±ikr /r (spherical waves outgoing or incoming)
for the unbound states, and of the form e−κr /r for the bound
states.
One final remark: in the asymptotic limit, we can multiply these
solutions by r a , correct to leading order in r.
(b) Find the bound state eigenvalues and eigenfunctions of the one-
electron atom. [Hint: it is convenient to express the wave function,
or rather un , with its asymptotic dependence explicit, so that
may be “divided out” in solving the rest of the problem.] You
may express your answer in terms of the Associated Laguerre
Polynomials:
n− −1
 (−)k+1 (n + &)!
L2 +1
n+ (x) = xk . (40)
k=0 (n − & − 1 − k)!(2& + 1 + k)!k!

Solution: We look for solutions of the form


un (r)
ψn m (r, θ, φ) = Y m (θ, φ), (41)
r
where un (r) satisfies the equaivalent one-dimensional Schrödinger
equation:
 
1 d2 Ze2 &(& + 1)
− − + un (r) = En un (r). (42)
2m dr 2 r 2mr 2
It is generally convenient to put such problems into dimensionless
form. Here, define

κ = −8mE (43)
ρ = κr (44)
v(ρ) = un (ρ/κ). (45)

20
With these substitutions (the “8” is chosen for later convenience. . . ),
we have the differential equation:
 
d2 λ &(& + 1) 1
− − − v(ρ) = 0, (46)
dρ2 ρ ρ2 4

where we have defined the dimensionless constant:


2mZe2
λ≡ . (47)
κ
We take the hint, and put in explicit asymptotic dependence. The
asymptotic equation is:
 
d2 1
− v(ρ) = 0. (48)
dρ2 4

The asymptotic solution is thus:

v(ρ) = ρn e−ρ/2 , (49)

where ρ is any number. We’ll therefore look for solutions of the


form:
v(ρ) = F (ρ)e−ρ/2 , (50)
where f is a power series in ρ of finite order:

F (ρ) = ρ +1 (c0 + c1 ρ + c2 ρ2 + . . . + cn ρM ) (51)


= ρ +1 f (ρ), (52)

where the second relation defines f (ρ). The lowest power ρ +1 is


required to to give the right dependence at r = 0, as we determined
in part (a).
The differential equation for F is:
 
 λ &(& + 1)
F −F + − F = 0. (53)
ρ ρ2

This gives the following differential equation satisfied by f (ρ):

ρf  + [2(& + 1) − ρ] f  + (λ − & − 1)f = 0. (54)

21
If we plug our series form for f (ρ) into this equation, we find the
recurrence relation for the coeeficients:
k−λ+&+1
ck+1 = ak . (55)
(k + 1)(k + 2& + 2)
At this point, it may be demonstrated that the series must indeed
terminate, or else we would obtain an unacceptable asymptotic
form of eρ/2 instead of e−ρ/2 .
Assuming k = m is the highest term, we must have, from our
recurrence relation:
M −λ+&+1
aM +1 = 0 = aM . (56)
(M + 1)(M + 2& + 2)
Thus, n = λ − & − 1. Since & and n are non-negative integers,
λ must also be an integer, with λ ≥ 1. Lambda is known as
the principal quantum number (np ), and M is known as the
radial quantum number (nr ). Since λ can only take on discrete
values, we have the quantization of the energy levels:
Z 2 e4 m
Eλ = − , λ = 1, 2, . . . . (57)
2 λ2
For Z = 1 and λ = 1 this gives the familiar E1 = −13.6 eV ground
state energy of hydrogen.
The differential equation for f is known as the Associated La-
guerre Equation. It may readily be verified that the coefficients
in the associated Laguerre polynomials given in the problem state-
ment satisfy the desired recurrence relation, hence the radial wave
function is given by (after properly normalizing):
 3/2

2Z

(np − & − 1)! −ρ/2 2 +1


Rnp (r) = ρe Lnp + (ρ), (58)
np a0 2np [(np + &)!]3
where
2Z
ρ = r, (59)
na0
1 1
a0 = = , is the Bohr radius. (60)
me 2 mα

22
Physics 195a
Problem set number 7
Due 2 PM, Thursday, November 21, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 1-5 of the “Solving the Schrödinger Equation:


Resolvents” course note.
PROBLEMS:

35. Harmonic oscillator in three dimensions: Exercise 2 of the Harmonic


Oscillator course note.

36. Resolvent mathematics: Exercise 1 of the Resolvent course note.

37. More resolvent mathematics: Exercise 2 of the Resolvent course note.

38. Still more resolvent mathematics: Exercise 3 of the Resolvent course


note.

23
39. Green’s function solution of the infinite square well: Exercise 4 of the
Resolvent course note.

40. The one-electron atom (review?): We have had a couple of examples


of looking at the qualitative features of wave functions. Now apply the
same reasoning to the one-electron atom. Thus, sketch the effective
potential and the lowest three radial wave functions (do both R(r) and
u(r) = rR(r)) for the 1-electron atom for  = 0. Now do the same for
the qualitative solutions for  = 1. Pay attention to the turning points,
and to the dependence at r = 0. Since you have already computed the
actual wave functions, you may produce graphs of the functions you
obtained. If you do this, however, you should look carefully at your
graphs and make sure you understand at a physically intuitive level the
qualitative features.

24
Physics 195a
Problem set number 7 – Solution to Problem 40
Due 2 PM, Thursday, November 21, 2002

READING: Read sections 1-5 of the “Solving the Schrödinger Equation:


Resolvents” course note.
PROBLEMS:

35. Harmonic oscillator in three dimensions: Exercise 2 of the Harmonic


Oscillator course note.

36. Resolvent mathematics: Exercise 1 of the Resolvent course note.

37. More resolvent mathematics: Exercise 2 of the Resolvent course note.

38. Still more resolvent mathematics: Exercise 3 of the Resolvent course


note.

39. Green’s function solution of the infinite square well: Exercise 4 of the
Resolvent course note.

40. The one-electron atom (review?): We have had a couple of examples


of looking at the qualitative features of wave functions. Now apply the
same reasoning to the one-electron atom. Thus, sketch the effective
potential and the lowest three radial wave functions (do both R(r) and
u(r) = rR(r)) for the 1-electron atom for  = 0. Now do the same for
the qualitative solutions for  = 1. Pay attention to the turning points,
and to the dependence at r = 0. Since you have already computed the
actual wave functions, you may produce graphs of the functions you
obtained. If you do this, however, you should look carefully at your
graphs and make sure you understand at a physically intuitive level the
qualitative features.
Solution: Since we have solved for the wave functions, we’ll plot them.
The effective equivalent one-dimensional potential is:

Ze2 ( + 1)
Veff (r) = − + . (12)
r 2mr 2

17
We found the solutions:
 3/2 

2Z 
 (np −  − 1)!  −ρ/2np 2+1
Rnp  (r) = − 3 (ρ/np ) e Lnp + (ρ/np ),
np a0 2np [(np + )!]
(13)
where
2Z
ρ = r, (14)
a0
1 1
a0 = = , is the Bohr radius, (15)
me 2 mα
np = nr +  + 1. (16)

The Associated Laguerre polynomials are given by:


n−−1
 (−)k+1 [(n + )!]2
L2+1
n+ (x) = xk . (17)
k=0 (n −  − 1 − k)!(2 + 1 + k)!k!

The first few of these polynomials are:

L11 (x) = −1 (18)


L12 (x) = 2(−2 + x) (19)
1
L13 (x) = 6(−3 + 3x − x2 ) (20)
2
3
L3 (x) = −6 (21)
L34 (x) = 24(−4 + x) (22)
1
L35 (x) = 120(−10 + 5x − x2 ). (23)
2

Hence, the first few radial wave functions are:


 
Z 3/2 −ρ/2
R10 (ρ) = 2 e (24)
a0
 3/2
1 Z
R20 (ρ) = √ (2 − ρ/2)e−ρ/4 (25)
2 2 0 a
 3/2
2 Z
R30 (ρ) = √ (3 − ρ + ρ2 /18)e−ρ/6 (26)
9 3 a0

18
 
1 Z 3/2 −ρ/4
R21 (ρ) = √ ρe (27)
4 6 a0
 3/2
1 Z
R31 (ρ) = √ ρ(4 − ρ/3)e−ρ/6 (28)
27 6 0a
 3/2  
1 Z 5 ρ2 −ρ/8
R41 (ρ) = √ ρ 10 − ρ + e . (29)
64 15 a0 4 32

In terms of ρ, we may express the effective potential as:

a0 Veff (r(ρ)) 1 ( + 1)


Ueff (ρ) ≡ = − + . (30)
2Z 2 e2 ρ ρ2
The bond state energies are:

Z 2 e4 m 1
Enp =− . (31)
2 n2p

In terms of our scaled Ueff (ρ) potential, they appear at:

1
Enp = − . (32)
4n2p

We’ll make graphs of Ueff (ρ) and of Rnp  (ρ)(a0 /Z)3/2 (as well as u).

19
Physics 195a
Problem set number 8
Due 5 PM, Monday, December 2, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Solving the Schrödinger Equation: Resol-


vents” course note.
PROBLEMS:

41. Green’s function solution of the finite square well/square bump: Exer-
cise 5 of the Resolvent course note. Note that there is a typo in the
statement of this problem, in Eqn. 128. The “ρ” in the denominator
should be a “ρ0 ”.

42. Time development transformation for the one-dimensional harmonic


oscillator: Exercise 6 of the Resolvent course note.

43. Problem moved to PS 10.

20
Physics 195a
Problem set number 9
Due 2 PM, Thursday, December 5, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 1 through 5 of the “Angular Momentum” course


note.
PROBLEMS:

44. Application of the time development transformation for the one-dimensional


harmonic oscillator: Exercise 7 of the Resolvent course note.

45. Basic theorem on rotations: Exercise 1 of the Angular Momentum


course note.

46. Generalized addition law for tangents: Exercise 2 of the Angular Mo-
mentum course note.

29
Physics 195a
Problem set number 10
Due 2 PM, Thursday, December 12, 2002

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 6 through 9 of the “Angular Momentum” course


note.
PROBLEMS:

47. An important lemma in group theory: Exercise 3 of the Angular Mo-


mentum course note.

48. The Baker-Campbell-Hausdorff theorem: Exercise 9 of the Angular


Momentum course note. This important theorem, or a piece of it, is
usually quoted without proof in quantum mechanics texts. This is
perhaps not surprising, even if not really excusable – the proof is not
trivial. You are given a number of hints which should help to guide
you through to the desired result.

30
49. Application of symmetry principles for selection rules: Exercise 4 of
the Angular Momentum course note.

50. Oivil and Livio are up to their old tricks again (sometimes they use their
undercover aliases “Alice” and “Bob”, so be leery of anyone bearing
these pseudonyms. You have been warned. . . ). As usual, they conspire
to produce two-photon states. The polarization states of a photon may
be described as a two-state system; we’ll work in the helicity basis here,
with    
1 0
|R = ; |L = (90)
0 1
The Pauli matrix σ3 is just the helicity operator in this basis, if we
adopt the convention that a right-handed particle has positive helicity.
If the photon is directed along the positive z-axis, we also have

Jz |R = +|R, (91)


Jz |L = −|L. (92)

The Hermitian matrices σ1 and σ2 correspond to other measures of


polarization in this state space.
The sneaky Oivil decides to have some fun with his hapless twin. He
says: “I want you to measure your photons’ polarization in two ways,
with operators having eigenvalues of ±1. Being fair, I’ll do the same
thing with my photons.”

(a) Let O1 = ±1 stand for the random variable corresponding to the


result of Oivil’s measurement with one operator, and O2 stand for
the random variable corresponding to his measurement with his
other operator. Likewise, let L1 and L2 stand for Livio’s random
variables. Consider the quantity:

T ≡ L1 O1 + L2 O1 + L2 O2 − L1 O2 . (93)

What are the possible values which random variable T can take
on?
Let p( 1 , 2 , o1 , o2 ) be the probability of sampling a set of values:

L1 = 1 ; L2 = 2 ; O1 = o1 ; O2 = o2 . (94)

31
Compute the expectation value, according to p( 1 , 2 , o1 , o2 ) of T :

E(T ) = E(L1 O1 + L2 O1 + L2 O2 − L1 O2 ), (95)

where the E(T ) notation is used for “expectation value of T ”.


You don’t know p( 1 , 2 , o1 , o2 ), of course, so express your answer
in terms of it. Try to find a bound on how large the expectation
value can be, independent of p( 1 , 2 , o1 , o2 ). Oivil is a bit too
pleased in pointing this bound out to Livio. . .
(b) Now Oivil says, “OK, let’s try it. We’ll produce photons according
to the two-photon state:
1
|ψ = √ (|RR + |LL), (96)
2
where L stands for left-handed polarization, and R stands for
right-handed polarization. You go measure your photons with the
following operators:

L1 = σ3 ; L2 = σ1 .” (97)

Slyly, he adds, “I’ll measure mine with linear combinations of your


operators:
1 1
O1 = √ (σ1 + σ3 ); O2 = √ (σ1 − σ3 ).” (98)
2 2

Let’s see what they might find: What is the expectation value of
the operator

T = L1 O1 + L2 O1 + L2 O2 − L1 O2 ? (99)

[Livio was last observed meandering down Venice Beach, shak-


ing his head with a sad demeanor. Do you understand what his
problem is, and can you cheer him up?]

32
Physics 195a
Problem set number 10 – Solution to Problem 50
Due 2 PM, Thursday, December 12, 2002

READING: Read sections 6 through 9 of the “Angular Momentum” course


note.
PROBLEMS:

47. An important lemma in group theory: Exercise 3 of the Angular Mo-


mentum course note.

48. The Baker-Campbell-Hausdorff theorem: Exercise 9 of the Angular


Momentum course note. This important theorem, or a piece of it, is
usually quoted without proof in quantum mechanics texts. This is
perhaps not surprising, even if not really excusable – the proof is not
trivial. You are given a number of hints which should help to guide
you through to the desired result.

49. Application of symmetry principles for selection rules: Exercise 4 of


the Angular Momentum course note.

50. Oivil and Livio are up to their old tricks again (sometimes they use their
undercover aliases “Alice” and “Bob”, so be leery of anyone bearing
these pseudonyms. You have been warned. . . ). As usual, they conspire
to produce two-photon states. The polarization states of a photon may
be described as a two-state system; we’ll work in the helicity basis here,
with    
1 0
|R = ; |L = (90)
0 1
The Pauli matrix σ3 is just the helicity operator in this basis, if we
adopt the convention that a right-handed particle has positive helicity.
If the photon is directed along the positive z-axis, we also have

Jz |R = +|R, (91)


Jz |L = −|L. (92)

The Hermitian matrices σ1 and σ2 correspond to other measures of


polarization in this state space.

30
The sneaky Oivil decides to have some fun with his hapless twin. He
says: “I want you to measure your photons’ polarization in two ways,
with operators having eigenvalues of ±1. Being fair, I’ll do the same
thing with my photons.”
(a) Let O1 = ±1 stand for the random variable corresponding to the
result of Oivil’s measurement with one operator, and O2 stand for
the random variable corresponding to his measurement with his
other operator. Likewise, let L1 and L2 stand for Livio’s random
variables. Consider the quantity:

T ≡ L1 O1 + L2 O1 + L2 O2 − L1 O2 . (93)

What are the possible values which random variable T can take
on?
Let p( 1 , 2 , o1 , o2 ) be the probability of sampling a set of values:

L1 = 1 ; L2 = 2 ; O1 = o1 ; O2 = o2 . (94)

Compute the expectation value, according to p( 1 , 2 , o1 , o2 ) of T :

E(T ) = E(L1 O1 + L2 O1 + L2 O2 − L1 O2 ), (95)

where the E(T ) notation is used for “expectation value of T ”.


You don’t know p( 1 , 2 , o1 , o2 ), of course, so express your answer
in terms of it. Try to find a bound on how large the expectation
value can be, independent of p( 1 , 2 , o1 , o2 ). Oivil is a bit too
pleased in pointing this bound out to Livio. . .
Solution: First, let’s see what the allowed values of T are:

T = (L1 + L2 )O1 + (L2 − L1 )O2 = ±2 (96)

The expectation value of T is:



E(T ) = p( 1 , 2 , o1 , o2 ) [( 1 + 2 )o1 + ( 2 − 1 )o2 ](97)
1 ,2 ,o1 ,o2

≤ 2 p( 1 , 2 , o1 , o2 )
1 ,2 ,o1 ,o2
≤ 2. (98)

More completely, −2 ≤ E(T ) ≤ 2.

31
(b) Now Oivil says, “OK, let’s try it. We’ll produce photons according
to the two-photon state:
1
|ψ = √ (|RR + |LL), (99)
2
where L stands for left-handed polarization, and R stands for
right-handed polarization. You go measure your photons with the
following operators:

L1 = σ3 ; L2 = σ1 .” (100)

Slyly, he adds, “I’ll measure mine with linear combinations of your


operators:
1 1
O1 = √ (σ1 + σ3 ); O2 = √ (σ1 − σ3 ).” (101)
2 2
Let’s see what they might find: What is the expectation value of
the operator

T = L1 O1 + L2 O1 + L2 O2 − L1 O2 ? (102)

[Livio was last observed meandering down Venice Beach, shak-


ing his head with a sad demeanor. Do you understand what his
problem is, and can you cheer him up?]
Solution:

1
L1 O1  = √ (RR| + LL|)σ3L (σ3O + σ1O )(|RR + |LL) (103)
2 2
1
= √ (RR| + LL|)σ3L (|RR − |LL + |RL + |LR)
2 2
1
= √ (RR| + LL|)(|RR + |LL + |RL − |LR)(104)
2 2
1
= √ . (105)
2
1
L2 O1  = √ (RR| + LL|)σ1L (σ3O + σ1O )(|RR + |LL) (106)
2 2
1
= √ (RR| + LL|)σ1L (|RR − |LL + |RL + |LR)
2 2

32
1
= √ (RR| + LL|)(|LR − |RL + |LL + |RR)(107)
2 2
1
= √ . (108)
2
1
L2 O2  = √ (RR| + LL|)σ1L (−σ3O + σ1O )(|RR + |LL)
(109)
2 2
1
= √ (RR| + LL|)σ1L (−|RR + |LL + |RL + |LR)
2 2
1
= √ (RR| + LL|)(−|LR + |RL + |LL + |RR)(110)
2 2
1
= √ . (111)
2
1
L1 O2  = √ (RR| + LL|)σ3L (−σ3O + σ1O )(|RR + |LL)
(112)
2 2
1
= √ (RR| + LL|)σ3L (−|RR + |LL + |RL + |LR)
2 2
1
= √ (RR| + LL|)(−|RR − |LL + |RL − |LR)(113)
2 2
1
= −√ . (114)
2
Therefore, √
T  = 2 2. (115)
Our quantum mechanical result violates the bound we obtained
in part (a)!

33
Physics 195b
Problem set number 11
Due 2 PM, Thursday, January 16, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 10 through 12 of the “Angular Momentum”


course note.
PROBLEMS:

51. More on application of symmetry principles for selection rules: Charge


Conjugation. Do Exercise 5 of the Angular Momentum course note.

52. Rotational and inversion symmetry, applied to determining selection


rules: Exercise 10 of the Angular Momentum course note.

53. Reduction of a representation into irreducible representations: Exercise


11 of the Angular Momentum course note.

54. Little-d functions: Exercise 13 of the Angular Momentum course note.

34
55. We discussed Berry’s phase in class, including the example of the
Aharonov-Bohm effect. Now that we know more about rotations, let
us try another example.
Consider a spin-1/2 particle of magnetic moment µ in a magnetic field,
B. Let the strength of the magnetic field be a constant, but suppose
that its direction is slowly changing. Suppose the magnetic field vector
is swept through a closed curve (i.e., think of the tip of its vector as
been varied along a closed curve on the surface of a sphere). Thus, two
parameters describing the direction of the field are being varied, which
we might take to be the polar angles (θ, φ).

(a) What does “slow” mean? That is, on what time scale should the
variation be slow, in order for the adiabatic approximation to be
reasonable?
(b) What is the expectation value of the spin vector in the adiabatic
ground state?
(0)
(c) Express the adiabatic ground state wave function, ψθφ , in terms
of the adiabatically-varied parameters. You can achieve this by
inspection from your result to part (b), but I urge you to try doing
it instead (or, in addition, as a check) by performing a rotation
on a vector to the desired polar angles, now that we know how to
do this. I hope you’ll even try it with our general rotation matrix
formalism (the D matrices), even though you can short-cut this
for spin- 12 .
(d) Suppose we let the B field direction rotate slowly around the 3-
axis at constant θ. Calculate Berry’s phase for this situation.
[Recall, from first quarter, that Berry’s phase is the change in the
phase of the adiabatic wave function (ground state here) over one
complete circuit in parameter space. We found that the Berry
phase is given by:

(0) (0)
γB = i α · ψα
dα |∇α ψα , (116)

where α is a vector in parameter space.] See if you can give a


geometric interpretation to your answer, in terms of an amount of
solid angle swept out by the circuit in parameter space.

35
Physics 195b
Problem set number 11 – Solution to Problem 55
Due 2 PM, Thursday, January 16, 2003

READING: Read sections 10 through 12 of the “Angular Momentum”


course note.
PROBLEMS:

51. More on application of symmetry principles for selection rules: Charge


Conjugation. Do Exercise 5 of the Angular Momentum course note.

52. Rotational and inversion symmetry, applied to determining selection


rules: Exercise 10 of the Angular Momentum course note.

53. Reduction of a representation into irreducible representations: Exercise


11 of the Angular Momentum course note.

54. Little-d functions: Exercise 13 of the Angular Momentum course note.

55. We discussed Berry’s phase in class, including the example of the


Aharonov-Bohm effect. Now that we know more about rotations, let
us try another example.
Consider a spin-1/2 particle of magnetic moment µ in a magnetic field,
B. Let the strength of the magnetic field be a constant, but suppose
that its direction is slowly changing. Suppose the magnetic field vector
is swept through a closed curve (i.e., think of the tip of its vector as
been varied along a closed curve on the surface of a sphere). Thus, two
parameters describing the direction of the field are being varied, which
we might take to be the polar angles (θ, φ).

(a) What does “slow” mean? That is, on what time scale should the
variation be slow, in order for the adiabatic approximation to be
reasonable?
Solution: The Hamiltonian is:

H = −µS · B(t), (116)

where S = 12 σ is the spin operator. The difference in energy


between the ground state (spin along µB) and the first excited

34
state (spin opposite µB) is just |µB|. The rate at which the
Hamiltonian changes must be slow compared with this energy gap,
for the adiabatic approximation. A suitable measure of the rate
of change of the Hamiltonian is
 
1 dH 1  dB 
∼  , (117)
H dt |B|  dt 
where the interpretation of the left hand side should be in terms
of expectation values. Thus, for the adiabatic approximation, we
require:  
1  dB 
   |µB|. (118)
|B|  dt 
Note that we can also never have the strength of the magnetic field
go to zero, since then we would have degenerate ground states.
(b) What is the expectation value of the spin vector in the adiabatic
ground state?
Solution: Let θ and φ be the polar angles of the magnetic field.
These are the paramters which are being varied slowly. Hence,
(0)
denote the adiabatic ground state by ψθφ . The ground state is
the state in which the spin direction is aligned with the magnetic
field. The expectation value of the spin vector is therefor:
(0) (0) B 1 B
ψθφ |S|ψθφ  = |S| = . (119)
|B| 2 |B|
(0)
(c) Express the adiabatic ground state wave function, ψθφ , in terms
of the adiabatically-varied parameters. You can achieve this by
inspection from your result to part (b), but I urge you to try doing
it instead (or, in addition, as a check) by performing a rotation
on a vector to the desired polar angles, now that we know how to
do this. I hope you’ll even try it with our general rotation matrix
formalism (the D matrices), even though you can short-cut this
for spin- 12 .
 
1
Solution: We would like to rotate a vector , representing a
0
spin along the 3-axis, to a vector oriented with polar angles (θ, φ).
We may perform such a rotation by first rotating about the 2-axis

35
by angle θ, followed by a rotation about the 3-axis by angle φ. We
could do this directly, since this is a spin- 12 system, by computing
the SU(2) element:

u(θ, φ) = eφJ3 eθJ2 , (120)

where J = − 2i σ . Let us instead use our general rotation formal-


ism, to give it a try. The rotation matrix we desire is (in the Euler
angle parameterization):
1 1
Dm2 1 m2 (φ, θ, 0) = e−im1 φ dm
2
1 m2 (θ). (121)

The little-d matrix is given in the notes:


 
1 cos θ2 − sin θ2
d (θ) =
2 . (122)
sin θ2 cos θ2
Thus, the full rotation matrix is:
 i i 
1 e− 2 φ cos θ2 −e− 2 φ sin θ2
Dm1 m2 (φ, θ, 0) =
2
i i . (123)
e 2 φ sin θ2 e 2 φ cos θ2
 
1
Finally, operating this on our vector , we obtain:
0
   
1
1 − 2i φ cos θ2
Dm1 m2 (φ, θ, 0)
2
=e . (124)
0 sin θ2 eiφ
We may drop the overall phase factor, and write:
 
(0) cos 2θ
ψθφ = . (125)
sin θ2 eiφ

(d) Suppose we let the B field direction rotate slowly around the 3-
axis at constant θ. Calculate Berry’s phase for this situation.
[Recall, from first quarter, that Berry’s phase is the change in the
phase of the adiabatic wave function (ground state here) over one
complete circuit in parameter space. We found that the Berry
phase is given by:

(0) (0)
γB = i α · ψα
dα |∇α ψα , (126)

36
where α is a vector in parameter space.] See if you can give a
geometric interpretation to your answer, in terms of an amount of
solid angle swept out by the circuit in parameter space.
Solution: Our parameter space is two-dimensional, but only one
is varied along our circuit, so Berry’s phase is:
 2π
∂ (0)
(0)
γB = i dφψθφ |
ψ  (127)
0 ∂φ θφ
 2π   
θ θ −iφ 0
= i dφ cos , sin e (128)
0 2 2 i sin θ2 eiφ
θ
= −2π sin2 . (129)
2
The geometric interpretation of this result is perhaps not immedi-
ately obvious, although the presence of the 2π and the trigonomet-
ric function of θ is suggestive. It becomes clearer when we use the
1
identity sin2 2θ = 12 (1 − cos θ). Further, (1 − cos θ) = cos θ d cos θ


is just the fraction of the total solid angle in parameter space pre-
sented by the region encircled by our path. Thus, with S = 12 , we
may write:  2π 1
γB = −S d cos θ = −SΩ, (130)
0 cos θ
where Ω is the solid angle encircled by our path, as viewed from
the origin in parameter space.

37
Physics 195b
Problem set number 12
Due 2 PM, Thursday, January 23, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Angular Momentum” course note.


PROBLEMS:

56. Application of SU(2) to nuclear physics: Isospin. Do Exercise 12 of


the Angular Momentum course note. This is not a problem on angular
momentum, but it demonstates that the group theory we developed for
angular momentum may be applied in a formally equivalent context.
The problem statement claims that there is an attached picture. This
is clearly false. You may find an appropriate level scheme via a google
search (you want a level diagram for the nuclear isobars of 6 nucleons),
e.g., at:
http://www.tunl.duke.edu/nucldata/figures/06figs/06 is.pdf
For additional reference, you might find it of interest to look up:

38
F. Ajzenberg-Selove, “Energy Levels of Light Nuclei, A = 5-10,” Nucl.
Phys. A490 1-225 (1988)
(see also http://www.tunl.duke.edu/nucldata/fas/88AJ01.shtml).
57. Symmetry and broken symmetry: Application of group theory to level
splitting in a lattice with reduced symmetry. Do exercise 14 of the
Angular Momentum course note. This is an important problem – it
illustrates the power of group theoretic methods in addressing certain
questions. I hope you will find it fun to do.
58. In class, we have discussed the transformation between two different
types of “helicity bases”. In particular, we have considered a system of
two particles, with spins j1 and j2 , in their CM frame.
One basis is the “spheical helicity basis”, with vectors of the form:

|j, m, λ1 , λ2 , (131)

where j is the total angular momentum, m is the total angular mo-


mentum projection along the 3-axis, and λ1 , λ2 are the helicities of the
two particles. We assumed a normalization of these basis vectors such
that:
j  , m , λ1 , λ2 |j, m, λ1 , λ2  = δjj  δmm δλ1 λ1 δλ2 λ2 . (132)

The other basis is the “plane-wave helicity basis”, with vectors of the
form:
|θ, φ, λ1, λ2 , (133)
where θ and φ are the spherical polar angles of the direction of particle
one. We did not specify a normalization for these basis vectors, but an
obvious (and conventional) choice is:

θ , φ, λ1 , λ2 |θ, φ, λ1, λ2  = δ (2) (Ω − Ω)δλ1 λ1 δλ2 λ2 , (134)

where d(2) Ω refers to the element of solid angle for particle one.
In class, we have obtained the result for the transformation between
these bases in the form:
 j
|θ, φ, λ1 , λ2  = bj |j, m, λ1 , λ2 Dmδ (φ, θ, −φ), (135)
j,m

where δ ≡ λ1 − λ2 . Determine the numbers bj .

39
59. Clebsch-Gordan coefficients, an alternate practical approach: Exercise
16 of the Angular Momentum course note.

60. Application to angular distribution: Exercise 18 of the Angular Mo-


mentum course note. While you may apply the formula we derived in
class, I urge you to do this problem by thinking about it “from the
beginning” – what should be the angular dependence of the spatial
wave function? That is, I hope you will try using some “physical intu-
ition” first, and use the formula as a check if you wish. Note that you
are intended to assume that the frame is the rest frame of the spin-1
particle.

40
Physics 195b
Problem set number 12 – Solution to Problem 58
Due 2 PM, Thursday, January 23, 2003

READING: Finish reading the “Angular Momentum” course note.


PROBLEMS:

56. Application of SU(2) to nuclear physics: Isospin. Do Exercise 12 of


the Angular Momentum course note. This is not a problem on angular
momentum, but it demonstates that the group theory we developed for
angular momentum may be applied in a formally equivalent context.
The problem statement claims that there is an attached picture. This
is clearly false. You may find an appropriate level scheme via a google
search (you want a level diagram for the nuclear isobars of 6 nucleons),
e.g., at:
http://www.tunl.duke.edu/nucldata/figures/06figs/06 is.pdf
For additional reference, you might find it of interest to look up:
F. Ajzenberg-Selove, “Energy Levels of Light Nuclei, A = 5-10,” Nucl.
Phys. A490 1-225 (1988)
(see also http://www.tunl.duke.edu/nucldata/fas/88AJ01.shtml).
57. Symmetry and broken symmetry: Application of group theory to level
splitting in a lattice with reduced symmetry. Do exercise 14 of the
Angular Momentum course note. This is an important problem – it
illustrates the power of group theoretic methods in addressing certain
questions. I hope you will find it fun to do.
58. In class, we have discussed the transformation between two different
types of “helicity bases”. In particular, we have considered a system of
two particles, with spins j1 and j2 , in their CM frame.
One basis is the “spherical helicity basis”, with vectors of the form:
|j, m, λ1 , λ2 , (116)
where j is the total angular momentum, m is the total angular mo-
mentum projection along the 3-axis, and λ1 , λ2 are the helicities of the
two particles. We assumed a normalization of these basis vectors such
that:
j  , m , λ1 , λ2 |j, m, λ1 , λ2  = δjj  δmm δλ1 λ1 δλ2 λ2 . (117)

34
The other basis is the “plane-wave helicity basis”, with vectors of the
form:
|θ, φ, λ1, λ2 , (118)
where θ and φ are the spherical polar angles of the direction of particle
one. We did not specify a normalization for these basis vectors, but an
obvious (and conventional) choice is:
θ , φ, λ1 , λ2 |θ, φ, λ1, λ2  = δ (2) (Ω − Ω)δλ1 λ1 δλ2 λ2 , (119)
where d(2) Ω refers to the element of solid angle for particle one.
In class, we have obtained the result for the transformation between
these bases in the form:
 j
|θ, φ, λ1 , λ2  = bj |j, m, λ1 , λ2 Dmδ (φ, θ, −φ), (120)
j,m

where δ ≡ λ1 − λ2 . Determine the numbers bj .


Solution: To select a particular bj , i.e., a particular j, let us invert
the basis transformation:

j∗
dΩDmδ (φ, θ, −φ)|θ, φ, λ1, λ2  = (121)
4π 
 j∗ j 
bj  |j  , m , λ1 , λ2  dΩDmδ (φ, θ, −φ)Dm  δ (φ, θ, −φ)
j  ,m 4π
 

= bj  |j  , m , λ1 , λ2  dΩdjmδ (θ)djm δ (θ) exp [−i(m φ − δφ) + i(mφ − δφ)]
j  ,m 4π

  1  2π
 
= bj  |j  , m , λ1 , λ2  d cos θdjmδ (θ)djm δ (θ) dφei(m−m )φ (122)
j  ,m −1 0

  1

= 2π 
bj  |j , m, λ1 , λ2  d cos θdjmδ (θ)djmδ (θ) (123)
j −1

 2δjj 
= 2π bj  |j  , m, λ1 , λ2  (124)
j 2j + 1

= bj |j, m, λ1 , λ2 . (125)
2j + 1
Note that we should probably justify the interchange of the order of
summation and integration in the very first step above. Thus,
2j + 1  j∗
|j, m, λ1 , λ2  = dΩDmδ (φ, θ, −φ)|θ, φ, λ1, λ2 . (126)
4πbj 4π

35
Now,

1 = j, m, λ1 , λ2 |j, m, λ1 , λ2  (127)


 2  
2j + 1 j∗ j
= dΩDmδ (φ, θ, −φ) dΩ Dmδ (φ , θ , −φ )θ , φ, λ1 , λ2 |θ, φ, λ1 , λ2 
4π|bj | 4π 4π
 2  
2j + 1 j∗ j
= dΩDmδ (φ, θ, −φ) dΩ Dmδ (φ , θ , −φ )δ(cos θ − cos θ)δ(φ − φ)
4π|bj | 4π 4π
 2 
2j + 1 j∗ j
= dΩDmδ (φ, θ, −φ)Dmδ (φ, θ, −φ) (128)
4π|bj | 4π
 2 
2j + 1 1  2
= 2π d cos θ djmδ (θ) (129)
4π|bj | −1
 2
4π 2j + 1
= . (130)
2j + 1 4π|bj |

Therefore, |bj |2 = (2j + 1)/4π, or picking a phase convention,




bj = . (131)
2j + 1

where we assume that it is all right to interchange the summation and


integration. Since each term is non-negative (and each finite), there is
no potential for cancellations. Hence, if we find convergence for one
ordering of the operations, we will for the other as well.
Note that we have used the result of Eqn. 348 of the notes to obtain:
 1  2 2
d cos θ djmδ (θ) = . (132)
−1 2j + 1

59. Clebsch-Gordan coefficients, an alternate practical approach: Exercise


16 of the Angular Momentum course note.
60. Application to angular distribution: Exercise 18 of the Angular Mo-
mentum course note. While you may apply the formula we derived in
class, I urge you to do this problem by thinking about it “from the
beginning” – what should be the angular dependence of the spatial
wave function? That is, I hope you will try using some “physical intu-
ition” first, and use the formula as a check if you wish. Note that you

36
are intended to assume that the frame is the rest frame of the spin-1
particle.

37
Physics 195b
Problem set number 13
Due 2 PM, Thursday, January 30, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read sections 1-5 of the “Approximate Methods” course note.


PROBLEMS:

61. Exercise 15 of the Angular Momentum course note.

62. Exercise 17 of the Angular Momentum course note.

63. More on decay angular distributions: Do exercise 19 of the Angular


Momentum course note.

64. Let us discuss some issues relevant to the proof you gave of Yang’s
theorem (by the way, the original reference is: C. N. Yang, Physical
Review 77 (1950) 242) in exercise 10 of the Angular Momentum course
note.

34
(a) Let us consider the parity again, and try to establish the connec-
tion between how I expected you to do the homework, and an
equivalent argument based on the identical boson symmetry.
In general, the parity of a system of two particles, when their state
is an eigenstate of the parity operator, may be expressed in the
form:
P = ηintrinsic ηspatial ,
where P 2 = 1, ηintrinsic refers to any intrinsic parity due to the
interchange of the positions of the particles, and ηspatial refers to
the effect of parity on the spatial part of the wave function. Given
a system of two identical particles, and considering the action of
parity on YL0 (θ, φ), determine the parity of the system for given
orbital angular momentum L.
For a two-photon system, such as we considered in our discussion
of Yang’s theorem, explicitly consider boson symmetry (i.e., that
the wave function is invariant under interchange of all quantum
numbers for two identical bosons) to once again determine the
effect of parity of the states:

|↑↑, |↓↓, |↑↓ + |↓↑, |↑↓ − |↓↑

Also, give the allowed possibilities for the orbital angular momen-
tum for these states, at least at the level of our current discussion.
You may now wish to go back to your original derivation of Yang’s
theorem, and determine where you implicitly made use of the
boson symmetry.
(b) Continuing our discussion of Yang’s theorem, there may be some
concern about the total spin angular momentum of the two photon
states, and whether the appropriate values are possible to give the
right overall angular momentum when combined with even or odd
orbital angular momenta. Using a table of Clebsch-Gordan coef-
ficients or otherwise, let us try to alleviate this concern. Thus,
decompose our four 2-photon helicity states (with Jz values in-
dicated by |↑↑, |↓↓, √12 [|↑↓+ |↓↑], √12 [|↑↓− |↓↑], where the
photons are travelling along the + and −z axis) into states of
total spin angular momenta and spin projection along the z-axis:
| S, Sz . Hence, show that a J P = 0+ particle may decay into

35
two photons with relative orbital angular momentum L = 2 or 0,
and a J P = 0− particle may decay into two photons with relative
angular momentum L = 1.

65. Do exercise 1 of the Approximate Methods course note.

36
Physics 195b
Problem set number 13 – Solution to Problem 64
Due 2 PM, Thursday, January 30, 2003

READING: Read sections 1-5 of the “Approximate Methods” course note.


PROBLEMS:

61. Exercise 15 of the Angular Momentum course note.


62. Exercise 17 of the Angular Momentum course note.
63. More on decay angular distributions: Do exercise 19 of the Angular
Momentum course note.
64. Let us discuss some issues relevant to the proof you gave of Yang’s
theorem (by the way, the original reference is: C. N. Yang, Physical
Review 77 (1950) 242) in exercise 10 of the Angular Momentum course
note.

(a) Let us consider the parity again, and try to establish the connec-
tion between how I expected you to do the homework, and an
equivalent argument based on the identical boson symmetry.
In general, the parity of a system of two particles, when their state
is an eigenstate of the parity operator, may be expressed in the
form:
P = ηintrinsic ηspatial ,
where P 2 = 1, ηintrinsic refers to any intrinsic parity due to the
interchange of the positions of the particles, and ηspatial refers to
the effect of parity on the spatial part of the wave function. Given
a system of two identical particles, and considering the action of
parity on YL0 (θ, φ), determine the parity of the system for given
orbital angular momentum L.
For a two-photon system, such as we considered in our discussion
of Yang’s theorem, explicitly consider boson symmetry (i.e., that
the wave function is invariant under interchange of all quantum
numbers for two identical bosons) to once again determine the
effect of parity of the states:
|↑↑, |↓↓, |↑↓ + |↓↑, |↑↓ − |↓↑

38
Also, give the allowed possibilities for the orbital angular momen-
tum for these states, at least at the level of our current discussion.
You may now wish to go back to your original derivation of Yang’s
theorem, and determine where you implicitly made use of the
boson symmetry.
Solution: The action of parity on a system of given orbital an-
gular momentum L is:

P YL0 (θ, φ) = YL0 (π − θ, π + φ) (133)


= (−)L YL0 (θ, φ), (134)

since YL0 does not depend on φ, and the effect of the tranformation
on θ is determined by the even/odd properties of the Legendre
polynomials. Note that it is sufficient to consider YL0 , since if
M = 0, we can always rotate to a basis in which M = 0, and
[J, P ] = 0. Thus, the parity of a system of two identical particles
of orbital angular momentum L is (−)L .
The | ↑↑ is symmetric under interchange of the spins. Therefore,
it must be symmetric under interchange of spatial coordinates
in order for the total wave function to be symmetric under in-
terchange of the two photons. Thus, the parity of this state is
even. The same argument applies for | ↓↓ and | ↑↓ + ↓↑. The
| ↑↓ − ↓↑ state is odd under spin interchange, hence must be odd
under space interchange; the parity of this state is odd.
The even parity states can have even orbital angular momenta,
and the odd parity state can have odd orbital angular momenta.
(b) Continuing our discussion of Yang’s theorem, there may be some
concern about the total spin angular momentum of the two photon
states, and whether the appropriate values are possible to give the
right overall angular momentum when combined with even or odd
orbital angular momenta. Using a table of Clebsch-Gordan coef-
ficients or otherwise, let us try to alleviate this concern. Thus,
decompose our four 2-photon helicity states (with Jz values in-
dicated by |↑↑, |↓↓, √12 [|↑↓+ |↓↑], √12 [|↑↓− |↓↑], where the
photons are travelling along the + and −z axis) into states of
total spin angular momenta and spin projection along the z-axis:
| S, Sz . Hence, show that a J P = 0+ particle may decay into

39
two photons with relative orbital angular momentum L = 2 or 0,
and a J P = 0− particle may decay into two photons with relative
angular momentum L = 1.
Solution: We are asked to combine the spins of the two photons
to determine the given states in terms of the total spin and its
projection along the z axis. Using a table of Clebsch-Gordan
coefficients, we find:

| ↑↑ = |22 (135)


1 1 1
| ↑↓ = √ |20 + √ |10 + √ |00 (136)
6 2 3
1 1 1
| ↓↑ = √ |20 − √ |10 + √ |00 (137)
6 2 3
| ↓↓ = |2 − 2. (138)

Hence, we also have the desired linear combinations:



1 1 2
√ (| ↑↓ + | ↓↑) = √ |20 + |00 (139)
2 3 3
1
√ (| ↑↓ − | ↓↑) = |10. (140)
2

The J P = 0+ state, with even parity, can only decay to the even
parity state | ↑↓ + | ↓↑. Note that we must have L = S, in order
to couple to angular momentum 0. We see that this is possible
with L = 0, coupling to the |00 spin state, or L = 2, coupling to
the |20 spin state. The J P = 0− state, with odd parity, can only
decay to the odd parity state | ↑↓ − | ↓↑. This is possible only
with L = 1.

65. Do exercise 1 of the Approximate Methods course note.

40
Physics 195b
Problem set number 14
Due 2 PM, Thursday, February 6, 2003

Notes about course:


• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.
• Collaboration policy: OK to work together in small groups, and to help
with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.
• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/
• TA: Anura Abeyesinghe, anura@caltech.edu
• If you think a problem is completely trivial (and hence a waste of your
time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .
READING: Finish reading the “Approximate Methods” course note.
PROBLEMS:
66. Variational computation: Do exercise 2 of the Approximate Methods
course note.
67. WKB and variational calculation of gravity levels: Do exercise 3 of the
Approximate Methods course note.
68. Perturbation calculation of relativistic correction in hydrogen: Do ex-
ercise 4 of the Approximate Methods course note.
69. Degenerate stationary state perturbation theory, applied to hydrogen
in an electric field: Do exercise 5 of the Approximate Methods course
note.

41
Physics 195b
Problem set number 15
Due 2 PM, Thursday, February 13, 2003
Notes about course:
• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.
• Collaboration policy: OK to work together in small groups, and to help
with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.
• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/
• TA: Anura Abeyesinghe, anura@caltech.edu
• If you think a problem is completely trivial (and hence a waste of your
time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .
READING: Read sections 1–7 of the “Scattering” course note.
PROBLEMS:
70. Almost degenerate perturbation theory: Do exercise 6 of the Approxi-
mate Methods course note.
71. Density of states: Do exercise 7 of the Approximate Methods course
note.
72. First Born approximation: Do exercise 8 of the Approximate Methods
course note.
73. Yukawa potential: Do exercise 9 of the Approximate Methods course
note.
74. Partial wave expansion; Optical theorem: Do Exercise 1 of the Scat-
tering course note.

42
Physics 195b
Problem set number 16
Due 2 PM, Thursday, February 20, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Finish reading the “Scattering” course note.


PROBLEMS:

75. Potential well/bump and spherical solutions: Do Exercise 2 of the Scat-


tering course note.

76. The “fundamental” and “effective” cross sections: Do Exercise 4 of the


Scattering course note.

77. Parity conservation and scattering amplitudes: Do Exercise 5 of the


Scattering course note.

78. Resonant scattering of light on an atom: Do Exercise 6 of the Scattering


course note.

43
79. When we talked about the hyperfine splitting in atoms, we mentioned
that the magnetic dipole moment of the proton is:
e
µ p = gp sp , (139)
2mp

with a measured magnitude corresponding to a value for the gyromag-


netic ratio of gp = 2 × (2.792847337 ± 0.000000029). Recall also that
I mentioned that the prediction of the Dirac equation for a point spin-
1/2 particle is g = 2. We may understand the fact that the proton
gyromagnetic ratio is not two as being due to its compositeness: In
the simple quark model, the proton is made of three quarks, two “ups”
(u), and a “down” (d). The quarks are supposed to be point spin-1/2
particles, hence, their gyromagnetic ratios should be gu = gd = 2 (up
to higher order corrections, as in the case of the electron). Let us see
whether we can make sense out of the proton magnetic moment.
The proton magnetic moment should be the sum of the magnetic mo-
ments of its constituents, and any moments due to their orbital motion
in the proton. The proton is the ground state baryon, so we assume
that the three quarks are bound together (by the strong interaction)
in a state with no orbital angular momentum. By Fermi statistics, the
two identical up quarks must have an overall odd wave function under
interchange of all quantum numbers. We must apply this with a bit of
care, since we are including “color” as one of these quantum numbers
here.
Let us look a little at the property of “color”. It is the strong interaction
analog of electric charge in the electromagnetic interaction. However,
instead of one fundamental dimension in charge, there are three color
directions, labelled as “red” (r), “blue” (b), and “green” (g). Unitary
transformations in this color space, up to overall phases, are described
by elements of the group SU(3), the group of unitary unimodular 3 ×
3 matrices. Just like combining spins, we may combine three colors
according to a Clebsch-Gordan series, with the result:

3 × 3 × 3 = 10 + 8 + 8 + 1. (140)

We haven’t studied this group, so this decomposition into irreducible


representations of the product representation is probably new to you.

44
However, the essential aspect here is that there is a singlet in the decom-
position. That is, it is possible to combine three colors in such a way as
to get a color-singlet state, i.e., a state with no net color charge. These
are the states of physical interest for our observed baryons, according to
a postulate of the quark model. After some thought (perhaps involving
raising and lowering operators along different directions in this color
space), you could probably convince yourself that the singlet state in
the decomposition above must be antisymmetric under the interchange
of any two colors. Assuming this is the case, write down the color
portion of the proton wave function.
Now that you know the color wave function of the quarks in the proton,
write down the spin wave function.
Since the proton is uud and its isospin partner the neutron is ddu, and
mp ≈ mn , let us make the simplfying assumption that mu = md . Given
the measured value of gp , what does your model give for mu ? Recall
that the up quark has electric charge 2/3, and the down quark has
electric charge −1/3, in units of the positron charge.
Finally, use your results to predict the gyromagnetic moment of the
neutron, and compare with observation.

45
Physics 195b
Problem set number 16 – Solution to Problem 79
Due 2 PM, Thursday, February 20, 2003

READING: Finish reading the “Scattering” course note.


PROBLEMS:

75. Potential well/bump and spherical solutions: Do Exercise 2 of the Scat-


tering course note.

76. The “fundamental” and “effective” cross sections: Do Exercise 4 of the


Scattering course note.

77. Parity conservation and scattering amplitudes: Do Exercise 5 of the


Scattering course note.

78. Resonant scattering of light on an atom: Do Exercise 6 of the Scattering


course note.

79. When we talked about the hyperfine splitting in atoms, we mentioned


that the magnetic dipole moment of the proton is:
e
µ p = gp sp , (139)
2mp

with a measured magnitude corresponding to a value for the gyromag-


netic ratio of gp = 2 × (2.792847337 ± 0.000000029). Recall also that
I mentioned that the prediction of the Dirac equation for a point spin-
1/2 particle is g = 2. We may understand the fact that the proton
gyromagnetic ratio is not two as being due to its compositeness: In
the simple quark model, the proton is made of three quarks, two “ups”
(u), and a “down” (d). The quarks are supposed to be point spin-1/2
particles, hence, their gyromagnetic ratios should be gu = gd = 2 (up
to higher order corrections, as in the case of the electron). Let us see
whether we can make sense out of the proton magnetic moment.
The proton magnetic moment should be the sum of the magnetic mo-
ments of its constituents, and any moments due to their orbital motion
in the proton. The proton is the ground state baryon, so we assume
that the three quarks are bound together (by the strong interaction)

43
in a state with no orbital angular momentum. By Fermi statistics, the
two identical up quarks must have an overall odd wave function under
interchange of all quantum numbers. We must apply this with a bit of
care, since we are including “color” as one of these quantum numbers
here.
Let us look a little at the property of “color”. It is the strong interaction
analog of electric charge in the electromagnetic interaction. However,
instead of one fundamental dimension in charge, there are three color
directions, labelled as “red” (r), “blue” (b), and “green” (g). Unitary
transformations in this color space, up to overall phases, are described
by elements of the group SU(3), the group of unitary unimodular 3 ×
3 matrices. Just like combining spins, we may combine three colors
according to a Clebsch-Gordan series, with the result:

3 × 3 × 3 = 10 + 8 + 8 + 1. (140)

We haven’t studied this group, so this decomposition into irreducible


representations of the product representation is probably new to you.
However, the essential aspect here is that there is a singlet in the decom-
position. That is, it is possible to combine three colors in such a way as
to get a color-singlet state, i.e., a state with no net color charge. These
are the states of physical interest for our observed baryons, according to
a postulate of the quark model. After some thought (perhaps involving
raising and lowering operators along different directions in this color
space), you could probably convince yourself that the singlet state in
the decomposition above must be antisymmetric under the interchange
of any two colors. Assuming this is the case, write down the color
portion of the proton wave function.
Now that you know the color wave function of the quarks in the proton,
write down the spin wave function.
Since the proton is uud and its isospin partner the neutron is ddu, and
mp ≈ mn , let us make the simplfying assumption that mu = md . Given
the measured value of gp , what does your model give for mu ? Recall
that the up quark has electric charge 2/3, and the down quark has
electric charge −1/3, in units of the positron charge.
Finally, use your results to predict the gyromagnetic moment of the
neutron, and compare with observation.

44
Solution: Note that there are six permutations of the three colors
among the three quarks, if no two quarks have the same color. The
completely antisymmetric combination of three colors is:
1
√ (|rgb − |rbg + |brg − |bgr + |gbr − |grb). (141)
6
To construct the spin wave function, we first note that the three spin-
1/2 quarks must combine in such a way as to give an overall spin-1/2
for the proton. Second, since the space wave function is symmetric,
and the color wave function is antisymmetric, the spin wave function
of the two up quarks must be symmetric. Thus, the two up quarks are
in a spin 1 state. To give the spin wave function of the proton, let us
pick the z axis to be along the spin direction. Then the spin state is:

11 2 1 1 1 11
| = |11; −  − √ |10; . (142)
22 3 2 2 3 22
The magnetic moment of the proton in this model is thus:
2 1 4 1
µp = (2µu − µd ) + µd = µu − µd . (143)
3 3 3 3
Hence,  
e 4 2 e 1 1 e
gp = 2 − 2 − . (144)
2mp 3 3 2mu 3 3 2md
With gp = 5.58, mp = 938 MeV, and mu = md , we obtain
mu = 2mp /gp = 336 MeV. (145)

The neutron wave function may be obtained from the proton wave
function by interchanging the u and d quark labels. Thus,
2 1 4 1
µn = (2µd − µu ) + µu = µd − µu . (146)
3 3 3 3
We predict the gyromagnetic moment of the nuetron to be:
4
µn µ
3 d
− 13 µu
= 4 (147)
µp µ
3 u
− 13 µd
 
4
3
− 13 − 12
33
=   (148)
42 1 1
33
− 3
− 3
2
= − . (149)
3
45
That is, we predict (neglecting the mass difference) gn = −3.72. This
may be compared with the observed value of −3.83.

46
Physics 195b
Problem set number 17
Due 2 PM, Thursday, February 27, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “Identical Particles” course note.


PROBLEMS:

80. High energy limit: Do Exercise 7 of the Scattering course note.

81. Consider the graph in Fig. 1.


Assume that the other phase shifts are negligible (e.g., “low energy”
is reasonably accurate). The pion mass and energy here are suffi-
ciently small that we can at least entertain the approximation of an
infinitely heavy proton at rest – we’ll assume this to be the case,
in any event.
 Note that Tπ is the relativistic kinetic energy of the
+
π : Tπ = Pπ + m2π − mπ .
2

46
Phase
Shift
(degrees)
160

120

80

40

0
0 50 100 150 200
Tπ (MeV)

Figure 1: Made-up graph of phase shifts δ0 and δ1 for elastic π + p scattering


(neglecting spin).

(a) Is the π + p force principally attractive or repulsive (as shown in


this figure)?
(b) Plot the total cross section in mb (millibarns) as a function of
energy, from Tπ =40 to 200 MeV.
(c) Plot the angular distribution of the scattered π + at energies of
120, 140 and 160 MeV.
(d) What is the mean free path of 140 MeV pions in a liquid hydrogen
target, with these “protons”?

82. Inelastic scattering: Do Exercise 8 of the Scattering course note.

83. Exclusion principle and atomic states: Do Exercise 1 of the Identical


Particles course note.

47
Physics 195b
Problem set number 17 – Solution to Problem 81
Due 2 PM, Thursday, February 27, 2003

READING: Read the “Identical Particles” course note.


PROBLEMS:
80. High energy limit: Do Exercise 7 of the Scattering course note.
81. Consider the graph in Fig. 1.
Phase
Shift
(degrees)

160

δ1

120

80

40
δ0

0
0 50 100 150 200
Tπ (MeV)

Figure 1: Made-up graph of phase shifts δ0 and δ1 for elastic π + p scattering


(neglecting spin).

Assume that the other phase shifts are negligible (e.g., “low energy”
is reasonably accurate). The pion mass and energy here are suffi-
ciently small that we can at least entertain the approximation of an
infinitely heavy proton at rest – we’ll assume this to be the case,
in any event.
 Note that Tπ is the relativistic kinetic energy of the
+
π : Tπ = Pπ + m2π − mπ .
2

47
(a) Is the π + p force principally attractive or repulsive (as shown in
this figure)?
Solution: The phase shifts are positive, indicating a dominantly
attractive potential.
(b) Plot the total cross section in mb (millibarns) as a function of
energy, from Tπ =40 to 200 MeV.
Solution: The total cross section in terms of the partial wave
phase shifts is:

4π 
σT = (2 + 1) sin2 δ (150)
k 2 =0

= (sin2 δ0 + 3 sin2 δ1 ). (151)
k 2

The kinetic energy Tπ is related to k by Tπ = m2π + k 2 − mπ , or

k= T (T + 2mπ ). (152)
To convert to millibarns, we multiply by:
1 = (197 MeV-fm)2 10 mb/fm2 = 3.88 × 105 MeV2 mb. (153)
(c) Plot the angular distribution of the scattered π + at energies of
120, 140 and 160 MeV.
Solution:

dσ 1  ∞  
= | (2j + 1) e2iδj (k) − 1 Pj (cos θ)|2 (154)
dΩ 2ik j=0
1 2iδ0 (k)
= 2
|e − 1 + 3(e2iδj (k) − 1) cos θ|2 (155)
4k
1  2 2

= [cos δ0 − 1 + 3(cos δ 1 − 1) cos θ] + [sinδ0 + 3 sin δ1 cos θ] .
4k 2
(d) What is the mean free path of 140 MeV pions in a liquid hydrogen
target, with these “protons”?
Solution: The cross section for 140 MeV pions is ∼ 260 mb. The
density of liquid hydrogen is 0.0708 g/cm3 . The number density
is ρ == 4.2 × 102 8 m−3 . The mean free path is thus
1
λ= = 0.9 m. (156)
σT ρ

48
82. Inelastic scattering: Do Exercise 8 of the Scattering course note.

83. Exclusion principle and atomic states: Do Exercise 1 of the Identical


Particles course note.

49
Physics 195b
Problem set number 18
Due 2 PM, Thursday, March 6, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “Electromagnetic Interactions” course note.


PROBLEMS:

84. Extended boson principle and decays to two pions: Do Exercise 2 of


the Identical Particles course note.

85. Gauge transformation in electromagnetism: Do Exercise 1 of the Elec-


tromagnetic Interactions course note.

86. We discussed the method of stationary phase in class. Recall that the
problem it addresses is to evaluate integrals of the form:
 ∞
I() = f (x)eiθ(x)/ dx, (150)
−∞

49
where f and θ are real, and  > 0. We showed that, in the situation
where  is very small, and θ has a stationary point at x = x0 , this
integral is approximately:

√ iθ(x0 )/ i π4 sign(θ  (x0 ) 2π
I() = f (x0 )e e [1 + O()] . (151)
|θ (x0 )|

If there is more than one stationary point, then the contributions are
to be summed.
To get a little practice applying this method, evaluate the following
integral for large t:
 1  
J(t) = cos t(x3 − x) dx. (152)
0

87. In problem 82 you considered the scattering of particles in a multiplet.


You determined the total elastic (sometimes called “scattering”) cross
section and the total inelastic (“reaction”) cross sections in terms of
( )
the Aαβ matrix in the partial wave expansion. Consider now the graph
in Fig. 2.
This graph purports to show the allowed and forbidden regions for the
total elastic and inelastic cross sections in a given partial wave . Derive
the formula for the allowed region of this graph. Make sure to check
the extreme points.

50
4
el(l) Not
k2 σ
α TOT Allowed
π (2l+1)

Allowed

0
0 1
k 2 σ inel(l)
α TOT
π (2l+1)

Figure 2: Made-up graph of phase shifts δ0 and δ1 for elastic π + p scattering


(neglecting spin).

51
Physics 195b
Problem set number 18 – Solution to Problems 86 and 87
Due 2 PM, Thursday, March 6, 2003

READING: Read the “Electromagnetic Interactions” course note.


PROBLEMS:

84. Extended boson principle and decays to two pions: Do Exercise 2 of


the Identical Particles course note.

85. Gauge transformation in electromagnetism: Do Exercise 1 of the Elec-


tromagnetic Interactions course note.

86. We discussed the method of stationary phase in class. Recall that the
problem it addresses is to evaluate integrals of the form:
 ∞
I() = f (x)eiθ(x)/ dx, (157)
−∞

where f and θ are real, and  > 0. We showed that, in the situation
where  is very small, and θ has a stationary point at x = x0 , this
integral is approximately:

√ iθ(x0 )/ i π4 sign[θ  (x0 )] 2π
I() = f (x0 )e e [1 + O()] . (158)
|θ (x0 )|
If there is more than one stationary point, then the contributions are
to be summed.
To get a little practice applying this method, evaluate the following
integral for large t:
 1  
J(t) = cos t(x3 − x) dx. (159)
0

Solution: To start to get it into the desired form, write


 1
3 −x)
J(t) =  ei(t(x ) dx. (160)
0

Thus, f (x) = 1, θ(x) = x3 − x, and  = 1/t. The first two derivatives


are θ (x) = 3x2 − 1 and θ (x) = 6x. The first derivative is zero at

50
√ √
x = ±1/ 3. The zero at x0 = 1/ 3 falls within the range of the
√ of interest. The value of θ
integral, so this is the only stationary point
at the stationary point is θ(x0 ) =√−2/3 3. The second derivative at
the stationary point is θ (x0 ) = 2 3.
Plugging into our stationary phase formula, we get:

1 − 2it
√ 2π
J(t) ≈  √ e 3 3 eiπ/4 √ (161)
t 2 3
  
π i − √
π 2t
=  √ e 4 3 3 (162)
t 3
  
π π 2t
= √ cos − √ . (163)
t 3 4 3 3

87. In problem 82 you considered the scattering of particles in a multiplet.


You determined the total elastic (sometimes called “scattering”) cross
section and the total inelastic (“reaction”) cross sections in terms of
()
the Aαβ matrix in the partial wave expansion. Consider now the graph
in Fig. 2.
This graph purports to show the allowed and forbidden regions for the
total elastic and inelastic cross sections in a given partial wave . Derive
the formula for the allowed region of this graph. Make sure to check
the extreme points.
Solution: For simplicity, let the vertical axis be v, and the horizontal
axis u:
inel() el()
k 2 σαTOT k 2 σαTOT
u= ;v = . (164)
π(2 + 1) π(2 + 1)
From the solution to problem 82, and unitarity of the A() matrix, we
thus have
()
u = |Aβα |2 = 1 − |A() 2
αα | , (165)
β=α

v = |A() 2 () 2 ()


αα − 1| = 1 + |Aαα | − 2Aαα . (166)
The constraint imposed by unitarity is that |A() 2 () iθ
αα | ≤ 1. Let Aαα = re .
Then r ≤ 1 and 0 ≤ θ < 2π gives the allowed region. In terms of the
plotted quantities, u = 1 − r 2 and v = 1 + r 2 − 2r cos θ. Thus
0 ≤ u ≤ 1, (167)

51
and for given u, v must be in the range

(1 − r)2 ≤ y ≤ (1 + r)2 , (168)



where r = 1 − x. If r = 0 then (u, v) = (1, 1). If r = 1 then u = 0
and 0 ≤ v ≤ 4.

52
4
el(l) Not
k2 σ
α TOT Allowed
π (2l+1)

Allowed

0
0 1
k 2 σ inel(l)
α TOT
π (2l+1)

Figure 2: The allowed and forbidden regions for possible elastic and inelastic
cross sections for the scattering of particles in a multiplet.

53
Physics 195b
Problem set number 19
Due 2 PM, Thursday, March 13, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

READING: Read the “Second Quantization” course note.


PROBLEMS:

88. Probability current of a charged particle in an electromagnetic field:


Do Exercise 2 of the Electromagnetic Interactions course note.

89. The electric field in quantum mechanics: Do Exercise 3 of the Electro-


magnetic Interactions course note.

90. Two-level fermion system, second quantized: Do Exercise 1 of the Sec-


ond Quantization course note.

54
91. We have been discussing superconductivity as an application of quan-
tum mechanics. We considered a very simple model at first, in order
to demonstrate the plausibility of the formation of Cooper pairs. The
scanned notes are available as a link from the Ph 195 home page. The
first seven pages are of present interest.

(a) Try to make an estimate


 for how large the Cooper pair is, perhaps
by evaluating r or r 2 .
(b) Turn your estimate into a number, e.g., comparing the size with
the lattice spacing of the superconductor. You will no doubt need
to make plausible estimates (guesses?) of unknown parameters.

55
Physics 195b
Problem set number 19 – Solution to Problem 91
Due 2 PM, Thursday, March 13, 2003

READING: Read the “Second Quantization” course note.


PROBLEMS:

88. Probability current of a charged particle in an electromagnetic field:


Do Exercise 2 of the Electromagnetic Interactions course note.

89. The electric field in quantum mechanics: Do Exercise 3 of the Electro-


magnetic Interactions course note.

90. Two-level fermion system, second quantized: Do Exercise 1 of the Sec-


ond Quantization course note.

91. We have been discussing superconductivity as an application of quan-


tum mechanics. We considered a very simple model at first, in order
to demonstrate the plausibility of the formation of Cooper pairs. The
scanned notes are available as a link from the Ph 195 home page. The
first seven pages are of present interest.

(a) Try to make an estimate


 for how large the Cooper pair is, perhaps
by evaluating r or r 2 .
Solution: Let’s try a simple uncertainty principle argument for
a starter: We know the temperature scale, Tc at which materials
normally become superconducting. The typical interaction energy
in a superconductor may be expected to be of this same scale.
Thus, the uncertainty in the momentum is of order
Tc
∆k ≈ kF , (169)
εF

where kF is the momentum at the Fermi surface, and F = kF2 /2m.


Thus, we anticipate a typical spatial extent of order
1 ε 1
∆x ≈ ≈ . (170)
∆k Tc kF

54
(b) Turn your estimate into a number, e.g., comparing the size with
the lattice spacing of the superconductor. You will no doubt need
to make plausible estimates (guesses?) of unknown parameters.
Solution: We need to estimate the energy of the Fermi surface.
We obtained in class that the momentum is given by:

kF = (3π 2 n)1/3 , (171)

where n is the number density of conduction electrons. What is


this density, numerically? Well, in a conductor, we have of order
one electron contributed to the conduction band by each atom.
Consider niobium, with atomic weight 93 and density 8.6 g/cc.
We estimate the number density as:
8.6
n≈ 6 × 1023 ≈ 6 × 1022 electrons/cc (172)
93
Thus,
kF ≈ (180 × 102 2)1/3 ≈ 1.2 × 108 cm−1 (173)
In other units (using 1 = 2 × 10−5 eV-cm, this is kF ≈ 2 × 103 eV.
This corresponds to an energy of

(2 × 103 )2
εF ≈ ≈ 4 eV. (174)
2 × 0.5 × 106
Let us take a value of Tc = 10 K as a superconducting transition
10 1
temperature for our estimate. This is approximately 300 40
≈ 10−3
eV. Thus,
1 4
∆x ≈ 8 −3 ≈ 4 × 10−5 cm. (175)
10 10
If the lattice spacing is of order 10−8 cm, then this distance cor-
responds to roughly 4000 lattice spacings.

55
Physics 195b
Problem set number 20
Due 2 PM, Thursday, March 20, 2003

Notes about course:

• Homework should be turned in to the TA’s mail slot on the first floor
of East Bridge.

• Collaboration policy: OK to work together in small groups, and to help


with each other’s understanding. Best to first give problems a good try
by yourself. Don’t just copy someone else’s work – whatever you turn
in should be what you think you understand.

• There is a web page for this course, which should be referred to for the
most up-to-date information. The URL:
http://www.hep.caltech.edu/˜fcp/ph195/

• TA: Anura Abeyesinghe, anura@caltech.edu

• If you think a problem is completely trivial (and hence a waste of your


time), you don’t have to do it. Just write “trivial” where your solution
would go, and you will get credit for it. Of course, this means you are
volunteering to help the rest of the class understand it, if they don’t
find it so simple. . .

PROBLEMS:

92. We obtained the following Hamiltonian, up to a constant, for our su-


perconductor:
  
1
HS = d3 (x) ∇ψs† (x) · ∇ψs (x) − µψs† (x)ψs (x) (169)
2m
   
1
+ 3
d (x) d3 (y) ∆ss (x, y)ψs (y)ψs (x) + ∆∗ss (x, y)ψs† (x)ψs† (y) ,
2
where

(a) ψs† (x) is a field operator creating an electron with spin projection
s at point x,

56
(b) repeated indices are summed over,
(c) µ is the “chemical potential” (energy of the Fermi surface here),
(d) the “gap function” is:

∆ss (x, y) = ψs† (x)ψs† (y)U(x − y), (170)

(e) and U(x−y) is the (weakly attractive) electron-electron potential.

As a first step towards the Bogoliubov transformation, we defined the


two-component vector field according to:
 
ψs (x)
Υα = , (171)
ψs† (x)

where α subsumes the s and x indices in one symbol.


Show that our Hamiltonian may be written in the form:
1
HS = Υ†α Hαβ Υβ + H0 , (172)
2
where  
1 1 2
H0 = Tr − ∇ −µ , (173)
2 2m
and


1

2m x
· ∇y − µ δαβ ∆∗αβ
Hαβ = 

1
. (174)
−∆αβ − ∇
2m x
· ∇y − µ δαβ

93. Let us investigate the very low energy scattering limit somewhat fur-
ther. In this limit, we expect S-wave scattering to dominate, so let us
look at the S-wave term (considering spinless case for now):
1
2iδ0
f0 (k) = e −1 . (175)
2ik
In the low energy limit, we can expand k cot δ0 in a series in powers of
k2:
1 1
k cot δ0 = − + r0 k 2 + O(k 4), (176)
a 2

57
where a is called the (zero-energy) “scattering length”,1 and r0 is called
the “effective range”. In principle, we need to establish this formula,
and relate a and r0 to the properties of the scattering center. However,
let us assume that the energy is sufficiently low that we may neglect
all but the first term.

(a) Show that, in the low energy limit, we may write δ0 = −ak, and
find a simple “physical” picture for a. That is, I want you to relate
a to some value of radius r. It is possible to do this by considering
the wavefunctions only in the region where the potential vanishes,
although to actually compute a requires looking at the potential,
of course. What is the total cross section, in terms of the scattering
length?
(b) We continue by considering specifically very low energy neutron-
proton scattering. These are not spinless particles, and indeed it is
observed that the potential is spin-dependent (can you think why
you already know this?). Thus, for neutron-proton scattering at
very low energies, we have two potentials to think about (more,
if spin flips happen, but we assume that we are at sufficiently low
energy so that everything is S-wave, with total spin conserved):
An effective potential for scattering in the spin-singlet state, and
another for the spin-triplet state. Corresponding to these two
potentials, we introduce two scattering lengths, at for the triplet
interaction, and as for the singlet.
With the sign convention of part (a), and using what you know
about the neutron-proton interaction, give a simple discussion of
what you expect for the signs of at and as .
(c) Continuing with low energy neutron-proton scattering, show that
the total cross section for low energy scattering of neutrons on a
target of randomly polarized protons is:
 
3 2 1 2
σ0 = 4π a + a . (177)
4 t 4 s
94. In practice, the result of the preceding problem may be applied to
neutron scattering on a hydrogen target (as our source of protons),
when the neutron wavelength is short compared with the hydrogen
1
Not everyone uses the same sign convention for a!

58
molecular size, but still long enough to be in the “low-energy” regime,
where we may neglect both higher partial waves and the effective range
term.

(a) Consider a possible target with hydrogen gas at room tempera-


ture. Assuming thermal equilibrium, such a gas is a mixture of
parahydrogen (nuclear spins antiparallel) and orthohydrogen (nu-
clear spins parallel).2 What is the fraction of parahydrogen?
(b) Suppose now that we have constructed a target (at 20 K, say)
consisting entirely of parahydrogen. We consider the scattering
of “cold” neutrons from this target. By “cold” we mean neutrons
with an energy corresponding also to T ≈ 20 K. In this case
(as you should convince yourself with a quick computation), the
neutrons scatter elastically from the parahydrogen molecule as a
whole. This may be thought of as coherent scattering from the
two protons. Show that the total cross section for the scattering
of cold neutrons on parahydrogen is:
 2
64 3 1
σP = 4π at + as . (178)
9 4 4
You should assume: (i ) The neutron wavelength is much larger
than the molecular size; (ii ) The scattered wave is the sum of the
waves scattered from each proton – i.e., there is no “double scat-
tering” where the wave scattered from one proton subsequently
scatters on the second proton; (iii ) The scattering is strictly elas-
tic.
Experimentally (see e.g., R. B. Sutton et al., Phys. Rev. 72 (1947)
1147),

σ0 = 20.4 × 10−24 cm2 , (179)


σP = 3.9 × 10−24 cm2 . (180)

Thus, determine at and as . Are your results sensible?

2
How do you remember which is which? Life is complicated by paradoxical parallelisms,
to say nothing of orthogonal orthodoxisms. But in the end, orthodoxy is paramount.

59

You might also like