You are on page 1of 43

Quantum Physics Lecture Notes

Understanding the Schrödinger Equation

Sebastian de Haro

Amsterdam University College, Fall Semester, 2014


Cover illustration: Wikipedia
Contents

Introduction 2
1 Motivating the Schrödinger Equation 3
1.1 Classical waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Enter the quantum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The wave function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Four Steps to Solve the Schrödinger Equation 10


2.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Step 1: Reduce to TISE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Step 2: General Solution of TDSE . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Step 3: Impose Initial Condition Ψ0 (x) . . . . . . . . . . . . . . . . . . . . 12
2.5 Step 4: Plug Back into Ψ(x, t) . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Example: Gaussian Wave Function . . . . . . . . . . . . . . . . . . . . . . 14

3 One-Dimensional Potentials 16
3.1 General Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 The TISE with one-dimensional potentials . . . . . . . . . . . . . . . . . . 17

4 Fourier Integrals and the Dirac Delta 19


4.1 The Dirac delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Fourier transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 The Formalism of Quantum Mechanics 21


5.1 Why? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 The Postulates of Quantum Mechanics . . . . . . . . . . . . . . . . . . . . 22
5.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.4 Continuous spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6 Dirac Notation 26
6.1 Base-free notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.2 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
6.3 Bras and kets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

7 The Interpretation of Quantum Mechanics 29


7.1 EPR and Hidden Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
7.2 Bohr's Reply to EPR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3 The Measurement Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.4 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.4.1 Decoherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.4.2 Many Worlds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

A Mathematical Formulas and Tricks 37


A.1 Gaussian integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

1
B Technicalities of Quantum Mechanical Measurements 38
B.1 Time Evolution Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
B.2 The Measurement Operator U . . . . . . . . . . . . . . . . . . . . . . . . . 39

References 40

2
Introduction

Welcome to the fascinating world of quantum mechanics! We are going to learn how to
do computations in quantum mechanics and how to interpret our results physically. I say
quantum mechanics is fascinating because it is so dierent from any other physical theory
we have ever seen before. In classical physics, particles travel along trajectories that can
be drawn in space and time. In quantum mechanics, particles don't have trajectories
and sometimes we can't even say where they are located in space! In fact, quantum
mechanics is so weird that some of the scientists who made major contributions to it
notoriously, Albert Einstein and Erwin Schrödingernever believed the interpretation of
quantum theory that has now become standard. Richard Feynman famously said, I think
I can safely say that nobody understands quantum mechanics. So, remember: if you get
confused, you are in good company. However, I don't think Feynman was entirely right
on this one. We don't seem to have quantum mechanical intuition wired into our brains
and bodies (as we do learn to appreciate weight, height, etc., intuitively). Feynman was
certainly right about that. But we can learn to work with the theory and progressively
develop a rational intuition for it, based on what we learn from the formulas and the
physical principles they embody. By logical thinking and some guesswork, by looking at
the experimental facts (black body radiation, photoelectric eect, double slit experiment,
etc.) we can try to nd the necessary physical principles that will allow us to construct
the theory of quantum mechanics. We can't derive quantum mechanics, let me be clear
about that, just like we can't derive, from sheer thinking about the concept of mass, the
fact that Newton's law of gravity decreases with the square of the distance. But we can
look for the minimal set of principles and equations that will reproduce and unify all of
the facts we know about quantum mechanics. This is what Schrödinger's equation does
for us. Once we have motivated why it should be true, we have to learn how to work with
it and to interpret it. Indeed this is the right order: rst learn to work with it, then gain
deeper understanding of what it means. Remember that in the short period in which the
`new' quantum theory was developed (basically, Christmas of 1925 to the summer of 1926)
Heisenberg and Schrödinger worked on the math and that only later on did they focus on
its meaning. We will be doing both things as simultaneously as possibledeveloping the
theory while discussing its interpretation.
These lecture notes are to be used as a complement to a textbook on quantum me-
chanics such as Griths' [1]. I will focus on a number of selected issues that I think are
important to understand quantum mechanics. In the rst chapter I motivate how the
Schrödinger equation comes to be in the rst place, rather than just throwing it at you
and asking you to trust quantum mechanics. Whereas the Schrödinger equation cannot be
derived from classical mechanics, it can be motivated from semi-classical considerations.
I also expand, in later sections, on mathematical explanations and include study tools.
Writing down a mathematical theory of quantum mechanics assumes knowledge of the
basic experiments that led to this theory and the broad principles that were derived
from them. Therefore I recommend that, before you start reading these lecture notes,
you refresh your memory (or catch up, whichever may be the case) on the most relevant
historical experiments and the physical principles that were drawn from them. I recom-
mend chapter 37 of Giancoli [2]. If you want a more thorough exposition, you can read

3
chapter 1 of [3]. I will refer to these experiments regularly.

1 Motivating the Schrödinger Equation

1.1 Classical waves

We recall some concepts from classical physics that will both serve x notation and mo-
tivate the way we introduce waves in quantum mechanics. Consider a one-dimensional
wave, say:

y(x) = A sin kx . (1)

This is a static, sinusoidal wave extending in the x-direction. A, the maximum height, is
called the amplitude of the wave. The wavelength is the distance between two identical
points on the wave (e.g. two crests, two troughs, two nodes) and in the above example it

is given by λ = because this is the smallest number for which the wave repeats itself,
k
i.e. y(x + λ) = y(x). k is usually called the wave number
. Consider now a wave traveling
at speed v:
 

y(x, t) = A sin (k(x − vt)) = A sin (x − vt) . (2)
λ
The period T of this wave is the time it takes for the wave to go back to itself in time,
analogously to the wavelength: y(x, t + T ) = y(x, t), hence T = λ/v . You see that the
interpretation of v
as the speed is right, as v is indeed the wavelength divided by the
1 2π
period. The frequency is dened as ν = , and the angular frequency is ω = 2πν = .
T T
The advantage of using the wave number and the angular frequency is that they include
the periodicity of the sine and cosine and the factors of 2π nicely disappear:

y(x, t) = A sin(kx − ωt) . (3)

This is a notation that we will use quite often. I am deliberately usingν for the frequency
here and not f , because this is the standard notation in more advanced texts. We will use
the Greek alphabet extensively in this course, so you better get used to it!: α, β, γ, δ, , . . .

Complex waves. In quantum mechanics we will always use complex waves, for rea-
sons that will become clear later. A complex wave looks like:

Ψ(x, t) = Aei(kx−ωt) . (4)

Now recall the following result:

Euler's formula. eiϕ = cos ϕ + i sin ϕ.


Proof. This can be proven by using the following relations for the sine and cosine, which
you should know:

eiϕ − e−iϕ
sin ϕ =
2i
eiϕ + e−iϕ
cos ϕ = . (5)
2

4
From this, we get:

1 iϕ 1 −iϕ 1 1
cos ϕ + i sin ϕ = e + e + i eiϕ − i e−iϕ = eiϕ , (6)
2 2 2i 2i
which proves Euler's formula.
Remark. If you don't recognise formula (5) at all, try to prove it using the Taylor
expansions (Taylor series) of the sine, cosine, and exponential functions, which you should
know:


X (−1)n
sin x = x2n+1
n=0
(2n + 1)!

X (−1)n
cos x = x2n
n=0
(2n)!

X xn
ex = . (7)
n=0
n!

As you can see, the cosine is an even function of x, hence it only has terms with even
2n 2n+1
powers, x , whereas the sine is odd and only has odd powers x . By taking combina-
iϕ −iϕ
tions of e and e with plus and minus sign, respectively, in (5), we cancel the odd/even
terms and are left with the cosine or sine, respectively.
Because of Euler's formula, we see that the cosine is the real part of the wave while
the sine is its imaginary part:

eiϕ

cos ϕ = Re

eiϕ .

sin ϕ = Im (8)

Hence, we can can always replace sinusoidal waves such as (3) by complex waves (4) and
take the real or imaginary part as needed.
Having made these remarks on complex waves we now go back to the physical meaning
of the quantities involved in a wave. I want to make the following point about the
ix
amplitude A. When we have a wave of the type A sin x or A e , the quantity that is of
2 ∗
physical interest is not A itself, but |A| ≡ AA . Mathematically, the reason is that A
could be made to be negative by simply changing our coordinate by x → −x in sin x, and
obviously that shouldn't aect the amplitude, which is independent of the orientation of

the coordinate. Also, in a complex wave e we could shift ϕ → ϕ + α by some constant
α and generate a complex phase iα
e which then would become part of A. But also that
is a change of variables (namely, choosing a dierent zero point for ϕ) which should not
aect the amplitude. For these reasons, the physical quantities of interest depend on the
2
absolute value (squared) of the amplitude |A| and not on A itself.

1.2 Enter the quantum

Experiments such as the two-slit experiment make clear that particles sometimes behave
as waves. This is true not only for light, but also for material particles such as electrons.

5
Indeed, in a ash of genius Louis de Broglie hypothesized that not only electromagnetic
radiation has a dual nature as waves and as particles (photons), but that also matter,
usually believed to be of corpuscular nature, should possess wave-like properties.
For photons, Planck's 1900 hypothesisclaried and generalized by Einstein's 1905
description of the photoelectric eecthad been that the energy is quantized in units
of h (the Planck constant): E = hν = hc/λ. The quantity p = E/c = h/λ had been
identied as the photon's momentum, which played an important role in the impressive
Compton eect, where a single photon could hit an electron at rest and impart it non-zero
momentum.
de Broglie now proposed that, analogously to the case of light, matter waves also have
a frequency ν and a wavelength λ, given by:

E
ν =
h
h
λ = . (9)
p
Indeed it is one of the great features of Planck's constant h that, by introducing a funda-
mental constant with units J · s, such relations between frequency and energy and between
wavelength and momentum can be written down. For the same reasons for which one in-
troduces the angular frequency and the wave number, we will often use the reduced Planck
constant

h
~= , (10)

which is usually simply referred to as `Planck's constant'. In terms of these, the above
read:

E = ~ω
p = ~k . (11)

Furthermore, remember the classical relation E = T +V. If we ignore the potential, we


have, classically:

p2 ~2 k 2
E= = , (12)
2m 2m
where m is the mass of the electron. We will use this expression a lot in what follows.

Remark. The above formulas (11)-(12) also imply a relation between the frequency
and the wavelength of the wave (viz. between the angular frequency and the wave num-
ber): ω(k) =
~k2
2m
. Such a relation is called a dispersion relation
because it determines the
1
eect of dispersion of the wave as it travels through a medium . The above dispersion

1 Namely, by giving the speed as a function of the wavelength.

6
relation holds for a matter particle. A photon, on the other hand, is massless and has the
dispersion relation
2 E = pc.
The central question that Schrödinger asked, then, is as follows. We know that, at
least semi-classically, electrons have energy given by (12). But they are also waves of the
type:

Ψ(x, t) = A ei(kx−ωt) . (13)

So how can such a wave have energy (12)? The answer from optics would be: write down
a wave equation! So following de Broglie's lead, Schrödinger had the idea writing down a
wave equation that would reproduce (12). The equation in question is the following:

∂Ψ ~2 ∂ 2 Ψ
i~ =− . (14)
∂t 2m ∂x2
As you can see by lling (13) into this equation, (14) looks like:

~2
i~ (−iω)A ei(kx−ωt) = − (−k 2 )A ei(kx−ωt) , (15)
2m
~k2
which holds if: ω =
2m
, which is precisely the condition (12). So the expression for
the energy does appear as a consequence of Schrödinger's wave equation (13): the wave
equation forces on us a relation between the angular frequency ω and the wave number
k, and this relation is nothing but the classical energy relation (12). This is nice: waves
seem to describe electrons too!
The is the one-dimensional Schrödinger equation for a free particle (free electron),
i.e. one with zero potential energy hence satisfying (12). Now if there is potential energy
V around, (12) will change to:

p2
E= +V , (16)
2m
again a result from classical mechanics. So it is natural to modify the Schrödinger equation
as follows:

∂Ψ ~2 ∂ 2 Ψ
i~ =− + V (x)Ψ(x) ≡ HΨ(x) . (17)
∂t 2m ∂x2
Here, V (x) is the potential function, that is the function from which (in classical mechan-
ics) the force can be derived. And the Hamiltonian H was dened as:

~2 ∂ 2
H=− + V (x) . (18)
2m ∂x2
2 Ironically,
quantum theory originated from considerations of the wave vs. particle nature of photons,
but the Schrödinger equation only describes matter particles such as electrons and protons which, because
they are massive, travel at low speeds. Photons travel at the speed of light and one needs to take
relativistic eects into account in order to describe them quantum mechanically, which in turn means that
one needs to generalize the Schrödinger equation and replace it by an appropriate equation incorporating
special relativity. The reason the Schrödinger equation works for massive particles such as the electron
is that it is based on the non-relativistic limit of the energy (12), which can be conveniently `quantized',
as I show below.

7
The Hamiltonian is an operator (a kind of derivative) that represents energy in quan-
tum mechanics. If you have studied the Hamiltonian formalism in classical mechanics,
you might remember that classically the Hamiltonian is the sum of the kinetic and the
potential energy functions, T + V . The Hamiltonian we just dened is not a function,
but an operator: ∂/∂x (also an operator), it acts on functions:
just like the derivative
~2 ∂ 2 ψ
Hψ(x) ≡ − 2m ∂x2 + V (x)ψ(x). That is, given a (wave) function ψ(x), an operator assigns
to it a newwave function Hψ(x) dened by the equation above. In quantum mechanics,
classical quantities such as the energy are replaced by operators. A quantum system may
not always have a denite energy, but the energy operator can always be dened. We will
see later on under what conditions we reproduce classical formulas like (16).
The above is not a derivation of the Schrödinger equation. All I have done is motivate
that, given classical formulas such as (12), and given the assumption that particles are also
waves like (13), we can write down a formula that reproduces (12) and hence encompasses
both principles. But I have not derived it. Then, based on classical mechanics, I made
it plausible that (17) is the right generalization to the case of non-vanishing potential.
Much of what we will do in this course is solving the Schrödinger equation (17) for various
forms of the potential V (x), and nding out precisely under what conditions we can make
predictions of the type (16).

1.3 The wave function

We will now look in more detail at the interpretation of the wave function. Before we
proceed, let me remark that the wave function Ψ is necessarily a complex quantity. We
cannot stick to its real or to its imaginary part (as we would do with, for instance, classical
electromagnetic waves) because the time evolution will normally develop an imaginary
part, even if we start with a purely real wave function. The reason is in the form of
the Schrödinger equation (17). Taking the complex conjugate of this equation, we get a
dierent equation:
∂Ψ∗ ~2 ∂ 2 Ψ∗
−i~ =− + V Ψ∗ . (19)
∂t 2m ∂x2
Notice the dierence in sign on the left-hand side compared to (17). Thus, Ψ and Ψ∗
satisfy dierent equations, which means that the real and the imaginary part of the wave

function contain dierent information. Roughly speaking, Ψ is the mirror image of Ψ

under t → −t, and if Ψ propagates `forward' in time, we can say that Ψ propagates
`backward'. In other words, it is not enough to look for real solutions of the Schrödinger
equation as we would be losing essential information.

Interpretation. Schrödinger originally thoughtalthough he was careful enough not to


2 ∗
write this in his paperthat the absolute value squared of the wave function, |Ψ| = ΨΨ ,
could be interpreted as some kind of charge density or particle density distributed over
the space. This was analogous to the classical theory of light, where the intensity of the
light is proportional to the square of the eld. As it turns out, this interpretation does
not conate well with quantum mechanics. Consider doing the double-slit experiment
with a single particle. Repeating the experiment many times the pattern that appears

8
is described by the distribution |Ψ(x, t)|2 . Since every time we do the experiment there
is just one particle in the set-up, this means that the wave function is not the average
particle density in a single experiment, but rather an average over many experiments.
This is usually called an ensemble average
. It is Max Born who is responsible for the in-
2 2
terpretation of |Ψ| as a probability density. |Ψ(x, t)| dx is the probability to nd, upon
detection, a particle within a small neighbourhood
3 (x, x + dx) of x at time t. Notice
the insistence on detection: this is necessary because before we measure the particle we
cannot make any assumptions about its location. In the double slit experiment that we
discussed earlier, we cannot say which slit the particle went through unless we measure
2
it. This is another reason why |Ψ(x, t)| cannot be interpreted as a `particle density': the
particles cannot be localized until we measure them.
Let me now further motivate why the probability grows like the square of the wave
function, instead of the wave function itself. I have already mentioned that, for waves,
only amplitudes such as |Ψ|, and not Ψ itself, can have a physical meaning. This is because
Ψ can become negative, even complex (something we don't want for a probability). Now
2
we also must explain why |Ψ| , and not |Ψ|, is the relevant quantity. This follows from
the wavelike nature of the wave function, and I will motivate it with three examples:

1. The electromagnetic eld. The intensity of radiation (the intensity of the radiation
emitted by, say, an antenna, or the amount of light emitted by a light bulbboth mani-
festations of the electromagnetic eld) grows proportionally to the square of the electro-
magnetic eld E, 4
not linearly with the eld . And it is the electromagnetic eld which
satises the dierential equations of electromagnetism (called Maxwell's equations), some-
thing analogous to our Schrödinger equation. This intensity of light is proportional to the
number of photons with the given frequency (and, hence, to the density of these photons).
It makes sense that the probability for a particle to have a given frequency be the quan-
tum mechanical quantity corresponding to the number of particles with a given frequency
in classical electromagnetism.

2. The harmonic oscillator. We have seen that the Schrödinger equation could be inter-
preted as reproducing the relation between the energy and the momentum (12) (for the
case of a simple wave like (13)). We extended it to (17) in the case of non-zero poten-
tial. So let us actually compute this energy for a familiar classical system with non-zero
1
potential, the harmonic oscillator. The potential is V =
2
kx2 and the energy function is
1 2 1 2
H = 2 mẋ + 2 kx . Solving Newton's equation x = A sin(ωt + φ), plugging this back into
3 It
is incorrect to say that |ψ(x)|2 is the probability to nd the particle at point x. The probability to
nd a particle exactly at point x is in fact zero. On the other hand, the probability to nd the particle
in the neighborhood (x, x + dx) is innitesimal and non-zero. The probability to nd a particle in a
nite interval (a, b) is P (a, b) = a |ψ(x)|2 dx. From the latter formula we see that, for any point x,
Rb

P (a, x) = a |ψ(y)|2 dy (notice the dierent name for the dummy variable y ), from which it follows that
Rx

∂x = |ψ(x)| . Hence |ψ(x)| itself can be interpreted as the spatial rate of change, or gradient, of the
∂P 2 2

probability at point x.
4 The intensity of radiation is dened as the amount of energy emitted per time per unit area, that
is, the power transported across a given unit area perpendicular to the ow. It is measured in units of
W/m2 .

9
p 1
the energy function and using ω= k/m, kA2 , indeed
we get the familiar result:
2
E=
the square of the amplitude. So, again, classical energies behave like squares of amplitudes.

3. Conservation of probability. Taking P ∝ |Ψ|2 , one can prove a `probability conser-


vation theorem' completely analogous to the charge conservation theorem in electrody-
namics (if charge disappears, it must be taken away by a current, i.e. the time derivative
∂ρ
of the charge density is minus the gradient of the current,
∂t
= −∇ · J). The analo-
gous result for probabilities involves the time derivative of the probability (see problem
1.14 of [1]). This only works if one takes the square of the amplitude to be the probability.

4. Interference. If, in view of these arguments, we accept the identication of the prob-
ability distribution with the square of the wave function, we still have to explain how
interference appears. What I will show now is that interference appears when we add
up the wave functions, ψ = ψ1 + ψ2 , rather than the probabilities. Consider again the
two-slit experiment, where we now close one of the slits. Call ψ1 (x) the wave function at
point x on the screen when only one slit is open (the left one, say) and ψ2 (x) when the
other is open. According to the superposition principle of waves, when both slits are open
2
the wave function is ψ(x) = ψ1 (x) + ψ2 (x). Now because P ∝ |ψ1 (x) + ψ2 (x)| , this is
dierent from the sum of the probabilities. There is a cross term in the square, and this
cross term is responsible for interference. You can easily simulate this yourself by writing
the following mathematica program:

a = 0.5; b = 5;
2
psi1[x ] := Exp[-(x+a) /2];
2
psi2[x ] := Exp[-(x-a) /2];
psi[x ] := psi1[x] - psi2[x];
Plot[psi1[x], {x,-b,b}]
Plot[psi2[x], {x,-b,b}]
Plot[psi[x], {x,-b,b}]
2 2
Plot[psi1[x] +psi2[x] , {x,-b,b}]
The outcome of the program is depicted in Figure 1. As you see from the picture, there
is an interference pattern that follows when we add up the wave functions rather than the
probabilities themselves. The result of adding up the probabilities is in the last picture.
You see that in that case interference completely disappears. Hence, it is important that
2
we add the wave functions, ψ = ψ1 + ψ2 (corresponding P ∝ |ψ1 + ψ2 | ) rather than the
probabilities (P 6= P1 + P2 ).
Hopefully the above arguments have convinced you that we should interpret |Ψ(x, t)|2
as the probability of nding the particle within a region (x, x + dx) at time t. These
arguments of course do not provide a derivation; they only make the assertion plausible.
Like when we set up the Schrödinger equation, we are now constructing a new theory and
we cannot give derivations from rst principles. Otherwise this would be no new theory at
all! So we write down tentative equations and try to uncover the fundamental principles.
In the end, of course, we will know that the structure is right because it worksit solves
many problems that would otherwise be intractable, and it agrees with the results of
experiment. Both things tell us that quantum mechanics is right!

10
1.0 1.0

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

-4 -2 2 4 -4 -2 2 4

0.30 1.5

0.25

0.20 1.0

0.15

0.10 0.5

0.05

-4 -2 2 4 -4 -2 2 4

Figure 1: Two slit experiment (from left to right, and from top to bottom): 1) with only
the left slit open; 2) with only the right slit open; 3) interference pattern when both slits
are open; 4) classical result one obtains by simply adding up the probabilities.

2 Four Steps to Solve the Schrödinger Equation

In this section I will summarize and further explain the steps you have to take to solve
the time-dependent Schrödinger equation as outlined in [1].

2.1 The Problem

The basic problem is to solve the time-dependent Schrödinger equation


i~ Ψ(x, t) = H Ψ(x, t) ,
∂t
~2 ∂ 2
H ≡ − + V (x) , (20)
2m ∂x2
given a known initial wave function (also called the `initial condition') at time5 t = 0:
Ψ(x, 0) = Ψ0 (x) , (21)

where Ψ0 (x) is a given function. I make the distinction between Ψ and Ψ0 because Ψ(x, t)
is the function we want to solve for (we only know it at one particular point in time,
t = 0), whereas Ψ0 is a given function of x. Griths doesn't use Ψ0 , only Ψ(x, 0).
The solution to the above problem consists in a four-step procedure.

2.2 Step 1: Reduce to TISE

Look for special solutions of the type

Ψ(x, t) = ϕ(t) ψ(x) . (22)

5 One may as well choose the initial condition at any other initial time t = tin .

11
These solutions are called separable. Filling this ansatz into the Schrödinger equation, we
completely solve for ϕ and nd another equation for ψ:
i
ϕ(t) = e− ~ Et (23)

~2 d2 ψ(x)
H ψ(x) = E ψ(x) ⇒ − + V (x) ψ(x) = E ψ(x) . (24)
2m dx2
Notice that I write partial derivatives ∂/∂t for functions such as Ψ(x, t) that depend on
several variables, and regular derivatives d/dx for functions ψ(x) that depend on a single
variable.
Equation (24) is called the time-independent Schrödinger equation. Filling the ansatz
(22) into the time dependent equation, we were able to solve for ϕ in (23). This gave us
an integration constant E, which reappears in (24) and can be interpreted as an energy
(though not ` the energy of the system', a concept which may not always be well dened).
Solving (24) will give us ψ(x) as well as the allowed values of the energy E, which is
called the spectrum of the Hamiltonian H . The spectrum depends on the potential V (x).
In general, (24) has many solutions (usually an innite number of them). For instance,
ψ(x) = sin nx for positive integers n = 1, 2, 3, . . . In this case, the spectrum is called
discrete (as there are discrete energy levels labeled by n = 1, 2, 3, . . .). Accordingly, we
label both the energies and the wave function by n in this case:

H ψn (x) = En ψn (x) . (25)

The main problem now is to solve this equation, imposing the appropriate boundary
conditions (this will be dealt with in chapter (3) for dierent types of potentials). For
~2 π 2 n2
instance, for the innite square well it turns out that En = .
2ma2
If the spectrum is continuousrather than discrete, the energies are labeled by a
continuous variable that we usually call k , the wave number. For instance, for the free
~2 k2 ikx
particle: Ek = , and the wave functions are ψk (x) = e . In this case, k is the wave
2m
number, with k > 0 for right-moving, k < 0 for left-moving waves. Remember that the
wave number relates to the momentum of the wave by the de Broglie formula (11). In
this case, the energy is given by (12).
To summarize this step, we have reduced the time-dependent Schrödinger equation
(a partial dierential equation in t and x) to the time-independent Schrödinger equation
(a second order dierential equation in x). It is useful to think of (25) as an eigenvalue
problem. H is an operator (an object which, like a matrix, acts on a vector space) and
ψn are the eigenfunctions, with corresponding eigenvalues En . Solving this eigenvalue
problem gives us, like in linear algebra, both the eigenvalues and the eigenfunctions. We
will develop this point of view further when we discuss the formalism.

2.3 Step 2: General Solution of TDSE

Once you have the solutions ψn (x) and En of the TISE (25), you can move on to get the
general solution of the TDSE (20). It is given by:

i
X
Ψ(x, t) = cn e− ~ En t ψn (x) . (26)
n

12
It is straightforward to show that, if the individual solutions (22) solve (20), then a
superposition of them like in (26) also is a solution. The reason is that (20) is a linear
equation, and its solutions obey the superposition principle: if we have a set of solutions
(22), we can add them together with arbitrary coecients cn and the result is also a
solution. But the key point to understand here is that (26) in fact gives us the most
general solution of the TDSE (20). Indeed, whereas the separation of variables (22) was
an ansatz (an assumption, in order words, we only obtained a specic solution), one can
show that any solution of the TDSE is in fact of the form (26).
To show this in general is a little tricky because it actually depends on the form of
the potential function V (x). But the idea, which we will later on substantiate for specic
potentials V (x), is as follows. (20) is a rst-order partial dierential equation in t and x
which we are solving in a two-step procedure of rst solving for t and then for x. So we
regard this equation as an ordinary, linear dierential equation in t (i.e. we treat x as a
constant). Since the equation is rst order (only one t derivative), it depends on a single
`integration constant', which is our initial function Ψ0 (x) (which I have not yet specied;
therefore, for the time being Ψ0 is a generic function). The theory of rst-order dieren-
tial equation tells us that, if I can choose cn in (26) such that Ψ(x, t) satises the initial
condition Ψ0 (x), then (26) is the unique solution associated with that Ψ0 (x). Remember,
a rst order linear dierential equation has only one integration constant/boundary con-
dition (in this case, Ψ0 ). Since Ψ0 (x) was generic to start with, this is the same as saying
that (26) is the most general solution (i.e. for any boundary condition) of the Schrödinger
equation regarded as a rst-order dierential equation in t.
1
In the continuous case, we replace n → k , cn → √ φ(k), and the sum becomes an

integral:
Z ∞
1 i
Ψ(x, t) = √ dk φ(k) e− ~ Ek t ψk (x) . (27)
2π −∞

For the simple case of a free particle, ψk (x) = eikx and this becomes:
Z ∞
1 2
 
i kx− ~k t
Ψ(x, t) = √ dk φ(k) e 2m
. (28)
2π −∞

2.4 Step 3: Impose Initial Condition Ψ0 (x)


In this step, we choose the coecients cn in (26) such that our initial condition (21) is
satised. If we can do this, then we are left with the unique solution of the Schrödinger
equation. Physically, what we are doing is we are imposing that the most general solution
of the Schrödinger equation Ψ(x, t) agrees with our experimental situation at time t = 0.
We have prepared our system (e.g. a system of electrons with spin up) in a particular
state Ψ0 (x) at time t = 0. Given this particular state, Ψ(x, t) tells us, via the Schrödinger
equation, how the system evolves in time. For instance, H might contain an interaction
between the spins and an external magnetic eld, and so some of the spin states of the
electrons may change in time. So, H encodes the dynamics of the system.

13
All we have to do is set t=0 in (26) or (27) for the discrete/continuous case, respec-
tively. We get:
X
Ψ0 (x) = cn ψn (x) , discrete spectrum
n
Z ∞
1
Ψ0 (x) = √ dk φ(k) ψk (x) , continuous spectrum (29)
2π −∞

So in this step, the task is to nd cn or φ(x) such that (29) is satised. It is here where
we have to use the specic knowledge about ψn and ψk . Once we have done this, we have
shown that (26)-(27) satisfy the given boundary condition.
Before I show how to this in practice, we must reect on the following question: does
(29) always have a solution? Can we always be sure that this is somehow the case? Can I
always nd cn and φ(k) to match Ψ0 (x) on the l.h.s.? This is the same as asking whether
Ψ(x, t) was in fact the most general solution of the TDSE, and I already anticipated that
the answer was yes. Now we see how this works: indeed, the reason I can always nd
cn and φ(k) to solve (29) is that ψn (x) and ψk (x) form what is called a complete set of
functions. This means that any (nice enough) function Ψ0 (x) can be written as a linear
superposition of them, precisely in the form (29). Mathematicians have shown this , and
6
we will see that in some specic cases as well during the course.
Remember that to solve (29) explicitly we need to know ψn (x) and ψk (x), which means
we have made a choice of potential function V (x) and solved (24). A very useful example
to work out is the free particle, where ψk (x) = eikx . Equation (29) then becomes the
Fourier decomposition of Ψ0 (x):
Z ∞
1
Ψ0 (x) = √ dk φ(k) eikx . (30)
2π −∞

We say that φ(k) is the Fourier transform Ψ0 (x). Fourier


of theory now tells us, via
Plancherel's theorem, how to nd φ(k) for given Ψ0 (x):

Plancherel's Theorem. Let


Rf (x) F (k) be two
and
R ∞ square integrable functions over
dk |F (k)| are nite. Then the
∞ 2 2
the real line, i.e. the integrals dx |f (x)| and
−∞ −∞
following holds:
Z ∞ Z ∞
1 ikx 1 −ikx
f (x) = √ dk F (k) e ⇔ F (k) = √ dx f (x) e . (31)
2π −∞ 2π −∞

This condition precisely allows us to solve for φ(k) in (30): assuming that Ψ0 and φ(k)
are square integrable (in fact, since|Ψ0 (x)|2 is a probability density, it should integrate
to one!), we see that (30) is of the form of the l.h.s. of (31) if we identify f (x) = Ψ0 (x)
and φ(k) = F (k). Therefore Plancherel's theorem tells us that the r.h.s. is true:
Z ∞
1 −ikx
φ(k) = √ dx Ψ0 (x) e . (32)
2π −∞
6 This
is valid under certain conditions for the potential V (x), which we will normally assume to be
polynomial.

14
This is how step 3 looks like for a free particle. Given any boundary condition Ψ0 (x) at
t = 0, we nd the unique innite set of coecients φ(k) that solve the boundary condition
(29).

2.5 Step 4: Plug Back into Ψ(x, t)


In the previous step we found the coecients cn or φ(k) such that the boundary condition
(29) is satised. We now have to plug this back into the general solution of the Schrödinger
equation (26) or (27) to get the full solution. Of course, to obtain a completely explicit
solution one has to carry out the summation or the integral.
Don't take the above four-step procedure as if it were written in stone. Sometimes
you may simply be interested in the wave functions ψn (x) and will not bother to calculate
the coecients cn , sometimes step 4 is trivial because only a few terms contribute or
sometimes you will directly jump to step 3 because you already know the wave functions.

2.6 Example: Gaussian Wave Function

We will now apply this procedure to the case of a free particle, which means V (x) = 0.
We will take as our initial wave function a Gaussian distribution:

 1/4
2a 2
Ψ0 (x) = e−ax . (33)
π

Some comments about the physics before we start. This wave function represents a par-
1
ticle localized around x = 0 with standard deviation σ = √ . So when a is large, the
2 a
particle is well localized around x = 0, whereas if a → 0, the probability distribution
attens and the particle becomes more and more delocalized in space. Using the basic
Gaussian integral (85), we check that the wave function is indeed normalized to one:
R∞ 2
dx|Ψ0 (x)| = 1.
−∞

Step 1. The TISE (24) reduces to:

~2 d2 ψ(x)
− = E ψ(x) . (34)
2m dx2
This equation is readily solved. The solutions are sinusoidal. We can write then as
ψ(x) = A sin kx + B cos kx or alternately as complex exponentials, as we saw in section
1.1. We will take exponentials:

ψk (x) = eikx

2mEk ~2 k 2
k = ± ⇔ Ek = . (35)
~ 2m
The second line follows from lling the wave function into (34).

15
Step 2. We ll this into the general solution of the Schrödinger equation, which gives us
the formula I wrote earlier:
Z ∞
1 2
 
i kx− ~k t
Ψ(x, t) = √ dk φ(k) e 2m
. (36)
2π −∞

Here, we took both the k>0 and the k<0 solutions into account
7 by integrating from
minus innity to plus innity with arbitrary coecients φ(k).

Step 3. Impose initial condition Ψ0 (x). We use Plancherel's theorem directly for
the initial wave function at hand:

Z ∞ Z ∞  1/4
1 −ikx 1 2a 2
φ(k) = √ dx Ψ0 (x) e =√ dx e−ax e−ikx
2π −∞ 2π −∞ π
 a 1/4 Z ∞
−ax2 −ikx
= dx e . (37)
2π 3 −∞

The integral can once again be done using the basic Gaussian integral (85). To bring it
to this form, we need to cancel the −ikx in the exponent. This can be done by a change
of variables. One readily checks that the following does the job :
8

ik
x=y− . (38)
2a
One way to see why this trick works is to rewrite the term in the exponent as follows:
 
2 ik
−ax − ikx = −ax x + ≡ a (y 2 + C) . (39)
a
We impose the last equality because we want to get a Gaussian integral, up to a constant
C but with no linear term in x. From this form it is more or less obvious that we can
complete the square as follows:
       
ik ik ik ik ik ik ik
−ax x + = −a x + − x+ + = −a y − y+
a 2a 2a 2a 2a 2a 2a
 2 !
k2 k2
 
ik
= −a y 2 − = −a y 2 + 2 = −ay 2 − . (40)
2a 4a 4a

In other words, to nd the trick (38) I just add a constant to x such that in the end I
2 2
get (y − c)(y + c) = y − c . This way of reducing integrals of exponentials containing a

7 As in the sinusoidal representation,the general solution for given k is actually ψk (x) = Aeikx +Be−ikx .
However, as mentioned in the text, the negative solution is automatically taken into account by the fact
that we integrate over both positive and negative k. It is also unnecessary to include the normalization
constant A, as overall normalizations are taken care of by φ(k) with which this wave function is multiplied
(in other words, A can always be reabsorbed in φ(k)).
8 One should worry about the fact that this change of variables involves an imaginary shift of the
integration variable, by a factor of − 2a
ik
. However, for real and positive a the integral is everywhere nite
and it follows from complex analysis that this change of variables can be done.

16
Gaussian piece to a pure Gaussian is an important trick. We also have dx = dy and the
integration range is the same. Hence:

 a 1/4 Z ∞  a 1/4 Z ∞ 2  a 1/4 k2 r π


−ax2 −ikx −ay 2 − k4a
φ(k) = 3
dx e = 3
dy e = 3
e− 4a
2π −∞ 2π −∞ 2π a
 1/4
1 k2
⇒ φ(k) = e− 4a . (41)
2πa
To summarize the trick, we have derived the following generalized Gaussian formula:


Z r
−αx2 −iβx π − β2
dx e = e 4α . (42)
−∞ α

Step 4. Plug back into Ψ(x, t). Now we can plug this back into the solution of the
TDSE (28):

Z ∞  1/4 Z ∞
1 2
1 ~k2
   
2
i kx− ~k t − k4a i kx− 2m t
Ψ(x, t) = √ dk φ(k) e 2m
= dk e e
2π −∞ 8aπ 3 −∞
 1/4 Z ∞
1 1
ikx−k2 ( 4a i~t
+ 2m ).
= 3
dk e (43)
8aπ −∞

This last integral is of the same type as the one before, it is a generalized Gaussian
integral. The only dierence is that we now integrate over k instead of x, but we can
apply formula (42) all the same since x and k are dummy variables. We get (see exercise
3.2):

 1/4 ax2
2a 1 −
1+ 2ai~t
Ψ(x, t) = q e m . (44)
π 1+ 2ai~t
m

This is the nal result for the wave function for the given boundary condition (33). As
you see, it is a completely explicit function of time. For its physical interpretation, see
exercise 3.2.

3 One-Dimensional Potentials

3.1 General Theorems

There are some useful theorems that save you work when solving the time-dependent
Schrödinger equation with one-dimensional potentials V (x). These theorems are pre-
sented in problems 2.1 and 2.2 of [1]. I summarize them here.

1) The assumption that (whenever this is possible) solutions of the Schrödinger equa-
tion should be normalizable implies the following:
a) E ∈ R. That is, we can always take the energy to be positive.

17
b) E > Vmin . The energy is bounded from below.

Remark. Normalizable solutions do not always exist. For instance, plane waves ψ(x) =
R∞ R∞
e are not normalizable in the range r ∈ (−∞, ∞) because −∞ dx |ψ(x)|2 = −∞ dx =
ikx

∞. But we use normalizable solutions whenever they exist. Whenever a non-normalizable


solution is used, one must have a physical motivation for doing that. For instance, in the
case of plane waves, it is not surprising that they are non-normalizable because they ll all
of space. We regard them as approximations to more realistic situations that can be built
by superpositions of plane waves. Superposing plane waves we can get a normalizable
solution, as in (28). Remember that we always only need to normalize the total wave
function Ψ(x, t). In this case, plane waves are like `fundamental building blocks' of more
physical solutions of the TDSE and as such they are useful.

2) In the TISE, ψ(x) can always be taken to be real. A complex solution of the TISE is
always a linear combination of real solutions.

3) If the potential is even, V (x) = V (−x), then the even and odd solutions of the TISE
can be analyzed separately. In this case, any solution ψ(x) of the TISE can be written
as a linear combination of an even and an odd solution. This is useful when we consider
bound states because it means that we can solve the Schrödinger equation to one side of
the potential (x > 0, say) and we automatically obtain the other side by use of the sym-
metry. On the other hand, we often cannot apply this to scattering states, because there
is an explicit breaking of the symmetry by the boundary conditions (which are generically
dierent at plus and minus innity). See the next section.

3.2 The TISE with one-dimensional potentials

As we saw in chapter 2, the rst step in solving the TDSE is reducing to the TISE. Here
we concentrate on this step. There are a few things to remember:

• Normalizability imposes E > Vmin (1b).


Further, one distinguishes bound states and scattering states because their physi-
cal behavior is completely dierent and they give independent contributions to the nal
wave function Ψ(x, t).
Also, we solve the Schrödinger equation separately in each region in which the poten-
tial is a continuous, dierentiable function (a smooth function).

• Bound states have the following generic behavior which you should check:
X Exponential behavior (damped/exponential growth) in the exterior regions.
X The behavior in the interior regions (if there are any) depends on the details of
the potential. It can be oscillatory (sine, cosine or plane wave) or exponential. See the
example in Figure 2.
X Symetric potenial: in that case, it is useful to separate even/odd solutions, which
leads to sines and cosines (or hyperbolic sines and cosines) in the interior region. Use

18
Figure 2: Potential symmetric around x = 0, with narrow spikes at x=a and x = −a.

exponentials in the exterior regions.


X Discrete spectrum: in that case, the values of E are limited (sometimes a only a
handful of solutions or none, or an innite number of them).
X Solve the energy equation graphically to obtain the spectrum.
• Scattering states have the following generic properties:
X Oscillatory behavior. Use plane waves.
X Transmission and reection coecients are interesting quantities.
X There is an interesting physical interpretation in terms of waves with which one can
probe a potential.
X Algebraic manipulations can be heavy if one has to match dierent regions.

• Impose (dis-)continuity conditions at each point at which the potential is non-dierentiable


(it has a kink or it is singular at those points):
X If the potential is everywhere nite, then ψ and dψ
dx are continuous. Such a potential
can be non-smooth though.
X If the potential has singularities (points where it is innite) then ψ is continuous

but
dx will have discontinuities at the singular points. The discontinuities are known and
can be obtained by integrating the Schrödinger equation around the singular points, as
in the steps leading to [2.125]. Obviously, an innite potential is only an idealization of a
very large one.
X If the potential is innite in a whole region (more than a single point), the wave
function must vanish there. For example: the innite potential well.

19
4 Fourier Integrals and the Dirac Delta

4.1 The Dirac delta

Consider the following Gaussian function with unit area:

1 (x−x0 )2
f (x) = √ e− ε . (45)
πε
R∞
Using (85), one easily sees that dx f (x) = 1. The standard deviation of this Gaus-
−∞
p √
sian is σ = /2 (see Exercise 1.3b) of [1]), so clearly ε determines the width of the

distribution and f (x0 ) = 1/ πε its height.
Consider what happens when we make ε smaller and smaller; the peak clearly becomes
higher and higher (in order for the function to preserve its area 1) and the distribution
becomes narrower and narrower. In the limit, the height of the distribution is innite and
its width goes to zero, but the area stays equal to one. We can picture what hapens in
√ √
the limit as follows. Replace the Gaussian by a square of height
√ ε and height 1/ ε such
1
that the total area under the square is ε × ε = 1, i.e. a nite limit. Now since in the

limit the height of the distribution is innite but the width goes to zero, we can think of
a function δ(x) that takes the following values:


0 for x 6= x0
δ(x − x0 ) =
∞ for x = x0
Z ∞
dx δ(x − x0 ) = 1 . (46)
−∞

This is called the Dirac delta distribution. It is called a distribution because it is not a
proper mathematical function (it has an explicit divergence at one point; this divergence,
however, is mild enough that it can be treated in the more general theory of distributions,
which goes beyond the scope of these notes). Our treatment here has been mathematically
heuristic, but sucient. We argued that this function is the following limit:

1 (x−x0 )2
δ(x − x0 ) = lim √ e− ε . (47)
ε→0 πε

To check the integration property (46), we rst perform the integral and then take the
limit ε → 0. It now also follows that for any function f (x):
Z ∞
dx δ(x − x0 ) f (x) = f (x0 ) . (48)
−∞

From the fact that the Dirac delta vanishes everywhere except at x = x0 , we have:
δ(x − x0 ) f (x) = δ(x − x0 ) f (x0 ) ∀x. Now we can take f (x0 ) outside the integral, and
evaluate the integral using (46). The result is (48).
The Dirac delta is paramount to many applications in physics, where we want to mimic
situations with `spikes'. If we identify t = ε as time, it turns out that (45) satises the heat
equation. This Gaussian for instance describes heating up an aluminium bar. Initially

20
at t = 0, the temperature is very high at the point where the material is being heated
(x = x0 ), but as the material goes to thermal equilibrium the temperature quickly drops
and the temperature prole looks like a blob centered around the origin but spreading in
time, as heat exits the center. See Figure 3.

Figure 3: Heat distribution, all energy concentrated at x=0 at t = 0, spreading in space


as time increases according to a Gaussian distribution.

4.2 Fourier transformations

Plancherel's theorem (31) tells us that, for every square-integrable function f (x), there is
a unique square integrable function F (k) from which f (x) can be reconstructed. F (k) is
called the Fourier transform of f (x).
Its physical signicance is that it gives an expansion
ikx 2π
of the function f (x) in terms of plane waves e with a well-dened wavelength λ = .
k
Thus, f (x) is written as a superposition of (an innite number of ) monochromatic waves,
one for each frequency. Fourier analysis is used in many branches of physics. Decomposing
a sound signal into a spectrum of frequencies of sound is an example of Fourier analysis.
Filling the second of (31) into the rst, we nd the following relation:

Z ∞ Z ∞ Z ∞ Z ∞
1 ikx 0 0 −ikx0 1 0 0 ik(x−x0 )
f (x) = dk e dx f (x ) e = dk dx f (x ) e
2π −∞ −∞ 2π −∞ −∞
Z ∞ Z ∞
1 0 0 0
ik(x−x )
= dx f (x ) dk e . (49)
2π −∞ −∞

The integral shifts are allowed because we were careful about distinguishing x from the
0
dummy variable x that is integrated over (and anything to the right
of the integral is
integrated over).

21
Now comparing (49) with (48), we conclude that
9

Z ∞
01 0
δ(x − x ) = dk eik(x−x ) , (50)
2π −∞

where in comparing the two we have relabeled x ↔ x0 , x0 ↔ x.


Let us evaluate this expression. When x = x0 the integrand is one, and the integral
diverges. When x 6= x0 , we can evaluate the integral explicitly:
0
0 1 eik(x−x ) ∞
δ(x − x ) = =0. (51)
2π i(x − x0 ) −∞

Hence this agrees with our previous denition (46).


The expression (50) we have obtained for the Dirac delta is a very useful one. Com-
paring it to Plancherel's theorem (31), taking f (x − x0 ) = δ(x − x0 ), we nd from the
left-hand side of (31) that:

1
F (k) = √ . (52)

In other words, the Dirac delta is the Fourier transform of a constant! This agrees with
our intuition about position vs. momentum space: a function that is completely localized
at some point in space corresponds, under Fourier analysis, to a constant functioni.e. a
superposition of all wavelengths, all with the same amplitude.

5 The Formalism of Quantum Mechanics

5.1 Why?

If you have read a text on the formalism of quantum mechanics, you might have been left
with the question: how is this useful? Why study it?
There are some important reasons why it pays o to learn quantum mechanics in this
high-brow language of Hilbert spaces:

1) For the simple case of nite Hilbert spaces, the formalism simply reduces to the algebra
of matrices (diagonalization of matrices, etc.).
2) In general, the formalism allows for a unied conceptual treatment of continuous and
discrete cases by means of linear algebra.
3) The formalism is independent of the particular basis you choose. Describing your sys-
tem in terms of its positions or in terms of momenta does not make any dierence, just
as you can describe a wave either by specifying its spatial prole or by specifying the
elementary frequencies that the wave is built up from. It is a choice of basis, and the

9 This equation certainly makes (49) and (48) compatible. But is this the unique solution? In fact, it
is. Assume that
R ∞ the 0two 0sides of (50) diered by some function g(x − x0 ). This function would have to
be such that −∞ dx f (x ) g(x − x0 ) = 0 ∀f ∈ L2 (R). But the only such function is g(x) = 0. Hence the
solution (50) is unique (even if for the reasons explained the Dirac delta is not a proper function).

22
formalism shows that quantum mechanics does not depend on a choice of basis.
4) It is conceptually simple: it allows to formulate the postulates of quantum mechanics
in a clear and concise way, and in particular it allows for a natural introduction of the
measurement postulate. Some approaches to quantum mechanics do not need a measure-
ment postulate (or they claim they don't), but inthe standard Copenhagen interpretation
this is now the connection between theory and experiment is made.

5.2 The Postulates of Quantum Mechanics

Von Neumann gave an axiomatic formulation of quantum mechanics, similarly to what


Einstein had done for special relativity. The great advantage of this approach is that,
if you want to generalize the theory, all you have to do is modify one or several of its
postulates. Similarly, if something turns out to be `wrong' with the theory, the axiomatic
structure makes it easier to trace the inconsistency back to one or several of the axioms. By
explicitly including a projection postulate we are not only making it possible to interpret
measurements, but also making explicit something that might turn out to be a weakness
of the theoryrather than hiding it under the rug of the theoretical machinery. So here
are the postulates
10 [3]:

Postulate 1. State of the System:


An ensemble of physical systems is completely described by a wave function or state
function. The wave function lives in a Hilbert space, and it may be multiplied by an
arbitrary complex number without altering its physical signicance.

Postulate 2. Observables:
Observables are represented by Hermitian (or self-adjoint) operators, Q̂ = Q̂† .
Justication: Hermiticity is the natural notion of `reality' for operators. An Hermitian
operator has real eigenvalues hence real expectation values (see the next postulate).

Postulate 3. Measurement postulate:


The only result of a precise measurement of Q̂ is one of the eigenvalues qn of Q̂.
Justication: when we measure any quantity in the laboratory, we never observe an ac-
tual superposition11 . We always nd a unique value for the observed outcome of the
experimentthe energy, say. Since the eigenvalues are well-dened real numbers associ-
ated with an Hermitian operator, they are good candidates for measurement outcomes.
2
The probabilities that, upon measurement, we nd qn , are given by |cn | .

10 Dierentauthors sometimes rank the postulates in a dierent order.


11 Occasionallyyou will nd a text, even in prestigious journals, where it is claimed that a superposition
has been observed. Beware of such metaphorical phrasing! So far no superposition has been observed as
an actual experimental outcome. We observe interference patterns after repeating an experiment many
times, from which we infer that the state of the system was a superposition. The dierence is subtle, but
important. The detector either clicks, or it doesn't!

23
Postulate 4. Schrödinger postulate:
The time evolution of the system is governed by the Schrödinger equation.

Postulate 5. Projection postulate:


After a precise measurement of Q̂ with outcome qn , the system is (shortly after measure-
ment) in the state ψqn .
The justication of this postulate is that repeated measurements of Q̂ should give the
same result if the time interval between them is small. The only state with P (qn ) = 1 is
ψqn itself, therefore after measurement the system must be in state ψqn .
This postulate introduces what we call the `collapse of the wave function'. If prior to
measurement the system is in the state:

X
|Ψ(t)i = cn (t) |ψqn i , (53)
n

and after measruement it is in state:

|Ψi = |ψqn i , (54)

we see that we have projected onto one of the components of (53) and lost all of the
information about the cn 's prior to measurement. We can neither predict with certainty
which state we will nd upon measurement, nor can we, after measurement, use the
Schrödinger equation to `trace back' the state (53).

5.3 Linear Algebra

In the table below I summarize the main concepts from linear algebra that are used in
the description of quantum systems, including a detailed comparison with inner products
inC N
, with their respective notations:

24
linear algebra in CN wave function Hilbert space

vector: |αi Ψ(x, t) |Ψ(t)i


  basis ei :
 
1 0
 0   .. 
ψn (x)
e1 =  ..  , · · · , eN =  .  |ψn i or |ni
   
 .   0  (n = 1, 2, . . . , ∞)
0 1
vector |αi in a basis:
 
α1
 .. 
α= N
P P P
i=1 ei αi =  .  Ψ(x, t) = n cn (t) ψn (x) |Ψ(t)i = n cn (t) |ψn i
αN
PN ∗ Rb ∗
inner product: hα|βi = i=1 αi βi dx Ψ1 (x) Ψ2 (x) hΨ1 |Ψ2 i
Rb a

orthonormal basis: hei |ej i = δij dx ψn (x) ψm (x) = δnm hψn |ψm i = δnm
a
linear transformation (matrix): operator Q̂:

β =T ·α ψ = Q̂ Ψ2 |Ψ1 i = Q̂ |Ψ2 i
R b1 ∗
coecients in a basis: αi = hei |αi cn (t) = dx ψn (x) Ψ(x, t) cn (t) = hψn |Ψ(t)i
a

dual vector: hα| Ψ (x, t) hΨ(t)|
dual basis:
eTi = (0, · · · , 0, 1, 0, · · · , 0)
ψn∗ (x) hψn |
(1 on i th place)
vector in dual basis:
PN T ∗ ∗ ∗
T
Ψ∗ (x, t) = ∗ ∗ ∗
P P
α = i=1 ei αi = (α1 , · · · , αN ) n cn (t) ψn (x) hΨ(t)| = n cn (t) hψn |

The Hilbert space considered here is L2 (a, b), that is the square integrable functions
between (a, b):
Z b
2
dx |Ψ(x)| <∞ (55)
a

where a, b can be nite or innite (−∞, ∞).

5.4 Continuous spectra

With appropriate modications, the above formulas also hold in the case of continuous
spectra. For deniteness, we will consider the momentum operator p̂ = −i~ d/dx and
the associated wave functions of a free particle moving along x ∈ (−∞, ∞). This is a
continuous spectrum labeled by p = ~k . The eigenfunctions of p̂ are:
1 i
ψp (x) = √ e ~ px , (56)
2π~
where the reason for the normalization will become clear in a moment.

25
2
R∞
The above wave-functions are not normalizable because dx |ψp (x)| = ∞. How-
−∞
ever, as long as we consider dierent p's, we do have the following orthonormality condi-
tion:
Z ∞ Z ∞
∗ 1 i(p−p0 )x/~
dx ψp0 (x) ψp (x) = dx e = δ(p − p0 ) , (57)
−∞ 2π~ −∞

where in the last equality we used the representation of the Dirac delta function introduced
in (50), after interchanging
√ x and k and introducing p = ~k (this also explains the reason
for the 1/ ~ in the normalization of (56). So we get:

hψp0 |ψp i = δ(p − p0 ) , (58)

which is analogous to the orthonormality condition in the discrete case.


Let us generalize this picture to the eigenfunctions of an arbitrary observable Q̂. Let
us assume that the eigenvalues are labeled by q(z), where z is a continuous variable (z =k
and q(z) = ~k , in the previous example). They come from solving the eigenvalue problem:

Q̂ ψz (x) = q(z) ψz (x) . (59)

The probability of nding a result q(z) in the range (z, z + dz) at time t is then given by
|c(z, t)|2 dz , where:
Z
Ψ(x, t) = dz c(z, t) ψz (z)
c(z, t) = hψz |Ψi . (60)

Notice that c(z, t) contains as much information as Ψ(x, t) does. There is a one-to-one
correspondence between the two which is brought out by Plancherel's theorem. In the
specic case z = k, then, c(z, t) has a special name, Φ(p, t), and it is called the `wave
function in momentum space':
Z ∞
1 ipx/h
Ψ(x, t) = √ dp e Φ(p, t)
2π~ −∞
Z ∞
1 −ipx/~
Φ(p, t) = √ dx e Ψ(x, t) . (61)
2π~ −∞
Examples
p 2
a) If we take Q̂ = H for a free particle, then z = p (the momentum) and q(z) = E(p) = 2m .
1 ipx/~
So these wave functions are the usual plane waves, ψp (x) = √ e . Alternatively, we
2π~
~2 k2
use the wave number instead of the momentum, z=k and E(k) = 2m
. The result is the
same.
b) If we take Q̂ = p̂ for a particle allowed to move with any momentump (for instance,
a free particle), then the eivenvalue equation isp̂ ψp (x) = p ψp (x). Hence z = p and the
1
eigenvalues are the momenta themselves, q(z) = p. Hence also ψp (x) =
2π~
eipx/~ .
c) If we take Q̂ = x̂, the eigenvalue equation is x̂ ψy (x) = y ψy (x). Then z = y and
q(z) = y . The solutions of the eigenvalue equation are delta functions, ψy (x) = δ(x − y).

26
Once we have the basis ψz (x) that solves (59), we can expand the wave function in
this basis:
Z
Ψ(x, t) = dz c(z, t) ψz (x) . (62)

In the above examples, this formula reproduces known results:


1
R R ipx/~
a) Ψ(x, t) = dp c(p, t) ψp (x) = √ dp c(p, t) e which is the Fourier transform (61).
2π~
b) Idem.
R R
c) Ψ(x, t) = dy c(y, t) ψy (x) = dy c(y, t) δ(x − y) = c(x, t). This is a `diagonal' basis:
the expansion of Ψ(x, t) contains just one term, namely c(x, t) = Ψ(x, t) itself.

6 Dirac Notation

6.1 Base-free notation

In the above examples we have been expanding the same wave function in a dierent
basis of wave functions (eigenfunctions of some Hermitian operator). The Hilbert space
formalism allows us to do this in a basis-free way, as announced. To that end we introduce
the vector |S(t)i ∈ H, where we now think of the Hilbert space as an abstract space
without the need to specify a basis. The only requirement is that |S(t)i satises the
TDSE. We dene for this vector space the L2 (−∞, ∞) inner product:

Z ∞
0 ∗ 0
hS(t)|S (t)i = dx Ψ (x, t)Ψ (x, t) , (63)
−∞

where Ψ(x, t) is the wave function that corresponds to the state |S(t)i in the position
representation. We can show, using Fourier transformation, that the above inner product
is independent of the basis. Fill in the rst of (61), then we can rewrite the above inner
product as (try to show this):

Z ∞
0 ∗ 0
hS(t)|S (t)i = dp Φ (p, t)Φ (p, t) . (64)
−∞

In the discrete case (for example, the harmonic oscillator), we nd:

X
hS(t)|S 0 (t)i = c∗n (t)c0n (t) . (65)
n

So this inner product is independent of the basis. Using this, we also nd:
Z ∞
hψx |S(t)i = dy ψx∗ (y) Ψ(y, t) = Ψ(x, t)
−∞
Z ∞
1 −ipx/~
hψp |S(t)i = √ dx e Ψ(x, t) = Φ(p, t)
2π~ −∞
− ~i En t
hψn |S(t)i = cn e = cn (t) . (66)

27
This gives a nice interpretation: Ψ(x, t), Φ(x, t) and cn (t) are the overlaps of the abstract
wave function |S(t)i with the respective basis, i.e. the projections of that vector onto a
particular basis. These bases are often simply denoted |xi, |pi, |ni, hence we can write:

Ψ(x, t) = hx|S(t)i
Φ(p, t) = hp|S(t)i
cn (t) = hn|S(t)i . (67)

In other words, these are the coecients c(z, t) that determine `how much q(z) is in |S(t)i'
(for the continuous/discrete case).
With this new notation, we can also write: ψp (x) = hx|ψp i = hx|pi ψy (x) =
and
hx|ψy i = hx|yi. This makes the notation much more symmetric and simple. ψp (x) is
simply the overlap hx|pi, `the amount of x in the state |pi'. hx|yi is `the amount of x in
y ', given by a Dirac delta function.
Now it is clear that we also have:ψx (p) = hp|xi = (hx|pi)∗ = ψp∗ (x), as it should
(notice that in Plancherel's theorem we have +ipx in one exponential, −ipx in the other).
The overlaps hx|pi contain the information about the spectrum of the particular op-
1
erator. For a free particle, hx|pi = √
2π~
eipx/~ . In other cases, however, this can take a
completely dierent form.

Examples
a) Harmonic oscillator: in this case, the spectrum of the Hamiltonian is discrete, En =
~ω n + 21 whereas the spectrum of x̂ is continuous and innite, x ∈ (−∞, ∞). The


overlaps are now hx|ni = ψn (x) (the harmonic oscillator wave function).

b) Free particle on a circle, φ ∈ [0, 2π].


Here, the spatial coordinate is continuous (x = aφ

with a the radius of the circle) but momenta are quantized, pn = . The overlaps are:
a
1 inφ
hφ|ni = 2π e .

6.2 Closure

Since now we have a basis-free notation, we can notice that, for f, g ∈ H, the following
holds:
Z Z Z

hf |gi = dx f (x) g(x) = dx hf |xihx|gi = hf | dx |xihx|gi . (68)

In the last step, we pulled the integral inside the inner product because
R R f is base-free,
i.e. it does not depend on x; as well we regard dx |xihx|gi = dx g(x) |xi as a new
vector which we obtain by multiplying the vector |xi with the number g(x) = hx|gi and
integrating over x. Now let us dene:

hf |Q̂|gi ≡ hf |Q̂ gi . (69)

(68) can now be written as:


Z 
hf |gi = hf | dx |xihx| |gi . (70)

28
R
We can regard the part in the middle, dx |xihx|, as an operator that acts on |gi and
produces another vector. Since this equation holds for all f, g ∈ H, this operator must be
the identity:
Z
dx |xihx| =1. (71)

This is called the closure relation and indicates that the basis is complete. Multiplying
0
on the right and left with |pi and hp |, this gives:
Z  Z
0 ∗ 0 0
hp | dx |xihx| |pi = dx ψp0 (x) ψp (x) = hp |pi = δ(p − p ) , (72)

0
which is indeed true for instance for a free particle. If we multiply with |yi, hy | instead,
∗ 0 0
R R
we get: dx ψy 0 (x) ψy (x) = dx δ(x − y ) δ(x − y) = δ(y − y ) which is again true.
In the same way we can derive closure for p:
Z
dp |pihp| =1. (73)

|xi, |x0 i, ∗
= δ(x − x0 ).
R
Multiplying both sides with this is the same as dp ψx0 (p) ψx (p)

6.3 Bras and kets


R
Something funny has happened. The notation (69) gave rise to operators such as dx |xihx|.
These operators act on a state to give a new state. But what do we really mean by hx| or,
more generally, hα|? The denition of these states was given in (69). We can in fact think
ofhΨ| as the `complex conjugate' of |Ψi. For obvious reasons, |Ψi is called a `ket' and
hΨ| is called a `bra', so that we can form an inner producta bra-ketby multiplying
the two as in a dot product. Here our analogy with
3
and
3
R C
comes in handy. In usual
R3
the standard inner product is in fact the dot product:
 
vx
hw|vi = wx vx + wy vy + wz vz = (wx , wy , wz ) ·  vy  . (74)
vz

In C3 :
 
vx
hw|vi = wx∗ vx + wy∗ vy + wz∗ vz = (wx∗ , wy∗ , wz∗ ) ·  vy  . (75)
vz

In the same way, via (69) (with Q = 1, the identity operator), we regard hΨ1 |Ψ2 i as the
`dot product' of hΨ1 | and |Ψ2 i: hΨ1 |·|Ψ2 i ≡ hΨ1 |1|Ψ2 i = hΨ1 |Ψ2 i. The bra corresponds to
the (complex conjugate) row vector, the ket to the column vector. They are each other's
`duals'.

Remark (optional). One can show that the bras form a vector space, in the same way

29
the kets |αi do.
In fact, this is the vector space dual ∗
to H, and it is often denoted H . By

denition, the dual vector space H of a vector space H is a linear (non-degenerate) map-
ping from H → C. Indeed, hα| maps vectors in H to C by the inner product: hα|βi ∈ C.
This mapping forms itself a vector space.
Now when we put together |αihβ|, this is an operator as it maps a vector |γi to
another vector (|αihβ|) |γi = |αihβ|γi = hβ|γi |αi. Here, hβ|γi is a number so it doesn't
matter if you write it left or right. On the other hand, the order does matter in |αihβ|:
|αihβ| =
6 hβ|αi!

7 The Interpretation of Quantum Mechanics

We have seen that quantum mechanics gives us a probabilistic interpretation of physical


quantities: it is not always possible to determine the outcome of a measurement with
absolute certainty, but we can predict the possible measurement outcomes and their re-
spective probabilities in measurements on an ensemble of identical systems. The only case
where the outcome of a measurement is unique is when the system is in a determinate
state and we measure the corresponding observable. If the system is in an eigenstate of
an operator Q̂ with eigenvalue qm , the formalism tells us that the probability to nd the
value qn upon measurement is 1 for n=m and 0 for all other states. If Q̂ is the position
operator, then the particle is well localized; but the momentum is not well dened in that
case, as the uncertainty principle expresses. This is because the operators x̂ and p̂ do
not commute. So the formalism tells us that there is no state that describes all possible
observables simultaneously, i.e. no state in which all possible measurable quantities have
well-dened values.
We can get used to the above interpretation of quantum mechanics, but there is some-
thing unsatisfactory about it. Imagine carrying out a two-slit experiment with photons.
Given knowledge of the initial wave function and of the interactions, we can predict the
intensity distribution of light on the detection screen. If we decrease the intensity of the
source so that only one photon at the time goes through the slits, we can predict the
probability that that detection of the photon will take place in a particular region on the
screen. But there is something counterintuivite about this. Which slit did the particle
actually go through? Quantum mechanics does not tell us the answer. If we try to mea-
sure which slit the photon goes through, the interference pattern disappearsknowledge
of which slit the particle went through destroys the quantum superposition. How can the
act of measurement be decisive here? Is a measurement any dierent from other physical
interactions? If the photon did not have a position before it was measured, but it does
have a well-dened position when it is detected, it would seem as if the act of measurement
is able to give particles properties they did not possess before. How can a measurement
be so decisive as to the presence of a physical property? What makes it so unlike other
physical interactions?
This set of questions, and their proposed solutions, which we will next turn to, is what
people usually call the problem of the `interpretation of quantum mechanics'
12 . It is seen

12 In writing this chapter, I have drawn ideas from [4]

30
as problematic because it dees our classical intuitions about how physical systems are
supposed to work.

7.1 EPR and Hidden Variables

In a 1935 article that was meant to be a death blow on quantum theory as a fundamental
theory of reality, Einstein, Podolsky, and Rosen claimed that quantum mechanics was
incomplete. Imagine a pair of particles that are initially in contact, at rest at the origin,
and then go separate ways until their mutual distance is very large. For instance, you
could think of an isotope that decays into two sub-particles that shoot on to dierent
sides, preserving the total momentum. Notice that, quantum mechanically, we cannot
determine the position and momentum of the individual particles because [Q̂1 , P̂1 ] = i~,
[Q̂2 , P̂2 ] = i~. However, we can determine the center of mass position of the system
Q̂1 + Q̂2 as well as the relative momenta P̂1 − P̂2 because these operators do commute:
[Q̂1 + Q̂2 , P̂1 − P̂2 ] = [Q̂1 , P̂1 ] − [Q̂2 , P̂2 ] = i~ − i~ = 0 (where we used the fact that the
operators of the two particles commute). Now imagine that we measure the position of
the rst particle, q1 , once they are far apart. Since we know the location of the center of
mass, we can infer from this measurement q2 , the location of the second particle. But if
we simultaneously measure p2 , the momentum of the second particle, we may also infer p1
in virtue of the fact that we know the dierence in momenta. But now, EPR concluded,
since the two particles are very far apart, one measurement cannot inuence the other.
This would violate the result of relativity that there can be no inuences that travel faster
than the speed of light (in Einstein's words, there can be no `spooky action at a distance').
Hence we can, by means of independent measurements, have complete knowledge of the
positions and momenta of the two particles. Since the outcome of the location of particle
1 cannot aect (by relativity) the location of particle 2, this means that particle 2 must
have had a well-dened position prior to measurement even though quantum mechanics
tells us it didn't. A similar argument can be made for the momentum. Of course this is
alll in contradiction with the predictions of quantum mechanics. So measurements can
give us complete knowledge about these properties of particles, but the theory doesn't.
Therefore, EPR concluded, quantum mechanics is an incomplete theory.
Before trying to rebute the EPR argument, it is natural to ask the following question:
could it be that quantum mechanics is indeed missing some piece of information? Is it
possible to extend quantum mechanics into a more predictive theory? Such attempts
are called `hidden variable theories': assuming that the position and momenta of particles
have precise values before measurement amounts to introducing some variable that remains
hidden to quantum mechanics, but determines those values before we measure them. One
such attempt was carried out by David Bohm in 1952. In Bohm's theory, particles have
well-dened values of positions and momenta, and the predictions of quantum mechanics
are statistically reproduced. However, in order to achieve this, Bohm has to add a non-
local interaction potential between the particles. This potential is called non-local because
it acts at a distance; when a particle changes momentum or position, its interaction with
all other particles changes. Whereas it is not clear whether there is a contradiction
with special relativity (as the interactions assumed by Bohm cannot be directly used to
transmit information faster than light), it is clearly an unwarranted feature of the theory.

31
For this reason, theorists have looked for local hidden variable theories, that is, hidden
variable theories where interactions do not propagate faster than light. The main criticism
to Bohm's theory, however, has been that it merely adds theoretical structure without
any predictive gain, as its predictions are compatible with quantum mechanics and all
experimental results agree with quantum mechanics. Furthermore, in order for the theory
to reproduce the results of quantum mechanics, ad hoc assumptions about the distribution
of the hidden variables have to be added.
In 1964 John Bell showed that local hidden variable theories are inconsistent with the
results of quantum mechanics. That is, that any hidden variable theory of the local type
makes experimental predictions which subtly deviate from those of quantum mechanics.
The corresponding experiments were carried out in 1982 by Allain Aspen and the predic-
tions of quantum mechanics, rather than those of hidden variable theories, were found to
hold. These results have been conrmed in more rened experiments later on.
Hidden variable theories are sometimes identied with realist interpretations (see
e.g. section 1.2 of [1]). This is a misnomerwhereas the hidden variable theory can
certainly be regarded as an (extreme) realist position, there are realist positions that do
not require particles to have well-dened properties (see the next section).

7.2 Bohr's Reply to EPR

Bohr's reply to the EPR article is all but a clear-cut physical argument. Instead, it is a
piece of philosophical discourse (and rather obscure at that) the main message of which
seems to be that what we call `position' and `momentum' cannot be detemined a priori,
but essentially depends on the measurement context in which these concepts are dened.
The measurement process has, in Bohr's view, an essential inuence on the conditions
for the denition of physical quantities. Since the conditions of measurement play an
essential role in dening what the call the `physical reality' (and, as part of that, the
concepts of position and momentum), one cannot infer a conclusion about the supposed
incompleteness of quantum mechanics: as far as the physical phenomena are concerned,
there simply is nothing else to describe than either position or momentum (but not both).
In more practical terms, one could say that Bohr's position amounts to saying that the
particular experimental context determines whether the concepts `position' or `momentum'
make sense. If we measure the position, there is no sense in which we can meaningfully
talk about the `momentum' of a particle: this concept is simply not dened.
Second, whereas Bohr denies that there are `spooky actions at a distance', he remarks
that EPR's conclusion that one can infer simultaneously the position and the momentum
of the particle is incoherent. Once position is measured on the rst particle, this concept is
applicable to the second particle as well. A measurement of the momentum of the second
particle creates a new measurement context which automatically introduces an uncertainty
in the position, which is now no longer applicable. We have therefore not determined the
position and momentum of particle two, but simply measured uncorrelated properties of
the second particle.
By 1927, Bohr's position had blended with the views of his younger colleages Heisen-
berg, Pauli, and Dirac, into the dominant paradigm in the interpretation of quantum
mechanics, known as the `Copenhagen interpretation'. The fact that Bohr's texts on this

32
matter are rather oracle-like and obscure and that his pupils developed related, but not
entirely coinciding accounts of quantum mechanics
13 , have contributed to a lack of clarity
as to how `the' Copenhagen interpretation should be understood.
Dierences in interpretation between Bohr, Heisenberg, Pauli, and Dirac nonwith-
standing, the Copenhagen interpretation does seem to oer a genuine reply to the EPR
argumentthis being the reason that it is widely accepted. The interpretation, how-
ever, does not come without a cost. At least two issues in the Copenhagen interpretation
require further thought:

1. As a philosophical interpretation, it talks about the measurement context as posing


restrictions on the class of macroscopic concepts (such as position, momentum,
etc.) that can be applied to a microscopic system at any given time. However, the
philosophical interpretation does not tell us how this should work. This necessity of
invoking the macroscopic context in dening the concepts that are applicable to the
microscopic phenomena, whilst upholding that the quantum mechanical description
is complete, seems to imply the existence of a fundamental boundary between the
macroscopic and the macroscopic. It is not at all clear what physical principle, if
any, denes this boundary.

2. Bohr's sketched solution to the EPR paradox does not imply action at a distance,
but does require an explanation of how measurement of particle 1 inuences the
applicability of the concept of `position' to particle 2, and the actual determination
of this position. So measurement still seems to play a special role hereit is not
simply regarded as an ordinary physical interaction.

7.3 The Measurement Problem

When trying to translate Bohr's philosophical account of quantum mechanics into a phys-
ically workable model, one runs, as mentioned, into the fact that it seems hard to regard
measurement as an ordinary physical interaction. Indeed, in Bohr's view, a measurement
does much more than simply `giving the value of a variable': it sets the conditions under
which one can meaningfully talk about that variable. In order to get a grasp of how deep
this problem runs, we will here give a model of classical and quantum measurements and
compare them to each other
14 .

Measurement in Classical Mechanics


Consider a system S M (a scale to measure your
(for instance, you), a measuring device
weight) and the readings R on this device (the value of the scale pointer). Consider also a
quantity A corresponding to S (your mass) with possible values a ∈ {a1 , · · · , an } ∈ R (ai
13 For instance, Heisenberg used the Aristotelian concepts of act and potency in his accounts of the
measurement problem. This is quite dierent from the Kantian avor of Bohr's remarks, as well as from
the instrumentalist interpretation that the `Copenhagen interpretation' has often been given.
14 This model goes back to von Neumann, but I am very much indebted to Jos Unk for the formulation
presented here.

33
for short). Corresponding to these values there are readings ri on the scale which indicate
your weight. If the scale works properly, then there is an invertible function m that gives
the reading ri for given mass ai : ri = m(ai ).
You are standing in front of the scale. Since there is nothing on the scale, the pointer
is in its rest position, which we call r0 . We have a pair of numbers describing this situa-
tion, your weight ai , and the at scale pointer: (ai , r0 ). Once you step on the scale, these
numbers change. Your weight does not change, but the pointer position does: we have a
pair(ai , ri ) = (ai , m(ai )). Now the values of ai and ri are correlated via the function m.
−1
Reading o your weight ri from the scale, and applying the function m to it, you nd
−1
your mass: ai = m (ri ) = 70 kg. Notice that, as an abstract property about you, your
mass is unmeasurable; we measure your weight on the scale and use it to infer the mass.

Measurement in Quantum Mechanics


Consider now a quantum system S with a property A (e.g. energy, or spin) to be mea-
sured by a measuring apparatus M (a detector, a phosphor plate, a photon camera). To
A corresponds an operator A with eigenvalues {a1 , · · · , an }. To these eigenvalues corre-
spond states |a1 i, · · · , |an i ∈ HS , the Hilbert space of the system. Since we want to model
measurement physically, we are going to associate an operator R to the masuring appara-
tus. R gives the possible `readings' of the detector (numbers on a computer screen, dots
on a phosphor surface, etc.), which can take values r0 , r1 , . . . , rn (the value r0 denoting,
as before, `no detection'). So the Hilbert space HM of the measuring apparatus contains
one more basis state: |r0 i, |r1 i, · · · , |rn i. We regard |r0 i as the ground state, the state
that indicates that nothing is being measured. We assume |ai i (i = 1, · · · , r ) and |ri i
(i = 0, 1, · · · , n) to form orthonormal bases of HS and HM , respectively.
Before detection, S is in a state |ai i (for some i) and M in the state |r0 i. So the total
state of the system is the product of states |ai i|r0 ithis is the quantum analog of the
statement that you and the scale are described by the pair (ai , r0 ).
Now we want to measure A using R. By the measurement interaction, the system M
will undergo a change from |r0 i to |ri i, and the latter should be indicative for the state
|ai i of S . For an ideal measurement, the system S itself should not change. So we have
the transition:

|ai i|r0 i → |ai i|ri i . (76)

Our task of nding out `whether measurement is a physical interaction' now amounts
to asking whether there is a Hamiltonian H that describes (76) as a transition from an
initial state |ai i|r0 i to a nal state |ai i|ri i. In order to eliminate technical details, we
will go about this question as follows. It turns out that this can be translated into the
question whether there is a linear operator
15 U such that:

U (|ai i|r0 i) = |ai i|ri i . (77)

We can think of this operator as eecting the Schrödinger time evolution with Hamiltonian
H . If U exists, there is a Hamiltonian H that contains the interaction between the system
S and the measuring apparatus M .
15 As it turns out, such an operator also has to be unitary, meaning U U † = 1.

34
In fact, the answer to this question is: yes, such an operator U exists (see the appendix
for more details). In principle this is very nice, because it means that:

1. We have succeeded in describing measurement as a physical interaction via a unitary


operator.

2. No measurement postulate is needed.

3. There is no distortion of the state of the system ai .

However, the fact that U is linear has some strange implications. The above really only
works if S is in an eigenstate |ai i of A, as we have assumed above. What if the system S
is in a superposition?

X
|ψi = cj |aj i . (78)
j

Let us apply our operator to this state. After measurement, we get a nal state:

!
X X X
|ψnal i = U (|ψi|r0 i) = U cj |aj i|r0 i = cj U (|aj i|r0 i) = cj |aj i|rj i , (79)
j j j

where we have used linearity of U . You see the important consequencethe system is no
longer in a product state |ai i|ri i with denite value of i (corresponding to one eigenvalue
of A) after the measurement! Instead, the nal state is a superposition of the measured
system and the measuring apparatus! This means that you and the scale are forever
entangled after you step on it! So which state is this system in? We simply cannot tell:
2
it might be any of |aj i|rj i with probabilities |cj | .
Could we solve this problem by bringing in another measuring device that will nd out
what state you and the scale are in? It is not hard to see that that second measuring device
will again become entangled with S and M in this way. The inescapable conslusion is
that, once we regard measurement interactions to be described by operators U acting as in
(77), the whole universe becomes entangled. Of course, this is nothing but Schrödinger's
cat in disguise. Once we start with a superposition somewhere in the universe, and if
all interactions are given by `nice' linear operators, it will not be long before the whole
visible universe will nd itself in a superposition that includes S and M. So this does not
give us a way of explaining how measurements give denite values after all.
This further motivates von Neumann's description of measurement not as a `normal'
unitary operator U, but as a `special' projection operator Pi that projects unto some
particular state: if we start with

X
|ψi = cj |aj i (80)
j

then after measurement of eigenvalue ai the system will project to the following state:

Pi |ψi = ci |ai i . (81)

35
Since |aj i are linearly independent, we see that in order for this to work the projection
operator Pi has to act on a basis as:

Pi |aj i = δij |ai i . (82)

We then also get for our nal state:

Pi |ψnal i = ci |ai i|ri i . (83)

Notice that the Pi 's are not unitary because the nal state is not normalized to 1: ci |ai i|ri i
has norm |ci |2 , hence this is not a regular physical interaction of the type described by the
Schrödinger equation because it does not preserve total probability. We have to rescale
the state again in order to normalize it to 1. It can also be shown that projections do not
conserve energy.

7.4 Other Approaches

7.4.1 Decoherence
The basic idea of decoherence is quite simple: the reason for the seemingly non-unitary
evolution (81) is in the interactions with the environment. The evolution (81) looks as if it
was non-unitary, but in reality (i.e. if we would be able to include all the very complicated
interactions with the environment) we would see that it is unitary. The wave function
appears to collapse, but in fact it satises the Schrödinger equation. Some parts of the
wave-function, which should be present in (81), have been projected out simply because
their coecients cj are very small. So we are dealing with a kind of thermodynamic
irreversibility here in which the `thermal bath' provided by the environment washes away
some of the information.
Another way of saying this is that, after measurement, the wave function (after rescal-
ing by ci ) does not look like (81), but actually looks like:

|ai i|ri i + . . . , (84)

where the dots relate to terms with low probability. The wave function is actually en-
tangled, but for all practical purposes it looks as if just one term in the superposition
contributes. The reason for this is that we neglect inteactions with the environment in
the Hamiltonian.
In my opinion (but some of my colleagues may disagree on this) this is not strictly
speaking a solution to the fundamental problem we posed of why we never see superpo-
sitions. The magic words above are `for all practical purposes': decoherence is a solution
that works in practice, but, in fact, the whole universe is entangled in the state (84) (since
the environment includes the whole visible universe). So it does not solve the matter of
principle we raised: how do we pass, from a formalism which predicts universal entan-
glement, to states in which particles have unique propertiesdeterminate states? Simple
decoherence just tells us that the determinate states are the most probable ones, but it
does not explain how to actually obtain them. Unfortunately, long discussions on texts
on decoherence fail to address this fundamental point.

36
Perhaps the answer to this is that we should not interpret the wave function to describe
the world as it actually is, but only as a collection of possibilities. The wave function tells
us what is possible and probable, but not what is actual.

7.4.2 Many Worlds


I just suggested that the wave function maybe only describes possibilities, not actualities.
At the other extreme of the spectrum there is the many worlds interpretation, which
assigns objective actuality to all of the terms in the wave function (whether they decohere
or not), i.e. to all the summands in (84). It replaces the collapse of the wave function
by its branching o: at each single measurement, the world branches o into distinct
possibilities, so that every possible outcome is realized in a dierent world. This reconciles
the appearance of non-deterministic events (such as random decay of an atom) with
deterministic equations such as the Schrödinger equation.
This may sound crazy, but you would be surprised by the growing number of physicists
who actually support this interpretation. There is some theoretical support for the many-
worlds interpretation coming from the `histories approach' to quantum mechanics as well
as from some recent puzzles in quantum cosmology. Perhaps the fact that physicists are
willing to support such a crazy idea is simply an indication of how bad the measurement
problem sits with them.
I see a simple puzzle for the many worlds interpretation which never seems to be
discussed in the literature. Consider a harmonic oscillator. It is intuitively clear how the
many worlds interpretation would work for this system, since its spectrum is discrete. If
P
the wave function is in a superposition n cn ψn , there is a world with wave function ψ1 ,
a world with wave function ψ2 , etc. (up to innity). But this only works because the
spectrum is discrete. If the spectrum is continuous and labeled by, say, the wave number
k, then there is a continuum of worlds between any two nite values of k. This seems
nonsensical because it is well-known from mathematics that we cannot label the real
numbers using natural numbers. The latter is an innite but countable set. The former
is not only an innite, but an uncountably innite set! If we cannot even count the
number of worlds, how can they all have separate existence? The concept of `continuum'
seems counter to the idea of all these worlds having `separate existence'. So to me, this
seems like nonsense. A way to resolve this puzzle could be to turn all continous spectra
in quantum mechanics into discrete ones, by introducing a new scale so small that we
cannot see it, so that particles `seem to have continuous momentum k' but are actually
always quantized and their energy levels are so dense that we cannot tell them apart. But
then again: 1) This is never discussed in the many worlds literature; and 2) It amounts
to changing quantum mechanics into a dierent fundamental theory, which runs counter
to our initial aim of simply interpreting quantum mechanics! (which we assumed to be a
complete theory).

37
A Mathematical Formulas and Tricks

A.1 Gaussian integration

• The basic Gaussian integral is:


Z r
−αx2 π
dx e = . (85)
−∞ α
To show this, we compute the integral in two dierent ways. Consider the area under the
−r2
function e on the plane in plane polar coordinates:

Z Z ∞ Z 2π Z ∞ Z ∞ ∞
−r2 −r2 du −u −u −u
dA e = dr dθ r e = 2π e =π du e = −π e = π ,(86)
0 0 0 2 0 0

where we dened u = r2 . Now compute this same integral in Cartesian coordinates


x = r sin θ, y = r cos θ:
Z ∞ Z ∞ Z ∞ Z ∞ Z ∞ 2
−(x2 +y 2 ) −x2 −y 2 −x2
π= dx dy e = dx e dy e = dx e . (87)
−∞ −∞ −∞ −∞ −∞

Taking the square root, it follows that:


Z ∞ √
−x2
dx e = π. (88)
−∞

To obtain the Gaussian integral with generic width, we simply rescale x→ αx in the
integrand, which gives back (85).

• From (85) we can also compute the following integral:


Z r
−αx2 −iβx π − β2
dx e = e 4α . (89)
−∞ α
For the proof of this, see section 2.4.

• Using (85), we can also compute integrals of the following type:


Z ∞
2
dx x
n
e−αx . (90)
−∞

First of all, we notice that this integral vanishes if n is odd. The reason is that in that
case the integrand is an odd function under x → −x. The integral of an odd function
over a range (−a, a) is zero. Since we integrate from (−∞, ∞), the result of this integral
is zero.
When n is even, we use the partial integration formula:
Z Z
u dv = uv − du v , (91)

D D D

38
−αx2 2
where D
denotes the domain of integration. Noting that d(e ) = −2αx e−αx dx, we
−αx2
can take u = x and v = e /(−2α). Applying (91), we get
∞ 2 Z ∞ 2 √
x e−αx ∞ e−αx
Z r
2 −αx2 1 π π
dx x e = − dx = = 3/2 , (92)
−2α −∞ −2α 2α α 2α

−∞ −∞

where the boundary term vanished because the Gaussian function vanishes at innity
more rapidly than any polynomial. We can get the result for higher powers of x by
successive partial integrations. The result is:

Z ∞ √
2n −αx2 (2n)! 2 π
dx x e = . (93)
−∞ n! (4α)n+ 12

B Technicalities of Quantum Mechanical Measurements

B.1 Time Evolution Operators

Remember that the problem of quantum mechanics amounts to nding a solution Ψ(x, t)
of the TDSE (20) given an initial condition Ψ0 (x). We can recast this problem in the
language of operators, as follows: nd a linear operator U such that:

U : Ψ0 (x) 7→ Ψ(x, t) = U (t) Ψ0 (x) (94)

satises the TDSE. Here, U is a time-dependent operator called the evolution operator.
In a particular basis, and in the discrete case, it maps:

i
X X
cn ψn (x) 7→ cn e− ~ En t ψn (x) . (95)
n n

The operator U is easy to nd:

∞ n
− ~i Ht
X − ~i Ht
U (t) ≡ e ≡ , (96)
n=0
n!

where the exponential, which includes an operator (the nth power of the Hamiltonian) has
been dened by its Taylor expansion. Although this requires some care, Taylor expansions
can indeed be dened for linear operators (as for matrices) analogously to how they are
dened for numbers:



X (Q̂)n
e ≡ . (97)
n=0
n!

Now the claim is that U (t), as dened in (96), does the job in (94): when U (t) is applied
to Ψ0 (x), it gives the solution of the TDSE with that initial condition. Let us check this.
If Ψ(x, t) is to be given by (94) with U as in (96), it must solve the TDSE:
∂Ψ(x, t) ∂U (t)
i~ = i~ Ψ0 (x) . (98)
∂t ∂t

39
Let us compute the time derivative:
 
∂U ∂  − i Ht  i i i
= e ~ = − H e− ~ Ht = − HU (t) . (99)
∂t ∂t ~ ~
Filling this back into (98):
 
∂Ψ(x, t) i
i~ = i~ − H U (t)Ψ0 (x) = H Ψ(x, t) , (100)
∂t ~
which is precisely the TDSE. Hence, we have shown that (96) does the job of solving for
us the TDSE.
Let us now see what these formulas mean in practice. If we are given the initial
P
condition Ψ0 (x) = n cn ψn (x), then:
!
X X X  i 
U (t) Ψ0 (x, t) = U (t) cn ψn (x) = cn (U (t) ψn (x)) = cn e− ~ Ht ψn (x)
n n n
− ~i
X
En t
= cn e ψn (x) = Ψ(x, t) , (101)
n

where we have used the fact that when H satises the TISE, Hψn = En ψn , and hence we
can replace any powers of H by powers of En when they act on ψn (also in the exponential,
by using the Taylor expansion we can see this term by term). Now the last expression in
(101) is indeed the usual solution of the TDSE.
U happens to be a unitary operator, i.e. U U † = U † U = 1. This can be shown by
computing the adjoint:
 i † i †
U † = e− ~ Ht = e(− ~ Ht) = e ~ Ht .
i
(102)

where we have used the fact that H is Hermitian and that, again, to compute the adjoint
of the exponential, we can use the Taylor expansion, compute the adjoint term by term,
and resum the Taylor series. Since we get a plus sign in the exponential, when we multiply
it with U which contains a minus sign, the two cancel out and we get the unit matrix.
Because U is unitary, it preserves amplitudes (and probabilities):
hΨ|Ψi = hU Ψ|U Ψi = hΨ0 |U † U Ψ0 i = hΨ0 |Ψ0 i . (103)

B.2 The Measurement Operator U


The measurement operator we were looking for in (77) can be explicitly written as follows:
X
U= |ak i|rj+k ihrj |hak | . (104)
jk

It is not hard to show that U indeed satises (77). In fact, this operator is a combination
of projectors onto A and onto R:
X (a) (r)
U= Pk,k Pj,j+k (105)
jk

40
where

(r)
Pi,j |rl i = δil |rj i
(a)
Pi,j |al i = δil |aj i , (106)

and P (a) is dened similarly to act on |al i.

Proof: We simply let U act on (77):

X
U (|ψi|r0 i) = cj U (|aj i|r0 i) . (107)
j

We compute the r.h.s. separately using the decomposition into projectors (we do some
relabeling of indices):

X X X (a) (r)
cj U (|aj i|r0 i) = cl Pk,k Pj,j+k |al i|r0 i
j l jk
X (a) (r)
X
= cl Pk,k |al i Pj,j+k |r0 i = ck |ak i|rk i , (108)
jkl

(r)
where in the second equality sign we used the fact that P acts only on eigenstates of R
(a)
and P only on eigenstates of A. In the last formula we used the property (106). This
is precisely what we wanted to show.

References

[1] D.J. Griths, Introduction to Quantum Mechanics, Pearson, 2nd Edition.

[2] D.C. Giancoli, Physics: Principles with Applications, Pearson, 6th Edition.

[3] B.H. Bransden and C.J. Joachain, Quantum Mechanics, Addison-Wesley, 2nd Edi-
tion.

[4] D. Dieks, Filosoe/grondslagen van de natuurkunde, Utrecht University, 2008-2009.

41

You might also like