You are on page 1of 77

Notes on PSO201A (Quantum Physics; 2019-20-II)

Anjan K. Gupta, Department of Physics, IIT Kanpur

1 The beginning of QM
I like to start this course with a remark by the famous Nobel laureate physicist Albert
Michelson (or Michelson-Morley experiment) to an audience in 1894. He said, “It seems
probable that most of the ground underlying principles have been firmly established and that
further advances are to be sought chiefly in the rigorous applications of these principles
to all phenomena which come under our notice..... The future truths of physics are to be
looked for in the sixth place of decimals.”
This was a very inappropriate time for this statement as in 1894 itself Becquerel dis-
covered radioactivity followed by X-rays discovery by Röntzen in 1896. In 1897, Thomson
discovered electrons, which changed the course of developments with the immediate ques-
tion like how a neutral atom could form. The plum pudding model of Thomson was
proved wrong by Rutherford by scattering α-particles from gold foils leading to a nuclear
model with the electrons bound to a positive nucleus by Coulomb force and orbiting like
planets around Sun. But this was problematic as an accelerating charge particle radiates
which raised questions about stability of an atom. Bohr proposed a model that could
explain the spectral lines of hydrogen. Some of these excitements can be lived through
by reading a first 5 chapters of a book, ”Second Creation” by Crease and Mann. One can
easily google to find the chronology of the early developments in quantum mechanics that
began in the last decade of 19th century.
The experiments and theories that eventually led to various aspects of quantum me-
chanics can be broadly divided into two categories, namely ”particle nature of waves”
and ”wave nature of particles”. We review some of the experiments below to inspire an
unease on wave-particle duality.

1.1 Particle nature of waves


There are several experimental results that could only be explained when one incorporated
a particle nature of waves in the sense that one can exchange energy (and momentum)
with these waves in a quantum of certain amount or these waves exhibit energy as integer
number of such quanta. These experiments include, 1) Black body radiation, 2) Photo-
electric effect, 3) certain aspects of X-rays, 4) Heat capacity of solids, 5) light detection
as quanta of energy and single photon interference. Below we discuss these experiments
and how they get explained by the particle or quantum nature of waves.

1
1.1.1 Black body radiation
Heated objects emit black body radiation with a characteristic spectrum determined by
temperature. A black body is an object that absorbs all the em-radiation falling on it.
This is an idealization of practical situations and such ideal Black body spectrum can
be theoretically explained. The spectrum is defined by I(λ)dλ which represents the em-
energy emitted in wavelength window λ to λ+dλ or by ρ(ν)dν which is the energy emitted
in frequency range ν to ν + dν.
A blackbody can be constructed by using a highly absorbing (non-transmitting) walls.
Further, one can geometry confine all em-radiation by collecting all the scattered light from
walls to within the cavity. One possible construction is as shown. This is an idealization
and in reality if the cavity is made of a particular material with a characteristic emission
spectrum, the radiation inside will not have typical blackbody characteristics. We, here,
are not concerned with such non-ideal situations.

Figure 1: The blackbody spectrum, i.e. energy density Vs λ, at two different temperatures.
The peak wavelengths are marked by λm1 and λm2

Blackbody spectrum has been measured in laboratories. When Planck worked on


his theory several experimental data were available and they showed disagreement with
existing theories. Before Planck’s theory, the blackbody spectrum, as shown, was known
to have certain characteristic features. A maximum at a particular wavelength λm that
depended on temperature as given by

λm T = 2.898 × 10−3 m-K. (1)

This is known as Wien’s displacement law. For Sun λm = 5100 Å leading to T = 5900◦ C.
In fact Wien had already argued about the general features of the blackbody spectrum
using thermodynamics arguments. Another known law, again from thermodynamics ar-
guments, was Stefan-Boltzmann law of radiation, i.e. the power emitted by a black body
per unit surface area is given by

u = σT 4 (2)

with σ = 5.67 × 10−8 W/m2 -K4 . Also, before Planck two attempts were made to un-
derstand blackbody spectrum which were only partially successful. One was a theory by
Rayleigh-Jeans and another by Wien.
With this as background let us discuss how we understand the blackbody radiation
now. In order to find the energy density due to em-waves inside a cavity in thermal

2
equilibrium at temperature T we need to find two things: 1) Number of em-modes in
frequency range ν to dν, i.e N (ν)dν and 2) Energy contained in each mode per unit
volume at temperature T . The net energy per unit volume in frequency range ν to dν
will be the product of the two, i.e.

ρ(ν)dν = N (ν)dν × (average energy in each mode at temperature T) (3)

So our first task is to find N (ν)dν. For this we consider stationary em-waves in a cube
shaped cavity. It turns out that the mode density is actually independent of the exact
shape for macroscopic objects.

Figure 2: Left: The stationary modes of a 1D cavity of length L in real space. Right:
The discrete allowed mode in k-space

Before analyzing a three-dimensional cubic cavity let’s first discuss the plane stationary
waves in a one dimension confined between x = 0 and L. The waves will exhibit nodes at
the two ends and thus the wavelength of permitted modes will be given by λn = 2L/n,
see figure. Thus the associated wave-number is given by kn = 2π/λn = nπ/L. Hence
the separation between the wave-numbers of the neighboring modes ∆k = π/L. We also
have νn λn = c and so νn = ckn /2π and ∆ν = cπ/2L. Since L is a large length scale and
so these modes will be extremely close to each other. Thus number of modes in a small
dk-interval will be given by 2 × dk/(π/L). The factor of two here is due to two different
polarizations of em-wave. In 2D one gets kx = nx π/L and ky = ny π/L with nx , ny as
positive non-zero integers. If we look at the modes in a 2D square shaped cavity kx , ky
plane we get a square-grid of discrete points with separation ∆kx = ∆ky = π/L. The
number of modes between k and k + dk can be found by finding the number of discrete
points in one-quarter of a circular shell of radius k and thickness dk. This shell will have
an area 2πkdk/4. Each discrete point in 2D k-space occupies an effective area (π/L)2 ,
giving number of discrete points as (πkdk/2)/(π/L)2 .
In 3D cubic cavity, one gets kx = nx π/L, ky = ny π/L and kz = nz π/L giving a
3D grid of points in 3D k-space with each point effectively occupying a volume (π/L)3 .

→ p
For a given point the magnitude of wave-vector would be | k | = kx2 + ky2 + kz2 which
determines the overall frequency and wavelength of this particular mode. To find the
number of total modes in an a volume of k-space between k and k = k + dk, this volume
is given by one octant of a spherical shell of radius k and thickness dk, i.e. 4πk 2 dk/8.
With the volume of each discreet k-point as (π/L)3 , we get the number discreet k-points as
(πk 2 dk/2)/(π/L)3 = L3 k 2 dk/2π 2 . Now counting the independent polarization that each
discrete k-point exhibits we get to number of modes per unit volume as N (k)dk = k 2 dk/π 2 .

3
Figure 3: The discrete allowed mode in k-space for a 2D square cavity of side L. The ring
like region of width dk and radius k encompasses good number of such discrete points
that can be found from (2πkdk/4)/(π/L)2 .

By using ν = ck/2π we get dk = (2π/c)dν N (ν)dν = (2π/c)3 ν 2 dν/2π = 8πν 2 dν/c3 .


Similarly in terms of λ we get N (λ)dλ = (8π/λ4 )dλ.
Rayleigh-Jeans law: Now our task is to find the average energy contained in thermal
equilibrium at temperature T in each mode. This is where there is a discrepancy between
classical average energy per mode expression with quantum one. The classical black body
radiation was derived by Rayleigh-Jeans where one uses equipartition theorem to find the
average energy contained in each mode. According to classical mechanics the energy in
each mode will be proportional to the square of the amplitude of electric field of that mode,
i.e. Energy = αE02 . The relative probability of E0 taking this particular value is given
by the Boltzmannn factor, i.e. exp(−Energy/kB T ). Thus we see that the probability of
occupancy reduces exponentially with energy. Since classically E0 can take any arbitrary
positive value (for the stationary mode) the average energy per mode will be given by
R∞
0R
αE02 exp(−αE02 /kB T )dE0
∞ (4)
0
exp(−αE02 /kB T )dE0
which works out as R ∞kB T /2, i.e. independent
p of α.R The mathematics here requires
p two

integrals, namely 0 exp(−βx2 )dx = π/4β and 0 x2 exp(−βx2 )dx = (1/2β) π/4β.
The second integral can be obtained by differentiating the first one wrt β.
This is the reasoning behind the so called equipartition theorem where if the energy
with respect to a continuous system variable x is quadratic in x then the average energy at
temperature T associated with x degree of freedom is kB T /2. This is what one uses when
writing the total energy associated with a monatomic ideal gas as 3kB T /2 per atom which
comes from the three components of the momentum vector. For em-wave mode there is
also the energy associated with the magnetic field B0 which is again quadratic with B0 .
Thus we get total energy per mode as kB T . This leads to the energy per unit volume in
the black body cavity from ν to ν + dν as ρ(ν)dν = (8πν 2 /c3 )kB T dν. This is called the
classical Raylegh Jeans law. Although there is fair agreement between Rayleigh-Jeans
law and experiments for small ν values, there are several problems: 1) ρ(ν) diverges with
ν as opposed to experiments which show a peak with an exponential decline at large ν.
2) The total energy density, obtained by integrating ρ(ν) over ν = 0 to ∞, diverges.
Wien’s law Wien argued from thermodynamic arguments [see M.K. Harbola, “The
genesis of quanta”, Resonance, Feb 2008, p.134] that ρ(ν) must have a form like ν 3 ϕ(ν/T )

4
where ϕ(x) is some unknown function of x. In fact people thinking of alternate theories
to classical thermodynamics believed in Boltzmann theory very strongly and they did
not want to sacrifice it. Wien further argued that the frequency of emitted em-radiation
from molecules must be proportional to their kinetic energy, i.e. mv 2 /2 = aν and so the
intensity of radiation should be proportional to the number of molecules at that kinetic
energy, i.e. exp(−aν/kB T ). Thus he proposed ρ(ν) = (Aν 3 /c4 ) exp(−bν/kB T ) with A, b, c
as constants. Or one can also write ρ(ν) = (A/λ5 ) exp(−b/λT ). This function shows a
maxima at some λm as seen in experiments. On maximizing this leads leads to λm T = b/5.
This gave a reasonable agreement with the experiments with Wien’s constant b found from
experiments. Wien got Nobel prize in 1911 for his contribution to the understanding of
the black body radiation.
Planck’s theory of black body radiation: Planck proposed that em-radiation
comes in quanta and the E and B fields cannot take arbitrary values. It turns out that
the quantum theory of em-radiation is much more complex and it came much later as
quantum electrodynamics (QED) which quantized the em-fields. This is way beyond the
scope of this course.
Planck argued that for a given frequency-ν mode the energy can take values 0, hν, 2hν,
3hν, ...., i.e. nhν. The relative probability to have one quantum of energy is exp(−hν/kT )
and for two quanta it is exp(−2hν/kT ) and for n-quanta it is exp(−nhν/kT ). So total
energy associated with a frequency ν mode will be
P∞
(nhν) exp(−nhν/kT )
P∞
n=0
(5)
n=0 exp(−nhν/kT )
P
. The series’ inPthe numerator and denominator can be easily summed using ∞ n
n=0 x =

1/(1 − x) and n=0 nx = x/(1 − x) . The first is a geometric series and the second one
n 2

can be obtained by differentiating the first result wrt x. Here x = exp(−hν/kT ). Thus
the final average energy in ν-frequency mode works out as hν/[exp(hν/kT ) − 1]. This
leads to the Planck’s law of black body radiation as:
8πhν 3 1
ρ(ν)dν = dν. (6)
c exp(hν/kT ) − 1
3

This also leads to


8πhc 1
ρ(λ)dλ = dλ. (7)
λ exp(hc/λkT ) − 1
5

One can easily see that in the limit of large frequencies, i.e. hν ≫ kB T one gets the
Wien’s law from Planck’s expression and in the opposite limit hν ≪ kB T , it agrees with
Rayleigh-Jeans law. Also one gets the precise Wien’s displacement law from Planck’s
expression (homework) as hc/λm kB T = 4.967. From the then known Wien’s constant
one could easily find the value of h/kB . Another known law at that time was Stefan’s
law i.e., energy radiated from a black body surface per unit area and per unit time is
σT 4 with experimentally known σ value of 5.67×10−8 Wm−2 K−4 . The Stefan’s law can
be easily derived from Planck’s law giving σ = 2π 5 kB
4
/15c2 h3 . This helped Planck deduce
h = 6.55 × 10−34 J-s and kB = 1.346 × 10−23 J/K. This did not help much as none of these
were known independently by any other method. But the gas constant R was known

5
which helped Planck deduce Avogadro number NA = R/kB as 6.175×102 3 per mole. Also
Faraday constant, i.e. charge in one mole of singly charged ions, F was known so Planck
could find e = F/NA . This prediction was shortly confirmed by Geiger and Rutherford
by finding charge on α-particles. The agreement between the two was within about 1%.

1.1.2 photoelectric effect


This experiment used a simple setup as shown in figure with two electrodes inside an
evacuated glass bulb. The light falling on one electrode leads to photo-induced electrons’
emission with certain kinetic energies. The applied voltage prohibits electron below certain
kinetic energy from reaching the other electrode and thus the current reduces as the
voltage is increased and beyond a threshold voltage Vth the current becomes zero. It was
experimentally found that: 1) The current (I) depended on light intensity, wavelength
both as well as on applied voltage, 2) For fixed wavelength λ, there is a threshold voltage
Vth above which there is no current. The general behaviour of current is depicted in figure.
The threshold voltage was found to vary with λ but not with the light intensity.

Figure 4: Photo-electric effect.

The explanation for this behaviour was given by Einstein in 1905 for which he got
Nobel prize. Einstein proposed that the electrons absorb only a quantum of energy from
incident light given by hν and come out of the metal as free electrons with kinetic energy
T = hν − ϕ where ϕ is the work function energy barrier which protects the electrons
from coming out of the metal. The metal has a continuous distribution of electrons below
certain energy, called Fermi energy EF , which is energy ϕ below the vacuum energy. Thus
one gets electrons with kinetic energy ranging from zero (i.e. for electrons excited from
hν − ϕ below EF ) and up to kinetic energy hν − ϕ (i.e. for electrons excited right from
EF ). As a result with increasing V less electrons will make it to the other electrode
and above eVth = hν − ϕ, no electron will make it to the second electrode. Thus we get,
Vth = (h/e)ν −ϕ/e. This linear relation between Vth and ν was found to be experimentally
correct with its slope well described by h/e. This could also help in deducing the work
function of the metal. The critical part of the argument was the absorption of light
happens only in one quantum hν of energy from light wave indicating the quantization of
energy associated with light wave.
Another interesting consequence of this experiment is the technique of ”photo-electron
spectroscopy” which helps us probe the number of electronic states available in a conductor
at different energies below the Fermi energy. This has been a very successful probe as it

6
Figure 5: Photoemission spectroscopy.

has helped us understand the physics of the electrons in many solids. In the modern photo-
electron spectroscopy experiments one uses highly monochromatic light (from Lasers or
more sophisticated sources) with very sophisticated electron energy analyzers. This helps
one deduce the distribution of electronic states in solids with great energy resolution. In
fact, one can also get momentum information of the electron with detectors capable of
fine angular resolution.

1.1.3 X-rays
X-rays were discovered by Röntzen in 1895 by a process called Bremstrahlung, i.e. sudden
deceleration of electrons in a metal leadin to em-radiation. This is actually an inverse pho-
toelectric effect as high-energy electrons impinging on a metal electrode lead to emission
of X-rays. The X-ray emission is related to atomic transitions. There is one interesting
fact that had consequences for the development of quantum mechanics. This was about
the detailed distribution of X-ray intensity as a function of wavelength, λ. It was found
that emitted X-ray intensity vanished below certain wavelength λth . Further, this λth
was dependent on the kinetic energy of the impinging electrons or rather the voltage V
through which the electrons were accelerated before striking the metal electrode. Thus
λth was found to depend on V as,

12.4Å
λth = . (8)
V (in kV)

Figure 6: X-rays generation and characteristics.

This behavior can be readily understood as the maximum energy that an X-ray photon
can carry will be limited by the electron’s kinetic energy. The photon energy can always be

7
less as the electron can loose energy to other non-radiative processes. Thus the maximum
photon energy hc/λth = eV or
hc 6.6 × 10−34 × 3 × 108 12.4 × 10−10 12.4
λth = = m = m= Å. (9)
eV 1.6 × 10 −19 × V (in kV) × 103 V (in kV) V (in kV)
This agress with experiments nicely. X-rays have any applications. The two noteworthy
are in medical diagnostics and in finding the crystal structure analysis. In fact the above
relation is useful to remember as a conversion formula between energy (in keV) and
wavelength (in Å).

1.1.4 Compton scattering


This experiment is about the scattering of X-rays by free electrons in a metal which
initially have negligible kinetic energy. This effect brought more particle-like aspects of
light as one needs to use quantized energy as well as momentum of a photon in order to
understand Compton scattering. The energy exchange between X-ray photon and electron
is large for Compton scattering to be observable taking the electron into relativistic regime.
Thus one needs to use the relativistic energy momentum relation for electron, i.e. E 2 =
p2 c2 + m20 c4 . Here E is the total energy, including the rest mass energy m0 c2 , and p is
the momentum. For photon the energy is given by hν = hc/λ while its momentum is
hν/c = h/λ.

Figure 7: The left schematic shows the X-ray photon and electron before scattering,
the middle one shows them after scattering. The right schematic is the experimental
setup showing the incident X-ray beam and the Compton scattered beam analyzed by the
detector at scattering angle θ.

One needs to write the energy and momentum conservation for a 2D scattering problem
as shown in figure. For electron initially at rest, the energy conservation gives:
q
hν + m0 c = p2 c2 + m20 c4 + hν ′
2
(10)

The momentum conservation along the direction of initial photon momentum gives
h h
= ′ cos θ + p cos ϕ. (11)
λ λ
The momentum conservation in the direction perpendicular to the initial photon momen-
tum gives
h
0=− sin θ + p sin ϕ. (12)
λ′

8
We thus get only three equations but there are four unknowns, namely p, ϕ, λ′ and θ.
We can eliminate two unknowns and find a relation between the remaining two. This
is dictated by experimental setup where one measured the scattered X-ray spectrum at
different scattering angles. Thus we eliminate the electron variables p and ϕ to find a
relation between λ′ and θ. From last two equations, we get
 2
h2 h h
p cos ϕ + p sin ϕ = ′2 sin θ + − ′ cos θ + −
2 2 2 2 2
λ λ λ
2 2 2
h h 2h
or p2 = ′2 + 2 − ′ cos θ
λ λ λλ
or p c =h ν + h ν − 2h2 νν ′ cos θ after using λ = c/ν.
2 2 2 2 2 ′2

Using the last relation in the first equation, we get

h2 ν 2 + h2 ν ′2 − 2h2 νν ′ cos θ + m20 c4 =(hν − hν ′ + m0 c2 )2


or − 2h2 νν ′ cos θ = − 2h2 νν ′ − 2hν ′ m0 c2 + 2hνm0 c2
 
m0 c2 1 1
or cos θ =1 + − (13)
h ν ν′
h
or λ′ − λ = (1 − cos θ) =λC (1 − cos θ). (14)
m0 c

Here λC = h/m0 c is called the Compton wavelength of electron whose value is 0.024Å.
Thus we see that the change in wavelength of the X-rays is rather small. In order to
resolve this change beyond the linewidth of the used X-rays one has to use X-rays of very
small wavelength or high energy. In actual experiment the X-ray beam strikes a metal foil
having free electrons as well as atoms. One measures the X-ray spectrum, i.e. intensity as
function of wavelength, at different scattering angle θ values. One sees two well separated
peaks in a typical spectrum, one elastic (i.e. at λ) and the other inelastic, at λ′ as
dictated by Compton formula. The elastic peak arises from the Compton scattering from
atoms which are much heavier than electrons and thus lead to a much smaller Compton
wavelength and thus there is negligible shift in λ of the X-rays that Compton scatter from
atoms.

Figure 8: The measured X-ray intensity as a function of wavelength at different scattering


angle θ values.

9
1.2 Specific-heat of solids due to lattice
Solids get contribution to their specific heat from many internal degrees of freedom includ-
ing lattice vibrations and electrons (in metals). For insulators the specific heat is dictated
primarily by lattice contribution which we discuss here as it also played some role in the
development of quantum mechanics. There was certain understanding of the specific heat
which was derived from the classical equipartition theorem. For a solid containing N
atoms in a 3 dimensional solid there are 6 bulk degrees of freedom (dof ) that lead to bulk
translational and rotational kinetic energies arising from three linear momenta and three
angular momenta. The remaining 3N − 6 dof go in dictating these many lattice vibration
modes. An example in 3D would be taking a triangular molecule with three atoms at its
corner. It’ll exhibit 3 independent vibration modes.
Each vibration mode will have two forms of energy associated with it, namely kinetic
energy associated with the mode-momentum (p) and elastic potential energy associated
with the mode-displacement (x) as is the case with a single oscillator of fixed frequency.
The energy would be quadratic wrt both i.e. E = p2 /2m + kx2 /2. Thus from classical
equipartition theorem one get kB T /2 from x and p dof associated with each mode leading
to per mode average energy at temperature T as kB T . This leads to an average total
energy ⟨U ⟩ = (3N − 6)kB T and thus the molar specific heat as c = ∂⟨U ⟩/∂T = 3NA kB =
3R. Here R is the gas constant. This temperature independent classical expectation
is known as Dulong-Petit law. But this does not agree with experiments as the low
temperature specific heat of insulating solids is much smaller that 3R and it vanishes as
T → 0. Einstein proposed a quantum theory asserting that the lattice vibration energy
is actually quantized. Thus a frequency-ν vibration mode will take energy in quanta of
hν just like photons. In fact these a quantum of lattice vibration energy is analogously
called phonon. This is indeed the case as we shall see from the discussion of quantum
simple harmonic oscillator. Thus the relative probability of a mode exhibiting energy nhν
would be exp (−nhν/kB T ) and eventually just like the blackbody radiation discussion the
average energy associated with a mode at temperature T is given by
P∞
n=0 nhν exp (−nhν/kB T ) hν
⟨u⟩ = P ∞ = (15)
n=0 exp (−nhν/kB T ) exp (hν/kB T ) − 1

Figure 9: Temperature dependent specific heat of lattice.

Einstein assumed all the 3N − 6 lattice modes to be of same frequency ν and thus the
total energy of all the modes will be (3N − 6)⟨u⟩ and so the molar specific-heat of the

10
solid will be given by (ignoring 6 relative to NA ),
 
∂⟨u⟩ −3NA hν exp (hν/kB T ) hν
c =3NA = −
∂T [exp (hν/kB T ) − 1]2 kB T 2
 2
hν exp (hν/kB T )
or c =3R
kB T [exp (hν/kB T ) − 1]2
 2
TE exp (TE /T )
or c =3R . (16)
T [exp (TE /T ) − 1]2
Here TE = hν/kB is a material dependent fitting parameter called Einstein temperature.
We can easily check the result in the high and low temperature limits. This showed a
much better agreement with the experiments. For T ≫ TE , one gets c = 3R, consistent
with classical Dulong-Petit law. For T ≪ TE one gets c = 3R(TE /T )2 exp (−TE /T ). So c
does approach zero as T → 0. However the exp (−TE /T ) dependence is not what one sees
at low temperatures experimentally. One rather finds that c ∝ T 3 at low temperature.
Debye gave a more satisfactory theory incorporating the fact that all modes do not have
same frequency but rather one gets a systematic frequency dependent density of modes
from the dispersion of lattice vibrations in real solids.

1.2.1 Frank-Hertz experiment

Figure 10: Frank-hertz experiment schematic and voltage dependent plate current.

This experiment done in 1914 demonstrated that internal energy of an atom are quan-
tized in a way that the atoms can loose energy in inelastic collisions only in certain
quantum of energy. A vacuum triode-tube with three electrodes, namely cathode, grid
and plate was used as shown in figure. In evacuated tube the plate current rises mono-
tonically as a function of voltage V between cathode and grid. This happens as more
electrons that are emitted by heated cathode reach the plate when V is increased. Now
when the tube contained mercury vapor which is achieved by heating the tube to near
170◦ C to evaporate mercury sealed in the tube. The electrons that accelerate collide
with mercury atoms and loose some of the energy. It was found that the plate current
showed a non-monotonous behavior with V . In fact it showed a periodic pattern with
4.9 V period together with a rising background as shown in figure. This was interpreted
using the excitation energy of mercury atom as 4.9 eV and thus electrons loose energy

11
in collisions with mercury atoms only in multiple of 4.9 eV. A given electron can loose
energy to multiple Hg atoms within the limit of its total kinetic energy.

1.2.2 Single photon interference


This is a rather modern experiment using the Young’s double-slit interference setup but
with very faint light source and very sensitive light detector so as to detect photons one
by one. What one sees here is the same pattern but in discrete photon counts. So one sees
that the rate at which the photons arrive at the screen at a particular position matches
with the expected interference pattern. This leads to a particle like picture of photon but
with wave-like properties. There are some questions which are useful to ponder over from
this experiments. For example, what is the interference between?, two different photons?,
of a photon with itself?, of the different paths that a given photon can take? These
eventually lead to better insights into quantum mechanics.

1.3 Wave nature of particles and de Broglie hypothesis


Louis de Broglie proposed in 1923 in his Ph.D. thesis that matter particles actually behave
as waves. EM-waves or photons have a dual nature with photons having momentum
p = h/λ and energy E = hν as proved from the Compton effect experiments. The matter
particles also have a wave-like character with p = h/λ and E = hν or we can also write
p = ℏk and E = ℏω. Here ℏ = h/2π, k = 2π/λ √ and ω = 2πν. For eg., for an electron
with kinetic energy E = 1 keV we get λ = h/ 2mE = 0.38Å and ν = 2.4 × 10 7 Hz. For
1

matter objects with large mass, λ is very small so these objects follow laws of classical
physics. Even for electrons λ is quite small so one has to look for proper experiments to
verify its wave nature.

1.3.1 Davisson and Germer and Thomson experiment: electron interference


X-rays with their wavelength in 1Å were used by Bragg and others in 1912-13 to probe
the crystal structure of solids. Elasser suggested in 1926 that solids provide a periodic
lattice with periodicity that can be appropriate for probing wave nature of electrons.
Davisson and Germer conducted the first experiment in 1927 where they scattered an
electron beam from a nickel target and found that a peak in scattered electron intensity
occurs at a particular angle (ϕ = 50◦ ) for a particular energy of electron beam (54 eV).
This is shown in the polar plots together with experimental geometry in the figure. They
analyzed this using the interplanar distance in Ni as d = 0.91Å and the Bragg’s formula
2d sin θ = nλ with n = 1 and θ = 90 − (ϕ/2) as found by comparing the Bragg geometry
with experiment. As a result they found λ = 1.65Å which matched well with the de
Broglie wavelength of 1.65Å for 54 eV kinetic energy.
Thomson carried out similar experiment in a powder-diffraction geometry on powdered
Ni and found that the electrons making scattering rings on a screen kept in front of the
sample and perpendicular to the incident electron beam. A good agreement with de
Broglie hypothesis was thus found.
In modern times people have been able to do other diffraction and double slit inter-
ference with electrons and even atoms. In fact there are many applications of electron

12
Figure 11: Davisson-Germer experiment.

Figure 12: Thomson experiment.

beams that exist precisely because of their wave nature. For instance scanning electron
microscope (SEM) gets a resolution in sub-nanometer range precisely because one can get
electron beams with sub-Å de Broglie wavelength and thus the diffraction limit leads to a
sub-Å resolution. In fact the eventual resolution limit in SEM comes from other consid-
erations including the energy spread and inter-electron interaction in the electron beam.
The transmission electron microscope (TEM) is another important tool that utilizes wave
nature of electrons to image solids with atomic resolution. Two important surface probes
are low energy electron diffraction (LEED) and reflecting high energy electron diffraction
(RHEED) which are used to monitor and track the atomic structure of the surface of the
thin films during film growth/deposition.

2 Formulation of quantum mechanics


From the experiments described above and de Broglie’s hypothesis we see: 1) energy
possessed by various atomic, sub-atomic objects and atomic lattice vibrations is quantized,
2) The em-waves behave as particle and wave both, and 3) the particles also behave
as waves. The last fact prompts one to the peculiar properties of waves that are also
possessed by particles. This leads to some very counter-intuitive phenomena and one of
this is uncertainty principle.

13
2.1 Heisenberg’s uncertainty principle
On the basis of wave-particle duality and de Broglie hypothesis Heisenberg proposed that
the uncertainty in a given particle’s position and momentum are inversely related. More
specifically he stated ∆p.∆x ≥ ℏ/2. We start this discussion by considering a wave,
y(x, t) = A sin(kx − ωt) with k = 2π/λ and ω = 2πν. As defined this wave exists
over whole space and, at all times. Now to get a space-localized wave or disturbance
one actually has to superpose waves of different λ values. These interfere to give a
rather localized entity that can describe a particle-like object. To get a feel of this let’s
superpose two waves of slightly different k. So we have y1 (x, t) = A sin(k1 x − ω1 t) and
y2 (x, t) = A sin(k2 x − ω2 t) and

y(x, t) =y1 (x, t) + y2 (x, t) = A sin(k1 x − ω1 t) + A sin(k2 xω2 t)


=2A sin[(k1 + k2 )x/2 − (ω1 + ω2 )t/2] cos[(k1 − k2 )x/2 − (ω1 − ω2 )t/2] (17)

. On simplifying, y(x) = 2A sin[(k1 +k2 )x/2−(ω1 +ω2 )t/2] cos[(k1 −k2 )x/2−(ω1 −ω2 )t/2].
This at a fixed time, say t = 0 shows a beat-like pattern if one plots it as a function of
x as shown in figure below. The width of each wave-group ∆x will be given by ∆x =
2π/(k1 − k2 ). The wave is made up of two different waves with k-values as k1 and k2
and thus one can see that the spread in k value is ∆ ∼ |k1 − k2 |. Using de Broglie’s
hypothesis, the spread in momentum is ∆p = ℏ|k1 − k2 |. Thus we get product of the
spread (or uncertainty) in x and p space as ∆px = [2π/(k1 − k2 )] × ℏ|k1 − k2 | = 2πℏ = h.

Figure 13: The top and middle show two waves with slightly different wavelengths and
the lower panel shows there superposition exhibiting beats-like pattern in real space.

In above illustration the beats-pattern still fills up the whole space and in order to
really get a space-confined wave one needs to superpose more k-waves. This is better
illustrated if we use the Fourier transforms to superpose many waves over a range of
k. For example if we superpose waves with different k complex waves, exp(ikx), such

14
Rthat the amplitude of the k-component is g(k) we’ll get a real space wave as y(x) =
g(k) exp(ikx)dk. This is actually the inverse Fourier transform of g(k). For instance,
if we choose g(k) = exp(−|k|/k0 ) which in k-space has a width ∆k ∼ k0 , we get y(x) ∼
[x2 +k0−2 ]−1 . This is a Lorenzian of width ∆x ∼ k0−1 . Thus we get the uncertainty product
∆p.∆x ∼ ℏ. Thus we see the connection of this uncertainty principle with the spread of
a complex function in x-space and p-space. This seems natural once we associate p with
λ as one needs many different λ or k-waves in order to create a localized wave that to
describe a particle.

2.2 group velocity


Another aspect of this wave superposition in order to get a localized wave, which I now
call a wave-packet, is the concept of group velocity as opposed to the phase velocity. For a
given wave dispersion relation, i.e. ω(k) the phase velocity is vp = ω(k)/k while the group
velocity is vg = dω/dk. The reason why this is relevant is because the wave-packet actually
moves with group velocity which for the matter waves is different from the phase-velocity.
To get a feel of group velocity we revisit the two-wave superposition, i.e. Eq. 17, which
led to the beat-like pattern. The phase velocity associated with the second cos-term,
which is responsible for the beat-like behavior, is actually (ω1 − ω2 )/(k1 − k2 ). This is
clearly different from the phase velocity vp = ω/k. In fact for the waves following certain
dispersion relation ω(k) the group velocity is given by vg = dω/dk. In fact the group
velocity can have a sign opposite to that of phase velocity. This was clearly illustrated in
the lecture using Mathematica simulations.

2.3 Experimental illustration of uncertainty principle


Above we discussed a somewhat mathematical approach to uncertainty principle based on
construction of a localized particle-like entity made up of different wavelength waves. We
now discuss two experimental illustrations of uncertainty principle. The rigorous mathe-
matical justification of both of these will eventually follow after using Fourier transforms.
In fact Heisenberg arrived at this hypothesis from the analysis of one such experiment.
Diffraction experiment: When one tries to localize a matter-wave in real space, it
inevitably leads to a spread or uncertainty in momentum. This attempt to localize is made
by putting a small slit of width ∆x = d in the path of a plane wave. In this case the wave
undergoes diffraction resulting into a spread in the x-component of its momentum. From
the width of the central maximum in the diffraction pattern on the screen kept far away
one can find that the angular spread (from −θ0 to +θ0 ) of the central maximum is given by
d sin θ0 = λ with λ as the wavelength of the plane wave. This amounts to a momentum
spread ∆px = (h/λ) sin θ0 . Thus the uncertainty product ∆px .∆x = (h/λ) sin θ0 × d.
Using d sin θ0 = λ we get ∆px .∆x = h. So we see that reducing ∆x leads to an increase
in ∆px as expected from uncertainty relation.
Microscope: Let’s try to measure a particle’s position by using a an optical mi-
croscope where one shines light on the particle and part of the light scattered from the
particle is directed towards the objective lens of a simple microscope. We use light of
wavelength λ. Since we detect photons in an angular spread given by the size of the lens
this will result into a momentum uncertainty (or spread) in px of the detected photons

15
Figure 14: The left schematic shows the diffraction of a plane wave when one tries to
constrain it using a slit. The right figure shows the objective lens of an optical microscope
which focuses the photons scattered from a particle in order to measure the particle’s
position.

given by ∆px = (h/λ) sin θ. This means that that the photons that we detect would
cause this much uncertainty in particle’s momentum. The lens has a diffraction limited
resolution of λ/ sin θ. So the particle will be detected with a minimum position uncer-
tainty of ∆x = λ/ sin θ. Thus we get the uncertainty product as ∆px .∆x = h. This
means that an effort to measure the position of a particle with precision ∆x will lead
to an uncertainty in its momentum of order h/∆x. This experiment makes the result
appear as an experimental limitation. However the analysis points towards an inherent
limitation. Measurements complicate and may be shadow this inherent nature which is
actually arising from the wave nature of particles.

2.4 Complementarity principle


The uncertainty principle is actually more prevalent and applies to many complementary
quantities, variables or observables. Some variables that complement each other to make
up a classical description are actually mutually exclusive. Some such complementary
variables are:
1) position (x) and momentum (px ): ∆x.∆px ≥ ℏ/2,
2) Energy (E) and time (t): ∆E.∆t ≥ ℏ/2,
3) Angular momentum (L) and angle (θ): ∆L.∆t ≥ ℏ/2.
We’ll discuss the time-energy uncertainty later. In above relations the formal definition
of uncertainty in variable z is ∆z 2 = ⟨z 2 ⟩ − ⟨z⟩2 , i.e. it represents the root-mean-square
deviation from the mean.
Invariably the product of these complementary quantities has dimensions of action, i.e.
Planck’s constant h or Joule-second. Action is an important variable in classical mechanics
as well as quantum mechanics. In classical mechanics one can derive Newton’s laws from
least action principle and it is often found to be more convenient. In the R path integral
formulation of quantum mechanics one starts with action defined as S = Ldt with L as
the Lagrangian. The probability amplitude for a given path is given by exp(iS/ℏ). From
this point of view Planck’s constant is also called quantum of action. In fact one can
define a rough criterion as follows:

16
If for a physical system any natural dynamical variable which has the dimensions of
action assumes a numerical value comparable to Planck’s constant h then the behavior of
the system must be described by quantum mechanics. On the other hand if every variable
with action dimensions is very large in comparison to h, then the laws of classical physics
are valid with good accuracy.

2.5 Minimum energy estimates using uncertainty principle


It seems that the uncertainty principle cannot be violated and thus any system or experi-
ment will work within the limits dictated by this principle. For instance a particle confined
to certain space will have finite momentum spread and thus a finite kinetic energy. Also
if one uses a particle beam to probe the structure of various entities with required resolu-
tion, the probing particle will need to have de Broglie wavelength smaller than the desired
resolution, for example SEM. Thus one can estimate the minimum energy the probing
particle should have. Here we discuss another example that uses the uncertainty principle
to estimate the minimum energy that a particle in simple harmonic potential can have.

Figure 15: The probability distribution of particle wave in x and p space together with
the potential energy and kinetic energy in two spaces. A finite distribution of probability
away from origin leads to finite kinetic and potential energies in minimum energy state
which is also called ground state.

We begin with the total energy expression, E = (mω 2 x2 /2) + (p2 /2m). The particle
has to have a non-zero ⟨p2 ⟩ and ⟨x2 ⟩ in order to have finite ∆x.∆p. What all this means
is that the particle in its ground state will not be localized at one point in either x or in p.
So the particle will have a probability to take up various x or p values leading to a finite
spread. This aspect is driven by uncertainty principle and the underlying physics, as we
shall see more later, is basically driven by the wave-nature of particle. From symmetry and
for minimum energy we can assert that ⟨p⟩ = 0 and ⟨x⟩ = 0. Remember, ⟨p2 ⟩ = ∆p2 +⟨p⟩2
will assume a minimum value (for minimum KE) when ⟨p⟩2 is minimum, i.e. zero. The
same logic is used for potential energy term to deduce ⟨x⟩2 for minimum potential energy.
Thus we get ⟨p2 ⟩ = ∆p2 and ⟨x2 ⟩ = ∆x2 . We minimize total energy keeping the
minimum uncertainty product, i.e. ∆x.∆p = ℏ/2 or ∆p = ℏ/2∆x. These arguments lead
ℏ2
to E = 21 mω 2 x2 + 2m 1
4∆x2
. Minimizing with respect to ∆x, i.e. ∂E/∂∆x = 0 leads to
∆x = ℏ/2mω and eventually we get Emin = ℏω/2. This estimate of minimum turns out
2

to be exact for SHO as we shall see later from the detailed solution.

17
2.6 Matter waves and postulates of quantum mechanics
If we accept the wave description for matter particles, the natural question to ask is what
quantity or variable does this wave represent and what equation controls the dynamics
of this variable. Moreover, what is the prescription for deducing the experimentally
measurable quantities from this variable? For an em-wave we know that E ⃗ and B ⃗ describe
it in classical sense at least and Maxwell’s equations describe the dynamics of E ⃗ and B ⃗
fields. Guided by em-waves here are the starting postulates of quantum mechanics. This
part is more or less copied from Cohen-Tannoudji book. Here I list these postulates.
1) The particle and wave aspects of light (or matter particles) are inseparable. Light
behaves simultaneously like a wave and a flux of particles. The wave enabling us to
calculate the probability of manifestation of a particle.
2) Predictions about the behavior of a photon (or matter particle) can only be prob-
abilistic.
3) The information about the photon at time t is given by the wave E(⃗r, t) which
a solution of Maxwell’s equations. We say that this wave characterizes the state of the
photons at time t. E(⃗r, t) is the amplitude of a photon occupancy at time t and position
⃗r. That means that corresponding probability is proportional to |E(⃗r, t)|2 .
4) This analogy between light and matter is good background to talk about the wave-
function in analogy with E(⃗r, t). For particles behave like wave with λ = h/p and ν = E/h
as suggested by de Broglie. This also gives p = ℏk with k = 2π/λ, i.e. wave vector and
E = ℏω.
5) The quantum state of a particle is characterized by a wave-function ψ(⃗r, t) which
contains all the information it is possible to obtain about the particle.
6) ψ(⃗r, t) is the probability amplitude of particle’s presence. So the probability of
finding the particle in a small volume dτ at position ⃗r and at time t is given by dP (⃗r, t) =
|ψ(⃗r, t)|2 dτ . So |ψ(⃗r, t)|2 is the probability density.
R Since the probability of finding the
particle somewhere should be one, this means R 2 |ψ(⃗r, t)|2 dτ = 1. We can normalize it by
multiplying by a constant C such that C |ψ(⃗r, t)|2 dτ R = 1. In2 this case it is required
that any wave-function must be square integrable, i.e. |ψ(⃗r, t)| dτ should be finite.
7) With this description and as compared to classical mechanics, the particle now
is described by infinite number of variables, i.e. ψ(⃗r, t) is a continuous function while
classically we only need ⃗v and ⃗r at time t.
8) This is analogous to light and E(⃗r, t) in the sense that |E|2 represents photon’s
probability. Also just like E, ψ follows the superposition principle, i.e. ψ1 + ψ2 describes
the interference of two wave-functions of the same particle with |ψ1 + ψ2 |2 describing the
resulting probability density.
9) ψ(⃗r, t) allows one to find the probability of occurrence. The experimental verifica-
tion must than be founded on the repetition of a large number of identical experiments.
10) ψ(⃗r, t) is complex in quantum physics unlike E for em-waves where the complexity
of E helps mathematically is solving the wave equation systematically. Actually, the
precise definition of the complex quantum state of radiation can only be given in the
frame work of quantum electrodynamics, which is out of the scope of this course.
11) Measurements are very important in quantum mechanics as they seem to interfere
with the natural state of the system. So let’s discuss them more systematically.
12) A certain measurement gives results that belong to a certain set of characteristic

18
results called eigen-values, i.e. {a}. This set can be discrete or continuous depending on
the system and the type of measurement.
13) With each eigen-value there is an associated eigen-state wave-function ψa (⃗r). So
if a measurement yields a value ‘a0 ’, the wave-function afterwards would be ψa0 (⃗r). Any
further measurements, without any time evolution, will always give the same value ‘a0 ’.
It is always possible to P write a general wave-function as a linear combination of eigen-
funcions, i.e. ψ(⃗r, t0 )P= a Ca ψa (⃗r). Then the probability of a measurement giving result
‘a0 ’ is Pa0 = |Ca0 |2 / a |Ca |2 .

Illustration: The idea of measurement can be illustrated using polarizers in the way
of a single photon. Consider two polarizers with their easy-axes at an angle θ with respect
to each other. The are kept with their planes perpendicular to the light beam as shown
in figure.

Figure 16: Photon transmission through two polarizers with easy axis at different angles.
See text for details.

The first polarizer polarizes the light beam along x-direction so if the original beam was
unpolarized, half of the intensity would be lost. Now after second polarizer the intensity
would be further reduced by a factor cos2 θ. The remaining, proportional to sin2 θ, is
absorbed by the polarizer.
Now we reduce the intensity so much that only one photon crosses the polarizers at a
time. So after the photon crosses the first polarizer we know that its polarization is along
the x-axis. Now what happens when it crosses the second one? The probability that it
would cross is cos2 θ and the probability of its absorption is sin2 θ. But if we work with
one photon only than only one thing can happen, i.e. absorption or transmission. Now
if it gets transmitted it’ll have a polarization êp , i.e. if we put more polarizers after the
second one with their axes along êp , the photon will get transmitted through all of them
with 100% probability.
Thus the result of a measurement is either one of the eigen-values of the measure-
ment or the state gets annihilated completely in this particular case. So the measurement
disturbs the system in a fundamental way. If the result is certain eigen-value then the
resulting state is going to be the corresponding eigen-state after the measurement. By
knowing the original state, one can predict the probability of various eigen-values which
can then be verified by doing the experiment on several such identical particles or sys-
tems. In this context the Stern-Gerlach experiment is also relevant, which is dicussed in

19
Feynmann lectures, vol.3.
14) The equation describing the time evolution of ψ(⃗r, t) is the Schrödinger equation
given by,

∂ψ ℏ2 2
iℏ =− ∇ ψ + V (⃗r)ψ(⃗r, t). (18)
∂t 2m
We shall not attempt to prove it. It is one of the fundamental equations that could be
derived by higher level physics. However, historically it was just proposed based on the
de Broglie’s wave ideas and verified experimentally later on.
This equation has the following properties:
(A) It is linear and homogeneous in ψ, which means that the superposition principle
is valid, i.e. if ψ1 (x, t) and ψ2 (x, t) are two solutions of this equation than ψ(x, t) =
C1 ψ1 (x, t) + C2 ψ2 (x, t) is also a solution. Here C1 and C2 are two complex numbers.
(B) It is a first order differential equation in time. This is necessary if the state of the
particle at t0 (alone) is to determine its state later. If it were a second order equation in
time then one will need both ψ and ∂ψ | in order to find ψ(t).
∂t t0
(C) We also recall that for ψ(⃗r, t) to be a valid wave-function, it has to be square-
integrable.
If we compare with the usual wave-equation which is second order in time, here we have
genuinely complex solutions. But the similarities with the wave-equation are quite striking
and the intuition derived from wave phenomena is often correct in quantum mechanics but
not always. For instance the Huygen’s construction is not valid in quantum mechanics.

3 Free particle in one dimension


The time dependent Schrödinger equation for a free particle in one-dimension is

∂ψ ℏ2 ∂ 2 ψ
iℏ =− . (19)
∂t 2m ∂x2
In order to solve this equation we use separation of variables, which works as the space and
time derivative terms in above are decoupled. We write a possible solution as ψ(x, t) =
ψ1 (x)χ(t) leading to

iℏ ∂χ ℏ2 ∂ 2 ψ1
=− = E. (20)
χ ∂t 2mψ1 ∂x2
Here E is a constant which is actually the energy as it relates to the frequency of wave-
function evolution. The last equality follows as the first and second terms are dependent
on variables x and t and therefore they have to be equal to same constant. This leads
ℏ2 ∂ 2 ψ1
iℏ ∂χ
to two equations, namely, χ(t) ∂t
= E and − 2mψ 1 ∂x
2 = E. The first one has a solution

χ(t) = C1 exp(−iEt/ℏ) while the second one has solution ψ(x) = A1 eikx + B1 e−ikx with
k 2 = 2mE/ℏ. Here, E must be a positive number in order to have the expected wave-lik
solution for free-particle. Thus we see how E gets related to frequency and wave-vector
consistent with the de Broglie hypothesis, i.e. ω = E/ℏ and E = ℏ2 k 2 /2m. We can

20
eliminate E to get the dispersion relation ω = ℏk 2 /2m which is different from the em-
wave dispersion, i.e. ω = ck. The non-linear dispersion implies that the matter waves
are dispersive. Moreover, for matter waves, the phase velocity vp = ω/k = ℏ2 k/2m and
group velocity vg = dω/dk = dE/dp = ℏk/m = p/m or p = mvg .
For a given E the most general solution is

ψE (x, t) = Aei(kx−ωt) + Bei(−kx−ωt) (21)

with A and B as two independent constants. The first term represents a wave propagating
in +x direction while the second one is a wave moving along −x direction. In order to
proceed with a general solution for free particle let’s first discuss a special function called
Dirac delta function.

3.1 Dirac delta function: δ(x − x0 )


This is a rather special singular looking function. In fact, going by rigorous mathematics
it’s not a function it’s rather a distribution. Anyhow, we’ll not worry about this jargon
much. The Dirac delta function has two defining properties,
1) It’s zero everywhere except at one point x0 where it is singular, i.e.
(
0 for x ̸= 0
δ(x − x0 ) = (22)
∞ for x = 0

2) Its integral over a range including the singular point is one, i.e.
Z x0 +ϵ
δ(x − x0 )dx = 1 (23)
x0 −ϵ

Here ϵ can be an arbitrarily small number. Thus, a δ-function has dimensions of inverse of
it argument, i.e. in above case (Length)−1 . This is rather complex and it’s better to look
at this function as limit of a series of functions. I use two such constructions to illustrate
this.
1) We take a box-like function of height 1/2a and width 2a centered at origin, i.e.
(
1 1/2a for |x| < a
f (x) = Θ(a − |x|) = (24)
2a 0 for |x| > a

Here Θ is the step-function. Thus in the limit a → 0 this function assumes a very narrow
width and very large value at origin, see figure. At the same time it’s integral over a range
including origin is one. Thus we can assert δ(x) = lim f (x).
a→0
2) We take a limit of the sinc-function. This is rather useful for the context of Fourier
transform involved in the above free-particle wave-function discussion. We directly define
δ-function is k-space as,
sin kL
g(k) = (25)
πk

21
This function is oscillatory, though not periodic, with period ∆k = 2π/L. Its magnitude
away from origin is below 1/πk and its value at the origin is L/π. Here we take the
limit L → ∞. In this limit, the oscillation period approaches zero and it’s value at origin
diverges. So its average value in a small k-range, away from origin, will be zero.R Thus
it acquires a δ-function like appearance. To prove the second property we need g(k)dk
which we evaluate over full k-range with the understanding that the integral will get
zero contribution from region away from origin as the integrand oscillates with very small
period. Moreover, we’ll have to use the residue method to evaluate this integral.
We consider the following integral,
Z ∞ iz  iz 
e 1 e
dz = × 2πi Res = iπ. (26)
−∞ z 2 z=0 z

The last step uses the Rfact that exp(iz)/z only R ∞hasiza pole at originRand with residue 1.
∞ ∞
Finally using the fact −∞ (sin x/x)dx = Im[ −∞ (e /z)dz] we get −∞ (sin x/x)dx = π
R∞
and hence −∞ g(k)dk = 1.
A few other useful properties ofR δ-function are:
x
1) dxd
Θ(x − x0 ) = δ(x − x0 ) or −inf ty δ(x − x0 ) = Θ(x − x0 )
2) Rδ(−x) = δ(x), i.e. it’s a symmetric function,
3) g(x)δ(x−x P0 ) = g(x0 ), if the integration range encompasses x0 and zero otherwise.
4) δ(f (x)) = xn (1/|f ′ (xn )|)δ(x − xn ).
Here, the sum runs over xn which are the zeroes of f (x), i.e. f (xn ) = 0. The last one
also leads to δ(αx) = (1/|α|)δ(x).
The second
R ∞ ikxconstruction of Rδ-function discussed above is useful when one tries to
L
evaluate −∞ e dx. We have −L e dx = (1/ik)(eikL − e−ikL ) = 2 sin kL/k. So for
ikx

L → ∞, we get
Z ∞
sin kL
eikx dx = 2π lim = 2πδ(k). (27)
−∞ L→∞ πk

3.2 General time dependent wave-function for free particle


By superposition principle, any linear superposition of different E solutions also satisfies
time-dependent Schrödinger equation (TDSE). Thus the most general solution of TDSE
for a free particle in one dimension can be written as,
X   
px p2 t
ψ(x, t) = Ap exp i − .
p
ℏ 2m

Here, Ap is a complex number which depends on p and the above sum is carried out
over all (positive and negative) values of p. In fact, for infinite space in 1-D, p can take
continuous values and thus a general wave function is more appropriate to write as,
Z ∞   
px p2 t
ψ(x, t) = A1 g(p) exp i − dp. (28)
−∞ ℏ 2m

22
Here A1 is a normalization constant and g(p) is a complex function of p. At t = 0, we
have
Z ∞ h px i
ψ(x, 0) = A1 g(p) exp i dp. (29)
−∞ ℏ

Actually, this g(p) is a complex-function that represents wave-function in momentum


space and |g(p)|2 dp gives the probability of finding the particle’s momentum value between
p and p + dp. This eq. 29 basically is the defining relation for inverse Fourier transform
which helps us evaluate the real space wave-function from the k- or p-space wave function.
Using this we can evaluate A1 for both ψ(x, 0) and g(p) to be normalized. We use eq.
29 to write the expression for Fourier transform, i.e.
Z ∞ Z ∞ Z ∞
−ip′ x/ℏ ′
ψ(x, 0)e dx = A1 g(p)eipx/ℏ dpe−ip x/ℏ dx
−∞
Z−∞

−∞
Z ∞ 

= A1 g(p) exp [i(p − p )x/ℏ]dx dp
−∞ −∞
Z ∞
= A1 2πℏδ(p − p′ )dp = A1 2πℏg(p′ )
−∞

leading to
Z ∞
1
g(p) = ψ(x, 0)e−ipx/ℏ dx. (30)
A1 2πℏ −∞

The above is the defining relation for Fourier transform. Our objective is to find A1 so
that both ψ(x, 0) and g(p) are normalized. For this,
Z ∞ Z ∞
|g(p)| dp =
2
g(p)g ∗ (p)dp
−∞ −∞
Z ∞ Z ∞  Z ∞ 
1 −ipx/ℏ ∗ ′ ipx′ /ℏ ′
= ψ(x, 0)e dx ψ (x , 0)e dx dp
(2πℏ|A1 |)2 −∞ −∞ −∞
Z ∞Z ∞ Z ∞ 
1 ∗ ′ [ip(x′ −x)/ℏ]
= ψ(x, 0)ψ (x , 0) exp dp dxdx′
(2πℏ|A1 |)2 −∞ −∞ −∞
Z ∞Z ∞
1
= ψ(x, 0)ψ ∗ (x′ , 0)2πℏδ(x′ − x)dx′ dx
(2πℏ|A1 |)2 −∞ −∞
Z ∞
1
= |ψ(x, 0)|2 2πℏdx
(2πℏ|A1 |)2 −∞
Z ∞
1
= |ψ(x, 0)|2 dx
2πℏ|A1 |2 −∞

Now for both g(p) and ψ(x, 0) to be normalized we should have |A1 |2 = 1/2πℏ. Thus we
write the full expressions for the Fourier transform and its inverse, respectively, as
Z ∞
1
g(p) = √ ψ(x, 0)e−ipx/ℏ dx (31)
2πℏ −∞

23
and
Z ∞
1
ψ(x, 0) = √ g(p)eipx/ℏ dp. (32)
2πℏ −∞

We can also write these using wave-vector k, instead of momentum p, as


Z ∞
1
g(k) = √ ψ(x, 0)e−ikx dx (Fourier Transform) (33)
2π −∞
and
Z ∞
1
ψ(x, 0) = √ g(k)eikx dk (Inverse Fourier Transform) (34)
2π −∞

Please note that Inverse Fourier Transform (IFT) of a Fourier Transform (FT) of a
given function leads to the original function. In fact FT (or IFT) is a linear transformation
implying many simplifications such √ as, FT[c1 ψ1 (x) + c2 ψ2 (x)]=c1 FT[ψ1 (x)]+c2 FT[ψ2 (x)].
Moreover, as defined above with 1/ 2π pre-factor, it preserves the norm of the function.
It turns out that it is a unitary transformation, which will become clearer later when we
discuss linear vector spaces.

3.3 Comparison of wave-equation with free-particle Schrödinger


equation
Here is an itemized comparison:
2 ∂2y
1) The wave equation, i.e. ∂∂t2y = v 2 ∂x 2 , is second order in both x and t.

2) Wave equation has general solutions of the form f (x ± vt) which is not possible for
Schrödinger equation.
3) Wave equation, being second order in time, requires two initial conditions to find
y(x, t). These can be y(x, 0) and ẏ(x, 0).
4) Boundary conditions with respect to x, for time independent equations, often have
similar nature. For example, the solution vanishes for both the particle in a well and the
electric field in a cavity at the boundaries. This leads to similar solutions for both cases.
5) Wave equation also has plane wave solutions of form e(ikx±ωt) with ω being real-
positive and k taking any real values. This follows from separation of variables. Since
the wave-equation
R∞ is a linear homogeneous
R∞ equation, a general solution can be written as,
y(x, t) = −∞ A− (k)ei(kx−ωt) dk + −∞ A+ (k)ei(kx+ωt) dk with ω = vk. The initial conditions
R∞ R∞
lead to y(x, 0) = −∞ [A+ (k) + A− (k)]eikx dk and ẏ(x, 0) = iω −∞ [A+ (k) − A− (k)]eikx dk.
Suppose Y (k) and Z(k) are Fourier transforms of y(x, 0) and ẏ(x, 0), respectively then
these can be used to get A+ (k) = [Y (k) + Z(k)/iω]/2 and A− (k) = [Y (k) − Z(k)/iω]/2.
These can now be used to find y(x, t).
6) In wave-equation solutions y is always a real number and complex notation is
only for mathematical convenience. Eventually one takes real or imaginary part of the
mathematical complex solution as valid physical solution. This is in contrast with the
Schrödinger equation.

24
3.4 Free-particle Gaussian wave-packet
Here we construct a localized wave-packet like solution to TDSE to simulate how a particle
like entity would evolve with time according to the TDSE. We discuss a specific wave-
packet called the Gaussian wave-packet, given by
 
a2
g(p) = A exp − 2 (p − p0 ) . 2
(35)
4ℏ
Here a is a constant with dimension of length. A is to be found from the normalization
condition,
Z ∞ Z ∞  
a2
|g(p)| dp = |A|
2 2
exp −2 2 (p − p0 ) dp = 1.
2
(36)
−∞ −∞ 4ℏ
We use the integral
Z ∞ p
exp [−α(ξ + β)2 ]dξ = π/α (37)
−∞

. This result is valid for complex α and β, if the Re[α] is positive so that the integral
converges. We can also differentiate this integral wrt α to derive
Z ∞

ξ 2 exp [−αξ 2 ]dξ = π/2α3/2 (38)
−∞

for β = 0 p
This helps us evaluate the integral in eq.36 to get |A| = (2π)−1/4 a/ℏ. Thus we get,
r  
1 a a2
g(p) = exp − 2 (p − p0 ) .2
(39)
(2π)1/4 ℏ 4ℏ
Now to work out the wave-function in x-space we need to find its inverse Fourier trans-
form using eq. .This is a bit cumbersome but straightforward after we use the Gaussian
integration identities. Here are the main steps,
 2 1/4 Z ∞  
a a2
ψ(x, 0) = exp − 2 (p − p0 ) + ipx/ℏ dp
2
8π 3 ℏ4 −∞ 4ℏ
 2 1/4 Z ∞ "  2 #  2 
a a2 2ℏx x ip0 x
= exp − 2 p − p0 − i 2 exp − 2 + dp
8π 3 ℏ4 −∞ 4ℏ a a ℏ
 1/4  2 
2 x ip0 x
= exp − 2 + . (40)
πa2 a ℏ
Remember this is the wave function at time t = 0 and we are yet to workout the non-zero
time wave-function. It is clear that this wave-packet is localized in both x and p spaces. In
p-space, g(p) real and symmetric about p0 while in x-space the wave-function is complex
and its real and imaginary parts are oscillatory with wave-vector p0 /ℏ and both have a
Gaussian envelope centered at x = 0. Thus the wave-function is spread over a finite x
and p range leading to uncertainty in both.

25
3.5 Uncertainty product for Gaussian wave-packet
RLet us evaluate thesep uncertainties
R explicitly. For ∆x2 = ⟨x2 ⟩ − ⟨x⟩2 , we have ⟨x⟩ =
∞ ∞
−∞
x|ψ(x, 0)|2 dx = 2/πa2 −∞ x exp (−2x2 /a2 )dx = 0. The integral here gives zero as
the integrand is an odd function of x.
Now,
Z ∞
⟨x ⟩ =
2
x2 |ψ(x, 0)|2 dx
r−∞ Z ∞ r √
2 2 π a2
= x 2
exp (−2x 2
/a2
)dx = × = .
πa2 −∞ πa2 2.(2/a2 )3/2 4
p
Thus we get ∆x ⟨x2 ⟩ − ⟨x⟩2 = a/2.
Now we go after ∆p2 = ⟨p2 ⟩ − ⟨p⟩2 . For this,
Z ∞ Z ∞  
a a2
⟨p⟩ = p|g(p)| dp = √
2
p exp − 2 (p − p0 ) dp
2
−∞ 2πℏ −∞ 2ℏ

Let’s make a change of variable from p to q = p − p0 to get,


Z ∞   Z ∞  
a a2 2 ap0 a2 2
⟨p⟩ = √ q exp − 2 q dq + √ exp − 2 q dq
2πℏ −∞ 2ℏ 2πℏ −∞ 2ℏ

Clearly the first term in above vanishes and the second uses the Gaussian integral, eq.
37, to obtain,
 1/2
p 2πℏ2
⟨p⟩ = (ap0 / 2/πℏ) = p0 .
a2

Finally, we need to find ⟨p2 ⟩. We write,


Z ∞  
a a2
⟨p ⟩ = √
2
p exp − 2 (p − p0 ) dp
2 2
2πℏ −∞ 2ℏ

Again let’s make a change of variable from p to q = p − p0 to get,


Z ∞  
a a2 2
⟨p ⟩ = √
2
(p0 + q + 2p0 q) exp − 2 q dq
2 2
2πℏ −∞ 2ℏ
" √ √ #
a 2 2πℏ 2π ℏ2
= √ p0 + 2
+ 0 = p0 + 2 .
2πℏ a 2(a2 /2ℏ)3/2 a

Here again use has been made of eq. 37 and 38 in evaluating above integral. Thus
we get ∆p2 = ⟨p2 ⟩ − ⟨p⟩ = p20 + (ℏ2 /a2 ) − p20 = (ℏ2 /a2 ). Hence, for this Gaussian wave-
packet, the uncertainty product ∆p.∆x = (ℏ/a).(a/2) = ℏ/2. This is the minimum
possible uncertainty product. In fact, one can prove rigorously that the above Gaussian
is the only wave-packet that leads to the minimum uncertainty product. In fact, not all
Gaussian wave-packets will lead to the same outcome.

26
3.6 Time evolution of the Gaussian wave-packet
We follow the general strategy of solving TDSE for free particle for given initial ψ(x, 0).
In fact we started with a g(p) and worked out the ψ(x, 0). Now, according to earlier
discussion as per eq. 28, the corresponding time dependent wave-function is given by,
Z ∞  
1 px p2 t
ψ(x, t) = √ g(p) exp i − i dp. (41)
2πℏ −∞ ℏ 2mℏ
h 2 i p
Here, g(p) = A exp − 4ℏ a
2 (p − p 0 ) 2
with A = (2π)11/4 aℏ . The mathematics of the evalu-
ation of the above integral is rather tedious though doable analytically. It involves a few
integrals that use eqs. 37 and 38 with complex α and β. You can work it out if you feel
up to it. I’ll give the final result here,
 1/4 r  2 2  
2 a a p0 x − i(a2 p20 /2ℏ)
ψ(x, t) = exp − 2 exp − 2 . (42)
π a2 + (2iℏt/m) 4ℏ a + 2i(ℏt/m)

This looks rather complicated. We can workout the time dependence of probability dis-
tribution in real space, which gives,
s  
∗ 2 (x − p0 t/m)2
P (x, t) = ψ(x, t)ψ (x, t) = exp − . (43)
πa(t) a(t)2

Here,

4ℏ2 t2
a(t)2 = a2 + (44)
m2 a2
We notice that |g(p)|2 , i.e. probability distribution in p-space, does not change while
P (x, t) changes with time. This also means that uncertainty ∆p = ℏ/a remains same,
while the uncertainty ∆x = a(t)/2 is time dependent. Thus the product,
 1/2
ℏ 4ℏ2 t2 ℏ
∆x.∆p = 1+ 2 4 ≥ . (45)
2 ma 2

The uncerainty product is minimum at t = 0. For large t, ∆x = (ℏt/ma) = (∆p/m)t =


∆vt. The wave packet disperses as different p-components of the wave move with different
velocities.
A few remarks: The Mathematica simulation in lecture on the evolution of a Gaus-
sian wave-packet illustrates many aspects of Gaussian wave-packet of matter waves. Some
of these are:
1) The wave-packet moves with group velocity vg = dω/dk = p/m and not with phase
velocity vp .
2) The wave-packet disperses with a reduction in spread as time progresses for negative
times reaching a minimum spread at zero time and then the spread increases again for
positive time.
3) There is no change in the momentum uncertainty, i.e. ∆p, with time.

27
4) One can also make out how the small λ plane-waves move faster than the large λ
leading to dispersion or spread in the wave-packet.
5) One can understand the increase in spread easily by making a parallel with an
athletic race. In a race all runners start from the same start line but after race starts
there position has a spread as all the runners run with different speed or in other words
there is a spread in velocity. One can easily argue that the real space spread in runners
position, i.e. ∆x, will be given by ∆x = ∆vt. This is the case for Gaussian wave packet
for large times where ∆x = ∆p/m as ∆p = ℏ/a.
6) We can estimate this rate of spread in x-space for macroscopic objects. So for m =
1 µgm and a = 1 µm we get d∆x/dt = ∆p/m = ℏ/ma ∼ 10−22 m/s. This is extremely
small as compared to the particle size and macroscopic velocity scales (m/s). It’ll take
about 1016 s or 109 years to see doubling of the size due to dispersion.
Similarity to diffusion: It turns out that the Schrödinger equation for a free particle,
2 2
i.e. ∂∂xψ2 = −i 2m

∂ψ/∂t has similar structure as the diffusion equation, i.e. ∂∂xT2 = D1 ∂T /∂t.
κ
Here T , for instance, can be temperature where thermal diffusivity D = σc with κ as
thermal conductivity, c as specific heat and ρ as density. The diffusion equation applies
to many phenomena such as diffusion of atoms or chemical species, charge, heat. This
equation has solutions similar ”diffusive” (similar to ”dispersive”) solutions when a local-
ized distribution at t = 0 spreads as time progresses. In fact the mathematics of solving
the diffusion equation is similar to that of the Schrödinger equation for free particle. One
difference in diffusion case is as time progresses the spatial spread always increases while
for matter waves it can increase or decrease both as we see above.

4 Some more formalism


Before we move on to the topic of one-dimensional potentials and how one can approach
the quantum dynamics through the solutions of time-independent Schrödinger equation
(TISE) we need to discuss a few more topics that are related to the postulates.

4.1 Expectation values, Hermitian operators and matrix ele-


ments
We see that the wave function ψ(x) leads to a probability distribution P (x) = |ψ(x)|2
in real space and thus one
R can find the Raverage values of space dependentR quantities of

interest,
R such as ⟨x⟩ = x|ψ(x)|2
dx = xψ (x)ψ(x)dx or ⟨f (x)⟩ = R f (x)|ψ(x)|2
dx =
∗ ∗
f (x)ψ (x)ψ(x)dx. Thus one can find the average potential ⟨V ⟩ = ψ (x)V (x)ψ(x)dx.
We call ⟨f (x)⟩ψ as the ”expectation value” of f over wave-function ψ(x).
Similarly we could find the wave function g(p) in p-spaceR by Fourier transform
R ∗ leading
to the probability distribution in p-space and thus ⟨p⟩ = p|g(p)|
R
2
dx = g (p)pg(p)dp.
It turns out that we can also write this in x-space as ⟨p⟩ = ψ ∗ (x)(−iℏ)(∂ψ(x)/∂x)dx.
Thus we surmise that in x-space we can write p = −iℏ(∂/∂x). Thus −iℏ(∂/∂x) is called
momentum operator. Similarly we can write the total energy operator, i.e. Hamiltonian,
as H = (p2 /2m) + V (x)R and so one can find the expectation value
R ∗ of total energy (kinetic

+ potential) as ⟨H⟩ = ψ (x, t)[(p /2m) + V (x)]ψ(x, t)dx = ψ (x, t)(iℏ∂/∂t)ψ(x, t)dx.
2

28
The last equality follows from Schrödinger equation. So iℏ(∂/∂t) is equivalent to the
energy operator.
Another entity that we introduce here is the matrix element of an operator. We’ll
discuss a more detailed approach to such entities later when we discuss linear vector
spaces. However, we need these to develop the machinery of quantum mechanics further.
We define the matrix element of an operator
R ∗ ‘A’ between two states described by wave-
functions ψ(x) and ϕ(x)
R as ⟨ψ|A|ϕ⟩ = ψ (x)Aϕ(x)dx. This is different from the matrix
element ⟨ϕ|A|ψ⟩ = ϕ∗ (x)Aψ(x)dx as in the former case ‘A’ operates on ϕ(x) while in
later it operates on ψ(x). The expectation value defined above is a special matrix element,
i.e. ⟨A⟩ψ = ⟨ψAψ⟩.
Another concept is that of Hermitian operators. Again this concept will be discussed
in more detail later. It turns out that all eigen-values of a Hermitian operator are real
or vice-versa, i.e. any operator with all eigen-values real is Hermitian. Thus in quantum
mechanics any observable, such as position, momentum, energy, is associated with a
Hermitian operator. We need to define this Hermitian property mathematically. An
operator ‘A’ is called Hermitian if its matrix element over any arbitrary wave-functions
ϕ(x) and ψ(x) satisfy: ⟨ψ|A|ϕ⟩ = ⟨ϕ|A|ψ⟩∗ or in another way
Z Z Z
ψ (x)Aϕ(x)dx = ( ϕ (x)Aψ(x)dx) = [Aψ(x)]∗ ϕ(x)dx.
∗ ∗ ∗
(46)

4.2 Time independent Schrödinger equation


We have already seen this for the free particle. Now for V (x) ̸= 0, the TDSE is
iℏ∂ψ(x, t)/∂t = Hψ(x, t) with H = (p2 /2m) + V (x). For time-independent potentials
V (x), we can use separation of variables. So we use ψ(x, t) = ϕ(x)χ(t) to write
1 iℏ dχ
Hϕ(x) = = constant = E
ϕ(x) χ dt
Therefore, we get HϕE (x) = EϕE (x) and χE (t) = exp (−iEt/ℏ). Here ϕE (x) is specific
solution of TISE HϕE (x) = EϕE (x) with energy eigen value E. The TISE expands into
ℏ2 d2 ϕ
− + V (x)ϕ(x) = Eϕ(x) in 1D
2m dx2
ℏ2 2
− ∇ ϕ(⃗r) + V (⃗r)ϕ(⃗r) = Eϕ(⃗r) in 3D
2m
In fact there may be more than one independent solution for ϕE (x) for a given energy E.
An example of this is the ±p plane wave solutions for free-particle which have the same
energy p2 /2m. These are called degenerate (smae energy) eigen-states. We use an extra
index α in ϕEα to denote all such independent solutions that are degenerate. The most
general solution of the TDSE will be
X
ψ(x, t) = AEα ϕEα (x) exp (−iEt/ℏ) (47)
E,α

The eigen state ϕE (x) is also called a stationary state as it does not evolve with time except
for a phase factor. For a given initial state ψ1 (x, 0) at t = 0 if one wants to find its time

29
P
evolution one has to find the complex coefficients AEα such that ψ1 (x, 0) = AEα ϕEα (x).
One can then use these AEα in eq. 47 to write the time-dependent solution.
The fact that any arbitrary ψ1 (x) (physically acceptable) can be written as a linear
superposition of ϕEα for a general Hamiltonian is not trivial. But it turns out that
these ϕEα always form a complete set. Thus these ϕEα are called Eigen (characteristic)
functions.
The fixed energy states have special significance because one can observe transitions
between these states and the energy difference between these states is easily observed as
emitted (or absorbed) radiation at specific frequencies.

30
Some remarks on TISE:
1) TISE has no ‘i’ so its solution need not be complex but it can be if convenience
dictates. For example for free particle we chose exp (±ikx) but we could have chosen
sin kx and cos kx as well.
2) An eigen-state wave-function ϕ(x) must be finite, single valued, continuous.
3) dϕ/dx must also be finite, single valued and continuous.
These properties of TISE and ϕE (x) sometimes (for bound states) force that the
solutions of TISE are found at certain energies only. This leads to quantized energies for
bound states as we’ll see soon. A good reference to see how this happens is sec. 5.7 of
Eisberg and Resnick.
It should also be pointed out that some of the mathematical restrictions are marginally
violated for ideal (unphysical) potentials that go to infinity or have a step jump. For
instance if potential abruptly jumps to infinity at some point, as for infinite well, dψ/dx
can be discontinuous.

4.3 Probability Current


The TDSE in 1-D is,

ℏ2 ∂ 2 ψ ∂ψ
Hψ = − 2
+ V (x)ψ(x) = iℏ
2m ∂x ∂x
The probability density ρ(x, t) = |ψ(x, t)|2 assuming the wave-function ψ(x, t) to be nor-
malized. As a result of time evolution of ψ the probability density will change with time.
However, when the probability in certain given region changes we expect to see a flow
of this probability at the boundaries. This is a local conservation of probability. This
is similar to charge conservation in the sense that when the charge contained in certain
region of space changes with time we expect to see a charge current at the boundary of
∂ρ
this region. This leads to the continuity equation: ∂tQ + ∇.⃗ J⃗Q . We are after a similar
concept here. We have,
∂ρ ∂ ∂ψ ∂ψ ∗
= (ψψ ∗ ) = ψ ∗ +ψ
∂t ∂t ∂t ∂t
Using the TDSE for time derivatives this gives,
 
∂ρ 1 −1
= ψ ∗ Hψ + ψ Hψ ∗
∂t iℏ iℏ
Rx
Thus the probability for finding the particle between x1 and x2 , i.e. P (x, t) = x12 ρ(x, t)dx,
will change with time as
Z x2 Z     
∂P ∂ρ 1 x2 ℏ2 ∂ 2 ψ ∗ ℏ2 ∂ 2 ψ ∗ ∗
= dx = − +Vψ ψ − − + V ψ ψ dx
∂t x1 ∂t iℏ x1 2m ∂x2 2m ∂x2
Z x2  
ℏ 2
∗∂ ψ ∂ 2ψ∗
=− ψ −ψ dx
2im x1 ∂x2 ∂x2

31
We use integration by parts to get
 Z   Z x2
∂P ℏ ∗ ∂ψ ∂ψ ∗ ∂ψ ∂ψ ∗ ∂ψ ∗ ∂ψ
=− ψ − dx − ψ − dx
∂t 2im ∂x ∂x ∂x ∂x ∂x ∂x x1
 
∗ x2
ℏ ∂ψ ∂ψ
=− ψ∗ −ψ
2im ∂x ∂x x1
= −[S(x2 ) − S(x1 )]

Here, we have used,


   
ℏ ∗ ∂ψ ∂ψ ∗ ℏ ∗ ∂ψ
S(x) = ψ −ψ = Im ψ (48)
2im ∂x ∂x m ∂x
R x2
We call the above S(x) as probability current density. This leads to ∂P
=− ∂S
dx =
R x2 ∂ρ ∂t x1 ∂x

x1 ∂t
dx or in other words,

∂ρ ∂S
+ =0 in 1-D
∂t ∂x
∂ρ ⃗ ⃗
+ ∇.S = 0 in 3-D (49)
∂t
In 3D this is actually a vector quantity. It basically represents probability
h i flux, i.e.

⃗ = Im ψ ∇ψ ∗ ⃗ . Eq. 49 is
probability flow per unit area per unit time. In 3D, we have S m
the continuity equation describing local probability conservation just like the continuity
equation for charge conservation in em-theory.

4.4 Time-energy uncertainty


Just like p and x, E and t are also complementary variables and the fact ∆x.∆p ≥ ℏ/2
is a consequence of associating λ with p, the time-energy uncertainty, ∆E.∆t ≥ ℏ/2 is
also a consequence of associating ν with E. Unlike ∆x it is difficult to define ∆t precisely
as t is a parameter rather than an observable. However, if one finds examples where
∆x.∆p ≥ ℏ/2, this can be translated into ∆E.∆t ≥ ℏ/2 as discussed in one example
below. One can illustrate time-energy uncertainty in several ways:
1) We recall from classical dynamics of a damped harmonic oscillator that its oscilla-
tions decay over a time scale τ dictated by damping. If the same oscillator is driven with
frequency ν near resonance, the response of the oscillator as a function of drive frequency
exhibits a peak at resonance with width ∆ν, which again is dictated by damping that
leads to certain quality factor. Fig. 17 illustrates how these τ and ∆ν are inversely related
so as τ.∆ν ∼ 1. This simple example already captures the time-energy uncertainty. In
QM one associates E with ν, i.e. E = hν.
2) A wave-function made up of different energy eigen-states spanning an energy range
∆E will evolve significantly over a time ∆t given by ∆E.∆t ∼ ℏ. For stationary states,
i.e. eigen-energy states, this time evolution scale is ∞ and ∆E = 0. In QM, we write the
general solution of TDSE as per eq. 28. Let’s look at a state made by superposing two
energy eigen-states, i.e. ψ(x, t) = ϕE1 (x) exp (−iE1 t/ℏ)+ϕE2 (x) exp (−iE2 t/ℏ). This state

32
Figure 17: Illustration of time-frequency uncertainty in a Harmonic oscillator. The
damped motion in time-domain in the left panel and the driven motion in frequency
domain is consisten with ∆t.∆ν ∼ 1
.

evolves significantly over a time scale ℏ/|E1 − E2 | i.e. ℏ/∆E. For instance probability of
measuring the system in two orthogonal states ϕE1 (x) + ϕE2 (x) and ϕE1 (x) − ϕE2 (x) works
out to be cos2 (∆Et/ℏ) and sin2 (∆Et/ℏ), respectively. These have significant evolution
over time scale ∆t ∼ ℏ/∆E. This idea can also be mapped to a classical coupled harmonic
oscillator system.
3) In nature when atoms emit em-radiation at characteristic frequency the emission
line has a natural line-width ∆λ. This measured line-width can arise from extrinsic factors
such as Doppler broadening and from the intrinsic lifetime τ of the excited states. This
τ and ∆E = hc∆λ/λ2 are related by ∆Eτ ∼ ℏ. Typical values of τ are in 10 ns range
giving ∆λ/λ ∼ 10−10 for ν ∼ 101 4 Hz.
4) Let’s try to measure the momentum of a free particle of mass m by observing its
displacement over time ∆t, i.e. ∆t = ∆x/v = m∆x/p assuming that the particle moves
by ∆x in ∆t. Now suppose the uncertainty in momentum measured in this way is ∆p
then ∆E = ∆(p2 /2m) = p∆p/m. Thus ∆E.∆t = (p∆p/m).(m∆x/p) = ∆p.∆x ≥ ℏ/2.
5) Suppose if one tries to measure E by looking at the time evolution of the wave
function and finding the frequency with which it is evolving. This sounds difficult in QM
but in classical oscillators one does measure the frequency in this way. If one measures
the behavior over time ∆t then the accuracy (or uncertainty) in frequency measurement
will ∼ 1/∆t as dictated by the Fourier transform in time domain. This also leads to
∆E.∆t ∼ ℏ
6) Another interpretation relates to an explanation of quantum tunneling where there
is an apparent violation of energy conservation, at least for a short time, when particle
crosses the barrier through quantum tunneling. It seems that one can actually create or
destroy ∆E energy for a time period ∼ ℏ/∆E. The fact that a particle goes across a
barrier seems to violate the energy conservation. A reconciliation is that an an energy
∆E can temporarily be borrowed for a time ∆t ≤ ℏ/∆E if this time is enough for going
across. We should understand that this extra energy is temporary and tunneling is a
perfectly elastic process.

33
4.5 Heisenberg equation of motion
In QM any given measurable
R ∗ quantity corresponds to an operator, say B, whose expec-
tation value ⟨B⟩ = ψ (x, t)Bψ(x, t)dx is a useful and measurable quantity. Its time
evolution is also important. It turns out that finding out how this expectation value
does not necessarily require solving for the time dependent wave-function. This can be
obtained using the Heisenberg equation of motion as we discuss here. ⟨B⟩ for a time-
independent operator will change with time as for a general state ψ(x, t) changes with
time. As mentioned earlier, any measurable quantity will correspond to a Hermitian
operator as defined by eq. 46. We have,
Z Z  ∗ 
∂ ∂ ∗ ∂ψ ∗ ∂B ∂ψ ∗
⟨B⟩ = (ψ Bψ) dx = Bψ + ψ ψ+ ψ dx
∂t ∂t ∂t ∂t ∂t
Using the TDSE for the first and third term, we get
Z  
∂ ∂B 1 ∗ ∗ 1
⟨B⟩ = ⟨ ⟩+ − (Hψ )Bψ + ψ B Hψ dx
∂t ∂t iℏ iℏ
R R
Now since
R H is HermitianR we use eq. 46 so that (Hψ)∗ ϕdx = ψ ∗ Hϕdx with ϕ = Bψ
to get (Hψ)∗ Bψdx = ψ ∗ HBψdx. We also assume B to be time independent. Thus
the above expression simplifies to
Z Z
∂ 1 ∗ ∗ 1
⟨B⟩ = [ψ BHψ − ψ HBψ] dx = ψ ∗ [BH − HB] ψdx
∂t iℏ iℏ
Defining the commutator [B, H] = BH − HB, we get the Heisenberg equation of motion:
∂ 1
⟨B⟩ = ⟨[B, H]⟩ (50)
∂t iℏ
Here are a few consequences and remarks related to this equation:
1) When [B, H] = 0 we get ∂t ∂
⟨B⟩ = 0, i.e. B is a constant of motion. Thus the oper-
ators that commute with Hamiltonian are important as they define constants of motion.
Usually such an operator describes certain symmetry of the system (or Hamiltonian). It
often helps to find such operators first to solve for eigen-states of H. Since H commutes
with itself so energy expectation value is a constant of motion.
2) We can also prove that the expectation value of any operator over an eigen-state
of H is independent of time.
3) Another consequence of commutator being zero is one can always find functions
which are simultaneous eigen-functions of such commuting operators. This, as we shall
see, helps in accounting for the degenerate eigen-states of H.

5 Bound state solutions of TISE in 1D


The bound state solutions are found by solving the TISE for ψ(x). The admissible so-
lutions must ensure that ψ(x) and dψ/dx are continuous everywhere and ψ(x) is square
integrable. The latter implies that ψ(x) → 0 as x → ∞. AS discussed earlier the conti-
nuity of dψ/dx is sometimes violated for ideal mathematical potentials that have infinite

34
jump at a point. These potentials are only to be looked at as limiting cases of real
potentials and dψ/dx is never discontinuous for physically meaningful potentials. The
discontinuity of dψ/dx at such points can be easily understood if one writes TISE as,
d2 ψ 2m[V (x)] − E]
= ψ(x).
dx 2 ℏ2
We can see that when V (x) jumps to ∞ at a point for finite E, d2 ψ/dx2 is infinite at such
points which indicates a discontinuity in dψ/dx.

5.1 Infinite potential well


The infinite potential well is defined by,
(
0 for 0 < x < L
V (x) =
∞ otherwise

Going with the general strategy stated above ψ(x) must be zero for x < 0 and x > L.
For 0 < x < L, we have
d2 ψ 2mE
= − 2 ψ(x) = −k 2 ψ(x)
dx 2 ℏ
with k 2 = 2mE/ℏ2 as a positive quantity. One cannot find any meaningful solutions for
E < 0. The solution for this is ψ(x) = A sin kx + B cos kx. For ψ(x) to be continuous,
we should have:
(i) ψ(0) = 0, which leads to B = 0,
(ii) ψ(L) = 0 implying sin kL = 0 as A ̸= 0 for non-trivial solution. This implies
kL = nπ. RL
Thus we get ψ(x) = A sin (nπx/L). To normalize ψ(x), we have |A|2 0 sin2 (nπx/L)dx =
p
1, which leads to |A| = 2/L. As stated earlier the phase of A cannot be determined and
we take it as zero as a convention. We also see that values of k are quantized in integer
multiple of nπ leading to permitted energy values as En = n2 π 2 ℏ2 /2mL2 . To summarize:
r  nπx 
2 n2 π 2 ℏ2
ψn (x) = sin and En = (51)
L L 2mL2
Here are a few remarks on this problem:
1) Energies are discrete, which is different from the classical expectation of continuous
energies.
2) Minimum energy, i.e. π 2 ℏ2 /2mL2 is non-zero which is different from the classical
expectation of zero.
3)En ∝ n2 so (En+1 − En ) ∝ n,R i.e. energy separation increases with n
4) ψn (x) are orthonormal, i.e. ψn∗ (x)ψm (x)dx = δnm with δnm is Kronecker delta.
5) The plots of ψn (x) has n − 1 nodes (other than the two extreme ones). This is a
general feature of bound states, i.e. as E increases number of nodes increase.
p
for ψn (x) work out as ∆pn = nπℏ/L and ∆xn = L (1/12 − (1/2n2 π 2 )).
6) The uncertainties p
Thus, ∆pn .∆xn = nπℏ (1/12) − (1/2n2 π 2 ). So uncertainty product increases with n.
For n = 1, ∆pn .∆xn = 0.57ℏ > ℏ/2

35
7) Since V (x) is symmetric about x = L/2 so ψn are either symmetric or antisymmetric
about x = L/2. This happens as H commutes with the ‘inversion about L/2’ operator
and there is no degeneracy.
8) One can write any general wave-function that vanishes at x = 0 and x = L as
a linear superposition of ψn (x). This can be seen easily here using the idea of Fourier
series. Let’s discuss an example wave-function: ψ(x) = A sin3 πx/L. Here, A is the
normalization
√ factor. We can easily see that the normalized wave-function is given by:
ψ(x) = (1 10)[ψ3 (x) − 3ψ1 (x)] by using identity sin 3θ = 3 sin θ − 4 sin3 θ. Thus for this
case the probability to find the particle in E1 and E3 states will be 9/10 and 1/10, respec-
tively. This also leads to an energy expectation value of (9E1 + E3 )/10 = 9π 2 ℏ2 /10mL2 .
Simulating a classical particle: Clearly, the above stationary eigen-states are far
from classical behavior. A natural question then is: how do we construct a classical-like
solution using these eigen-state wave-functions? A classical particle in such box-potential
is expected to elastically bounce from the two hard walls of the potential well with constant
speed in-between. We construct a superposition wave-function which is localized at a given
time and its time evolution is consistent with the classical expectation. Here is such a
wave-function:
X
N +(∆N/2)  
nα ℏπ 2
ψ± (x, t) = sin (nπx/L) exp (±inaπ) exp −i t .
2mL2
N −(∆N/2)

Here α = 2. This actually gives a wave packet centered at x = a and moving to the left or
right, depending on the sign chosen in the middle exponential term. We choose the unit
of time as τ = (4mL2 /h = 1. This makes the exponent of the last exp term as −iπnα t/τ .
For L = 1 cm and m = 1 gm we get τ ∼ 6 × 1026 . The center of the wave function moves
at speed vN = N ℏπ/mL. This just means that the magnitude of speed v0 corresponding
to N = 1, i.e. v0 = ℏπ/mL, is extremely small. For above parameters (1 gm and 1 cm)
we get v0 = 3.3 × 10−29 m/s. So for a classical particle to move at noticeable speed, say
10−3 m/s, the principal quantum number (N ) will have to be ∼ 3 × 1025 , which is huge.
The parameter ∆N is the spread in N corresponding closely to uncertainty in momen-
tum. This is required to make a localized wave-packet representing a classical particle. In
fact the magnitude of the width of the wave-packet in real-space is proportional to 1/∆N .
As time passes this wave-packet’s evolution consists of its motion with reflection at the
walls and dispersion (spread). The time scale of wave-packet motion over the width of
the well is given by L/N v0 while time taken to disperse it in the whole well would be
∼ L/(∆N v0 ). The latter can be very large as compared to the former if ∆N/N is very
small, which would be the case for a classical particle.
All the above features, namely, motion, reflection, dispersion, etc., can be easily
demonstrated with the above wave-packet using Mathematica. One can also see the
effect of various parameters on motion and dispersion. It is worthwhile to notice how the
small wavelength (or sharper) features move faster than the large wavelength components.
It is also interesting to see the effect of the value of α. The wave-packet becomes
non-dispersive for α = 1. Further one can see the change in the nature of dispersion
when α is changed from 1.01 to 0.99. In this case one can see how the sharp features lead
in the former case while the broad tail legs and vice versa for the latter case. It is also
interesting to see the time evolution from a negative t.

36
In the Mathematica code one has to run the first half of the code to create the matrix
that stores the complex time dependent wave-packet at different times. This is computa-
tion intensive and for large ∆N it takes longer as the series to be summed has more terms.
After this matrix is created one can run the animation part to get the time evolution of
the wave-function.

5.2 Piecewise constant potentials in 1D


Solving for the bound states of an arbitrary 1D potential requires solving the TDSE for
which there is no general way and thus there are few 1D potentials that are exactly
analytically solvable for their bound states. Harmonic oscillator is one of them but that
also involves special functions. We discuss here a potential which is piecewise constant,
i.e. V (x) is made up of constant value segments as illustrated in Figure 18. In such case
we can formulate a general strategy to put down the solution and the boundary conditions
that eventually lead to (transcendental) equations which lead to the eigen-energies of the
bound states and corresponding wave-functions. These transcendental equations may or
may not be analytically solvable.

Figure 18: Piecewise constant potential V (x) with a bound state of eigne-energy E.

The bound states will exist for energy E values which are between the minimum of
potential and the maximum or the potential at infinity. For some x-ranges V (x) < E while
for others V (x) > E. These two types of regions will have different types of solutions.
Suppose V (x) = V1 < E for x1 < x < x2 then we get the TISE as,
2m 2m
ψ ′′ = − (E − V 1 )ψ(x) = −k 2
ψ(x) with constant k 2
= (E − V1 ) > 0.
ℏ2 1 1
ℏ2
Thus we can write the solution for this x-range in three equivalent ways with two un-
knowns,
ψ(x) = A1 exp(ik1 x) + B1 exp(−ik1 x) OR
ψ(x) = C1 sin k1 x + D1 cos k1 x OR
ψ(x) = F1 sin (k1 x + δ1 ).
For region, say x2 < x < x3 , V (x) = V2 > E then we get the TISE as,
2m 2m
ψ ′′ = (V2 − E)ψ(x) = κ22 ψ(x) with constant κ22 = 2 (V2 − E) > 0.
ℏ2 ℏ

37
In such regions we can write the solution in the following three equivalent ways with two
unknowns,
ψ(x) = A2 exp(κ2 x) + B2 exp(−κ2 x) OR
ψ(x) = C2 sin κ2 x + D2 cos κ2 x OR
ψ(x) = F2 sin (κ2 x + δ2 ).
Here A1 , A2 , B1 , B2 ,... etc. are complex constants and δ1 , δ2 are real constants. These are
to be found from the boundary conditions and normalization conditions. In general one
of these constants cannot be found from boundary conditions while others can be found
in terms of this one unknown, which has to be found from the normalization condition.
One can also choose to work with either k or κ for all regions and in this case these can
admit imaginary values also. With these one has to satisfy the boundary conditions at
the boundaries between various constant potential regions. For the above case, one such
boundary is x = x2 . Each boundary leads to two equations from the continuity of ψ(x)
and ψ ′ (x).
There are three remarks that need to add:
1) For the bound states, the wave function for far left, i.e. x → −∞, must be either
zero, if the potential is infinite in this region or if it is finite then it’ll have a form exp(κL x)
with appropriate κL . Similarly for far right, i.e. x → −∞, the solution can be either zero
or of form exp(−κR x). Also if V (x) becomes infinite in certain regions then ψ(x) must
vanish in such regions.
2) zero of energy: We have the freedom to choose the zero of energy and potential,
as long as we choose the same for both. Different problems may make a particular choice
of this zero and one choice may be more convenient than others. This change in choice of
zero offsets both V (x) and E equally keeping their difference the same. Thus the TISE
and its solutions remain unchanged.
3) In case the potential is symmetric about some point, say x = 0, then the solutions
of TISE can always be chosen to be either symmetric [i.e. ψ(−x) = ψ(x)] or anti-
symmetric [i.e. ψ(−x) = −ψ(x)]. In 1D the bound states are always non-degenerate and
then the associated wave-functions are always either symmetric or antisymmetric. The
scattering states are usually degenerate and thus one can choose non-symmetric solutions
if convenience dictates. This aspect will become more clear when we discuss the linear
vector spaces and symmetry operators.

5.3 1D symmetric finite potential-well


This potential is shown in Fig. 19 and is defined as
(
V0 for |x| ≥ a/2
V (x) = .
0 for |x| < a/2
Here for E < V0 we have bound-state solutions which are confined near x = 0. For E > V0
we’ll have unbound or free-particle like solutions that are also called scattering states. In
the former case acceptable solutions exist only for certain discrete values of E while for
the scattering states solutions can be found for all E > 0 values. Here we discuss the
bound state solutions.

38
Figure 19: Symmetric finite well potential V (x) of width a and depth V0 .

This is clearly a piecewise constant potential and following the general strategy dis-
cussed above, we write:
2mE
For |x| < a/2, ψ ′′ (x) = − ψ(x) = −k 2 ψ(x) and
ℏ2
2m(V0 − E)
for |x| ≥ a/2, ψ ′′ (x) = ψ(x) = −α2 ψ(x).
ℏ2

Here,
r r
2mE 2m(V0 − E)
k= and α = (52)
ℏ2 ℏ2
Thus, we get for the three regions-I, II and III as marked in the figure,

ψI (x) = Ceαx for x < −a/2


ψII (x) = A cos kx + B sin kx for |x| < a/2
ψIII (x) = De−αx for x > a/2.

Note that for bound states, ψ = 0 as x → ±∞. Also since the potential here is symmetric
about x = 0 we can choose the wave-functions to be either symmetric or anti-symmetric
about x = 0. Below we discuss these two types separately.
Symmetric Solutions: The even parity (or symmetric) solution is ψ(x) = A1 cos kx
for |x| < a/2 and ψ(x) = C1 e−α|x| for |x| ≥ a/2. The continuity of ψ and ψ ′ at x = a/2
lead to,
ka
A1 cos = C1 e−αa/2
2
ka
−kA1 sin = −αC1 e−αa/2 (53)
2
The boundary conditions at the other boundary, i.e. x = −a/2, lead to identical equations.
Dividing these we get the relation, k tan ka
2
= α. Defining η = ka/2 and ξ = kα/2 we get

ξ = η tan η (54)

39
Using Eq.52, we also get

2mV0 a2
η2 + ξ 2 = (55)
2ℏ2
These last two equations give solutions for η and ξ or in other words for k and α. These
lead to allowed E values. These equations are transcendental and not analytically solvable.
One can solve them numerically or one can describe the graphical solutions which is quite
insightful. After knowing an allowed E value one can use one of the relation in Eq. 53 to
obtain C1 in terms of A1 while A1 can be found from the normalization condition. Before
discussing the graphical solutions, let’s first discuss the anti-symmetric solutions.
Antisymmetric Solutions: The odd parity (or antisymmetric) solution consists of
ψ(x) = C2 eαx for x < −a/2, ψ(x) = A2 sin kx for |x| < a/2, ψ(x) = −C2 e−α|x| for
x > a/2. The continuity of ψ and ψ ′ at x = a/2 lead to,

ka
A2 sin = C2 e−αa/2
2
ka
kA2 cos = −αC2 e−αa/2 (56)
2
Dividing these we get the relation, −k cot ka
2
= α and with η = ka/2 and ξ = kα/2 we
get

ξ = −η tan η (57)

As discussed for the symmetric case, this and Eq. 55 have to be solved simultaneously to
obtain allowed E values and the unknowns A2 and C2 can be found in a similar way.

Figure 20: Graphical solutions for η for a symmetric finite well potential. The black lines
diverging at π/2, 3π/2, etc. correspond to ξ = η tan η and the red lines diverging at π, 2π
correspond to ξ = −η cot η. The black circular arcs correspond to ξ 2 + η 2 = mV0 a2 /2ℏ2
for different V0 values.

40
Graphical Solutions for E: Figure 20 shows the plots of Eqs. 54, 57 and 55. The
last equation is plotted for different values of V0 . We can find the solutions for η and thus
E for a given V0 from the intersection points of the circles with the two transcendental
equations. We choose to plot these equations in first quadrant of η − ξ plane. This is
justified as we defined k and α as positive square-roots.
Finally, a few remarks on this problem:
1) ξ = η tan η is quadratic near first zero, i.e. η = 0, but it is linear with increasing
slope near other zeroes. In fact, the slope at the η = nπ zero is nπ.
2) We see that for (mV0 a2 /2ℏ2 ) < (π/2)2 there is no odd parity solution possible.
However for even parity there is at least one possible solution for arbitrary V0 . So there
is at least one bound state for any 1D symmetric potential well.
3) For V0 → ∞ the circle ξ 2 + η 2 = mV0 a2 /2ℏ2 intersects the ξ = η tan η lines
at η = π/2, 3π/2, 5π/2... and it intersects the ξ = −η cot η lines at η = π, 2π, 3π....
Combining the two we get, for allowed k values, ka = 2η = nπ, which leads to the
energies same as that of infinite well potential as expected.
4) The wave function for the lowest energy symmetric and antisymmetric bound states
are shown in Fig.21. Clearly the ground state is described by the symmetric wave-function.
These can be easily obtained by finding Ai /Ci ratio (Eqs. 56 and 53) after we know the
admissible η and ξ values from the eigen-energies. The other unknown is dictated by the
normalization condition.

Figure 21: Symmetric finite well potential V (x) of width a and depth V0 .

5) Another interesting limit of this problem is when this potential approaches a neg-
ative δ-function potential. This can be realized by taking the limit V0 → ∞ and a → 0
such that V0 a = λ, i.e. a constant. We also need to shift the zero of energy to V0 so
R ϵ the potential is −V0 for |x| < a/2 and it is zero otherwise, see Fig. 22. This way
that
−ϵ
V (x)dx = −λ. With this there will be exactly one bound state as the radius of the cir-
cle ξ 2 + η 2 = mV0 a2 /2ℏ2 = mλa/2ℏ2 will approach zero in the δ-function limit. This will
be a symmetric bound state pso we need to look at the intersection point of ξ = η tan η with
a diminishing radius, i.e. mλa/2ℏ2 , circle. Also with the new choice for zero √ of energy,
the bound state p will have negative energy, say −E0 , such that ξ = αa/2 = 2mE0 a2 /ℏ
and η = ka/2 = 2m(E0 − V0 )a2 /ℏ.
For small η, η tan η = η 2 so we need to find intersection of ξ = η 2 with the diminishing

41
radius circle ξ 2 + η 2 = mλa/2ℏ2 . This gives ξ 2 + ξ = mλa/2ℏ2 and in small ξ limit
we omit the quadratic term to get ξ = mλa/2ℏ2 or 2mE0 a2 /ℏ2 = m2 λ2 a2 /4ℏ4 leading to
E0 = mλ2 /2ℏ2 . Thus the bound state energy is −mλ2 /2ℏ2 . Furthermore, the bound state
wave-function works out as ψ(x) = Ae−α|x| with (α = mλ/ℏ2 ). We also see that the wave-
function derivative is discontinuous for this wave-function at x = 0. This discontinuity is
given by ∆ψ ′ |x=0 = −A(2mλ/ℏ2 ). One can also solve the δ-function bound state problem
directly by finding this discontinuity starting from the TISE. This is to be done as a HW.

Figure 22: Negative δ-function limit of the finite well potential. The middle plot illustrates
the intersection of ξ = η 2 and ξ 2 + η 2 = mλa/2ℏ2 circle with diminishing circle radius.
The right most plot is the wave function corresponding to the only permitted bound state.

6 Unbound solutions of TISE in 1D: scattering states


The unbound or free-particle-like or scattering state solutions of TISE are also very im-
portant for many observed phenomena. Scattering experiments are carried out in high
energy accelerators to probe the structure of matter at smallest scales. Thus understand-
ing of the scattering from basics is important. Scattering solutions also describe one of
the very important quantum phenomena that distinguishes it from classical physics. This
is the phenomena of quantum tunneling. We shall discuss two applications of quantum
tunneling, namely scanning tunneling microscope (STM) and α-decay.
For scattering state solutions of TISE, the energy is not discretized. One can find such
solutions for a continuous range of energies that are classically permitted. In fact, there
is a degeneracy of two for the scattering state solutions in 1D which is systematically
handled by looking for solutions for left and right propagating waves, i.e. with wave-
vectors ±k. There can be other ways of handling this degeneracy. For instance one can
look for two orthogonal solutions of TISE that are both real as it is always possible, at
least for simple 1D potentials. This is similar to the degeneracy of two for a free particle
in 1D with the two degenerate states corresponding to opposite momenta. A general
solution for a scattering wave-function can be constructed by a linear superposition of
different k solutions. The time dependent wave-function can then be found by including
appropriate phase factors, i.e e−iEt/ℏ , in this superposition.
Another important fact about scattering-state solutions is that the wave-function in
the classically forbidden region must not diverge at ∞ in case this region extends to ∞.
This would make any superposition wave-function not-normalizable. The wave-function
in the classically permitted region at ∞ need not vanish for the eigen-states although a

42
physically permitted wave-function made by superposition of such eigen-state solutions
must be normalizable and thus it should vanish at ∞.

6.1 Scattering from a step-potential

Figure 23: Step potential with a particle wave striking with E < V0 .

Let’s consider the step potential as shown in Fig. 23 and given by,
(
0 for x < 0
V (x) = .
V0 for x > 0

We consider two relevant cases separately, E < V0 and E > V0 . The V0 < 0 scenario is
also effectively covered by these two cases. We look for solution where a particle wave
with wave-vector +k and energy E = ℏ2 k 2 /2m is incident from left. This wave can get
partially reflected and partially transmitted, in general.
Case I: 0 < E < V0 The acceptable solution of TISE for this case is given by,
(
Aeikx + Be−ikx for x < 0
ψ(x) = .
De−αx for x > 0

Here Aeikx represents the incident wave and Be−ikx is the reflected one. On the right of
the step is the classically forbidden region which extends to x → ∞ thus eαx is ruled out.
Note also k 2 = 2mE/ℏ2 and α2 = 2m(V0 − E)/ℏ2 . The continuity of ψ and ψ ′ at x = 0
leads to two equations:

A + B =D

ik(A − B) = − αD or A−B = D.
k
By adding and subtraction these two equations we get
   
D iα D iα
A= 1+ and B = 1− .
2 k 2 k
Thus, in terms of A, we get
2k k − iα
D=A and B = A
k + iα k + iα

43
This leads to the final wave-function,
(
Aeikx + A k−iα
k+iα
e−ikx for x < 0
ψk (x) = .
2k
A k+iα e−αx for x > 0
Here, E, k and α take continuous values. One can superpose many such different k
wave-functions to create localized particle-like states. The time evolution of such wave
function is easily captured by incorporating the phase factors exp (−itℏk 2 /2m) in the
superposition sum. This was P illustrated in the lecture by looking at the time evolution of
the wave-function ψ(x, t) = k Ak ψk (x)e−itℏk /2m with |Ak | = exp [−(k − k0 )2 /δk 2 ]. In
2

general the sum will be replaced by an integral over k.


The actual flux of particles is represented by the probability current S and eventually
the reflectance and transmittance of the incident matter waves is given in terms of the
probability currents. We have seen earlier that the probability current for a a plane wave
Aeikx works out as (ℏk/m)|A|2 and for a superposition of waves, i.e. Aeikx + Be−ikx it will
be given by (ℏk/m)(|A|2 − |B|2 ). From above we see that |B|2 = |A|2 and thus the net
probability current for x < 0 is zero. This follows as the probability currents in forward
direction is same as that in the reverse direction. In other words SI = SR and thus the
reflectance R = SR /SI = 1 and transmittance T = ST /SI = 1 − R = 0. The latter can
also be seen more directly as,
ℏ ℏ  ℏ 
ST = Im (ψ ∗ ψ ′ ) = Im D∗ e−αx (−Dα)e−αx = Im −α|D|2 e−2αx = 0
m m m
Case II: E > V0 The acceptable solution of TISE for this case is given by,

Figure 24: Step potential with a particle wave striking with E > V0 .
(
A1 eikx + B1 e−ikx for x < 0
ψ(x) = ′ .
D1 eik x for x > 0

Here A1 eikx represents the incident wave and B1 e−ikx is the reflected one. On the right of
the step is the transmitted wave.
We can again write the boundary conditions at x = 0 and get linear equations to find
B1 and D1 in terms of A1 as we did earlier. This is easily achieved after we recognize
that all the steps of Case-I are repeated after replacing α by −ik ′ . With this we get the
following wave-function for this case,
( ′
A1 eikx + A1 k−k
k+k′
e−ikx for x < 0
ψk (x) = 2k ik′ x
.
A1 k+k′ e for x > 0

44
′ 2
Thus we get SI = ℏk m
|A1 |2 , SR = − ℏk (k−k )
m (k+k′ )2
|A1 |2 and ST = ℏk 4k2
m (k+k′ )2
|A1 |2 . You can easily
verify that |ST | + |SR | = |SI |. Finally, we get
ST 4k ′ k |SR | (k − k ′ )2
T = = and R = = = 1 − T.
SI (k + k ′ )2 |SI | (k + k ′ )2
For V0 we can easily extrapolate the Case-II with the plane wave incident from right. The
variation of reflectance and transmittance with energy is shown in Fig.25.

Figure 25: Transmittance across a step potential of height V0 as a function of incident


particle’s energy E.

6.2 Quantum Tunneling


Quantum mechanically, the matter waves are found to penetrate the classically forbidden
region up to certain distance. Thus if we make the width of this region finite there is
a non-zero probability of transmitting the particles beyond this region. Let’s consider
a potential barrier of height V0 and width a as shown in Fig. 26. We solve the TISE
corresponding to a incident wave Aeikx striking the barrier from left. The TISE solution
can be written as,

Figure 26: The quantum tunneling across a barrier of height V0 and width a.


 ikx −ikx
Ae + Be for x < 0
ψ(x) = Ceαx + De−αx for a > x > 0 .

 ikx
Fe for x > 0

45
Here k 2 = 2mE/ℏ2 and α2 = 2m(V0 − E)/ℏ2 . The boundary conditions dictate the ψ and
ψ’be continuous at x = 0 and x = a and this leads to the following four linear equations:

A+B =C +D
α
A − B = (C − D)
ik
αa −αa
Ce + De = F eika
ik
Ceαa − De−αa = F eika
α
One has to solve these to find B, C, D and F in terms of A in order to find the complete
wave function. For finding the transmittance which corresponds to the probability of
tunneling for a given particle we need to mainly find F . One can solve these by various
linear manipulations. For instance, addition and subtraction of the last two equations
leads to C and D in terms of F . Addition of first two eliminates B and then one can use
C and D found earlier to find a relation between F and A. This works out as,
2kαe−ika
F =A
2kα cosh αa + i(α2 − k 2 ) sinh αa
The others, i.e. C, D, and B, can be worked work out in terms of A using
1 ik 1 ik
C = F e−αa (1 + )eika and D = F eαa (1 − )eika and B = C + D − A
2 α 2 α
The relation between F and A can be used to find the transmittance T = ST /SI =
|F |2 /|A|2 as
"  2 #−1
1 α k
T = cosh2 αa + − sinh2 αa
4 k α

This, on eliminating α and k in favor of E and V0 , leads to,


 −1
V02 2
T = 1+ sinh αa
4E(V0 − E)
0 −E) −2αa
For the case αa >> 1, i.e. V0 >> ℏ2 /2ma2 (large height barrier), we get T ≈ 16E(V
V02
e .
The pre-factor to exponential term is of order 1 and thus the commonly used expression
for tunneling probability is given by
" r #
2m(V0 − E)
P ∼ exp −2 (58)
ℏ2

In above E > V0 is also an interesting case of transmission over a barrier. This can
easily be worked out by replacing α by ik ′ with k ′2 = 2m(E − V0 )/ℏ2 . This leads to the
expression for transmission as,
 −1
(k − k ′ )2 2 ′
T = 1+ sin k a .
(k + k ′ )2

46
We see that T = 1 for k ′ a = nπ, i.e. full transmission as shown in Fig. 27. This is a
resonance condition arising from the constructive interference between waves scattered
at x = 0 and x = a. This is similar to the physics of anti-reflection coatings where one
chooses a thickness of the coating and its refractive index such that k ′ a = nπ leading to
zero reflection. This kind of resonant transmission is seen in scattering of electrons by
atoms and nuclei.

Figure 27: Transmittance an reflectance across a barrier of height V0 as function of the


particle energy E.

In realistic situations the barrier is never rectangular and solving a general tunneling
problem with arbitrary V (x) is not possible. In certain limits, one cam make a useful
approximation called WKB approximation. This is named after three physicists: Wentzel,
Kramers and Brillouin. According to this approximation the tunneling probability across
a potential barrier described by V (x) is given by,
" Z r #
x2
2m
P ≈ exp −2 (V (x) − E)dx . (59)
x1 ℏ2

Here, x1 and x2 are the classical turning points and while the region between x1 and x2

Figure 28: Tunneling across an arbitrary shaped barrier with classical turning points as
x1,2 dictated by energy E.

as shown in Fig. 28 is classically forbidden. Naively, one can justify the above expression

47
using Eq. 58 and by dividing the barrier into regions of width dx between the two turning
points. The overall tunneling probability will then be a product of the probability of
tunneling across each of these width dx regions where potential is V (x).

6.2.1 Scanning Tunneling Microscope (STM)


STM is a remarkable application of quantum tunneling where one can image the surfaces
with atomic resolution. It was made, for the first time by, G. Binnig and H. Rohrer in
1981 for which they got Nobel prize in 1986. This is based on tunneling of electrons
between two metals and across the barrier formed by their work functions. Typical value
of work function barrier is ϕ0 = 4 eV and thus the tunneling probability for an electron
at Fermi energy to go across √ the barrier, according to Eq. 58 is, P ≈ exp (−αa). Here a
is the barrier width and α = 2mϕ0 /ℏ. For ϕ0 = 4 eV, we get α = 1.1 × 101 0 m−1 . Thus
we see that P increases by a factor of e2.2 ≈ 10 if a reduces by 1Å. This sensitivity of P
on a is actually responsible for the atomic resolution imaging in a STM.

Figure 29: The left figure shows the schematics of the barrier and the electron energies
between two metal surfaces at distance a and with a potential difference V . The latter
lowers the Fermi energy of one metal relative to the other by energy eV . The right figure
shows the schematics of a STM with a sharp metal tip and a conducting sample surface.

In more detail, one actually uses a very sharp metal (gold or platinum) tip and the
flat metal sample surface for STM as depicted in Fig. 29. A potential difference of 100
mV or less is applied between the tip and sample to slightly miss-align the Fermi energies
of the two metals so that there is a biased flow due to tunneling between the two. At
zero potential difference equal number of electrons tunnel at finite temperature between
the two metals resulting into zero current. Also since electrons obey the Pauli exclusion
principle the electrons can only tunnel from the filled state of one electrode to the empty
states of the other electrode. At a fixed bias the tunnel current is directly proportional
to tunneling probability which is exponentially dependent on the tip-sample separation.
Thus when the tip is scanned over the surface which has some height variations the tunnel
current reflects the local surface height (z) and thus one can obtain a topographic image
of the surface. The exponential dependence of tunnel current on ‘a’ also helps in achieving
very good lateral (xy) resolution. In a sharp metal tip one atom sticks out the most and

48
other atoms that are behind this one by 0.5Åwill contribute much smaller tunnel current.

6.2.2 Radioactive alpha-particle decay


It is known that for atomic mass (A) more than 200, the atomic nuclei are unstable and
they emit α, β or γ particles. The α particle is seen to come out with an energy E in range
4 to 10 MeV for different nuclei. The lifetime τ of these radioactive nuclei ranges from
µs to 1010 years which spans 24 decades, i.e. huge. In fact, there is inverse correlation
between τ and E with smaller tau corresponding to higher E values. The α-particle
emission can be described by nuclear reaction,
A′
A
ZX → Z′ Y + 42 He.

Here, Z ′ = Z − 2 and A′ = A − 4. E is dictated by the binding energy (BE) difference


between the resulting nuclei, i.e. Y and He, and the mother nucleus X. The BE of He
nucleus is 28MeV which is the highest per nucleon BE for any nuclei indication α-particle
to be the most stable nuclei.

Figure 30: Potential barrier formed by strong forces and Coulomb repulsive forces between
two nuclei.

The tunneling theory of α-decay was given by George Gamow for which he got Nobel
prize. In this theory it was assumed that protons and neutrons are bound to each other
through the strong forces which are effective only at very small distances of about a Fermi,
which is same as Femto-m or 10−15 m. Beyond this separation the interaction between
different nuclei is dominated by the repulsive Coulomb potential energy which is given by
Z1 Z2 e2
V (r) =
4πϵ0 r
with Z1,2 as the atomic numbers of the two nuclei and r as their separation. In this theory
it is assumed that the α-particle with energy E, same as that of the emitted α-particle,
attempts to tunnel across the barrier formed by the strong and Coulomb forces as shown
in Fig. 30. The Coulomb potential here would correspond to V (r) with Z1 = 2 and
Z2 = Z − 2. This is depicted in Fig... One does not know much about the dependence
of strong interaction on r but these interactions lead to a large negative potential drop
in an extremely short distance ≤ 1 fm. The range of Coulomb potential is much longer.

49
Thus we see that classical turning point corresponding to strong force, i.e. R ∼ 1 fm.
The other turning point r1 can be estimated for Z2 = 84, Z1 = 2 and for E = 4 MeV by
using
Z1 Z2 e2 Z1 Z2 e2
V (r1 ) = = E or r1 = (60)
4πϵ0 r1 4πϵ0 E
e
This gives: r1 = 40 fm. It is useful to know that 4πϵ0 R
= 0.96 MeV for R = 1 fm. The
barrier height would be,
Z1 Z2 e2
∆V = V (R) = (61)
4πϵ0 R
This for Z2 = 84, Z1 = 2 works out to be about 160 MeV, which is much higher than E.
Therefore in order to find the rate at which an α-particle gets emitted from a given
nucleus we need to find 1) the probability P of tunneling across the barrier depicted in
Fig... and 2) the rate at which the α-particle strikes the barrier from inside the parent
nucleus. For finding P we use Eq. 59 with the turning points R and r1 to get,
" √ Z r1 p # " √ Z r1 s  #
2m 2m Z1 Z2 e2
P = exp −2 (V (r) − E)dr = exp −2 − E dr
ℏ R ℏ R 4πϵ0 r
" √ Z r # " √ #
2mE r1  r1  r1 2mE
= exp −2 − 1 dr = exp −2 I .
ℏ R r ℏ

Here m is the mass of the α-particle. In above we have also used Eq. 60. Also the
integral-I is given by,
Z r Z 1 p
1 r1  r1 
I= − 1 dr = (x−1 − 1)dx
r1 R r R/r1
Z 1p Z R/r1 p
= (x−1 − 1)dx − (x−1 − 1)dx
0 0

The first term in above can be evaluated by using x = sin2 θ. It works out as π/2. The
second term can be found with the approximation R/r1 << 1 leading to x1 − 1 ≈ x1 over
the range of integration. This eventually leads to
r
π R
I = −2
2 r1
Thus we finally get,
" √ r !#
r1 2mE π R
P = exp −2 −2 .
ℏ 2 r1

By eliminating r1 in favor of E we get


" √ r !#
2mZ1 Z2 e2
π 4πϵ0 R √
−B/ E
P = exp −2 √ −2 = C 1 e .
4πϵ0 ℏ 2 E Z1 Z2 e2

50
√ √ p
2mZ1 Z2 e2 2mZ1 Z2 e2 R
Here B = 4ϵ0 ℏ
and ln C1 = 4 √4πϵ 0ℏ
= 4R 2mV (R)/ℏ2
The strike rate or attempt rate of the α-particle inside the nucleus can be estimated
as v/2R where v is its speed leading to the net emission rate as (v/2R)P . Thep lifetime,
−1
which is the inverse of emission rate is given by τ = (2R/v)P . We expect v = 2E/m.
Thus we can write,
2R −B/√E
τ= e
vC1
 √ 
2R m B log e
or log τ = log √ − log C1 + √
2E E
In above the E dependence of the last term is the most drastic as compared to the E
dependence of the first term. Basically v does not change significantly over the energy
range of interest from 4 to 10 MeV. Moreover the Z dependence of B, C1 or R is much
weaker. Thus we write,
B log e
log τ = C2 + √ (62)
E

Thus we use a typical Z = 86 to obtain B log e ≈ 150 MeV and C2 ≈ −53 for τ in
seconds.
The success of this theory becomes clear when one looks at the experimentally found
τ and E for a variety of nuclei and compares it with the theoretical plot as per Eq. 62.
This is shown in Fig. 31.

7 Nuclear fusion
The nuclear fusion involves same physics except this time two positively charged nuclei
have to come within the range of strong forces to bind with each other. The nuclei have
to again overcome the Coulomb barrier given by Eq. 61. For instance if we consider the
fusion reaction between two protons, i.e.
p + p → D + e+ + νe ,
the energy barrier (Z1 = Z2 = 1) works out to be about 1 MeV. Such fusion reactions
actually occur in the Sun whose core temperature is 106 K and surface temperature is
6000 K. The corresponding thermal energies (i.e. kB T ) are about 100 eV and 0.5 eV.
Thus the probability of overcoming the barrier by thermal activation, i.e. e−∆V /kB T in
the core of the Sun would be about e−10000 . If the two nuclei approach each other with
kinetic energy (in center
√ of mass frame) E = kB T the tunneling probability, according to
−100
Eq. 62 (with B ≈ 1 MeV) will be about e . We thus see that these nuclear reactions
in the Sun actually occur due to quantum tunneling.

8 Linear Vector Spaces (LVS):


Let’s start with a concrete example of linear vector space (LVS) that most of us are
already familiar with, i.e. the three dimensional LVS of vectors. We know how a vector

51
Figure 31: Experimentally measured half-lives of different nuclei as function of emitted
α-particle energy E. The dotted
√ line shows the expectation as per the tunneling theory
Eq. 62 with
√ B log e = 148 MeV and C2 = −53.5. Note the log scale on vertical axis
and −1/ E scale on horizontal axis. Also note that half life and decay time τ differ by
a factor ln 2.

is an abstract entity and we choose a set of axes, or bases vectors, in order to work with
vectors mathematically. The exact mathematical form of a given vector depends on the
bases (or axes) that we choose in 3D space. These same concepts will be generalized for
quantum mechanics as LVS plays a very important role in QM. To define a LVS formally:

52

→ − → − → − →
it is a set of elements, say {V1 , V2 , V3 , V4 . . . }, satisfying the following

→ − → −
→ −

1. If V1 & V2 are two vectors, i.e. elements of this set, than λ1 V1 +λ2 V2 is also a vector.
Here λ1 , λ2 are real numbers and so this LVS is called real LVS. We similarly have
complex LVS also.

→ − →
2. There is a scalar product defined, i.e. V1 · V2 , which is a scalar, i.e. a real number
in case of real LVS. We know what this scalar product means for a LVS of vectors,
i.e. it is the scalar or dot product of two vectors. Also this scalar product satisfies
a few properties, namely,

→ − →
(a) The scalar product is linear with respect to both V1 & V2 , i.e.
−→ −
→ − → −→ − → −
→ − →
(λ1a V1a + λ1b V1b ) · V2 = λ1a V1a · V2 + λ1b V1b · V2 . (63)


Similarly it is linear wrt V2 .

→ − → −
→ −
→ −
→ −

(b) If V1 · V2 = 0 than either V1 or V2 is zero, i.e. null vector, or V1 & V2 are
orthogonal, i.e. mutually perpendicular in case of vectors in 3D space.

→ − → −

(c) V · V = 0 if and only if V is a null vector. The norm of the vector, i.e.

→ −
→ −
→ −
→ − →
|| V || (or simply | V |), gets defined as || V ||2 = V · V . Norm represents the
magnitude of the vector and it is a non-negative real number.
(d) Schwarz inequality: The scalar product satisfies

→ − → −
→ −

(V1 · V2 )2 ≤ ||V1 ||2 ||V2 ||2 . (64)

→ − → −
→ − →
Here the equality holds if V1 ∝ V2 , i.e. if V1 & V2 are parallel or co-linear.

With the scalar product defined one can also define the angle between two vectors and
this can be generalized to other LVS as well. These vector spaces are also called Hilbert
spaces, which is a common terminology for any quantum system as the Hilbert space
encompasses all possible states wrt to a particular degree of freedom such as the spin
degree of freedom or spatial (3D) degree of freedom.

8.1 Bases Vectors:


In the LVS of vectors in 3D, a vector (an abstract entity) is mathematically written on
paper after choosing a set of axes and the bases vectors along these axes, i.e. ub1 , ub2 ,
ub3 . These satisfy the orthonormality, i.e. orthogonality & normalized to one, condition,
written as,

ubi · ubj = δij . (65)

with δij as the Kronecker delta, i.e. δij = 1 if i = j and zero otherwise. Now any vector,

→ −
→ P
V can be written as V = Vi ub1 + V2 ub2 + V3 ub3 = i V1 ubi . Representability of any vector
(or element of the LVS) in terms of ubi is very important for an acceptable basis-set. It is
also called completeness of the bases or the closure condition. Here Vi is a number and it

53


represents the ith component of V . These {Vi }, i.e. their values, depend on which bases
we choose. In 3D we are free to choose any set of three orthogonal axes and corresponding
unit vectors as bases. The number of independent basis vectors required for completeness
is called the dimensionality of the LVS, which is 3 for LVS of vectors in 3D.


After choosing this representation (or bases), a particular component of V can be
written as linear superposition of components along the basis vectors, which are the scalar


products with corresponding basis vector, i.e. Vi = V · ubi . In terms of these components,
We can write the scalar product between two vectors as

→ −→ X X X
U ·V = Ui ubi · Vj ubj = Ui Vj δij = Ui Vi . (66)
ij ij i


→ −
→ −
→ − →
Furher, if we look at U and V as two column matrices than the scalar product U · V

→ − →
is equivalent to the matrix product ( U )T V where the first one is the transpose of the


column matrix U giving a row matrix. This’ll also help us appreciate the Dirac notation
discussed later. This can be looked at more explicitly as,
 

→ − → X  V1 −
→ − →
U ·V = Ui Vi = U1 U2 U3 V2  = ( U )T V
i V3

While we all are familiar with above stuff for LVS of vectors in 3D space, it’s useful
to keep this in mind as a specific example of LVS. This gives some intuition and and
it’s good for remembering certain relations for other LVS that are more complicated and
indispensable in QM.

8.2 Linear Operators:


Again we start with operators in 3D LVS of vectors. We also have come across some
operators that map a given vector on to another one, such as rotation operator, moment
of inertia tensor. After all a linear operator defines a one to one linear mapping from one
element of LVS on to another. So the rotation operator gives the vector after rotating a
given vector by certain angle about certain axis. Naturally this is a one-to-one mapping
and it also satisfies linearity. Operators are also abstract entities that can be written
concretely after choosing a bases and its explicit mathematical form, just like vectors,
will depend on the chosen bases. One can find how an operator (or vector) will change
once we go from one basis to another. We shall see this later. A few properties satisfied
by these linear operators are:

→ − → −
→ − → −
→ −

1. Linearity: If A is an operator so that AV1 = V1 ′ & AV2 = V2 ′ then A(λ1 V1 +λ2 V2 ) =

→ −

λ1 V1 ′ + λ2 V2 ′ .

2. Operators’ product satisfies the associative law, i.e. (AB)C = A(BC) but not the
commutative law (in general), i.e. AB ̸= BA. There may be some special operators
that commute with each other and this is of special significance in QM.

54
To write an operator A in concrete mathematical form using certain bases {ubi } we analyze

→ → P
− −
→ P
its operation on a given vector V in the same bases. Suppose V = Vi ubi , V ′ = Vi′ ubi

→ − → −

and A V = V ′ . Taking scalar product of the last relation with ubj , we get ubj · (A V ) =

→ P P
ubj · V ′ = Vj′ . This gives Vj′ = ubj · (A i Vi ubi ) or Vj′ = i (ubj · Aubi )Vi . Thus we see that we
can find the result of A operating on any given vector if we know the entities {ubj · Aubi }
corresponding to a given operator P A. These entities are called matrix elements of A, i.e.
Aji = ubj · Aubi and so we get Vj′ = i Aji Vi .
It is more convenient to work with matrix form of operators and vectors. With the

→ −

matrix elements defined above, we can write V ′ = A V as
 ′   
V1 A11 A12 A13 V1
V2′  = A21 A22 A23  V2  (67)
V3′ A31 A32 A33 V3

The above defines the matrix forms of vectors, i.e. elements of LVS, and operators after
choosing a bases, i.e. a complete set of basis vectors. One can also define general matrix

→ − →
element of anPoperator as V1 · AV2 which can be written in matrix form and eventually it
simplifies to ij V1j Aji V2i . We should realize that the “linearity” of operators and vectors
plays an important role in all these simplifications.

8.3 Change of bases:


Change of bases is quite important in QM since finding the eigen-values and eigen-states
of an operator, such as Hamiltonian, basically amounts to finding a bases (or basis-set) in
which the operator is diagonal. We emphasize that the change of basis does not actually
change the state or the operator physically but its elements and thus appearance changes.
We consider two different orthonormal bases {ubi } and {vbi }, which in case of LVS of vectors
in 3D can be associated with two different sets of orthogonal axes that are rotated wrt
each other. Given a vector or operator in one bases our objective here is to find them in
the other bases.


Suppose in {ubi } bases a vector V has components {Vi } while in {vbi } bases the com-
ponents are {Vi′ }. We have to find how the two components are related to each other.
Clearly the two represent the same vector P but inPtwo different bases. Writing the two

forms of the same vector
P as equal, i.e. V b
u
i i i = j Vj vbj , and taking its scalar product
with vbk we get Vk′ = i Vi vbk · ubi . Defining

Ski = vbk · ubi (68)


P
we get1 the linear relation Vk′ = i Ski Vi . Thus we see that

→′ −

V = SV (69)
1
One can also choose anPalternative convention where vectors are represented by row matrices in place
of column and then Vk′ = i Vi Sik ′ ′
with Sik = ubi · vbk . We see clearly that S ′† = S or S ′T = S since all
elements of S or S ′ are real here. These spaces of row-vectors and column vectors are conjugate (or dual)
to each other and we stick to the column vector convention. The scalar product as discussed earlier is
the product of the row-vector and the column vector.

55
gives the vector in the new basis. We can see that for real LVS the transformation matrix
S is a real matrix with elements Ski = vbk · ubi as the scalar products between the elements
of the two bases. Given the orthonormality of the two bases, we can also verify that SS T
is a unit matrix,
P T i.e.′ S is unitary. Thus one can also write the inverse transformation,
T
i.e. VPk = S
i ki i V . Here S is the transpose of S. Using this S we can also write
vbk = j ubi (ubi · vbk ), i.e.
X
vbk = Ski ubi . (70)
j

Similarly for operator A, in {ubi } bases its matrix elements are {Aij } while in {vbi }

bases these are {Aij }. We have to find how the two P are relatedP to each P other. For this
we use the above Eq.70 to write, A′kl = vbk · Ab vl = i Ski ubi · A j Slj ubj = ij Ski Aij Slj =
P
ij Ski Aij Sjl . Thus in the {v
bi } bases the operator A will be given by
T

A′ = SAS T . (71)

We see that the change of bases is captured, for both the vectors and operators, by
the unitary matrix S and thus this is also called a unitary (or similarity, for real LVS)
transformation.

→ − →
We can also see that a scalar expression, such as V1 · AV2 works out same no matter
which bases is chosen. For this it is convenient to look at the scalar product in the

→ − → −
→ −

matrix form. V1 · AV2 in the second bases will be V1 ′ · A′ V2 ′ , which using matrix notation

→ −

will give (V1 ′ )T A′ V2 ′ . Using the change of bases transformation matrix S, this gives

→ −
→ −
→ −
→ − → − → − → − →
(S V1 )T (SAS T )(S V2 ) = V1 T S T SAS T S V2 = V1 T AV2 = V1 ·AV2 . This uses unitary property
of S. This illustrates that such scalar expressions are independent of bases. It is useful
to work out some such examples to gain some experience.

9 Complex LVS of wave-functions:


This is a LVS of the well behaved, i.e. normalizable and analytical, complex wave-
functions. Unlike the LVS of vectors in 3D, where any vector can be written in terms
of three independent numbers, this LVS is infinite dimensional. Thus the set of required
basis wave-functions (or also called basis vectors) is infinite. We can re-state the definition
of this LVS as there are differences in detail:

1. If ψ1 (x) and ψ2 (x) are two valid wave-functions, i.e. elements of this LVS, than
c1 ψ1 (x)+c2 ψ2 (x) is also a valid wave-function. Here c1 and c2 are complex numbers.

2. The scalar product is defined as


Z ∞
⟨ψ1 |ψ2 ⟩ = ψ1∗ (x)ψ2 (x)dx. (72)
−∞

The notation used on left of this equation for scalar product is called Dirac notation
and we’ll discuss this in more detail later. The scalar product satisfies the following

56
(a) This scalar product is linear wrt ψ2 and it is anti-linear wrt ψ1 , i.e. ⟨(c1a ψ1a +
c1b ψ1b )|ψ2 ⟩ = c∗1a ⟨ψ1a |ψ2 ⟩+c∗1b ⟨ψ1b |ψ2 ⟩ and ⟨ψ1 |(c2a ψ2a +c2b ψ2b )⟩ = c2a ⟨ψ1 |ψ2a ⟩+
c2b ⟨ψ1 |ψ2b ⟩
(b) If ⟨ψ1 |ψ2 ⟩ = 0 then either ψ1 (x) or ψ2 (x) is zero or ψ1 (x) and ψ2 (x) are or-
thogonal.
(c) The norm of a wave function ψ(x) gets defined by using the scalar product as
Z ∞
||ψ|| = ⟨ψ|ψ⟩ =
2
|ψ(x)|2 dx. (73)
−∞

Thus the norm is real and non-negative with well defined square-root, which
gives the overall probability in QM.
(d) The scalar product satisfies the Schwarz inequality: |⟨ψ1 |ψ2 ⟩|2 ≤ ||ψ1 ||2 ||ψ2 ||2 .

One can easily generalize the notion of linear operators to this LVS as well. An operator
is still a linear one-to-one mapping from one wave-function to another, i.e. ϕ(x) = Aψ(x).
The linearity implies A[c1 ψ1 (x) + c2 ψ2 (x)] = c1 [Aψ1 (x)] + c2 [Aψ2 (x)]. One can re-look at
this with already encountered operators, such as position operator (x), p, H, etc. A few
others are, parity operator, defined by Πψ(x) = Ψ(−x) or translation by a operator, i.e.
Ta ψ(x) = ψ(x + a). The product of operators are also easily generalized. The products
again follow the associative law but not the commutative law in general. The later is
important for QM and one defines the commutator between two operators A & B as
[A, B] = AB − BA; for eg. [x, px ] = iℏ.

9.1 Dirac notation:


The quantum mechanical state, specified by a wave-function, of a system is an abstract
entity and once we put it down on paper as wave-function we have already chosen a bases.
That bases is x-basis, i.e. {|x⟩}. We have seen that we can represent the same quantum
state in p-basis, i.e. {|p⟩} by using the Fourier transform (FT) as
Z ∞
1
gψ (p) = √ ψ(x) exp(−ipx/ℏ)dx (74)
2πℏ −∞
or we can transform it back using inverse Fourier Transform (IFT) as
Z ∞
1
ψ(x) = √ gψ (p) exp(ipx/ℏ)dp. (75)
2πℏ −∞
Thus we can choose a representation (or bases) to write the abstract state in mathematical
form. Dirac introduced a notation to handle these abstract states without choosing a
specific bases. This notation consists of three expressions:
1. |ψ⟩, pronounced as ket-ψ,

2. ⟨ϕ|, i.e. bra-ϕ, and

3. ⟨ϕ|ψ⟩, i.e. bra-ϕ-times-ket-ψ.

57
The last quantity is the scalar product, in abstract form (without bases), which results
into a complex number. This scalar product can be expanded as per Eq.72 once we choose
x-bases. A useful identity that can be verified easily using Eq.72 is ⟨ϕ|ψ⟩∗ = ⟨ψ|ϕ⟩. In
mathematical jargon there are two LVSs, conjugate (or dual) of each other, with element
|ψ⟩ having conjugate as ⟨ψ|. When we work with matrix algebra, it turns out that these
two elements are actually Hermitian conjugates of each other, i.e. (|ψ⟩)† = ⟨ψ|, with the
ket being equivalent to a column matrix (or column vector) and a bra being equivalent to
a row matrix.
The wave-function ψ(x) actually gives a complex number at a specific value of x which
basically is the scalar product of the state |ψ⟩ with the basis-state corresponding to x, i.e
|x⟩. So ψ(x) = ⟨x|ψ⟩. One can easily draw an analogy with the LVS of vectors in 3D where

→ −

Vi , i.e. ith component of V , is the scalar product ubi · V . Similarly in p-representation
gψ (p) = ⟨p|ψ⟩. Obviously, gψ (p) and ψ(x) are two different mathematical functions but

→ −

they represent the same abstract quantum state. This is analogous to V and V ′ having
different elements in two different bases but they represent the same vector in 3D space.
We saw earlier that an abstract operator, defined over a LVS, gets a concrete form in
terms of its matrix elements. The same idea holds here with a general matrix element
of an operator A being ⟨ϕ|A|ψ⟩. It is insightful to recall ubj · Aubi for LVS of vectors in
3D. In x-representation, the general matrix element for operator A would be ⟨x′ |A|x⟩.
This is non-trivial since x is a continuous basis, which we postpone for later; still we can
write the matrix element  for2 the Hamiltonian
 operator for a particle in a potential V (x)
ℏ ∂
as ⟨x′ |H|x⟩ = δ(x′ − x) − 2m + V (x) or ⟨x′ |p|x⟩ = −iℏδ(x′ − x) ∂x
2 ∂
∂x2
. Please do not get
misled into believing that such H or p are diagonal operators in x-bases as the derivative
terms make them non-local and thus non-diagonal operators.
One can get certain insights into this aspect by analyzing the finite-element implemen-
tation of these operators for numerical calculations. In this case ψ(x) will take complex
values at discrete x locations, say . . .−3a, −2a, −a, 0, a, 2a, 3a . . ., making ψ(x) a column
matrix. Numerically the wave function ϕ(x) that would result from pψ(x) can easily be
imagined as a square matrix operating on the column matrix. Clearly this square matrix
will not be diagonal. More precisely, the equation |ϕ⟩ = p|ψ⟩ can be written as,
 .  . .. .. . .   ..   
.. . . ... ..
.
..
.
..
.
..
. . . . .
..
.
 ′      
c−3  · · · 0 1 0 0 0 0 0 · · · c−3  c−2 − c−4 
 ′      
c−2  · · · −1 0 1 0 0 0 0 · · · c−2  c−1 − c−3 
 ′      
c−1  ℏ · · · 0 −1 0 1 0 0 0 · · · c−1 
ℏ  c0 − c−2 
 ′      
 c0  = −i · · · 0 0 −1 0 1 0 0 · · ·  c0  = −i  c1 − c−1 
 ′  2a     2a  
 c1  · · · 0 0 0 −1 0 1 0 · · ·   c1   c2 − c0 
 ′      
 c2  · · · 0 0 0 0 −1 0 1 · · ·  c2   c3 − c1 
 ′      
 c3  · · · 0 0 0 0 0 −1 0 · · ·  c3   c4 − c2 
.. . .. .. .. .. .. .. . . .. ..
. . . . .. . . . . . . . . .

It is also easy to see that in above matrix form p is Hermitian as there is an ’i’ multiplying
this matrix. Also one can write the discrete
P matrix elements of p as pij = −i(ℏ/2a)(δi,j−1 −
δi,j+1 ) which leads to ⟨ϕ|p|ψ⟩ = −iℏ n ϕ(na)[ψ((n + 1)a) − ψ((n − 1)a)]/2a. This form
helps in better comprehending the form ⟨x′ |p|x⟩ = −iℏδ(x′ − x) ∂x ∂
. Now it is also quite

58
straightforward to conclude that an operator consisting of a form δ(x′ −x)V (x) is diagonal
in x-bases.
We’ll be working with state wave-functions and operators in specific orthonormal bases
discussed in the next section. As we shall see that a wave function |ψ⟩ in a discrete
bases reduces to a set of infinite discrete complex-numbers, ci = ⟨ui |ψ⟩, which we can also
imagine as an infinite size column matrix. The Hermitian conjugate of this wave function,
i.e. ⟨ψ|, would then be a row matrix consisting of c∗i . Similarly an operator A in such
discrete bases gets defined by its matrix elements Aij = ⟨ui |A|uj ⟩. Thus an operator can
be easily imagined as an infinite dimensional square matrix with elements Aij and we can
easily define its Hermitian conjugate, A† , consisting of elements (A† )ij = A∗ji .
This also gives an idea about the algebraic expressions involving bras, kets, operators
and complex numbers. One can, in principle come across four valid types of entities that
are eventually equivalent to

1. a complex number, like c1 ⟨ϕ|A|ψ⟩.

2. a column matrix, like c1 A|ψ⟩.

3. a row matrix, like c1 ⟨ψ|A.

4. a square matrix, like c1 A|ψ⟩⟨ϕ|.

The expressions equivalent to product of two kets or two bras (like |ψ⟩|ϕ⟩ or ⟨ψ|⟨ϕ|) are
unphysical unless one is looking at the direct product states, which is beyond this course.
One often has to convert such complex expressions, such as c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |, into its
Hermitian conjugate, i.e. [c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |]† . For this, the prescription is,

1. The order of entities in the product gets reversed and the gets transferred to
individual entities, like operator A becomes A† .

2. The placement of complex numbers (or equivalent expressions) can be arbitrarily


changed as they commute with everything else.

3. For bra and kets we use (|ψ⟩)† = ⟨ψ| or vice versa.

So the expression (c1 |ψ1 ⟩⟨ψ2 |A|ψ3 ⟩⟨ψ4 |)† would simplify to (c∗1 |ψ4 ⟩⟨ψ3 |A† |ψ2 ⟩⟨ψ1 |) which
is also equivalent to (c∗1 ⟨ψ3 |A† |ψ2 ⟩|ψ4 ⟩⟨ψ1 |). A useful exercise at this stage is to analyze
the Hermitian conjugates of A† , cA, AB, A + B, ⟨ψ|A, A|ϕ⟩.
When looking at composite expressions consisting of sums of such terms, one has to
realize that we can only add similar type expressions, i.e. column (row) matrix with a
column (row) matrix, a square matrix with a square matrix and complex number with
another complex number. Thus expressions such as c1 |ψ⟩ + c2 ⟨ϕ| or c1 |ψ⟩ + c2 or c1 |ψ⟩ + A
are meaningless.

9.2 Orthonormal bases for LVS of wave-functions:


The complex LVS of wave-functions is infinite dimensional as it requires infinite number
of basis-vectors (or basis functions) to represent a general wave-function. In fact, there
are two types of basis-sets possible for this LVS, namely, discrete and continuous.

59
We already have seen examples of two continuous basis-sets, namely x and p. The
continuous bases offer more conceptual difficulty though they are more often used. In
fact, it turns out that the complex-functions corresponding to basis-sets for continuous
bases, such as exp(ipx/ℏ) or δ(x − x0 ) are not normalizable and thus they are not valid
wave-functions themselves. This means that these continuous basis-sets do not belong to
the LVS itself. However, if we take a careful superposition of these basis-functions we can
ensure that the resulting complex functions are valid wave-functions. On the other hand
these basis-functions (like plane-waves) are often used, like in scattering, despite their
non-normalizability. However, one can justify them by putting some envelope (such as a
box function or some other function vanishing at infinity) in real or k-space. Thus such
functions essentially retain their plane wave character but they become normalizable. In
this case one can look at the basis-functions as some sort of limit of a series of wave-
functions.
The discrete bases are also encountered in QM and these are easier to comprehend.
p example is, the eigen states for a particle in an infinite potential well, i.e. ψn (x) =
An
2/L sin(nπx/L). This set {ψn (x)} gives a discrete set of functions whose superposition
will give any arbitrary wave-function that vanishes at x = 0 & x = L. Another example
is the eigen state wave-functions of a simple harmonic oscillator.
In fact, the complete set of eigen-states corresponding to a given Hamiltonian forms a
basis. However, we may end up getting mixed basis-set in case the Hamiltonian has both,
bound states and scattering states. The two examples (box and SHO) that I gave do not
have scattering states. It is much easier to discuss the formalism for discrete bases and we
can generalize to continuous bases by replacing the discrete summations by appropriate
integrals. The summations will also span infinite range as this is an infinite-dimensional
LVS. We shall use the Dirac notation and I hope you’ll get more comfortable with it as
we move forward. The Dirac notation is so convenient and popular that I cannot imagine
doing any QM without it.
We start with a discrete but infinite set {ui (x)} of orthonormal basis-functions, i.e.
{u1 (x), u2 (x), u3 (x), u4 (x), ...}. Now these are already written in x-representation, how-
ever we can bring the Dirac notation to represent an abstract basis-state as |ui ⟩ and its
Hermitian conjugate as ⟨ui |. This basically means ui (x) = ⟨x|ui ⟩ and u∗i (x) = ⟨ui |x⟩.
This (or any) basis-set satisfies the following:
1. {|uj ⟩} or {ui (x)} forms a complete set, i.e. any valid state |ψ⟩ with wave-function
ψ(x) can be written as a linear superposition P of these bases. We Pcan write this in
both Dirac notation and otherwise: |ψ⟩ = i ci |ui ⟩ or ψ(x) = i ci ui (x). The ci
values will be same in both ways of writing. Here ci s are complex numbers and the
sum, in general, would run over an infinite range of i.

2. Orthonormality:
Z ∞
⟨ui |uj ⟩ = u∗i (x)uj (x)dx = δij . (76)
−∞

The coefficients ci s represent ui (x)th component inR ψ(x). To find the R c∗i we take the
scalar
R ∗ product
P of |ψ⟩ with
P |uRi ⟩ to get, ⟨ui |ψ⟩ =
P ⟨ui |x⟩⟨x|ψ⟩dx = ui (x)ψ(x)dx =
ui (x) j cj uj (x)dx = j cj u∗i (x)uj (x)dx = j cj δij = ci . Thus ci = ⟨ui |ψ⟩ and we

60
P P
can write ψ(x) = ⟨x|ψ⟩ = i ⟨x|ui ⟩⟨ui |ψ⟩ = i ci ui (x). Effectively, what we have done
here is to introduce a unity operator, i.e.
X
|ui ⟩⟨ui | = 1 (77)
Z ∞
or |x⟩⟨x|dx = 1 (78)
−∞

in between the bra and ket in ⟨x|ψ⟩ or ⟨ui |ψ⟩. The operator in Eq.77 is a sum of operators
|ui ⟩⟨ui |, which is called a projection operator as it projects out |ui ⟩ component from any
given state |ψ⟩, i.e. |ui ⟩⟨ui |[|ψ⟩] = |ui ⟩⟨ui |ψ⟩ = |ui ⟩ci . One can similarly define a general
projection operator Pϕ = |ϕ⟩⟨ϕ| to project out |ϕ⟩ from any given state.
Also we can work out the scalar product between P abovePψ(x) and another wave-
function ϕ(x) which in {ui (x)} bases is ϕ(x) = i bi ui (x) = Pi ⟨ui |ϕ⟩⟨x|ui ⟩. Using P ∗the
unity operator from Eq.77, the scalar product will give, R ⟨ϕ|ψ⟩ = i ⟨ϕ|ui ⟩⟨u
R i∗|ψ⟩ = i bi ci .
Or we
RP ∗ ∗ can work it out using Eq.78 as ⟨ϕ|ψ⟩ = ⟨ϕ|x⟩⟨x|ψ⟩dx = ϕ (x)ψ(x)dx =
ij bi ui (x)cj uj (x)dx which leads to the same P
final expression after using the orthonor-
R
mality of {ui (x)}. Also we can see, ⟨ψ|ψ⟩ = i |ci |2 = |ψ|2 dx.
The identity operators in Eq.77 & 78 are statements of closure (or completeness) for
the basis sets {|x⟩} and {|ui ⟩} in the sense that any wave-function can be written as linear
superposition of these basis functions.
P There is P another Rrelation that follows from this
completeness: ψ(x) = ⟨x|ψ⟩ = i ⟨x|ui ⟩⟨ui |ψ⟩ = i ui (x) u∗i (x′ )ψ(x′ )dx′ . Here we have
written the R P scalar product ⟨ui |ψ⟩ using the integral form in x-representation. This yields
ψ(x) = [ i ui (x)u∗i (x′ )] ψ(x′ )dx′ and for this to be true for any arbitrary ψ we must
have
X X
ui (x)u∗i (x′ ) = ⟨x′ |ui ⟩⟨ui |x⟩ = δ(x − x′ ). (79)
i

This is equivalent to Eq.77 (and Eq.78) if one inserts it in ⟨x|x′ ⟩ which is δ(x − x′ ). This
can be thought as the orthonormality condition for continuous x-bases.
As discussed in previous section an operator gets its concrete form in a given bases
as a square matrix, which one can use to find the operator’s matrix elements between
two general wave-functions. For instance one can write ⟨ϕ|A|ψ⟩ hP by inserting
i unity op-
P
erator of Eq.77 in two places to get ⟨ϕ| [ i |ui ⟩⟨ui |] |A| j |uj ⟩⟨uj | |ψ⟩. This gives
P P ∗
ij ⟨ϕ|ui ⟩⟨ui |A|uj ⟩⟨uj |ψ⟩, i.e. ij bi Aij cj i.e. the product of three matrices similar
to Eq.67 except for infinite number of rows hR and columns
i hinR each. We can i also use
∞ ∞
the Eq.78 in ⟨ϕ|A|ψ⟩ to write it as ⟨ϕ| −∞ |x⟩⟨x|dx |A| −∞ |x′ ⟩⟨x′ |dx′ |ψ⟩ to get
R∞ R∞ ∗ R∞
−∞ −∞
ϕ (x)⟨x|A|x′ ⟩ψ(x)dxdx′ which becomes −∞ ϕ∗ (x)Aψ(x)dx after we use ⟨x|A|x′ ⟩ =
Aδ(x′ − x).
Similarly, one can analyze other entities, such as A|ψ⟩ which will be a column matrix
resulting from the product of a square matrix A and column matrix for |ψ⟩, or ⟨ψ|A which
will be a row matrix resulting from the product of a row matrix corresponding to ⟨ψ| with
square matrix A. One P can show that in the former case the ith element of the column
matrix will
P be given by j Aij cj while in the latter case the ith element of the row matrix
will be j c∗j Aji . It’s is a good exercise to think of Hermitian conjugates of these two
results.

61
9.3 Bases change using Dirac notation:
If we look at the previous section, we basically practiced change of bases between |x⟩-
bases and the discrete |ui ⟩-bases. We can also look at change of bases between two discrete
bases, i.e. from |ui ⟩ to |vi ⟩ and how it transforms the wave functions and operators. In
fact one example is to change from x to p bases and we know that the wave functions
transform according√ to Eqs.74 & 75, which can be√revisited using the Dirac notation and
with ⟨x|p⟩ = (1/ 2πℏ) exp(ipx/ℏ) or ⟨p|x⟩ = (1/ 2πℏ) exp(−ipx/ℏ). We also discussed
in section 8.3 the discrete change of bases but without using the Dirac notation and for
real LVS. The Dirac notation P makes thePbases change very convenient,
P at
P least notation-
wise. We start with |ψ⟩ = i ⟨ui |ψ⟩ = i ci |ui ⟩ and |ψ⟩ =P i ⟨vi |ψ⟩ = i c′i |vi ⟩. Taking
the scalar product of firstP form with |vj ⟩ we get ⟨vj |ψ⟩ = i ci ⟨vj |ui ⟩ and then defining

Sji = ⟨vj |ui ⟩ we get cj = i Sji ci . This transformation matrix S thus defines the linear
relation between c′i and ci and the state vector in |vi ⟩ bases will be given by a column
matrix that results when one multiplies S with the column matrix of |ui ⟩ bases. We
can also think of the inverse transformation and in that case one has to multiply by S −1
with the column matrix in |vi ⟩ bases. For the two orthonormal bases, S works out to be
unitary, i.e. SS † = 1, and so S −1 = S † . It’s left as an exercise to prove the unitarity
starting from the elements of S, i.e. Sji = ⟨vj |ui ⟩ and using the orthonormality of the two
bases.
We can also look at how an operator A transforms, as follows. We have,

A′ij = ⟨vi |A|vj ⟩


" # " #
X X
= ⟨vi | |uk ⟩⟨uk | |A| |ul ⟩⟨ul | |vj ⟩
k l
X
= ⟨vi |uk ⟩⟨uk |A|ul ⟩⟨ul |vj ⟩
kl
X
= Sik Akl Slj† .
kl

Thus we can see that the operator in |vi ⟩ bases is given by A′ = SAS † . We can also argue
that the trace and determinant of the operator remains independent of the bases by using
the facts that for a product of two matrices the trace and determinant are independent
of the order of the product.

62
10 Quantum Operators:
We review some of the properties of operators that are useful for QM. We begin by stating
that the operators associated with physically measurable quantities are always Hermitian
as they are guaranteed to have real eigen values. We recall that a measurement always
yields an eigen value. We have already seen in context of Hamiltonian and Schrödinger
equation what eigen values and eigen states mean. We shall be using the Dirac notation
for further discussions of this section.

10.1 Eigen values and eigen vectors:


A non-trivial (i.e. not null) state |ψ⟩λ is called an eigen state with eigen value λ of an
operator A if

A|ψλ ⟩ = λ|ψλ ⟩. (80)

When we solve TISE, we saw how the eigen-energies and eigen-functions are calculated
by solving the differential equation. This was the case in continuous x-basis. In discrete
basis for N -dimensional systems, the eigen value equation, i.e. Eq. 80, leads to N linear
homogeneous equations for finding the N elements of eigen vector |ψλ ⟩. This set of N -
equations will admit non-trivial solutions if and only if Det[A − λI] = 0. This leads to an
N th order polynomial equation in λ which will have N roots say {λi }. In case of equal
roots we get degenerate eigen vectors, i.e. set of linearly independent eigen-vectors having
same eigen values. The ith eigen vector can then be found by using λi in Eq. 80 to find
the elements of |ψλi ⟩.
Theorem-1: If A is Hermitian than all its eigen values λ are real and, conversely, if
all eigen values of A are real than it is Hermitian.

Proof: Given Hermitian A, i.e. A† = A we have A|ψλ ⟩ = λ|ψλ ⟩ = A† |ψλ ⟩. Consider


⟨ψλ |A|ψλ ⟩ = λ⟨ψλ |ψλ ⟩ and ⟨ψλ |A|ψλ ⟩∗ = ⟨ψλ |A† |ψλ ⟩ = λ⟨ψλ |ψλ ⟩ since A† = A. Thus we
get λ⟨ψλ |ψλ ⟩ = λ∗ ⟨ψλ |ψλ ⟩ which implies λ∗ = λ since |ψ⟩λ is not a null state.
Now for the converse, i.e. given all eigen values to be real, we look at A in its eigen
bases. In this bases A will be a diagonal operator with diagonal elements as Eigen values
which are real and thus A is clearly Hermitian in its Eigen bases. Now if we go back to
original (non-eigen) bases using a unitary transformation so that we get SAS † . Now the
task is to prove that if A is Hermitian than SAS † is also Hermitian which we can see
trivially as (SAS † )† = (S † )† A† S † = SAS † .

Theorem-2: Two eigen-vectors, corresponding to different eigen values, of a Hermi-


tian operator are orthogonal.

Proof: We have A|ϕ1 ⟩ = λ1 |ϕ1 ⟩ and A|ϕ2 ⟩ = λ2 |ϕ2 ⟩. Now consider ⟨ϕ2 |A|ϕ1 ⟩∗ =
λ∗1 ⟨ϕ2 |ϕ1 ⟩∗ . This is also equal to ⟨ϕ1 |A† |ϕ2 ⟩ which gives λ2 ⟨ϕ1 |ϕ2 ⟩. Subtracting the two, we
get (λ2 − λ1 )⟨ϕ1 |ϕ2 ⟩ = 0. Thus we get the required result that if λ1 ̸= λ2 than ⟨ϕ1 |ϕ2 ⟩ = 0,
i.e. |ϕ1 ⟩ and |ϕ2 ⟩ are orthogonal.

63
The above theorem-2 will also apply to unitary operators which are also very important
in QM. The proof will follow the same logic together with the fact that U † and U have
common eigen states but with eigen values that are complex conjugate of each other. This
is easy to comprehend as a unitary operator U can always be written as U = exp(iλA)
with A as a Hermitian operator and λ as a real number. Can you prove the last sentence?

10.2 Commuting Operators:


Two operators, A & B, that commute with each other, i.e. AB = BA, play an important
role in QM for two reasons:
1. They both can be simultaneously diagonalized, i.e. one can find eigen-states com-
mon to both and thus they will be diagonal in the bases consisting of these common
eigen-states and,
2. According to Heisenberg equation of motion, if A commutes with Hamiltonian H
than the expectation value of A, over any state, will be time independent and thus A
represents a time invariant observable or constant of motion or a conserved quantity.
Actually, the operators that commute with H represent certain symmetry of H and this
leads to degeneracy which can be systematically handled by finding common eigen states
of H and the operators that commute with it. For instance invariance of H under space
inversion, i.e. x → −x (or − →r → −− →
r ), implies that H commutes with the parity oper-
ator Π. In this case one can always find eigen-states wave functions of H that are also
eigen states of Π. The eigen states of Π are functions that are either even or odd under
inversion. In one dimension, usually there is no degeneracy for bound states and thus we
get the eigen states that are either symmetric or anti-symmetric. Another example that
we’ll extensively discuss is the periodic potential where H commutes with translation-by-a
operator, i.e. Ta , with a being the periodicity. Thus we can find the eigen states of H
that are also eigen states of Ta so that these states only change phase when one changes
x by a. So let’s study these commuting operators a bit more systematically.

Theorem-1: If A and B are two commuting operators and if |ψλ ⟩ is an eigen state
of A with eigen value λ than the state B|ψλ ⟩ is also an eigen state of A with the same
eigen value.
Proof: The proof is almost trivial but the implications are very important. Since
AB = BA and A|ψλ ⟩ = λ|ψλ ⟩. Thus we get AB|ψλ ⟩ = BA|ψλ ⟩ = λB|ψλ ⟩ which im-
plies that B|ψλ ⟩ is an eigen state of B. Also, please note that the proof does not require
A & B to be Hermitian.
This leads to two possible cases:
1. λ is a non-degenerate eigen value of A, i.e. there is only one state of A with eigen
value λ. In this case B|ψλ ⟩ and |ψλ ⟩ can differ at most by a multiplicative constant,
i.e. B|ψλ ⟩ ∝ |ψλ ⟩ or B|ψλ ⟩ = µ|ψλ ⟩, with µ as a complex number. This implies that
|ψλ ⟩ is also an eigen state of B (with certain eigen value µ).
2. λ is a degenerate eigen value of A, i.e. there are many eigen states, say |ψλi ⟩, of A
with same eigen value λ. In this case we can only assert that B|ψλi ⟩ would be a

64
linear combination of all such |ψλi ⟩ states and it cannot be an eigen state of A with
different eigen value. In fact B|ψλi ⟩ will be orthogonal to all other eigen states of A
having different eigen values, i.e. ⟨ψµi |B|ψλi ⟩ = 0 with |ψµi ⟩ as an eigen state of A
with eigen value µ (̸= λ).

Theorem-2: If A & B commute, one can construct an orthonormal basis-set of the state
space with the basis vectors being eigen vectors of both A & B. Conversely, If there exists
a basis of eigen vectors common to A and B then A and B commute.

We shall not prove this theorem but illustrate it with an example. Although the proof of
the converse is easy as both A and B will be diagonal operators in the referred common
basis-set of eigen vectors and the diagonal operators always commute. We can easily
construct a simple illustration using A and B in basis |u1 ⟩, |u2 ⟩ & |u3 ⟩ as follows,
         
1 0 0 1 0 0 1 0 0
A = 0 1 0 and B = 0 2 0 . Here, |u1 ⟩ = 0 , |u2 ⟩ = 1 and |u3 ⟩ = 0 .
0 0 2 0 0 1 0 0 1

Clearly, AB = BA and both have degeneracies. A has two independent eigen states |u1 ⟩
& |u2 ⟩ with eigen value 1. Thus a general linear combination of the two, i.e. c1 |u1 ⟩+c2 |u2 ⟩
is also an eigen state of A with eigen value 1. The third eigen state of A is |u3 ⟩ with eigen
value 2. For B, b1 |u1 ⟩ + b3 |u3 ⟩ gives the doubly degenerate eigen state with eigen value
1 and |u2 ⟩ is the third eigen state with eigen value 2. As given we already have chosen a
basis encompassing the eigen states of both A & B.
Now one can refer to the three eigen states uniquely by labeling them as (a, b) with a
and b as eigen values wrt A & B. This gives |u1 ⟩ → (1, 1), |u2 ⟩ → (1, 2) and |u3 ⟩ → (2, 1).
Thus B helps in lifting the degeneracy of A completely, or vice-versa, and with systematic
labeling and orthogonal states. The last one definitely holds in case both A and B are
Hermitian or unitary. In general one may need more than two commuting operators to
lift the degeneracy completely.
This was an illustration using Dirac notation and simple 3 dimensional Hilbert space.
We can think of more realistic and physical examples (listed below) in QM where one
takes help of other operators, describing certain symmetry, to lift all the degeneracies and
to systematically label the eigen states.

1. Here we use p for momentum operator and p for a number representing an eigen
2
value of p. Free particle, with Hamiltonian
√ H = p /2m, gives√eigen states of H
with eigen value E as ψE (x) = c1 exp(i 2mEx/ℏ) + c2 exp(−i 2mEx/ℏ). Thus
there is degeneracy of two. Two ways that can be used to lift this degeneracy are:
1) H commutes with p, 2) H commutes with Π (parity). In the former case we
get eigen states common to H and p as exp(ipx/ℏ) with eigen values as p and
p2 /2m wrt to the two operators.
√ In the latter case,
√ we can choose the eigen states
common to H and Π as sin( 2mEx/ℏ) and cos( 2mEx/ℏ) with eigen values as E
and ±1 wrt to the two operators. We note that p and Π do not √ commute so we
cannot
√ find common eigen-states for these two. More explicitly sin( 2mEx/ℏ) and
cos( 2mEx/ℏ) are not eigen states of p and exp(ipx/ℏ) is not an eigen state of Π.

65
2. Another example that we are familiar with is the hydrogen atom. In this case H
commutes with L2 , i.e. total angular momentum, and Lz , i.e. z-component of
angular momentum. Thus one can lift all the degeneracies of H (excluding spin)
using L2 and Lz and label the states uniquely using eigen values corresponding to
the three operators, i.e. (n, l, m). Here, eigen value wrt H is −E0 /n2 , wrt L2 is
l(l + 1)ℏ2 and wrt Lz , it is mℏ. In fact L2 and Lz commute with H due to some
underlying symmetries of H.

The set of operators that commute with each other and lift the degeneracy completely
is called complete set of commuting operators (CSCO). This is important in QM as
this gives a systematic way to keep track of all the eigen states. The latter helps in
keeping account of states as total number of states with a given energy controls many
observable phenomena. We’ll be seeing another problem where these ideas come in handy,
i.e. periodic potential which is important for understanding the physics of electrons in
solids.

66
11 Periodic potentials:
In a given solid the electrons experience a periodic potential due to atomic cores arranged
in a periodic fashion. Different solids have different crystal structure and with different
atoms. This leads to a variety of properties. To understand these properties we need to
first obtain the energies and wave-functions of the states that the electrons will occupy. It
turns out that these energies consists of continuous bands separated by gaps in between.
Finding this band structure of a given solid is an important problem in solid state physics
as this is the first step towards understanding various properties of a given solid.
We shall start with a general formalism on how to find the possible energy states that
electrons will assume in periodic potentials. The wave-function of these states takes the
form of a periodically modulated plane wave as dictated by Bloch’s theorem and thus
these states are also called Bloch states. Further we shall discuss the periodic boundary
conditions as any given solid has large number of atoms when we look at macroscpic scale
but it’s not mathematically infinite and thus not really periodic in mathematical sense.
This also leads to only certain Bloch states being allowed.

11.1 Bloch’s theorem


Consider a periodic potential V (x) in one dimension with periodicity a so that V (x + a) =
V (x). Thus the time independent Schrödinger equation is
−ℏ2 d2
ψ(x) + V (x)ψ(x) = Eψ(x). (81)
2m dx2
If we change x to x + a in this equation, we get
−ℏ2 d2
ψ(x + a) + V (x + a)ψ(x + a) = Eψ(x + a)
2m dx2
−ℏ2 d2
i.e. ψ(x + a) + V (x)ψ(x + a) = Eψ(x + a)
2m dx2
The last equation implies that ψ(x + a) is also an eigen state of the Hamiltonian H =
−ℏ2 d2
2m dx2
+ V (x), given ψ(x) as an eigen state of H and that too with with the same eigen
energy E. We define an operator Ta , i.e. translation by a so that Ta ψ(x) = ψ(x + a).
The LHS of the two steps that immediately followed Eq.81 can be re-written symbolically
as, Ta Hψ(x) & HTa ψ(x), which are equal since H is invariant under translation by a.
This means H commutes with Ta . This implies from previous chapter that we can find
common eigen states of H and Ta both. The eigen states of Ta have specific structure as
we see further.
We can prove that any arbitrary matrix element of Ta , i.e. ⟨ϕ1 |Ta |ϕ2 ⟩, satisfies
⟨ϕ1 |Ta |ϕ2 ⟩∗ = ⟨ϕ2 |Ta† |ϕ1 ⟩ = ⟨ϕ2 |T−a |ϕ1 ⟩. The proof consists of writing the matrix ele-
ment explicitly in integral form and making a change of variable in the integral from x
to x − a. This implies Ta† = T−a and thus Ta† Ta = 1, i.e. Ta is a unitary operator. One
should think of this with Ta† Ta operating on ψ(x) and giving nothing but ψ(x).
It turns out that any eigen value c of a unitary operator is unimodular, i.e. |c|2 = 1.
The proof starts with the eigen value equation, i.e. Ta |ϕ⟩ = c|ϕ⟩. Now we can take the
norm (or scalar product with itself) of both left and right hand sides to get ⟨ϕ|Ta† Ta |ϕ⟩ =

67
⟨ϕ|c∗ c|ϕ⟩. Using the unitary property of Ta we get ⟨ϕ|ϕ⟩ = |c|2 ⟨ϕ|ϕ⟩ which implies |c|2 = 1
as we are analyzing the non-trivial eigen state |ϕ⟩. The unimodular nature of c implies
that it has a general form c = eiα with α as a real number. It is useful to find alternative
proofs of the statements used here as that leads to more insights.
All this means is that we can always find eigen states of H that are also eigen states
of Ta , i.e. they satisfy Ta ψ(x) = cψ(x) = eiα ψ(x) or

ψα (x + a) = eiα ψα (x). (82)

I have used a subscript α with ψ as it refers to a wave-function having eigen value exp(iα)
wrt operator Ta . Eq.82 is essentially the statement of the Bloch’s theorem but it is usually
stated after imposing the periodic boundary condition which makes the values of α more
explicit as discussed next. We should also keep in mind that both H and Ta may have
degeneracies and at this stage it is not ruled out that combination of H and Ta still leave
some unresolved degeneracies. Use this as food for further thoughts.

11.2 Periodic boundary condition:


We usually look at macroscopic solids which consist of a large number of lattice sites but
however large this is not mathematically infinite and such crystals with finite size are not
mathematically periodic. You must think over this and convince yourself that this is the
case. On the other hand if we look at the physics in the bulk of the crystal and not at
the surface or too close to the surface we expect that in the bulk (far enough from the
surface) the electrons should behave like they are in an infinite crystal. Now keeping the
surface effects aside, can we find a way to make the finite size crystal mathematically
periodic? This will help us in terms of using the above Bloch’s theorem which is valid for
mathematically periodic potentials. We must not forget that the outcome of this theory
should not be used in looking at the surface effects and we have to find other ways to
handle the surface effects.
To make these finite crystals mathematically periodic one uses “periodic boundary
condition”. Restricting our discussion to 1D, this states that if a given crystal has N
lattice sites, we can extend it beyond this range, and infinitely, by using the prescription
V (x + N a) = V (x) and correspondingly

ψ(x + N a) = ψ(x). (83)

This can be thought of in 1D as wrapping the lattice into a circle so that (N + 1)th site
coincides with 1st site and everything, including potential and wave-function, repeats when
one completes a full circle. Now this makes the crystal of our interest mathematically
periodic.
Repeating Eq.82 N -times repeatedly, we get ψ(x + N a) = exp(iN α)ψ(x) and then
using in Eq.83 we get exp(iN α) = 1 which gives N α = 2nπ or α = 2nπ/N . Here, n will
range from 0 to N − 1. Beyond this range exp(2nπ/N ) will map to one of the values for
n within 0 to N − 1 for this exp factor. For instance if I take n = N − 1 + n1 (i.e.> N ),
I get exp[i(2π + 2(n1 − 1)π/N )], i.e. a repeat of n − N (=n1 − 1) as e2πi is trivially one.
This amounts to saying that if exp(iα) is an eigen value of Ta then exp(iα + 2πi) is also
an eigen value and for the same eigen state.

68
In the end what matters is that we keep n values spanning a range of N different
values. So we might as well use n ∈ [− N2 , N2 − 1] or n ∈ [− N2 + 1, N2 ] and for large N
this N2 − 1 or N2 + 1 can be taken as N2 . In solid state physics, for given lattice period
a, one rather uses k = αa = 2πn
Na
in place of α. Here k takes N different values separated
by ∆k = N2πa = 2πL
with L as crystal length. k covers a range k ∈ [− πa , πa − N2πa ] which for
large N can be taken as k ∈ [− πa , πa ]. With k taking place of α we can restate Eq.82 as

ψk (x + a) = eika ψk (x). (84)

This is precisely how the Bloch theorem is stated in most textbooks. We can see that
Bloch wave-function ψk (x) is not periodic except for a special value of k = 0. It turns out
that the Bloch wave function actually represents a periodically modulated plane wave,
i.e.

ψk (x) = eikx uk (x), (85)

with uk (x) being periodic with lattice period, i.e. uk (x + a) = uk (x). Think what a
periodically modulated plane wave is and convince yourself on how Eq.85 represents a
periodically modulated plane wave. Eq.85 and Eq.84 are equivalent and both are inter-
changeably used as Bloch’s theorem statement. It is left as an exercise to prove that
Eq.85 indeed satisfies Eq.84. You can also prove the converse if you feel up to it [Hint: Is
e−ikx ψ(x) periodic?].

11.3 Kronig-Penney model

Figure 32: Left: Potential in KP model as an array of finite depth wells with period a.
Right: The KP potential in the δ-function limit.

The band-structure calculations for real systems (even in 1D) are rather cumbersome
and one needs to make many approximations. More than one full semester courses will be
required to learn the state of the art techniques for band structure calculations. However,
to get a flavor I discuss a simple 1D model, called Kronig-Penney (KP) model, which is
also exactly solvable in the δ-function limit. In fact, I’ll only solve the δ-function limit of
this model. The potential for the KP model is shown in Fig.32. The left one shows it as
a periodic array (period a) of finite potential wells separated by barriers of height V0 and

69
width b while the right one shows it in the δ function limit. In the latter case V0 → ∞
and b → 0 such that bV0 = γ stays constant. We can write, in the latter case,
X
V (x) = γ δ(x − na). (86)
n

We shall be using periodic boundary conditions with a large N and Bloch’s theorem,
i.e. Eq.84. We do not know the eigen energy (0 < E < ∞) values and we also have to
find the corresponding wave-functions. Since the potential vanishes except when x = na,
we have the TISE away from these points as −ℏ
2 d2
ψ (x) = Eψk (x). Thus we can write
2m dx2 k
the solution,
for 0 < x < a, ψk (x) = A exp(iλx) + B exp(−iλx) (87)
Here A, B and λ2 = 2mE/ℏ2 will be dependent on k. Using Eq.85 we can write, ψk (x) =
exp(−ika)ψk (x + a), which can be used to assert,
for − a < x < 0, ψk (x) = exp(−ika)[Aeiλ(x+a) + Be−iλ(x+a) ] (88)
Now this ψk (x) has to satisfy two boundary conditions at x = 0, namely,
1. Continuity, i.e.
ψk (x)|x=0+ = ψk (x)|x=0− (89)

2. Since there is a δ function of strength γ sitting at x = 0, the first derivative of ψk (x),


i.e. ψk′ (x), at this point will have a discontinuity as discussed earlier in the course.
This will be given by
ψk′ |x=0+ − ψk′ |x=0− = 2mγ/ℏ2 (90)

Our goal here is to find A, B and E as a function of k. We use Eq.87 & 88 in Eq.89 to
get
(A + B) = e−ika (Aeiλa + Be−iλa )
or B[1 − e−i(k+λ)a ] = A[e−i(k−λ)a − 1] (91)
and we use Eq.87 & 88 in Eq.90 to get
2mγ
iλ(A − B) − e−ika iλ(Aeiλa − Be−iλa ) = 2 (A + B)
   ℏ 
−i(k+λ)a 2mγ −i(k−λ)a 2mγ
or B −iλ + iλe − 2 = A −iλ + iλe + 2 (92)
ℏ ℏ
Now to eliminate A & B, we cross multiply Eq.91 & Eq.92 to get
2mγ 2mγ
iλ − iλe−i(k+λ)a + 2 − iλe−i(k−λ)a + iλe−2ika − 2 e−i(k−λ)a
ℏ ℏ
−i(k−λ)a 2mγ 2mγ
= −iλ + iλe + 2 + iλe−i(k+λ)a − iλe−2ika − 2 e−i(k+λ)a
ℏ ℏ
−ika iλa −iλa −2ika 2mγ −ika −iλa
or 2iλ − 2iλe [e + e ] + 2iλe − 2 e [e − e−iλa ] = 0

−ika −2ika 2mγ −ika
or 2iλ − 2iλe 2 cos(λa) + 2iλe − 2 e 2i sin(λa) = 0

70
Dividing by 2iλe−ika and introducing P = mγa/ℏ2 , we get

2P
eika + e−ika −2 cos(λa) − sin(λa) = 0
λa
2P
or 2 cos(ka)−2 cos(λa) − sin(λa) = 0
λa
P
or cos(ka) = sin(λa) + cos(λa) (93)
λa
This is the transcendental equation to find valid λ for a given k value. Given this λ(k)
one can also deduce B in terms of k by using Eq.91 (or Eq.92) and A. As usual A will be
dictated by the normalization condition. In order to find λ for given k one has to solve
Eq.93 numerically.

71
11.3.1 Energy Bands in KP Model
We review a graphical method to solve the above transcendental equation to find the
P
eigen-energy or λ values. Fig.33 shows the plot of λa sin(λa) + cos(λa), i.e. RHS of Eq.93
for P = 5, as a function of λa in units of π. Please look at the plot and the expression
being plotted to convince about some limiting points such as λa = 0, π/2, π, 2π, etc.
Also, can you try to find the lowest λa for which this expression vanishes?

Figure 33: Plot of the RHS of Eq.93 for P = 5 as a function of λa. The red and blue
horizontal lines represent +1 and -1, respectively, corresponding to ka = 0 & π.

We know that k will take N different discrete values in interval (−π/a, π/a] separated
by ∆k = π/N a and thus cos(ka) will take values between -1 and +1. For instance,
for k = 0, cos(ka) = 1 which is shown as the red horizontal line in Fig.33. This line
intersects the KP-curve at many points. Each intersection point can be projected on the
horizontal axis as shown by red dots (for cos ka = +1) on horizontal axis in Fig.33. The
λ values corresponding to these dots give energy E = ℏ2 λ2 /2m. Thus for each given k
we see that there are many (discrete) energies possible. Similarly for k = π/a (or −π/a),
cos(ka) = −1 and the line corresponding to this is shown as blue horizontal line, which
again intersects KP-curve at many points that can be projected as blue dots.
For any given k the horizontal line corresponding to cos(ka) will lie within these two
red and blue lines. There are some ranges of λ for which the KP-curve will not intersect
horizontal cos(ka) line corresponding to any k. These values (between two consecutive
red or blue dots) of λ are forbidden as depicted in Fig.34 implying that certain range
(or bands) of energies are forbidden. The allowed bands of energies or λ (i.e. between
a red dot and the immediate blue dot or vice-versa) are separated by band-gaps. Also,
a horizontal line corresponding to cos(ka) for a given k intersects the KP-curve at many

72
points with one point in each allowed band. This implies that there are many (discrete)
energies possible for a given k with each energy lying in a different band.
We also see that k takes rather continuous values as ∆k = 2π/N a is very small for
large N . Thus we get continuous energy bands separated by clear gaps. Please also
note that a general intersection point in Fig.33 will correspond to two different k values
that differ in sign as cos ka is an even function of k. This, in fact, implies that there
is a degeneracy of two for each energy and the Bloch wave-vector k helps in lifting this
degeneracy. This aspect is further discussed later on. If we numerically calculate these
energy bands by finding λ values for each k we get E as a function of k for different bands
as shown in the left panel of Fig.35 for P = 5. The black lines in this plot are the bands
that are separated by energy gaps from the neighboring ones.

Figure 34: Plot of the KP equation with the grey regions showing forbidden bands of λa
responsible for the band gaps in energy.

It is insightful to analyze how this band structure will evolve when one varies P from
zero (i.e. no δ-functions) to infinity. For P → ∞, the potential will be equivalent to an
array of infinite wells while for P = 0 it becomes a free particle. Thus for P → 0 we should
get free particle like energies and for P → ∞ it should give particle-in-a-box energies.
The plot in the right panel of Fig.35 shows the evolution of the RHS of Eq.93 when P
increases from 5 to 40. From this we can see that the bands are expected to get narrow.
In fact the bands in P → ∞ limit collapse on the red lines of the left plot in Fig.35 which
are precisely the energies corresponding to the particle-in-a-box. On the other hand when
P → 0 we see that the energy-gaps will approach zero and we get an E(k) which is
equivalent to the free particle result, i.e. E = ℏ2 k 2 /2m, except that different portions
of the free particle E(k) have been shifted and brought into the k-region from −π/a to
π/a. In this model P is a measure of interaction between the neighboring potential wells

73
with large P representing small interaction. We see that large interaction (small P ) we
get broad bands and for small interaction (large P ) we get narrow bands. This is a very
general point which gets captured in this simple and exactly solvable model.

Figure 35: Left: Energy band diagram E(k) for KP model for P = 5. The red lines mark
the energies corresponding to infinite well potential. Right: plot of the KP equation for
different P values depicting the narrowing of the allowedλ ranges with increasing P .

More rigorously, if we solve Eq.93 for P → ∞ limit we get λa = nπ and thus E =


n2 π 2 ℏ2
2m
,
i.e. the particle-in-a-box energy. However each energy state here has a degeneracy
of N as E is independent of k. In P = 0 limit Eq.93 gives λa = ka + 2nπ and thus
ℏ2
E = 2m (k + 2nπ
a
)2 . Please convince yourself that the latter corresponds to free particle
result.
Another point that I want to make is regarding the degeneracy and the Bloch’s the-
orem. We see that in Eq.93, the cos(ka) term is independent of the sign of k and thus
λ as well as E values are independent of the sign of k. This implies a degeneracy. This
degeneracy can be traced to the symmetric nature of KP potential, i.e. its Hamiltonian
commutes with Π. In detail, we see that ψk and ψ−k wave functions are orthogonal as
they are eigen states of Ta with different eigen values, i.e. e+ika & e−ika . We also see
that c1 ψk + c2 ψ−k is also eigen state of H but not of Ta for non-zero c1,2 . Please convince
yourself about this. Thus c1 ψk + c2 ψ−k does not satisfy the Bloch’s theorem for non-zero
c1,2 . In fact the symmetry property of H permits us to choose eigen states of H that are
also eigen states of Π, i.e. if we choose c1 = ±c2 . However, such wave-functions will not
obey Bloch’s theorem. Finally, we should comprehend that the Bloch’s theorem, and the
choice of Bloch’s states as eigen states of periodic potential, is more of a convention which
becomes a rule in case of no degeneracy. In the end, Bloch states can always be chosen
as eigen states of H for periodic potential.

74
11.3.2 Wave-functions in KP Model:
One can also plot some characteristic wave-functions of the KP model to internalize the
modulated plane-wave character and other aspects. This requires knowing B as a function
of k and A while A, determining the overall scale of ψ, can be left out as a normalization
constant. In fact a general state, its energy and wave-function depends on two things: k
and the band-index. Thus we symbolically write En (k) for k-dependent energy of different
bands and ψnk for the band-index (n) and k dependent wave-function. Some of the wave-

Figure 36: Plots of the Re[ψ], Im[ψ] (red and blue) and |ψ| for the bands and ka as
marked. The last one, with much longer x-range, is for E just above the bottom of the
first band with ka ≃ 0.012π.

75
functions for KP model are plotted in Fig.36. We can list a few general observations
about these wave functions as follows:

1. The odd band (1st, 3rd, 5th,..) minimum energy wave functions have k = 0 and
max energy wave functions have ka = π.

2. The even band (2nd, 4th, 6th,..) minimum energy wave functions have ka = π and
max energy wave functions have k = 0.

3. For k = 0 the wave functions are periodic with a as ψk=0 (x + a) = ψk=0 (x).

4. For ka = π the wave functions change sign in every consecutive periodic, i.e.
ψka=π (x + a) = −ψka=π (x).

5. With increasing energy the average number of nodes per unit length increases. For
lowest energy there are no nodes.

6. The highest energy state of the odd bands correspond to ka = π and those of the
even bands correspond to k = 0. In fact, if we look at K-P plot in Fig.33, the
maximum energy point of all the bands corresponds to λa = nπ with n ̸= 0. These
highest energy states have energy same as that of particle in an infinite well. These
wave functions can also be mapped to infinite well wave-functions as the wave-
functions vanish at the δ-function location, i.e. x = na leading to no discontinuity
in its derivative.

7. Finally we can recognize in the last plot of Fig.36 the modulated plane wave nature
of the wave function, plotted over a large x-range, shows a period-a modulated
plane wave. The plane-wave envelope has a large wavelength given by 2π/k >> a,
see Eq.85. The real and imaginary parts of the wave-function evolve spatially in
quadrature with |ψk (x + a)| = |ψk (x)|.

11.3.3 Negative δ-function KP Model


One can also look at the PKP model for a lattice of negative δ-functions, i.e. with overall
potential as V (x) = −γ n δ(x p− na). In this case since the bound state 2energies will
be negative we redefine λ = −2mE/ℏ2 and P remains as P = mγa/ℏ > 0. The
mathematical steps for this are identical and one will get a slight modifications in Eq.93
as the sinusoidal functions of λa will change to the hyperbolic sinusoidal functions. This
equation works out to be cos(ka) = − λa P
sinh(λa) + cosh(λa). In the end we’ll get only
one band of energy. Physically this results from the broadening of the single negative
δ-function bound state. Again it is interesting to analyze the large and small P limits.

11.4 Electrons in solids


Above discussion of KP model gives us a flavor of how electrons behave in a solid. These
E(k) diagrams are referred to as band-structure diagrams resulting from rather complex
calculations incorporating the actual atomic orbitals. We can ignore electron-electron
interaction in many real solids, which, by the way, is not a bad approximation in most

76
cases. Given the number of electrons, arising from the valence electrons of atoms, we fill
these electrons into these bands respecting Pauli exclusion principle for electrons so that
we put two electrons per energy and per k state. We’ll end up filling certain bands fully
and in several cases we’ll have partially filled bands. In the former case the solid behaves
as an insulator (or semiconductor) as one needs a minimum energy, equal to band gap,
to excite electron. In the later case, i.e. partially filled bands, we get a metal.

77

You might also like