Professional Documents
Culture Documents
-------------------------
-------------------------
If you like the FAQ and/or found it useful, please link to it from
your home page to make it more widely known.
If you spot errors or have suggestions for improvements,
please write me (at Arnold.Neumaier@univie.ac.at).
If you have questions, please post them to the moderated newsgroup
sci.physics.research (http://www.lns.cornell.edu/spr)!
If you found this FAQ useful you are likely to benefit also from
reading our book
Arnold Neumaier and Dennis Westra,
Classical and Quantum Mechanics via Lie algebras,
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#QML
http://de.arxiv.org/abs/0810.1019
Of course, the FAQ refers only to a tiny part of theoretical physics,
namely to what I happened to discuss on sci.physics.research.
The answers are only as good as my understanding of the matter.
This doesn't mean that they are poor but probably that they are
not perfect. Many topics are discussed quite in detail, but this is
not a book, so don't expect completeness or comprehensiveness in any
sense.
On topics where the physics community has not yet reached a consensus,
my point of view is of course only one of the possibilities, and not
always the mainstream view, although I tend to discuss that view, too.
In any case, I try to be accurate, consistent, and intelligible.
Happy Reading!
Arnold Neumaier
University of Vienna
http://www.mat.univie.ac.at/~neum/
I like to see people grow
-----------------
Table of Contents
-----------------
The 21 topics in the initial version, posted there on April 28, 2004,
have grown to 88 by January 1, 2005, to 116 by January 4, 2006,
to 128 by January 3, 2007, to 140 by January 3, 2008, to 147 by
January 30, 2009, and are likely to grow further.
(A * indicates addition of a new topic, or large modification of
an old one, since January 30, 2009. Minor changes or additions to
old topics are not indicated.)
The various topics can usually be read independently of each other;
they are arranged into groups of loosely related topics.
To read a particular entry, grep for its label, e.g., S2e.
The labels may change with time as answers to further questions
will be added and old answers regrouped. So, to quote part of the FAQ,
refer to the title of a section and not only to its label.
Abbreviations:
QM = quantum mechanics, QFT = quantum field theory,
QED = quantum electrodynamics, CCR = canonical commutation relations,
s.p.r. = sci.physics.research (newsgroup).
Strings like quant-ph/0303047 or arXiv:0810.1019 refer to electronic
documents in the e-Print archive at
http://xxx.lanl.gov and mirror sites.
p_0 and \p are the time and space part of a 4-vector p;
the Minkowski inner product is always taken to be p^2=p_0^2-\p^2.
Chapter 1 (20 sections)
S1a. What are bras and kets?
S1b. Projective geometry and quantum mechanics
S1c. What is the meaning of the entries of a density matrix?
S1d. Postulates for the formal core of quantum mechanics
S1e. Open quantum systems
S1f. Interaction with a heat bath
S1g. Quantum-classical mechanics
S1h. Can all quantum states be realized in nature?
S1i. Modes and wave functions of laser beams
S1j. Classical and quantum tunneling
S1k. Quantization in non-Cartesian coordinates
S1l. Second quantization
S1m. When is an object macroscopic?
S1n. The role of the ergodic hypothesis
S1o. Does quantum mechanics apply to single systems?
*S1p. Dissipative dynamics and Lagrangians
*S1q. How can QM be stochastic while the Schroedinger equation is not?
*S1r. Measurement theory for real numbers
*S1s. The classical limit of quantum mechanics
*S1t. The classical limit via coherent states
Chapter 2 (10 sections)
S2a. Lie groups and Lie algebras
S2b. The Galilei group as contraction of the Poincare group
S2c. Representations of the Poincare group, spin and gauge invariance
S2d. Forms of relativistic dynamics
S2e. Is there a multiparticle relativistic quantum mechanics?
S2f. What is a photon?
S2g. Particle positions and the position operator
S2h. Localization and position operators
*S2i. Position operators in relativistic quantum field theory
S2j. Coherent states of light as ensembles
Chapter 3 (6 sections)
S3a. What are 'bare' and 'dressed' particles?
S3b. How meaningful are single Feynman diagrams?
S3c. How real are 'virtual particles'?
S3d. What is the meaning of 'on-shell' and 'off-shell'?
S3e. Virtual particles and Coulomb interaction
S3f. Are virtual particles and decaying particles the same?
Chapter 4 (10 sections)
S4a. How do atoms and molecules look like?
S4b. Why are observable densities state-dependent?
S4c. Are electrons pointlike/structureless?
S4d. How much information is in a particle?
S4e. Entropy and missing information
S4f. How real is the wave function?
S4g. How real are Feynman's paths?
S4h. Can particles go backward in time?
S4i. What about particles faster than light (tachyons)?
S4j. Do free particles exist?
Chapter 5 (9 sections)
S5a. QM pictures and representations
S5b. Inequivalent representations of the CCR/CAR
S5c. Why does QFT look so different from QM?
S5d. Why is QFT based on a classical action?
S5e. Why does the action only contain first derivatives?
S5f. Why normal ordering?
S5g. Why locality and causal commutation relations?
S5h. Creation operators and rigged Hilbert space
S5i. Why Feynman diagrams?
Chapter 6 (8 sections)
S6a. Nonperturbative computations in quantum field theory
S6b. The formal functional integral approach to QFT
S6c. Functional integrals, Wightman functions, and rigorous QFT
S6d. Is there a rigorous interacting QFT in 4 dimensions?
S6e. Constructive field theory
S6f. The classical limit of relativistic QFT
S6g. What are interpolating fields?
S6h. Hilbert space and Hamiltonian in relativistic quantum field theory
*S6i. 2-dimensional quantum field theory
Chapter 7 (3 sections)
S7a. What is the mass gap?
S7b. Why can a bound state of massless quarks be heavy?
S7c. Bound states in relativistic quantum field theory
Chapter 8 (9 sections)
S8a. Why renormalization?
S8b. Renormalization without infinities I
S8c. Renormalization without infinities II
S8d. Renormalization and coarse graining
S8e. Renormalization scale and experimental energy scale
S8f. Dimensional regularization
S8g. Nonrelativistic quantum field theory
S8h. Nonrenormalizable theories as effective theories
S8i. What about infrared divergences?
Chapter 9 (6 sections)
S9a. Summing divergent series
S9b. Is QED consistent?
S9c. What about relativistic QFT at finite times?
S9d. Perturbation theory and instantaneous forces
S9e. QED and relativistic quantum chemistry
S9f. Are protons described by QED?
Chapter 10 (13 sections)
S10a. How are matrices and tensors related?
S10b. Is quantum mechanics compatible with general relativity?
S10c. Difficulties in quantizing gravity
S10d. Renormalization in quantum gravity
S10e. Hadamard states and their Hilbert spaces
S10f. Why do gravitons have spin 2?
S10g. What is the tetrad formalism?
S10h. Energy in general relativity
S10i. What happened to the aether?
S10j. What is time?
S10k. Time in quantum mechanics
S10l. Diffeomorphism invariant classical mechanics
S10m. The concept of ''Now''
Chapter 11 (7 sections)
S11a. A concise formulation of the measurement problem of QM
S11b. The double slit experiment
S11c. The Stern-Gerlach experiment
S11d. The minimal interpretation
S11e. The preferred basis problem
S11f. Master equation and pointer variables
S11g. Does decoherence solve the measurement problem?
Chapter 12 (6 sections)
S12a. Which interpretation of quantum mechanics is most consistent?
S12b. Which textbook of quantum mechanics is best for foundations?
S12c. What is the role of quantum logic?
S12d. Stochastic quantum mechanics
S12e. Is there a relativistic measurement theory?
S12f. Quantum mechanics and dice
Chapter 13 (10 sections)
S13a. Random numbers and other random objects
S13b. What is the meaning of probabilities?
S13c. What about the subjective interpretation of probabilities?
S13d. Are probabilities limits of relative frequencies?
S13e. How meaningful are probabilities of single events?
S13f. Objective probabilities
S13g. How probable are realizations of stochastic processes?
S13h. How do probabilities apply in practice?
S13i. Incomplete knowledge and statistics
S13j. Priors and entropy in probability theory
Chapter 14 (4 sections)
S14a. Theoretical challenges close to experimental data
S14b. Does the standard model predict chemistry?
S14c. Is the result of a measurement a real number?
S14d. Why use complex numbers in physics?
Chapter 15 (5 sections)
S15a. How precise can physical language be?
S15b. Why bother about rigor in physics?
S15c. Justifying the foundations of a theory
S15d. Foundations, theory and experiment
S15e. Theoretical physics as a formal model of reality
Chapter 16 (12 sections)
S16a. On progress in science
S16b. How different are physical sciences and social sciences
S16c. Can good theories be falsified?
S16d. What, then, distinguishes a good theory?
S16e. When is a theory preferred to another one?
S16f. What is a fact?
S16g. Physics and experience
S16h. Modeling reality
S16i. What is a system (e.g., an ideal gas)?
S16j. When is a theory confirmed?
S16k. What is real?
S16l. How many angels fit onto the tip of a needle?
Chapter 17 (8 sections)
S17a. How to get information from sci.physics.research
S17b. How to get your work published
S17c. How to respond to critical referee's reports
S17d. How to sell your revolutionary idea
S17e. Useful background, online lecture notes, etc.
S17f. Stories about physicists
S17g. Other physics FAQs
*S17h. Naming in science
Chapter 18 (5 sections)
S18a. What is the meaning of 'self-consistent'?
S18b. What is a vector?
S18c. Learning quantum mechanics at age 14
S18d. Research at age 16
S18e. Are there indefinite Hilbert spaces?
Chapter 19 (1 section)
S19a. God and physics
Chapter 20 (1 section)
S20a. Acknowledgments
----------------------------
S1a. What are bras and kets?
----------------------------
In the language of linear algebra, kets |psi> are just column vectors
psi (for systems with finitely many levels only; each component gives
the amplitude for the corresponding level), and the corresponding
bras <psi| are the complex conjugated transposed row vectors psi^*.
The inner product <phi|psi>, the bra(c)ket, is therefore
<phi|psi> = phi^*psi = sum_k phi^k^* psi_k.
For the basis bra <k|, the unit vector with a single entry 1 at
position k, we find as special case
<k|psi> = psi_k.
In infinite dimensions, the sum becomes an integral, and we get
<phi|psi> = integral dx phi(x)^* psi(x)
and for the basis bra <x|, which is a delta distribution centered at x,
we have
psi(x) = <x|psi>.
Actually, in infinite dimensions, one needs functional analysis
in place of linear algebra to get a concise definition; kets are smooth
functions from some nice function space, and bras are linear
functionals on the dual space. The dual space is larger and also
contains distributions.
(For those who want to be fully rigorous: kets belong to a
so-called nuclear space H_inf, for example the space of Schwartz
functions; its closure H under the Euclidean norm
gives the conventional Hilbert space, and together with the dual
H_inf^* = H_-inf, these define a Gelfand triple or rigged Hilbert
space, two names for the same concept).
Physicists are less picky, however, and allow kets also to be
less smooth functions and even distributions, so that every bra has
a corresponding ket. Thus they use the ket |x> although this is not a
function but a delta distribution centered at x.
This allows them to write not only psi(x) = <x|psi>, but also
psi(x)^* = <x|psi>^* = <psi|x>.
The price to be paid is that inner products are no longer well-defined
in general; for example, <x|x> is infinite. They say, |x> is not
normalizable and mean that it is not in the Hilbert space of
well-behaved pure states.
Caution: Physicists often use different bases which may cause confusing
notation. For example <p| is a momentum basis state, while <x| is a
position basis state. But while <x|y> = 0 if x and y are distinct
positions, and <p|q> = 0 if p and q are distinct momenta,
the inner product of a momentum bra <p| and a position ket |x>
(or vice versa) is never zero. (Exercise: Verify this by computing
explicit formulas for <p|x> and <x|p>!) Thus, unlike in mathematics,
the formulas are not invariant under substitution of letters for
the variables!
About the pitfalls when not using the required care, I recommend reading
F. Gieres,
Mathematical surprises and Dirac's formalism in quantum mechanics,
Rep. Prog. Phys. 63 (2000) 1893-1931.
quant-ph/9907069
and
G. Bonneau, J. Faraut, G. Valent,
Self-adjoint extensions of operators and the teaching of quantum
mechanics,
Amer. J. Phys. 69 (2001) 322-331.
quant-ph/0103153
----------------------------------------------
S1b. Projective geometry and quantum mechanics
----------------------------------------------
Projective geometry means that one works with rays instead of vectors
to designate points in a geometry.
Think of the 2-dimensional affine plane. The points are represented by
vectors in R^2. On the other hand, by moving an affine plane lying on
the floor a little upwards into the air (the same amount at every
point), one may think of each point as being represented by the ray
from an origin on the floor to the point on the plane.
(Actually, instead of the ray one should consider the whole line;
strictly speaking, a ray is only a half-line. But in quantum physics,
one custonmarily calls the 1-dimensional subspaces rays. Since the
coefficient field is complex, the rays are actually rotated complex
number planes.)
Similarly, lines are now 2-spaces through the origin. This gives
projective geometry (or homogeneous coordinates, which is the same in
more algebraic terms).
But now one also has some additional points, corresponding to rays
parallel to the affine plane. These points form the 'line at infinity'
= the 2-space through the origin parallel to the affine plane.
A slightly closer look reveals that the geometry has become more
complete: Now not only every two points have a unique connecting line
but also any two lines have a unique intersections - what were before
parallels are now lines intersecting 'at infinity'. Imagine two long,
straight rails of a railway track...
Thic can be extended to higher dimensions. n-dimensional affine geometry
can be respresented by rays through 0 in n+1 dimensional space, and can
be completed there to a projective geometry, in which the vector
subspaces are the geometrical objects. In Hilbert space one cannot
count anymore dimensions, but otherwise everything is similar.
Since, in quantum mechanics, state vectors are only defined up to a
phase (even when normalized), they correspond uniquely to rays
= 1-dimensional subspaces in Hilbert space. Hence quantum mechanics is
intrinsically projective.
------------------------------------------------------------
S1c. What is the meaning of the entries of a density matrix?
------------------------------------------------------------
Density matrices are a convenient way of describing states of quantum
systems in contact with an environment. (State vectors = wave functions
are appropriate only for isolated systems at zero absolute temperature,
though they can be used in an approximate way in thermally isolated
contexts. But contact with an environment means positive temperature.)
If the quantum system has only a finite number n of levels,
the density matrix is an n x n matrix; otherwise it is
a linear operator on Hilbert space (but nevertheless called a matrix).
The real use for density matrices is to compute expectations
<f> = trace (rho f)
for quantities f of interest. Indeed, rho is just a collection of
numbers enabling one to calculate these expectations.
The fact that the constant 1 must have expectation 1 leads to the
restriction that
sum_k rho_kk = trace rho = 1.
Apart from that, rho must be a Hermitian, positive semidefinite matrix,
to satisfy the requirements of statistics. (See quant-ph/0303047 for
details.) For small systems, all such density matrices can indeed be
approximately realized in practice.
Since diagonal entries of a semidefiniteness are always nonnegative,
the p_k:=rho_kk are nonnegative numbers summing to 1 and thus look like
probabilities. What the components mean depends on the basis used.
In particluar, if the basis consists of eigenstates of a Hamiltonian,
and the eigenvalues E_k are all nondegenerate, a diagonal element
rho_kk can be interpreted as the probability that upon measuring the
energy of the system one will find the value E_k.
If f is a function of the Hamiltonian H, and the basis used consists of
eigenstates |k> of H, with H|k>=E_k|k> then the density matrix rho
has entries rho_jk = <j|rho|k>. If one now calculates the expectation
of a function f(H), the equation f(H)|k>=f(E_k)|k> implies that
<f(H)> = trace (rho f(H)) = sum_k <k|rho f(H)|k>
= sum_k <k|rho f(E_k)|k> = sum_k <k|rho|k> f(E_k)
= sum_k rho_kk f(E_k).
If we average the results f(E) of a number of measurements of the
energy, where the energy E_k is measured with probability p_k,
we get
<f(H)> = sum_k p_k f(E_k).
Thus, to match the expectations no matter which function we are
averaging, we need to take p_k=rho_kk. This gives the claimed
probability interpretation of the diagonal entries.
Off-diagonal elements have no simple interpretation.
Usually one does not look at off-diagonal elements at all, but they
are important in intermediate steps of calculations.
Close to absolute zero temperature, and assuming the absence of
degeneracy, (but also in certain other, well prepared nearly
isolated systems), quantum state have the property that all columns
of the density matrix are nearly parallel to a wave function psi
that is conventionally normalized to have norm 1,
psi^*psi=1.
(In Dirac language, this says <psi|psi>=1; see the FAQ entry for bras
and kets.). This vector psi, which is clearly determined only up to a
complex number of absolute value 1, is called the wave vector
(or, in infinite dimensions, the wave function) of the state.
Idealizing this situation, one describes such quantum systems by states
in which all columns of the density matrix are exactly parallel to some
nonzero wave vector psi. (Such matrices are called rank 1 matrices;
the wave vector, also referred to as a wave function, is defined
only up to a phase factor.)
Then the k-th column is a multiple c_k psi of psi. The fact that rho
is Hermitian forces each row to be a multiple of psi^*. But this implies
that c_k is a multiple of phi^*_k, so that rho is a multiple of
psi psi^*. Since psi is normalized, the multiplication factor is just
the trace, and since the trace is 1 we find
rho = psi psi^* for any rank 1 density matrix.
If we now calculate the probability of measuring the energy E_k, we find
p_k = rho_kk = <k|rho|k> = <k|psi psi^*|k> = <k|psi> <psi|k>,
and since <psi|k> is just the complex conjugate of <k|psi>,
we end up with
p_k = |<k|psi>|^2.
This is Born's squared amplitude formula for calculating probabilities.
Thus one sees that the traditional wave vector calculus is just a
special case of the density matrix calculus, appropriate (only) for
the study of tiny, well-prepared nearly isolated systems and for
systems close to zero absolute temperature. For the study of ordinary
matter under ordinary conditions, one needs to represent states
by density matrices.
Everything that is done with wave vectors can also be done with
density matrices, or equivalently with the associated expectation
mapping. Indeed, everything becomes simpler that way, much closer
to classical mechanics, and much less weird-looking.
See quant-ph/0303047 for an exposition of the foundations of quantum
mechanics (including the probability interpretation, uncertainty
relations, nonlocality, and Bell's theorem) in terms of expectations.
--------------------------------------------------------
S1d. Postulates for the formal core of quantum mechanics
--------------------------------------------------------
Quantum mechanics consists of a formal core that is
universally agreed upon (basically being a piece of mathematics
with a few meager pointers on how to match it with experimental
reality) and an interpretational halo that remains highly disputed
even after 80 years of modern quantum mechanics. The latter is the
subject of the foundations of quantum mechanics; it is addressed
elsewhere in this FAQ. Here I focus on the formal side.
As in any axiomatic setting (necessary for a formal discipline),
there are a number of different but equivalent sets of axioms
or postulates that can be used to define formal quantum mechanics.
Since they are equivalent, their choice is a matter of convenience.
My choice presented here is the formulation which gives most
direct access to statistical mechanics, which is the main tool for
real life applications of quantum mechanics. The relativistic case
is outside the scope of the present axioms. Thus the following
describes nonrelativistic quantum statistical mechanics in the
Schroedinger picture. (The traditional starting point is instead
the special case of this setting where all states are assumed to be
pure.)
There are six basic axioms:
A1. A generic system (e.g., a 'hydrogen molecule')
is defined by specifying a Hilbert space K whose elements
are called state vectors and a (densely defined, self-adjoint)
Hermitian linear operator H called the _Hamiltonian_ or the _energy_.
A2. A particular system (e.g., 'the ion in the ion trap on this
particular desk') is characterized by its _state_ rho(t)
at every time t in R (the set of real numbers). Here rho(t) is a
Hermitian, positive semidefinite (trace class) linear operator on K
satisfying at all times the conditions
trace rho(t) = 1. (normalization)
A state is called _pure_ at time t if rho(t) maps K to a 1-dimensional
subspace, and _mixed_ otherwise.
A3. A system is called _closed_ in a time interval [t1,t2]
if it satisfies the evolution equation
d/dt rho(t) = i/hbar [rho(t),H] for t in [t1,t2],
and _open_ otherwise. (hbar is Planck's constant, and is often set
to 1.) If nothing else is apparent from the context,
a system is assumed to be closed.
A4. Besides the energy H, certain other (densely defined, self-adjoint)
Hermitian operators (or vectors of such operators) are distinguished
as _observables_.
(E.g., the observables for an N-particle system conventionally include
for each particle a involved several 3-dimensional vectors:
the _position_ x^a, _momentum_ p^a, _orbital_angular_momentum_ L^a
and the _spin_vector_ (or Bloch vector) sigma^a of the particle with
label a. If u is a 3-vector of unit length then u dot p^a, u dot L^a
and u dot sigma^a define the momentum, orbital angular momentum,
and spin of particle a in direction u.)
A5. For any particular system, one associates to every vector X
of observables with commuting components a time-dependent monotone
linear functional <dot>_t defining the _expectation_
<f(X)>_t:=trace rho(t) f(X)
of bounded continuous functions f(X) at time t.
This is equivalent to a multivariate probability measure dmu_t(X)
(on a suitable sigma algebra over the spectrum spec(X) of X)
defined by
integral dmu_t(X) f(X) := trace rho(t) f(X) =<f(X)>_t.
A6. Quantum mechanical predictions amount to predicting properties
(typically expectations or conditional probabilities)
of the measures defined in axiom A5 given reasonable assumptions
about the states (e.g., ground state, equilibrium state, etc.)
Axiom A6 specifies that the formal content of the theory is covered
exactly by what can be deduced from axioms A1-A5 without
anything else added (except for restrictions defining the specific
nature of the state), and hence says that Axioms A1-A5 are complete.
The description of a particular closed system is therefore given by
the specification of a particular Hilbert space in A1, the
specification of the observable quantities in A4, and the
specification of conditions singling out a particular class of
states (in A6). Everything else is determined by the theory and
hence is (in principle) predicted by the theory.
The description of an open system involves, in addition, the
specification of the details of the dynamical law. (For the basics,
see the entry 'Open quantum systems' in this FAQ.)
Deriving the Born rule (*) from axioms A1-A5 makes it completely
natural, while the traditional approach starting with (*)
makes it an irreducible rule full of mystery and only justifiable
by its agreement with experiment.
-------------------------
S1e. Open quantum systems
-------------------------
Open quantum systems are usually modelled in a stochastic way
to account for the unpredictability of the measurement process.
(Note that a measurement is any non-negligible interaction with the
environment, whether or not it is observed by something deserving
the name 'detector' or 'observer').
In the simplest setting in which states can be assumed to
be pure and measurements occur at definite, a priori known times
and have a negligible duration, an open quantum system is a discrete
stochastic process with values psi(t) in the Hilbert space of state
vectors, normalized to norm 1. Between two consecutive measurements,
the system is assumed to be closed.
Thus between two consecutive measurements at times t' and t''>t',
the normalized state psi(t) evolves according to the Schroedinger
equation
i hbar psidot = H psi,
so that
psi(t''-0)= P psi(t'+0), P = exp (i/hbar (t'-t'')H). (1)
(In the interaction picture, H=0 and psi remains constant between
measurements.)
A measurement at time t is assumed to happen in infinitesimal time
and replaces psi(t-0) independent of other measurements with
probability p_s by
psi(t+0)= P_s psi(t-0)/p_s if p_s>0, (2)
where the P_s are linear operators determined by the experimental
arrangement, satisfying the relation
sum_s P_s^*P_s = 1, (3)
and
p_s=|P_s\psi(t-0)|^2 (4)
guarantees that psi(t+0) remains normalized. Clearly the p_s are
nonnegative and by (3), they sum up to 1 (since psi(t-0) is normalized).
(For measurements with more than countably many possible outcomes,
one must replace the probabilities by probability densities and the
sums by integrals.)
Thus this is a well-defined stochastic process.
A von-Neumann measurement of a self-adjoint linear operator A
corresponds to the special case where P_s is an orthogonal projector
to the eigenspace corresponding to the eigenvalue a_s of A
(respective to the set of eigenvalues corresponding to the s-th
interval in a partition of the continuous spectrum of A.)
---------------------------------
S1f. Interaction with a heat bath
---------------------------------
Quantum mechanics in the presence of a heat bath requires the use
of density matrices. Instead of the usual von-Neumann equation
rhodot = rho \lp H
(for \lp see the section on 'Quantum-classical correspondence'),
the dynamics of the density matrix is given by a dissipative version
of it,
rhodot = rho \lp H + L(rho)
usually associated with the name of Lindblad. Here L(rho)
is a linear operator responsible for dissipation of energy to
the heat bath; it is not a simple commutator but can have
a rather complex form.
To get the Lindblad dynamics from a Hamiltonian description of
system plus bath, one uses the projection operator formalism.
The clearest treatment I know of is in
H Grabert,
Projection Operator Techniques in Nonequilibrium
Statistical Mechanics,
Springer Tracts in Modern Physics, 1982.
The final equations for the Lindblad dynamics are (5.4.48/49)
in Grabert's book.
--------------------------------
S1g. Quantum-classical mechanics
--------------------------------
Quantum mechanics and classical mechanics are very close relatives.
There are analogous objects for everything of relevance in
classical and quantum statistical mechanics.
Observable f:
classical - real phase space function f(x,p)
quantum - Hermitian linear operator or sesquilinear form f
Lie product f \lp g:
read \lp as 'Lie', and visualize it as inverted, stylized L;
Macro for LaTeX:
\def\lp{\mbox{\Large$\,_\urcorner\,$}}
classical: f \lp g = {g,f} in terms of the Poisson bracket
quantum: f \lp g = i/hbar [f,g] in terms of the commutator
The Lie product is bilinear in the arguments and satisfies
f \lp g = - g \lp f
f \lp gh = (f \lp g)h + g(f \lp h) (Leibniz)
f \lp (g \lp h) = (f \lp g) \lp h + g \lp (f \lp h) (Jacobi)
Invariant measure:
classical - integral f := integral dxdp f(x,p)
quantum - integral f := trace f
Integrability: integral |f| finite
quantum integrable <==> f trace class
Partial integration formula:
integral f \lp g = 0.
Dynamics: df/dt = X_H f := H \lp f with Hermitian H
canonical transformations = mappings exp(tX_H) with Hermitian H
Liouville's theorem says that
integral f = integral exp(tX_H)f
The infinitesimal form of this is the partial integration formula.
State rho:
classical - real integrable phase space function rho(x,p)>=0
quantum - Hermitian positive semidefinite trace class operator rho
both normalized to integral rho = 1.
expectation of f in state rho:
<f> = integral rho f
--------------------------------------------------
S1h. Can all quantum states be realized in Nature?
--------------------------------------------------
No. Many mathematically conceivable states do not exist in Nature,
for example, that of water at an absolute temperature of zero.
Quantum mechanics does not demand that all states are realizable.
For a number of tiny systems with a few levels, all states are
realizable with reasonable precision. However, the larger the system
the fewer states are realized.
The number of states realized at a given time of very large systems
such as human beings or galaxy clusters is even so small that it
can be approximately counted!
--------------------------------------------
S1i. Modes and wave functions of laser beams
--------------------------------------------
The physical state described by a typical laser beam is a state with
an indeterminate number of photons, since it is usually not an
eigenstate of the photon number operator. This essentially means that
in a beam, a certain number of photons cannot be meaningfully asserted;
instead, one has a meaningful photon density, referred to as the beam
intensity.
Thus the traditional N-particle picture does not apply.
Instead one has to work in a suitable Fock space.
The Maxwell-Fock space is obtained by 'second quantization' of the mode
space H_photon, consisting of all mode functions, i.e., solutions A(x)
of the free Maxwell equations, describing a classical background
electromagnetic field in vacuum. H_photon may be thought of as the
single photon Hilbert space, in analogy to the single electron Hilbert
space of solutions of the Dirac equation. (However, following up on
this analogy and calling A(x) a wave function leads to confusion later
on, and is best avoided.)
Actually, because of gauge invariance, the situation is slightly more
complicasted, and best described in momentum space. The Maxwell
equations reduce in Lorentz gauge, partial dot A(x) = 0, to
partial^2 A(x)=0, whence the Fourier transform of A(x) has the form
delta(p^2) Ahat(p), and Ahat(p) must satisfy the transversality
condition
p dot Ahat(p) = 0.
By gauge invariance, only the coset of Ahat(p) obtained by adding
arbitrary multiples of p has a physical meaning, reflecting the
transversal nature of the free electromagnetic field.
This coset construction is needed to turn the space of modes
into a Hilbert space H_photon with invariant inner product
<A|B>= integral Ahat(p) dot Bhat(p) Dp,
where
Dp = d\p/p_0 = dp_1 dp_2 dp_3/p_0,
is the Lorentz invariant measure on the photon mass shell,
0 < p_0 = |\p| = sqrt(p_1^2+p_2^2+p_3^2)
(negative frequencies are discarded to get an irreducible
representation of the Poincare group).
Indeed, without the coset construction, the inner product is only
positive semidefinite, hence gives only a pre-Hilbert space.
Of course, a golf ball sitting on top of a flat hill will not move
down the hill; because of friction it remains in a metastable state.
Thus the above is an idealization. But most of physics is idealized,
and the language is also somewhat idealized (and, as actually used by
people, not even completely precise).
----------------------------------------------
S1k. Quantization in non-Cartesian coordinates
----------------------------------------------
Textbook quantization rules assume (often silently, without warning)
Cartesian coordinates. The rules derived there are based on
canonical commutation rules and are invalid for systems
described in other coordinate systems.
In particular, a Hamiltonian alone does not have a physical meaning
since it can be quite arbitrarily transformed by coordinate
transformations. The Hamiltonian needs to be combined with the
correct Poisson bracket to yield the correct dynamical equations.
Only if the classical Poisson bracket satisfies the canonical
commutation rules, the quantum mechanics is obtained by imposing
canonical commutation rules on the commutators.
The standard quantization procedure assumes that the symplectic form
underlying the Hamiltonian description has the standard form
p dq - q dp. Under a coordinate transformation, the symplectic form
changes into something nonstandard, and naive quantization gives
wrong results.
To get correct results, one has to take account of the correct
symplectic structure, more precisely of the Poisson bracket defined
by it. This is most naturally done in a differential geometric
setting, in terms of symplectic manifolds and Poisson manifolds.
To proceed, one must quantize a symplectic (or a Poisson) manifold
together with a Hamiltonian defined on it.
This combination is invariant under coordinate transformations
and hence has a coordinate-independent geometric meaning.
How to quantize Hamiltonians on a symplectic (or a Poisson) manifold
is the subject of geometric quantization, about which there is a
significant literature.
------------------------
S1l. Second quantization
------------------------
Second quantization is a way of writing the quantum mechanics of
indistinguishable particles in such a way that it makes statistical
mechanics calculations easy and makes everything look like field theory.
One starts with a distinguished vacuum state |vac> and a family of
annihilation operators a(x) whith their adjoints, the creation
operators a^*(x), satisfying the canonical commutation relations (CCR)
[a(x),a(y)]=[a^*(x),a^*(y)]=0,
[a^(x),a^*(y)]=delta(x-y).
(This is for Bosons; for Fermions one has instead canonical
anticommutation relations, CAR, and everything below gets additional
minus signs in certain places.)
A pure (permutation symmetric) N-particle state with wave function
psi(x_1:N) is written in 2nd quantization as
psi = integral dx_1:N psi(x_1:N) a^*(x_1:N) |vac>,
hence the corresponding density matrix
rho = psi psi^*
takes the form
rho = integral dx_1:N dy_1:N rho(x_1:N,y_1:N),
where rho(x_1:N,y_1:N) is the rank one operator
psi(x_1:N)psi^*(y_1:N)a^*(x_1:N)|vac><vac|a(y_1:N).
Using this correspondence, one can do in second quantization whatever
one can do in first quantization (i.e., wave mechanics),
and match the results.
If f is a 1-particle operator given by an integral operator with
kernel f(x,y) (the general case follows by taking limits), so that
(f psi)(x_1:N)
= sum_a integral dx f(x_a,x) psi(x_{1:a-1},x,x_{a+1:N}),
the formula
<f> = integral dx dy <x|Rho|y> f(x,y)
defines the 1-particle density matrix Rho. The form of f in second
quantization is
f = integral dx dy f(x,y) a^*(x) a(y)
(exercise: check that it has indeed the desired action on an
N-particle state!), hence one has
<f> = integral dx dy f(x,y) <a^*(y)a(x)>.
and comparison with the definition of Rho gives the formula
<x|Rho|y> = <a^*(y)a(x)> = trace a(x) rho a^*(y),
which can therefore be viewed as the definition of the 1-particle
density matrix in second quantization.
Authers who fear integrals write instead similar formulas with
sums in place of integrals and discrete indices in place of the x,y.
Also, one can do the same in momentum space rather than position space,
which amounts to a change of basis but generally leads to
computationally more tractable formulations.
-----------------------------------
S1m. When is an object macroscopic?
-----------------------------------
One says that thermodynamics and statistical mechanics apply to
macroscopic objects. But when is an object macroscopic?
Thermodynamics and statistical mechanics are approximate, asymptotic
descriptions valid for 'sufficiently large' objects.
The approximations made are better and better the larger the object.
One can place the barrier anywhere; if one puts it too low, the
approximate description will be poor, if one puts it too high it
won't apply to the system of interest.
Thus the loose language accommodates the freedom in modeling the
user has when choosing the description level and the accuracy level.
It is only in the same sense subjective as is the choice of a
system of interest. What is interesting for one person or investigation
may be different from what is interesting for another person or
investigation; nevertheless, both may employ objective tools.
The mathematical meaning underlying this loose language is called the
thermodynamic limit. It makes the term 'macroscopic'
precise in a similar way as the mathematical notion of a limit N->inf
makes the term 'N sufficiently large' precise.
If one accepts the vague terminology to avoid talking always about
limits, one can give the following definition (which reflects the
subjectivity in the qualification about the modeling accuracy):
In statistical mechanics, all macroscopic observables are ensemble
averages. Thus, formally, a "macroscopic observable" is the expectation
of a space-time dependent field operator which remains constant
within the modeling accuracy under changes in space and time
smaller than the modeling accuracy.
---------------------------------------
S1n. The role of the ergodic hypothesis
---------------------------------------
----------------------------------------------------
S1o. Does quantum mechanics apply to single systems?
----------------------------------------------------
It is clear phenomenologically that statistical mechanics (and hence
quantum mechanics) applies to single systems like a particular cup of
tea, irrespective of what the discussions about the foundations of
physics say (see many other entries in this FAQ). Thus statistical
mechanics and quantum mechanics do not only apply - as is often
claimed - to large ensembles of independently and identically prepared
systems; when the system is large enough (i.e., macroscopic),
a _single_ system is enough.
(For smaller single systems, see the entry
''How do atoms and molecules look like?'' in the present FAQ.)
In classical statistical mechanics, the traditional bridge between
the ensemble view and thermodynamics (which clearly applies to single
systems) is the ergodic hypothesis. But there is not enough time
in the universe to explore more than an extremely tiny region of the
about 10^25-dimensional phase space of the cup of tea to explain the
success of the thermodynamical description by ergodicity.
In quantum mechanics, the situation is even worse - usually it is not
even attempted here to bridge the gap.
The best treatment I know of the foundational problems
involved in classical statistical mechanics is in the book
L. Sklar,
Physics and Chance,
Cambridge Univ. Press, Cambridge 1993.
but it does not present a solution. Other sources are not better in
this respect.
My own solution is the ''thermal interpretation'' of
physics, discussed to some extent in Chapter 7 of the book
Arnold Neumaier and Dennis Westra,
Classical and Quantum Mechanics via Lie algebras,
Cambridge University Press, to appear (2009?).
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#QML
arXiv:0810.1019
and in my recent slides
A. Neumaier,
Classical and quantum field aspects of light,
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#lightslides
and
A. Neumaier,
Optical models for quantum mechanics,
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#optslides
and explored in more detail in my German
Ein Theoretische Physik FAQ
http://www.mat.univie.ac.at/~neum/physik-faq.txt
under the name ''consistent experiment interpretation''
The key idea is that mathematical expectation has two different
interpretations in physics, one as average over a large number of
cases, and the other as a means of defining observables. That the
two interpretations have the same mathematical properties is the
reason they have been confused in the past. The thermal interpretation
separates them neatly and thus gets rid of most of the confusing
aspects of the foundations of physics.
-----------------------------------------
S1p. Dissipative dynamics and Lagrangians
-----------------------------------------
Any system of ordinary differential equations can be brought
into an artificial Lagrangian form, by first rewriting it in first
order form
F(q,q')=0
doubling the degrees of freedom by introducing conjugate variables p,
and then considering the Lagrangian
L(p,q)= p^T F(q,q').
In particular, this provides a Lagrangian formulation of dissipative
systems, such as the damped harmonic oscillator
m q'' + c q' + k q = 0 (m,c,k >0)
Unfortunately, the Hamiltonian in such a formulation has
nothing to do with the physical energy
E = (m q'^2 + k q^2)/2
The same holds for various other representations for the damped
harmonic oscillator found in the literature.
Lagrangians for the damped harmonic oscillator go back to
H. Bateman, Phys. Rev. 38, 815-819 (1931); the treatise
P.M. Morse and H. Feshbach,
Methods of Theoretical Physics
MacGraw-Hill, Boston 1953
discusses the procedure in Chapter 3 in terms of 'mirror images'
= additional dynamical variables needed to absorb the missing energy,
and remarks on p 313:
''The introduction of the mirror image ... is probably too artificial
a prcedure to expect to obtain much of physical significance from
it.''
And indeed, the book doesn't make use of it anywhere.
There are cases where one needs to model the memory to capture the
essence of the reduced dynamics. But in many cases, a simpler,
memory-free description is possible and adequate. One can remove the
memory by employing a Markov approximation, and gets again a
differential equation, which defines the Lindblad (or, classicallally,
the Focker-Planck) dynamics. Again, this is no longer described by a
Hamiltonian or Lagrangian framework.
In the extended formulation with explicit environment or with memory,
already a simple damped harmonic oscillator becomes a huge and
unwieldy dynamical system which is no longer equivalent to the damped
harmonic oscillator, but includes unwanted environment terms or memory
terms. In cases where one really needs to model the memory, the system
therefore is no longer a damped harmonic oscillator. The latter is
described by a simple linear constant coefficient second order
differential equation for a single function, and has no memory.
Its analysis is very simple, and compared to that any more detailed
description is unwieldy.
----------------------------------------
S1r. Measurement theory for real numbers
----------------------------------------
The standard textbook measurement theory says that the possible
measurement results in measuring an observable given by a Hermitian
operator A are its possible eigenvalues, with a probability density
depending on the state of the system. This is part of the content of
Born's rule, and counts as one of the cornerstones of the
interpretation of quantum mechanics.
But Born's rule gives only a very idealized account of measurement
theory, and gives no sufficient explanation for what is going on in
many nontrivial measurements.
The spectrum of the Hamiltonian of the electron of a hydrogen atom
has a discrete part, catering for its bound states. According to the
idealized textbook measurement theory, a measurement of the energy
of a bound state should produce an infinitely accurate value agreeing
with one of the values in the (QED-corrected) Balmer (etc.) series.
But this is ridiculous. Repeated preparation and measurement of the
position of the ``same'' spectral lines (which provide these energy
measurements, relative to an appropriate zero of the energy) yields
different results, from which the energies themselves can be obtained
only to a certain accuracy.
Thus Born's rule does not account for the interpretation of a
measurement of the energy of an electron. For similar reasons,
measurements of particle masses or resonance energies do not reveal
the exact values (which they should according to Born's rule) but only
approximations whose quality depends a lot on the way the measurement
is done (an aspect that does not figure at all in Born's rule).
Measurements such as that of a particle lifetime or the integral cross
section of a particular reaction do not even have a natural associated
operator of which the measurement result would be an eigenvalue.
The idealized textbook measurement theory based on Born's rule is
appropriate only for the measurement of spin and related variables
that result in recording decisions of finite information content.
Thus the measurement process as described by von Neumann (and copied
from there to numerous textbooks) is an unrealistic idealization
compared with many (and probably most) real measurements.
The latter are usually much better described by suitable POVMs
(positive operator valued measures) rather than by Born's rule,
which corresponds to PVMs (projection-valued measures), a special case
of POVMs in which the positive operators are in fact projections.
See Sections 7.3-7.5 of the book
A. Neumaier and D. Westra,
Classical and Quantum Mechanics via Lie algebras,
arXiv:0810.1019
for a realistic account of measurement theory not dependent on
Born's rule. The latter is derived there as a special case, together
with giving the condition in which it is applicable.
---------------------------------------------
S1s. The classical limit of quantum mechanics
---------------------------------------------
Classical mechanics is often seen as the formal limit hbar-->0 of
quantum mechanics. Strictly speaking, this cannot be true since hbar
is a constant of nature, which is often even set to one to have
convenient units. The classical limit really is the limit of large
quantum numbers M (typically of mass, number of particles, or size of
angular momentum), when attention is limited to quantities whose
uncertainties are small compared to their expectations.
In these situations, the effect is similar to taking the limit
hbar --> 0. In these cases the relative uncertainties scale with
sqrt(hbar/M), which becomes small if either hbar is made formally
tiny or if M is large.
Indeed, a quantum system is essentially classical if its relevant
quantities have uncertainties that are small compared to their
expectations.
The relation between classical mechanics is most easily seen if --
as in statistical mechanics -- quantum mechnaics is presented in terms
of mixed states, which correspond to density matrices.
(Almost all quantum mechanics applied to real systems not in
the ground state needs density matrices, since pure states are very
difficult to create and propagate unless a system is in the ground
state. Pure states describe only an idealized version of quantum
reality, which in statistical mechanics appears as the approximation
in the cold limit T-->0.)
Density matrices are intrinsically quantum mechanical.
Nevertheless they exhibit very close analogies to classical densities.
Therefore everyone interested in the relations between classical and
quantum mechanics is well-advised to look at both theories in the
statistical mechanics version, where the analogies are obvious, and
the transition from quantum to classical takes the form of a simple
approximation.
QM in the statistical mechanics version is almost as intuitive as
classical statistical mechanics. The only somewhat nonintuitive part
is in both cases how to interpret probability. (This is already a
severe problem in classical statistical mechanics, as the book by
Laurence Sklar, Physics and Chance, explains in detail.)
--------------------------------------------
S1t. The classical limit via coherent states
--------------------------------------------
One method for producing classical mechanics from a quantum theory is
by looking at coherent states of the quantum theory. The standard
(Glauber) coherent states have a localized probability distribution in
classical phase space? whose center follows the classical equations
of motion when the Hamiltonian is quadratic in positions and momenta.
(For nonquadratic Hamiltonians, this only holds approximately over
short times. For example, for the 2-body problem with a 1/r^2
interaction, Glauber coherent states are not preserved by the dynamics.
In this particular case, there are, however, alternative SO(2,4)-based
coherent states that are preserved by the dynamics, smeared over
Kepler-like orbits. The reason is that the Kepler 2-body problem --
and its quantum version, the hydrogen atom -- are superintegrable
systems with the large dynamical symmetry group SO(2,4).)
In general, roughly, coherent states form a nice orbit of unit vectors
of a Hilbert space H under a dynamical symmetry group G with a
triangular decomposition, such that the linear combinations of
coherent states are dense in H, and the inner product phi^*psi of
coherent states phi and psi can be calculated explicitly in terms of
the highest weight representation theory of G. The diagonal of the
N-th tensor power of H (coding systems with N-fold quantum numbers)
has coherent states phi_N (labelled by the same classical phase space
as the original coherent states, and orresponding to the N-fold highest
weight) with inner product
phi_N^*psi_N=(phi^*psi) N
and for N --> inf, one gets a good classical limit. For the Heisenberg
group, phi^*psi is a 1/hbar-th power, and the N-th power corresponds
to replacing hbar by hbar/N. Thus one gets the standard classical limit.
Basic literature on relations between coherent states and the classical
limit, based on irreducible unitary representations of Lie groups
includes the book
A. M. Perelomov,
Generalized Coherent States and Their Applications,
Springer-Verlag, Berlin, 1986.
and the paper
L. Yaffe,
Large N limits as classical mechanics,
Rev. Mod. Phys. 54, 407--435 (1982)
Both references assume that the Lie group is finite-dimensional and
semisimple. This excludes the Heisenberg group, in terms of which the
standard (Glauber) coherent states are usually defined. However, the
Heisenberg group has a triangular decomposition, and this suffices to
apply Perelomov's theory in spirit. The online book
Arnold Neumaier, Dennis Westra,
Classical and Quantum Mechanics via Lie algebras,
http://lanl.arxiv.org/abs/0810.1019
contains a general discussion of the relations between classical
mechanics and quantum mechanics, and discusses in Chapter 16 the
concept of a triangular decomposition of Lie algebras and a summary of
the associated representation theory (though in its present version
not the general relation to coherent states).
For other relevant approaches to a rigorous classical limit, see the
online sources
http://www.projecteuclid.org/Dienst/Repository/1.0/Disseminate/euclid.cmp/110
3859040/body/pdf
http://www.univie.ac.at/nuhag-php/bibtex/open_files/si80_SIMON!!!.pdf
http://arxiv.org/abs/quant-ph/9504016
http://arxiv.org/pdf/math-ph/9807027
--------------------------------
S2a. Lie groups and Lie algebras
--------------------------------
Lie groups can be illustrated by continuous rigid motion of a ball
with painted patterns on it in 3-dimensional space. The Lie group ISO(3)
consists of all rigid transformations.
A rigid transformation is essentially the act of picking the ball and
placing it somewhere else, ignoring the detailed motion in between and
the location one started.
Special transformations are for example a translation in northern
direction by 1 meter, or a rotation by one quarter around the vertical
axis at some particular point (think of a ball with a string attached).
'Rigid' means that the distances between marked points on the ball
remains the same; the mathematician talks about 'preserving distances',
and the distances are therefore labeled 'invariants'.
One can repeat the same transformation several times, or two different
transformations and get another one - This is called the product of
these transformations. For example, the product of a translations
by 1 meter and another one by 2 meters in the same direction gives one
of 1+2=3 meters in the same direction. In this case, the distances add,
but if one combines rotations about different axes the result is no
longer intuitive. To make this more tractable for calculations,
one needs to take some kind of logarithms of transformations - these
behave again additively and make up the corresponding Lie algebra
iso(3) [same letters but in lower case]. The elements of the Lie algebra
can be visualized as very small, or 'infinitesimal', motions.
General Lie groups and Lie algebras extend these notions to to more
general manifolds. A manifold is just a higher-dimensional version
of space, and transformations are generalized motions preserving
invariants that are important in the manifold. The transformations
preserving these invariants are also called 'symmetries', and the
Lie group consisting of all symmetries is called a 'symmetry group'.
The elements of the corresponding Lie algebra are 'infinitesimal
symmetries'.
For example, physical laws are invariant under rotations and
translations, and hence unter all rigid motions. But not only these:
If one includes time explicitly, the resulting 4-dimensional space
has more invariant motions or ''symmetries''.
The Lie group of all these symmetry transformations is called the
Poincar'e group, and plays a basic role in the theory of relativity.
The transformations are now about space-time frames in uniform motion.
Apart from translations and rotations there are symmetries called
'boosts' that accelerate a frame in a certain direction, and
combinations obtained by taking products. All infinitesimal symmetries
together make up a Lie algebra, called the Poincar'e algebra.
Much more on Lie groups and Lie algebras from the perspective of
classical and quantum physics can be found in:
Arnold Neumaier and Dennis Westra,
Classical and Quantum Mechanics via Lie algebras,
Cambridge University Press, to appear (2009?).
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#QML
arXiv:0810.1019
-----------------------------------------------------------
S2b. The Galilei group as contraction of the Poincare group
-----------------------------------------------------------
The group of symmetries of special relativity is the Poincare group.
However, before Einstein invented the theory of relativity,
physics was believed to follow Newton's laws, and these have a
different group of symmetries - the Galilei group, and its
infinitesimal symmetries form the Galilei algebra.
Now Newton's physics is just a special case of the theory of relativity
in which all motions are very slow compared to the speed of light.
Physicists speak of the 'nonrelativisitic limit'.
Thus one would expect that the Galilei group is a kind of
nonrelativistic limit of the Poincar'e group.
This notion has been made precise by Inonu. He looked at the
Poincar'e algebra and 'contracted' it in an ingenious way
to the Galilei algebra. The construction could then be lifted to
the corresponding groups. Not only that, it turned out to be a
general machinery applicable to all Lie algebras and Lie groups,
and therefore has found many applications far beyond that for which
it was originally developed.
---------------------------------------------------------------------
S2c. Representations of the Poincare group, spin and gauge invariance
---------------------------------------------------------------------
Whatever deserves the name ''particle'' must move like a single,
indivisible object. The Poincare group must act on the description of
this single object; so the state space of the object carries a
unitary representation of the Poincare group. This splits into a direct
sum or direct integral of irreducible reps. But splitting means
divisibility; so in the indivisible case, we have an irreducible
representation. Thus particles are described by irreducible unitary
reps of the Poincare group. Additional parameters characterizing the
irreducible representation of an internal symmetry group = gauge
On the other hand, not all irreducible unitary reps of the Poincare
group qualify. Associated with the rep must be a consistent and causal
free field theory. As explained in Volume 1 of Weinberg's book on
quantum field theory, this restricts the rep further to those with
positive mass, or massless reps with quantized helicity.
-----------------------------------
S2d. Forms of relativistic dynamics
-----------------------------------
Relativistic multiparticle mechanics is an intricate subject,
and there are no-go theorems that imply that the most plausible
possibilities cannot be realized. However, these no-go theorems
depend on assumptions that, when questioned, allow meaningful
solutions. The no-go theorems thus show that one needs to be careful
not to introduce plausible but inappropriate intuition into the
formal framework.
-------------------------------------------------------------
S2e. Is there a multiparticle relativistic quantum mechanics?
-------------------------------------------------------------
In his QFT book, Weinberg says no, arguing that there is no way to
implement the cluster separation property. But in fact there is:
There is a big survey by Keister and Polyzou on the subject
B.D. Keister and W.N. Polyzou,
Relativistic Hamiltonian Dynamics in Nuclear and Particle Physics,
in: Advances in Nuclear Physics, Volume 20,
(J. W. Negele and E.W. Vogt, eds.)
Plenum Press 1991.
www.physics.uiowa.edu/~wpolyzou/papers/rev.pdf
that covered everything known at that time. This survey was quoted
at least 116 times, see
http://www.slac.stanford.edu/spires/find/hep?c=ANUPB,20,225
looking these up will bring you close to the state of the art
on this.
They survey the construction of effective few-particle models.
There are no singular interactions, hence there is no need for
renormalization.
The models are _not_ field theories, only Poincare-invariant few-body
dynamics with cluster decomposition and phenomenological terms
which can be matched to approximate form factors from experiment or
some field theory. (Actually many-body dynamics also works, but the
many particle case is extremely messy.)
They are useful phenomenological models, but somewhat limited;
for example, it is not clear how to incorporate external fields.
The papers by Klink at
http://www.physics.uiowa.edu/~wklink/
and work by Polyzou at
http://www.physics.uiowa.edu/~wpolyzou/
contain lots of multiparticle relativistic quantum mechanics,
applied to real particles. See also the Ph.D. thesis by Krassnigg at
http://physik.uni-graz.at/~ank/dissertation-f.html
(Other work in this direction includes Dirac's many-time quantum
theory, with a separate time coordinate for each particle; see, e.g.,
Marian Guenther, Phys Rev 94, 1347-1357 (1954)
and references there. Related multi-time work was done under the
name of 'proper time quantum mechanics' or 'manifestly covariant
quantum mechanics', see, e.g.,
L.P. Horwitz and C. Piron, Helv. Phys. Acta 48 (1973) 316,
but it does not reproduce standard physics, and apparently never
reached a stage useful to phenomenology.)
Note that in the working single-time approaches, covariance is always
achieved through a representation of the Poincare group on a
Hilbert space corresponding to a fixed time (or another 3D manifold in
space-time), rather than through multiple times.
Thus the whole theory has a single time only, whose dynamics is
generated by the Hamiltonian, the generator H=P_0 of the Poincare group.
(This is completely analogous to the nonrelativistic case,
where multiparticle systems also have a single time only.)
The natural manifestly covariant picture is that of a vector bundle
on Minkowski space-time, with a standard Fock space attached to each
point. An observer (i.e., formally, an orthonormal frame attached at
some space-time point) moves in space-time via the Poincare group,
and this action extends to the bundle by means of the representation
defining the Fock space.
----------------------
S2f. What is a photon?
----------------------
According to quantum electrodynamics, the most accurately verified
theory in physics, a photon is a single-particle excitation of the
free quantum electromagnetic field. More formally, it is a state of
the free electromagnetic field which is an eigenstate of the photon
number operator with eigenvalue 1.
The pure states of the free quantum electromagnetic field
are elements of a Fock space constructed from 1-photon states.
A general n-photon state vector is an arbitrary linear combinations
of tensor products of n 1-photon state vectors; and a general pure
state of the free quantum electromagnetic field is a sum of n-photon
state vectors, one for each n. If only the 0-photon term contributes,
we have the dark state, usually called the vacuum; if only the
1-photon term contributes, we have a single photon.
A single photon has the same degrees of freedom as a classical vacuum
radiation field. Its shape is characterized by an arbitrary nonzero
real 4-potential A(x) satisfying the free Maxwell equations, which in
the Lorentz gauge take the form
nabla dot nabla A(x) = 0,
nabla dot A(x) = 0,
expressing the zero mass and the transversality of photons. Thus for
every such A there is a corresponding pure photon state |A>.
Here A(x) is _not_ a field operator but a photon amplitude;
photons whose amplitude differ by an x-independent phase factor are
the same. For a photon in the normalized state |A>, the observable
electromagnetic field expectations are given by the usual formulas
relating the 4-potential and the fields,
<\E(x)> = <A|\E(x)|A>
= - partial \A(x)/partial x_0 - c nabla_\x A_0(x),
and
<\B(x)> = <A|\B(x)|A> = nabla_\x x \A(x)
[hmmm. check if this really is the case...]
Here \x (fat x) and x_0 are the space part and the time part of a
relativistic 4-vector, \E(x), \B(x) are the electromagnetic
field operators (related to the operator 4-potential by analogous
formulas), and c is the speed of light. Amplitudes A(x) producing
the same \E(x) and \B(x) are equivalent and related by a gauge
transformation, and describe the same photon.
-------------------------------------------------
S2g. Particle positions and the position operator
-------------------------------------------------
The standard probability interpretation for quantum particles
is based on the Schr"odinger wave function psi(x), a square integrable
single- or multicomponent function of position x in R^3.
Indeed, with ^* denoting the conjugate transpose,
rho(x) := psi(x)^*psi(x)
is generally interpreted as the probability density to find (upon
measurement) the particle at position x. Consequently,
Pr(Z) := integral_Z dx |psi(x)|^2
is interpreted as the probability of the particle being in the open
subset Z of position space. Particles in highly localized states
are then given by wave packets which have no appreciable size
|psi(x)| outside some tiny region Z.
If the position representation in the Schr"odinger picture exists,
there is also a vector-valued position operator x, whose components
act on psi(x) by multiplication with x_j (j=1,2,3). In particular,
the components of x commute, satisfy canonical commutation relations
with the conjugate momentum
p = -i hbar partial_x,
and transform under rotations like a 3-vector, so that the commutation
relations with the angular momentum J take the form
[J_j,x_k] = i eps_{jkl} x_l.
Moreover, in terms of the (unnormalizable) eigenstates |x,m> of the
position operator correponding to the spectral value x (and a label m
to distinguish multiple eigenstates) we can recover the position
representation from an arbitrary representation by defining psi(x)
to be the vector with components
psi_m(x) := <x,m|psi>.
Therefore, if we have a quantum system defined in an arbitrary
Hilbert space in which a momentum operator is defined, the necessary
and sufficient condition for the existence of a spatial probability
interpretation of the system is the existence of a position operator
with commuting components which satisfy standard commutation
relations with the components of the momentum operator and the
angular momentum operator.
Thus we have reduced the existence of a probability interpretation
for particles in a bounded region of space to the question of the
existence of a position operator with the right properties.
Theorem.
An irreducible representations of the full Poincare group with
mass m>=0 and finite spin has a position operator transforming
like a 3-vector and satisfying the canonical commutation relations
if and only if either m>0 or m=0 and s<=1/2 (but s=0 if only
the connected poincare group is considered).
This theorem was announced without giving details in
T.D. Newton and E.P. Wigner,
Localized states for elementary systems,
Rev. Mod. Phys. 21 (1949), 400-406.
A mathematically rigorous proof was given in
A. S. Wightman,
On the Localizability of Quantum Mechanical Systems,
Rev. Mod. Phys. 34 (1962), 845-872.
See also
T.F. Jordan
Simple proof of no position operator for quanta with zero mass
and nonzero helicity
J. Math. Phys. 19 (1980), 1382-1385.
who also considers the massless representations of continuous spin,
and
D Rosewarne and S Sarkar,
Rigorous theory of photon localizability,
Quantum Opt. 4 (1992), 405-413.
For spin 1, the case relevant for photons, we have d=3, and the
subspace of interest is the space H obtained by completion of the
space of all vector-valued C^infty functions A(p) of a nonzero
3-momentum p with compact support satisfying the transversality
condition p dot A(p)=0,
with inner product defined by
<A|A'> := integral dp/|p| A(p)^* A'(p).
It is not difficult to see that one can identify the wave functions
A(p) with the Fourier transform of the vector potential in the
radiation gauge where its 0-component vanishes. This relates the
present discussion to that given in the FAQ entry ''What is a photon?''.
----------------------------------------
S2h. Localization and position operators
----------------------------------------
Position operators are part of the toolkit of relativistic quantum
mechanics.
In a relativistic setting, one always has a representation of the
Poincare algebra. From the generators of the Poincare algebra
(namely the 4-momentum p, the angular momentum \J, and the
boost generators \K) one can make up (in massive representations)
a nonlinear expression for a 3-dimensional \x (the position operator)
that together with the space part \p of the 4-momentum has canonical
commutation rules and hence gives a Heisenberg algebra.
(The backslash is a convenient ascii notation to indicate bold face
letters, corresponding to 3-vectors.)
The position operator so constructed is unique, once the time coordinate
is fixed, and is usually called the Newton-Wigner position operator,
although it appears already in earlier work of Pryce. Relevant
applications are related to the names Foldy and Wuythousen
(for their transform of the Dirac equation, widely used in relativistic
quantum chemistry) and Bakamjian and Thomas (for their relativistic
multi-particle theories); both groups rediscovered the Newton-Wigner
results independently, not being aware of their work.
That the time coordinate has to be fixed means that the position
operator is observer-dependent. Each observer splits space-time
into its personal time (in direction of its total 4-momentum) and
personal 3-space (orthogonal to it), and the position operator
relates to this 3-space. By a Lorentz transformation, one can
transform the 4-momentum to the vector (E_obs 0 0 0), which makes time
the 0-component. Most papers on the subject work in the latter setting.
For massless representations of spin >1/2, the construction breaks down.
This is related to the fact that massless particles with spin >1/2
don't have modes of all helicities allowed by the spin
(e.g., photons have spin 1 but no longitudinal modes),
which makes them being always spread out, and hence not completely
localizable. For details, see the FAQ entry
''Particle positions and the position operator''
------------------------------------------------------------
S2i. Position operators in relativistic quantum field theory
------------------------------------------------------------
In relativistic quantum field theory in its usually given form,
position is promoted to the same status as time, and hence becomes a
parameter in the quantum field, while in quantum mechanics it is an
operator vector.
This poses the question of whether there is a position operator in
relativistic quantum field theory. Many people think that there is none.
But even though there is a parameter called x and referred to as
4-dimensional position, there is also an vector defining a
3-dimensional position operator, provided the relativistic system
under consideration is not massless.
Indeed, any relativistic theory possesses the Poincare group as a
symmetry group, whose infinitesimal generators satisfy the standard
commutation rules of the Poincare algebra. But given these, the
standard construction by Newton and Wigner gives (in each Lorentz
frame) a 3-dimensional position operator with commuting components,
and the associated conjugate momentum operators. (See Section S2g
''Particle positions and the position operator'' of this FAQ.)
These play exactly the same role as the position and momentum
operators in nonrelativistic quantum mechanics.
------------------------------------------
S2j. Coherent states of light as ensembles
------------------------------------------
Let us look in some detail at the setting of a weak laser switched on
at time t_0 and switched off again at time t_1. The time T:=t_1-t_0
that the laser is switched on is a variable that we can choose at will.
Conventionally one models the light produced by a laser by coherent
states. If one tests the photon contents at the end of the beam by a
photodetector, one measures a series of clicks indicating (according to
tradition) the presence of single photons. Each click is conventionally
regarded as the measurement of a single photon; hence one measures an
ensemble of photons. Without this interpretation, much of the talk
about photons in quantum optics would not make sense.
---------------------------------------------
S3a. What are 'bare' and 'dressed' particles?
---------------------------------------------
A bare electron is the formal entity discussed in textbooks
when they do perturbative quantum electrodynamics. The intuitive
picture generally given is that a bare electron is surrounded
by a cloud of virtual photons and virtual electron-positron pairs
to make up a physical, 'dressed' electron. Only the latter is real
and observable. The former is a formal caricature of the latter,
with paradoxical properties (infinite mass, etc.).
On a more substantial level, the observable electrons are produced
from the bare electrons by a process called renormalization,
which modifies the propagators by self-energy terms
and the currents by form factors. As the name says, the latter define
the 'form' of a particle. (In the above picture, it would correspond
to the shape of the virtual cloud, though it is better to avoid
giving the virtual particles too much of meaning.)
The dressed object is the renormalized, physical object,
described perturbatively as the bare object 'clothed' by the
cloud of virtual particles. The dressed interaction is the 'screened'
physical interaction between these dress objects.
To draw an analogy in nonrelativistic quantum mechanics
think of nuclei as bare atoms, electrons as virtual particles,
atoms as dressed nuclei and the residual interaction between atoms,
computed in the Born-Oppenheimer approximation, as the dressed
interaction. Thus, for Argon atoms, the dressed interaction is
something close to a Lennard-Jones potential, while the bare
interaction is Coulomb repulsion. This is the situation physicists
had in mind when they invented the notions of bare and dressed
particles.
Of course, it is only an analogy, and should not be taken very
seriously. It just explains the intuition about the terminology used.
(For the serious version of renormalization, see Chapter 8.)
The electrons in QM are real, physical electrons that can be isolated.
The reason is that they are good eigenstates of the Hamiltonian.
On the other hand, virtual particles don't have this nice attribute
since the relativistic Hamiltonian H from field theory contains
creation and annihilation operators which mess things up.
The bare particles correspond to 1-particle states in the Hilbert
space (though that is not quite true since there is no good Hilbert
space picture in conventional interacting QFT). Multiplying them
with H introduces terms with other particle numbers, hence a bare
particle can never be an eigenstate of H, and thus never be
observable in the way a nonrelativistic particle is.
The eigenstates of the relativistic Hamiltonian are, instead,
complicated multibody states consisting
of a superposition of states with any number of particles and
antiparticles, just subject to the restriction that the total quantum
numbers come out right. These are the dressed particles.
For the computational side of dressing, see, e.g., nucl-th/0102037,
or http://www.geocities.com/meopemuk/
------------------------------------------------
S3b. How meaningful are single Feynman diagrams?
------------------------------------------------
The standard model is a theory defined in terms of a Lagrangian.
To get computable output, Feynman graph techniques are used.
But individual Feynman graphs are meaningless (often infinite);
only the sum of all terms of a given order can be given - after
a process called renormalization - a well-defined (finite) meaning.
This is well-known; so no-one treats the Feynman graphs as real.
What is taken as real is the final outcome of the calculations,
which can be compared with measurements.
--------------------------------------
S3c. How real are 'virtual particles'?
--------------------------------------
Virtual particles are used in perturbation theory with
Feynman diagrams. (See the FAQ entry ''Why Feynman diagrams''
for an explanation of their meaning. They do _not_ describe
processes in space and time, but certain multiple integrals...)
Feynman diagrams change their nature depending on the way
one does perturbation theory and what is resummed.
In their treatise on QED, Landau and Lifshitz discuss virtual particles
in Section 79. They start at the outset with the remark that things
depend on which kind of perturbation theory is used, and contrast
'virtual' explicitly with 'real'. Virtual particles are called that
in contrast to 'real particles' which are observable and hence real.
Unlike the latter, virtual particles occuring in computations _must_
have disappeared from the formulas by the time the calculations lead
to something that can be compared with experiment.
Whence their 'reality' if there is any is like the reality of
characters in a dream. For example, just as we can fly in a dream,
virtual particles can be faster than light (since they may have
imaginary mass)...
The following is a more detailed discussion of the question how
meaningful it is to ascribe some sort of reality to virtual particles.
-------------------------------------------------------
S3d. What is the meaning of 'on-shell' and 'off-shell'?
-------------------------------------------------------
This applies only to relativistic particles.
A particle of mass m is on-shell if its momentum p satisfies
p^2 (= p_0^2-p_1^2-p_2^2-p_3^2) = m^2,
and off-shell otherwise. The 'mass shell' is the manifold of
momenta p with p^2=m^2.
Observable (i.e., physical) particles are asymptotic states
(scattering states) described (modulo unresolved mathematical
difficulties) by free fields based on the dispersion relation p^2=m^2,
and hence are necessarily on-shell. Off-shell particles only
arise in intermediate perturbative calculations; they are necessarily
'virtual'.
The situation is muddled by the fact that one has to distinguish
(formal) bare mass and (physical) dressed mass; the above is valid
only for the dressed mass. Moreover, the mass shell loses its meaning
in external fields, where, instead, a so-called 'gap equation'
appears.
----------------------------------------------
S3e. Virtual particles and Coulomb interaction
----------------------------------------------
Virtual objects have strange properties. For example,
the Coulomb interaction between two electrons is mediated by
virtual photons faster than the speed of light, with imaginary masses.
(This is often made palatable by invoking a time-energy uncertaintly
relation, which would allow particles to go off-shell.
But there is no time operator in QFT, so the analogy to Heisenberg's
uncertainty relation for position and momentum is highly dubious.)
Strictly speaking,
the Coulomb interaction is simply the Fourier transform of the
photon propagator 1/q^2, followed by a nonrelativistic approximation.
It has nothing at all to do with virtual particle exchanges ---
except if one does perturbation theory. But then there is no surprise
that it must influence already the tree level. By a hand waving
argument (equate the Born approximations) this gives the
nonrelativistic correspondence.
But to get the Coulomb interaction as part of the Schroedinger equation,
one needs to sum all ladder diagrams with 0,1,2,3,...,n,... exchanged
photons arranged in form of a ladder. Then one needs to approximate
the resulting Bethe-Salpeter equation. These are nonperturbative
techniques. (The computations are still done at few loops only,
which means that questions of convergence never enter.)
Virtual photons mediating the Coulomb repulsion between electrons
have spacelike momenta and hence would proceed faster than light
if there were any reality to them. But there cannot be; one'd need
infinitely many of them, and infinitely many virtual electron-positron
pairs (and then superpositions of any numbers of these) to match exactly
a real, dressed object or interaction.
-----------------------------------------------------------
S3f. Are virtual particles and decaying particles the same?
-----------------------------------------------------------
Decaying particles and resonances are used synonymously in the
literature; they are complementary views of the same unstable state.
A very sharp resonance has a long lifetime relative to a scattering
event, hence behaves like a particle in scattering. It is regarded
as a real object if it lives long enough that its trace in a wire
chamber is detectable, or if its decay products are detectable at
places significantly different from the place where it was created.
On the other hand, a very broad resonance has a very short lifetime
and cannot be differentiated well from the scattering event producing
it; so the idealization defining the scattering event is no longer
valid, and one would not regard the resonance as a particle.
Of course, there is an intermediate grey regime where different people
apply different judgment. This can be seen, e.g., in discussions
concerning the tables of the Particle Data Group.
The only difference between a short-living particle and a stable
particle is the fact that the stable particle has a real rest mass,
while the mass m of the resonance has a small imaginary part.
Note that states with complex masses can be handled well in a rigged
Hilbert space (= Gelfand triple) formulation of quantum mechanics.
Resonances appear as so-called Siegert (or Gamov) states.
A good reference on resonances (not well covered in textbooks) is
V.I. Kukulin et al.,
Theory of Resonances,
Kluwer, Dordrecht 1989.
For rigged Hilbert spaces (treated in Appendix A of Kukulin), see also
quant-ph/9805063 and for its functional analysis ramifications,
K. Maurin,
General Eigenfunction Expansions and Unitary Representations of
Topological Groups,
PWN Polish Sci. Publ., Warsaw 1968.
But a very short-living particle is not the same as a virtual
particle. Often it is a complicated, nearly bound state of other
particles. On the other hand, virtual particles are essentally always
elementary. (There are exceptions when deriving Bethe-Salpeter equations
and the like for the approximate calculations of bound states and
resonances, where one creates an effective theory in which the latter
are treated as elementary.)
Even an unstable elementary particle can be distinguished from
a virtual particle. In perturbation theory, unstable elementary
particles are modelled exactly like stable particles,
namely as external lines in a Feynman diagram.
Virtual particles in Feynman diagrams are exactly those parts
of the diagram which are not given by external lines.
In particular, what is real and what is virtual is not affected
by a diagram rotation - this only affects what is input
and what is output.
The difference can also be seen in the mathematical representation.
In an effective theory where the resonance (e.g., the neutron or a
meson) is regarded as an elementary object, the resonance appears
in in/out states as a real particle, with complex on shell momentum
satisfying p^2=m^2, but in internal Feynman diagrams as a virtual
particle with real mass, almost always off-shell, i.e., violating
this equation.
There are also some unstable elementary particles like the weak
gauge bosons. Usually, one observes a 4-fermion interaction and the
gauge bosons are virtual. But at high energy = very short scales,
one can in principle observe the gauge bosons and make them real.
This means that they now appear as external lines in the corresponding
perturbative calculations, which displays their nonvirtual nature.
In any case, from a mathematical point of view, one must choose the
framework. Either one works in a Hilbert space, then masses are real
and there are no unstable particles (since these 'are' poles on the
so-called 'unphysical' sheet); in this case, there are no asymptotic
gauge bosons and all are therefore virtual.
Or one works in a rigged Hilbert space and deform the inner product;
this makes part of the 'unphysical' sheet visible; then the gauge
bosons have complex masses and there exist unstable particles
corresponding to in/out gauge bosons which are real.
The modeling framework therefore decides which language is appropriate.
------------------------------------------
S4a. How do atoms and molecules look like?
------------------------------------------
Today, images of single atoms and molecules can be routinely produced.
M. Herz, F.J. Giessibl and J. Mannhart
Probing the shape of atoms in real space
Phys. Rev. B 68, 045301 (2003)
http://prola.aps.org/pdf/PRB/v68/i4/e045301
write in the introduction:
''quantum mechanics specifies the probability of finding an electron
at position x relative to the nucleus. This probability is
determined by |psi(x)|^2, where psi(x) is the wave function of the
electron given by Schroedinger's equation. The product of -e and
|psi(x)|^2 is usually interpreted as charge density, because the
electrons in an atom move so fast that the forces they exert on
other charges are essentially equal to the forces caused by a
static charge distribution -e|psi(x)|^2.''
One of the authors, Jochen Mannhart, is one of the 10 winners of the
Leibniz prize 2008,
http://www.dfg.de/aktuelles_presse/preise/leibniz_preis/2008/
among others for the achievement that, for the first time, he made
pictures of atoms with subatomic resolution possible.
The Leibniz prize is the highest German academic prize, endowed with
a research grant of up to 2.5 Million Euro for each winner,
awarded each year to a few excellent younger scientists from all
sciences.
The orbitals one can look at in physics and chemistry books
are the pictures of the squared absolute values of basis functions
used for representing single electron wave functions.
The actual shape of the wave function of each electron is some linear
combination of such basis function. These are calculated (in the
simplest realistic approximation) by Hartree-Fock calculations.
The atom shape is the shape of all electrons together, forming
in the Hartree-Fock approximation a Slater determinant formed from the
single-particle wave functions, and in general a linear combinations
of such Slater determinants. These live in a multidimensional space
with 3n dimensions for an atom with n electrons.
The shape one can measure is actually a 3-dimensional charge density
rho(x) (x in R^3) formed by integrating the square of the absolute
value of the 3n-dimensional wave function psi over 3n-3 dimensions.
More precisely, it is defined (nonrelativistically) such that
(apart form a constant factor and the charge contribution of the
nucleus)
integral dx rho(x) f(x) = psi^* O_1(f) psi (1)
for all nice 3-dimensional functions f(x) of the space coordinate
vector x, where
O_1(f) = integral f(x) a^*(x) a(x)
is the 1-particle operator corresopnding to f. Here a^* and a denote
creation and annihilation operators. Since rho(x) decays quickly as x
differs more and more from the atom center, the atom looks like a
charge cloud with slightly fuzzy boundary.
For isolated atoms in the absence of external fields,
rho is typically spherically symmetric, giving symmetric shapes.
(In case of particles of nonzero spin, this assumes
that we are in a thermal setting where the spin directions average out.
In this case, we have instead of (1) the formula
integral dx rho(x) f(x) = tr O_1(f) rhohat,
where rhohat is the density matrix of the mixed state.)
For molecules, rho is in fact also a function of the coordinates of
all nuclei involved, and there is no longer any reason to have more
symmetry than the symmetry of the configuration of nuclei,
which is very little and often none.
The shape of molecules is therefore mainly determined by the geometry
of the positions of the nuclei. In equilibrium, these arrange
themselves such that the potential energy, i.e., the smallest
eigenvalue of the Hamiltonian operator for the electrons is minimal
among all other positions (or at least a local minimum from which a
deeper lying state is very difficult to reach). The charge density
of molecules can be identified by means of X-ray crystallography or
nuclear magnetic resonance (NMR) spectroscopy; however, for complex
molecules, doing this reliably from the available indirect information
is a highly nontrivial art.
A few years ago,
I wrote a survey of molecular modeling of proteins, the largest
molecules in nature (apart from crystals, which are essentially
molecules of macroscopic size):
A. Neumaier,
Molecular modeling of proteins and mathematical prediction of
protein structure,
SIAM Review 39 (1997), 407-460.
http://www.mat.univie.ac.at/~neum/papers/physpapers.html#protein
--------------------------------------------------
S4b. Why are observable densities state-dependent?
--------------------------------------------------
In the preceding, the mass and charge density of a n-particle system
(or of a single particle) depends on its quantum state. This is
sometimes regarded as a reason for denying the 'reality' of the
mass and charge density. However, such a reasoning is misguided.
Indeed, the phenomenon is already present in classical mechanics.
That mass and charge density depends on the state is no more
surprising than that the trajectory of a classical particle depends
on its classical state (its position and momentum), or that the
density of a cloud in the sky depends on its classical state
(the position and momentum of all its particles, or, in the customary
fluid mechanics approximation, its mass density field and its velocity
field).
Of course it has to, to match a particular real life situation.
What seems strange at first sight is that the above applies already to
a single, indivisible particle. But this is really strange only if one
assumes that the particle is pointlike - which we know is the case only
for unphysical, bare particles, but not for the physical, renormalized
ones. (See the entry ''Are electrons pointlike/structureless?''
elsewhere in this FAQ.) Once one realizes that physical particles are
extended (although they are indivisible), there is enough room to
accommodate the internal structure described by densities.
Thus the only quantum paradox that remains is that particles with
nontrivial internal structure (and shape) can nevertheless be
indivisible, a fact coming from the representation theory of the
fundamental symmetry group of our universe: Indivisibility of an
object just means that this object is described by an irreducible
representation which cannot be decomposed further without violating
a fundamental symmetry.
-------------------------------------------
S4c. Are electrons pointlike/structureless?
-------------------------------------------
Both electrons and neutrinos are considered to be pointlike
as bare particles, because of the way they appear in the standard model.
But physical, relativistic particles are not pointlike.
A pointlike electron would be described exactly by the 1-particle
Dirac equation, which has a degenerate spectrum. But the real electron
is described by a modified Dirac equation, resulting in an anomalous
magnetic moment and a nonzero Lamb shift resolving the degeneracy of
the spectrum. Both are measurable to high accuracy.
The relations between form factors for spin 1/2 particles and
terms in a modified Dirac equation describing the covariant dynamics
of a particle deviating from a point particle are given in
L. L. Foldy
The Electromagnetic Properties of Dirac Particles
Phys. Rev. 87 (1952), 688 - 693.
An intuitive argument for the lack of pointlikeness is the fact that
their localization
to a region significantly smaller than the de Broglie wavelength
would need energies larger than that needed to create
particle-antiparticle pairs, which changes the nature of the system.
(See also this FAQ about localization, and Foldy's papers quoted there.)
On a more formal, quantitative level, the physical, dressed particles
have nontrivial form factors, due to the renormalization necessary to
give finite results in QFT. The form factor measures the deviation
form the behavior of an ideal point particle, i.e., a particle obeying
exactly the the Dirac equation. The form factor can be measured
indirectly, through the anomalous magnetic moment and the Lamb shift.
(A point particle has no anomalous magnetic moment and no Lamb shift
since it satisfies the Dirac equation exactly.)
Nontrivial form factors give rise to a positive charge radius.
In his book
S. Weinberg,
The quantum theory of fields, Vol. I,
Cambridge University Press, 1995,
Weinberg defines and explicitly computes in (11.3.33) a formula for the
'charge radius' of a physical electron. But his formula is not
fully satisfying since it is not fully renormalized (infrared
divergence: the expression contains a ficticious photon mass,
and diverges if this goes to zero).
For electron form factors in light atoms, see
hep-ph/0002158 = Physics Reports 342, 63-126 (2001):
Equation (28) uses a binding energy dependent cutoff,
which makes the electron charge radius depend on its surrounding.
Of course, other particles also have form factors and associated
charge radii. For proton and neutron form factors, see hep-ph/0204239
and hep-ph/030305. Neutrons have a negative mean squared charge radius.
This looks strange but is not since the measure for the mean is
not positive; but it means that a classical interpretation of the
charge radius of neutrons is dubious. In the introduction of
S. Kopecky et al
Phys. Rev. C 56, 2229-2237 (1997)
one can read:
''The charge radius of the neutron <r_n^2> or the mean squared charge
radius is described by the volume integral over the neutron
integral rho(r)r^2dr, where r is the distance to the center of
the neutron and rho(r) is the charge density.
Positive as well as negative values of rho(r) will occur coming
from the distributions of valence quarks and the negative p-meson
cloud outside.
Since rho(r) is negative for larger r values, caused by the meson
cloud, the r^2 dependence of the integral will lead to a negative
value of <r_n^2>.''
The paper
L.L. Foldy,
Neutron-electron interaction,
Rev. Mod. Phys. 30, 471-481 (1958).
discusses the extendedness of the electron in a phenomenological way.
On the numerical side, I only found values for the charge radius
of the neutrinos, computed from the standard model to 1 loop order.
The values are about 4-6 10^-14 cm for the three neutrino species.
See (7.12) in Phys. Rev. D 62, 113012 (2000)
http://adsabs.harvard.edu/cgi-bin/nph-bib_query?1992PhDT.......130L
gives in an abstract of a 1982 thesis of Anzhi Lai
an electron charge radius of ~ 10^{-16} cm
(But I haven't seen the thesis.)
The "form" of an elementary particle (considered as a free particle
at rest) is described by its form factor,
which is a well-defined physical function
(though at present computable only in perturbation theory)
describing how the (spin 0, 1/2, or 1) particle's response to an
external classical electromagnetic field deviates from the
Klein-Gordon, Dirac, or Maxwell equations, respectively.
The form factor contains the complete state-independent information
about a free particle, since it determines the (single-particle)
Hamilton operator of the free particle and everything else can be
computed from it.
In Foldy's paper, the form factors are encoded in the infinite sum
in (16). The sum is usually considered in the momentum domain;
then one simply gets two k-dependent form factors, where k represents
the 4-momentum transferred in the interaction. These form factors
can be calculated in a good approximation perturbatively from QFT,
see for example Peskin and Schroeder's book.
An extensive discussion of form factors of Dirac particles
and their relation to the radial density function is in
D. R. Yennie, M. M. Levy and D. G. Ravenhall,
Electromagnetic Structure of Nucleons,
Rev. Mod. Phys. 29, 144-157 (1957).
and
R. G. Sachs
High-Energy Behavior of Nucleon Electromagnetic Form Factors
Phys. Rev. 126, 2256-2260 (1962)
Yennie et al. write:
''Information about the internal structure of the individual nucleons
is contained in the results of a variety of experiments performed in
recent years. [...] The Lamb shift and the hyperfine splitting also
give such information, [...] The charge-current density of the nucleon
(proton or neutron) includes all of the effects of the internal
structure. [...] The nucleon charge-current density must have the form
<formula involving two form factors F_1 and F_2>
The functions F_1 and F_2 are relativistic generalizations of the form
factors characteristic of finite extension occurring in other
experiments, [...]''
However, the form factor contains nothing at all about
interaction- or state-dependent information since the
interaction-dependent information is coded in an external potential
or a multiparticle formulation, and the state-dependent information
is coded in the wave function or density matrix, which (at any given
time) is independent of the Hamiltonian.
Also, the information contained in the form factor is only about
the free particle in the rest system, defined by a pure state
in which momentum and orbital angular momentum vanish identically.
In an external potential, or in a state where momentum (or orbital
angular momentum) doesn't vanish, the charge density (and the
resulting charge radius) can differ arbitrarily much from the
charge density (and charge radius) at rest.
For example, for a hydrogen electron in the ground state,
the charge density is significant in a region of diameter about
10^-11 cm (a small multiple of the Bohr radius), while the
charge radius at rest is probably (in view of the above partial
results) << 10^-12 cm.
In all cases, the charge distribution is defined as the
expectation of the charge density operator of the corresponding
quantum field. For molecules, this charge distribution is the
computational target of much of quantum chemistry, and defines the
shape of a molecule. The shape of a particle determined by the form
factor therefore corresponds to the equilibrium shape a molecule
takes in its rest frame in the absence of forces, i.e., in its
ground state, while the state-dependent shape corresponds to the
much less predictable shape of a molecule interacting with its
environment.
-------------------------------------------
S4d. How much information is in a particle?
-------------------------------------------
Knowing a particular electron intimately is infinitely precious.
A pure state of an electron is defined by its wave function
(up to a phase). Thus knowing all about an electron requires in the
traditional interpretation to know all about this wave function -
an infinite amount of information.
The information humans are interested in is however always finite,
since they can hardly remember even 20 decimal digits seen only once.
And the amount of information humans are capable of retrieving
by experiment is still limited, since each experiment has only a finite
accuracy.
Thus they simplify things to the point that all they want to know about
an electron is its mass, charge and its state to a small number
of decimal places.
This is only a few bits. But if you want to tell someone else exactly
where the electron is that you are referring to, you have an
infinitely more difficult task. Of course, any human 'else' will not
be patient enough to hear the whole (infinite) story but will be
satisfied with a crude position and momentum
estimate consistent with the uncertainty relation. But this is not the
best possible statement about the electron, which would be telling
its complete wave function. You can do it only if you force the
electron into a prison where it has to behave in a dull (and hence
completely describable) way, being
restricted in its freedom to at most a few bits of change.
This is indeed done when studying qubits for quantum information
processing.
For an N-state system, one needs N^2-1 independent pieces of
information to reconstruct (by quantum tomography) the density matrix
of a finite mixed quantum system, and a fortiori the wave function of
a finite pure quantum system. Most natural systems, unlike those
systems carefully prepared by modern technology, have infinitely many
states, and therefore need an infinite amount of information for their
reconstruction to full accuracy.
------------------------------------
S4e. Entropy and missing information
------------------------------------
[This continues the preceding entry.]
How is this notion of information related to information in terms of
entropy?
Informally, entropy is often equated with information, but this is not
correct - entropy is _missing_ information!
More precisely, in the statistical interpretation, the state belongs
not to a single particle but to an ensemble of particles.
Entropy measures the amount of information missing for a complete
probabilistic description of a system.
Entropy is the mean number of binary questions that must be asked in
an optimal decision strategy to determine the state of a particular
realization given the state of the ensemble to which it belongs.
See Appendix A of my paper
A. Neumaier,
On the foundations of thermodynamics,
arXiv:0705.3790
http://lanl.arxiv.org/abs/0705.3790
The formula for the entropy S found in every
statistical mechanics textbook is, for a system in a mixed state
described by the density matrix rho,
S = <kbar log rho> where <f> = Tr rho f
and kbar is Boltzmann's constant. (I use the bar to be free to use k
as an index.) In any representation where rho is diagonal,
rho = sum_k p_k |k><k|,
this gives
S = kbar sum_k p_k log p_k;
also, since <1>=1 and rho is positive semidefinite,
sum_k p_k = 1 , all p_k >= 0.
Thus p_k can be consistently interpreted as the probability of the
system to occupy state |k>. This probability interpretation
depends on the orthonormal basis used to represent rho; which basis
to use is a famous and not really solved problem in the foundations of
quantum mechanics.
For a pure state psi, rho has rank 1, and the sum extends only over
the single index k with |k> = psi. Thus in this case, p_k = 1 and
S = kbar 1 log 1 = 0, as it should be for a state of maximal
information. The amount of missing information is zero.
For more along these lines, and in particular for a way to avoid
the probabilistic issues indicated above, see Sections 6 and 12
and Appendix A of my paper
A. Neumaier,
On the foundations of thermodynamics,
arXiv:0705.3790
http://lanl.arxiv.org/abs/0705.3790
But how does the infinite amount of information in a pure state (wave
function) square with the finiteness of entropy?
Specifying a mixed state _exactly_ provides already an infinite amount
of information, since the density matrix rho must be specified to
infinite precision.
Defining the eigenstates that are of interest in measurement
amounts to specifying a Hamiltonian operator H _exactly_, which again
provides already an infinite amount of information, since the
coefficients of H in an explicit description must be specified to
infinite precision.
Then only a finite amount of information is missing to determine
in which of the eigenstates a particular particle is.
Of course in practice one just _postulates_ rho and H based on a
finite number of measurements, and _pretends_ (i.e., procedds as if)
they are known exactly, while knowing well that one knows them only
approximately.
In practice, a number of approximations are made. Frewquently,
one postulates exact equilibrium, hence a grand canonical ensemble,
which of course is not exactly valid. Deviations from equilibrium are
handled by means of a hydrodynamical approximation, in which entropy
is no longer a number but a field - and specifying the entropy density
again requires an infinite amount of information. Of course, one
also represents this only to some limited accuracy, to keep things
tractable.
Thus finiteness of the entropy in a particular model is enforced by
making simplifying assumptions which are valid only if one doesn't
look too closely.
Indeed, as the Gibbs paradox (discussed, e.g., as Example 9.1 in
my above thermodynamics paper) shows, the amount of entropy depends
on the level of modeling.
-----------------------------------
S4f. How real is the wave function?
-----------------------------------
In thought experiments one often assigns a state to a single particle.
How defendable is this, and what is the meaning of the state?
In a statistical interpretation - see the section on measurements -,
this would make no sense, since there the state is a property
of the ensemble of particles generated by a given source. But then
it is difficult to visualize what happens in each single case.
Thus many people prefer the 'realistic' language of particles having
definite states. So let us discuss some of its implications.
Suppose that the particle is in the pure state represented by the wave
function psi. It is possible to give the wave function, or rather its
absolute valued squared, a geometric interpretation:
m(x)=m|psi(x)|^2
is the mass density and
e(x)=e|psi(x)|^2
the charge density.
Thus while the wave function itself has no tangible interpretation,
certain fields computable from it have.
On the other hand, one can probe the state of particles in detail
if one has a large ensemble of identically prepared particles
(to make sure that they have the same state). These are usually created
by a carefully calibrated source, such as a laser. Then one can
subject them to different kinds of measurements from which one can
reconstruct a reasonable approximation of the state by quantum
tomography. In theory, one can make the approximation arbitrarily good.
Similarly a particle bound to a surface in a stationary state will
be measurable repeatedly if after the measurement the particle returns
to its state (which is natural if the bound system is in equilibrium).
Therefore one can measure equilibrium properties quite accurately.
In this sense one can say that the state of a single particle is
indeed real, and objective.
----------------------------------
S4g. How real are Feynman's paths?
----------------------------------
In Feynman's version of quantum mechanics, amplitudes are calculated
as sum over all possible classical paths a particle (or a system)
can take in a classical phase space.
The paths in the Feynman picture of QM should not be regarded as real.
All possible paths are about as real as all possible books that can
be written, or - closer to physics - all possible items in a
statistical ensemble modeling a classical ideal gas. Of course only one
state is realized, not all conceivable ones; all others are just there
to compare to and compute probabilities.
In QM things are slightly more complicated, however, since the 'true'
path is smeared by the uncertainty principle. (Even in the many-wolds
interpretation, quantum objects have no sharp paths, while the paths
integrated over in a path integral must be perfectly accurate.)
The paths are just calculational devices that stop to exist once a
different approach to computations are taken. This is why I don't
ascribe any reality to them. The real objects remain present in
_any_ sensible description; the unreal one's don't.
---------------------------------------
S4h. Can particles go backward in time?
---------------------------------------
In the old relativistic QM (e.g., in Volume 1 of Bjorken and Drell)
antiparticles are viewed as particles traveling backward in time.
This is based on a consideration of the solutions of the Dirac equation
and the idea of a filled sea of negative-energy solutions in which
antiparticles appear as holes (though this picture only works for
fermions since it requires an exclusion principle). One can go some way
with this view, but more sophisticated stuff requires the QFT picture
(as in Volume 2 of Bjorken and Drell and most modern treatments).
In relativistic QFT, all particles (and antiparticles) travel forward
in time, corresponding to timelike or lightlike momenta.
(Only 'virtual' particles may have unrestricted momenta; but these are
unobservable artifacts of perturbation theory.)
The need for antiparticles is in QFT instead revealed by the fact that
they are necessary to construct operators with causal (anti)commutation
relations, in connection with the spin-statistic theorem. See, e.g.,
Volume 1 of Weinberg's quantum field theory book.
Thus talking about particles traveling backward in time, the Dirac sea,
and holes as positrons is outdated; it is today more misleading
than it does good.
-------------------------------------------------------
S4i. What about particles faster than light (tachyons)?
-------------------------------------------------------
Tachyons are hypothetical particles with speed exceeding the speed of
light. Special relativity demands that such particles have imaginary
rest mass (negative m^2), and hence can never be brought to rest
(or below the speed of light); unlike ordinary particles, they speed
up as they lose energy,
Charged tachyons would produce Cerenkov radiation in vacuum which has
never been observed. However, Cerenkov radiation is indeed observed
when fast particles enter a dense medium in which the speed of light
is smaller than the particle's speed. This is not a problem since
relativity only demands that no particle with real mass is faster
than the speed of light in vacuum.
(Unfortunately, this does no longer allow to discriminate between
massless particles having the vacuum speed of light, and tachyons.)
Neutrinos are uncharged and have a squared mass of zero or very close
to zero, and hence could possibly be tachyons.
Recently observed neutrino oscillations confirmed a small
squared mass difference between at least two species of neutrinos.
This does not yet settle the sign of m^2 for any species.
Direct measurements of m^2 have experimental errors still compatible
with m^2=0. For data see http://cupp.oulu.fi/neutrino/
The initial interest in tachyons stopped around 1980, when it was
clear that the QFT of tachyons would be very different from standard
QFT, and that experiment didn't demand their existence. The publications
of the particle data group, which contain the biannually revised
consensus of the particle physics community, do not even include the
search for tachyons in their reviews of hypothetical particles:
http://pdg.lbl.gov/2004/reviews/contents_sports.html#hyppartetc
In fact, the theory of symmetry breaking demands that tachyons do
_not_ exist: When a relativistic field theory is deformed in a way
that the square of the mass (pole of the S-matrix) of some physical
particle would cross zero, the old physical vacuum becomes unstable and
induces a phase transition to a new physical vacuum in which all
particles have real nonnegative mass. This would happen already at
tiny negative m^2,
and is believed to be the cause of inflation in the early universe.
(Of course, the exact mechanism is not known since it would require a
nonperturbative definition of QFT. But classical and semiclassical
computations strongly suggest the correctness of this picture.)
Expanding a theory (such as the standard model) around an unstable state
(e.g., the Higgs with a local maximum at vanishing vacuum expectation)
formally produces a bare tachyon. This does not contradict the above
assertion, but only indicates the instability of the bare vacuum.
Asymptotic power series expansions around maxima
(especially those with tiny or vanishing convergence radius)
make meaningless assertions about the behavior of a function near one
of its minima. Since physical particles arise from field excitations
near the global minimum of the effective energy, perturbations around
the maximum are unphysical.
An expansion around an unstable state gives no significant information,
unless one has a system that actually _is_ close such an unstable state
(as perhaps the very early universe). But in that case there are no
relevant excitations (tachyons), since the whole process of motion
(inflation) towards a more stable state proceeds so rapidly that
excitations do not form and everything can be analyzed semiclassically.
The physical Higgs field is far away from the unstable maximum, and its
particle excitations have a positive real mass, hence are not tachyons.
-----------------------------
S4j. Do free particles exist?
-----------------------------
Free particles are a convenient mathematical abstraction.
In Nature, there are - strictly speaking - no free particles,
only interacting ones. This holds both for photons and for other
more tangible particles like electrons. However, in sufficiently
localized (and nearly empty) regions of space, particles can be
approximately free. Again, this holds for both photons and other
particles.
It is very convenient to approximate such states by free states.
For example, this allows to explain much of quantum mechanics
in terms of particle scattering. The S-matrix interpretation
depends crucially on the fact that the ingoing and outgoing
asymptotic states of photons, electrons, quarks, etc. are free.
Thus, in this sense, free photons exist just as much (or just as
little) as free electrons.
------------------------------------
S5a. QM pictures and representations
------------------------------------
QM exists in different pictures, of which the Schroedinger picture,
the Heisenberg picture, the interaction picture, and Feynman's
path integral representation are frequently invoked. There is also
the algebraic approach using unitary representations of canonical
commutation rules (CCR).
The Schroedinger picture, the Heisenberg picture, and the interaction
pictures are equivalent because there are unitary transformations
between them. They all provide different representations of the
same canonical commutation rules
i[p_j,q_k]= hbar delta_jk
between components p_j of momentum p and q_k of position q.
The Stone-von Neumann theorem guarantees that the canonical
commutation relations (or their unitary version, the Weyl relations)
have a unique unitary representation apart from unitary
transformations, and hence suffice to specify the QM of finitely many
degrees of freedom uniquely, no matter which picture is used.
The Stone-von Neumann theorem fails for systems of infinitely many
degrees of freedom (see the FAQ entry on 'Inequivalent
representations of CCR/CAR'), which in a sense 'causes' the
difficulties in quantum field theory.
Nevertheless, QFT still has a Schroedinger picture
and a Heisenberg picture, and these are still equivalent:
The Heisenberg picture can be immediately constructed from the Wightman
fields. Then the canonical procedure - fixing the Heisenberg operators
at time t=0 and instead defining dynamical states
psi(t) := exp(-itH)psi
- produces the Schroedinger picture from it.
The Feynman path integral is related to the other pictures via the
Feynman-Kac formula, which makes the often only formally stated
equivalence precise, after analytically continuing the time to purely
imaginary times. The Osterwalder-Schrader theory
[see, e.g., math-ph/0001010 or the book by Glimm and Jaffe]
shows how to go back in case of relativistic quantum field theory.
The Feynman path integral only gives time-ordered expectation values;
this suffices to compute S-matrix elements, but is inadequate for
dynamical investigations needed for nonequilibrium quantum mechanics.
The latter can be treated with the so-called closed time path (CPT)
integral within the Schwinger-Keldysh formalism.
------------------------------------------------
S5b. Inequivalent representations of the CCR/CAR
------------------------------------------------
Ordinary quantum mechanics of N particles can be written in terms of
creation and annihilation operators for the 3N modes of an associated
reference harmonic oscillator. The field case, on the other hand,
is characterized by the fact that there are infinitely many modes.
If the creation and annihilation operators are those in the action
or Hamiltonian defining the QFT, the different modes are traditionally
referred to as 'bare particles', though this is not recommended for
reasons discussed elsewhere in this FAQ. If the creation and
annihilation operators are properly renormalized so that they
create and annihilate physical particles from the physical vacuum,
the modes are referred to as 'dressed particles'; only these have
physical relevance.
A state in which k modes are excited is called a k-particle state.
In many states of interest, however, (the most prominent ones being
the coherent states) infinitely many modes are excited (although the
notion of infinitely particles is strained in this case). Thus one
needs to cater in the formalism for states with arbitrarily many or
even infinitely many modes. This has subtle consequences, which
account for the big difference between quantum field theory and
ordinary quantum mechanics.
The reason for this is that the natural representation space for
creation and annihilation operators is the vector space consisting
of all formal linear combinations
sum psi(n1,n2,n3,...) |n1,n2,n3,...>
with _arbitrary_ complex coefficients psi(n1,n2,n3,...), on which
a(k) and a^*(l) act as
a(k)|n1,....,n_k,...> = sqrt(n_k)|n1,....,n_k - 1,...>,
a*(l)|n1,....,n_l,...> = sqrt(1+n_l)|n1,....,1+n_l,...>.
This vector space V has no natural Hilbert space structure.
To provide a definite inner product, one must select a suitable
subspace where this inner product can be defined.
This allows many choices; the choice usually discussed in QFT treatises
is Fock space, where only basis vectors |n1,....,n_k,0,0,...>
with finitely many particles are allowed, and these basis vectors are
declared orthonormal. As a result, Fock space contains only
the linear combinations
sum psi(n1,n2,n3,...,n_k) |n1,n2,n3,...,n_k>
where k is variable and
sum |psi(n1,n2,n3,...,n_k)|^2 is finite.
Unfortunately, if this choice is made for the representation of the
bare creation and annihilation operators, it excludes the states
relevant for the physical, interacting situation. This is the
essential message of Haag's no interaction theorem.
Indeed, the physical states lie in a different, inequivalent unitary
representation, characterized by a different subspace of V. This
subspace is generated by applying to the physical (= renormalized)
vacuum state the dressed (= renormalized) creation operators
an arbitrary number of times, then taking all finite linear
combinations, and finally taking the closure with respect to the
innner product in which all a^*(n_1)...a^*(n_k)|vac> are orthonormal.
In general, this Hilbert space has only the null vector (_not_ the
vacuum) in common with the Fock space, even for the simplest
(i.e.,quadratic) Hamiltonians and actions. This case is well understood,
giving rise to the theory of quasiparticles and in particular of
superconductivity. For example (counting modes by signed nonzero
integers for simplicity - they become momenta in the infinite volume
limit), if the bare a(k) and b(k) satisfy CCR then do the dressed
annihilation operators
alp(k) = A(k) a(k) - B(-k) b*(-k),
bet(k) = A(k) b(k) - B(-k) a*(-k),
and their formal adjoints
alp^*(k) = A(-k) a^*(k) - B(k) b(-k),
bet^*(k) = A(-k) b^*(k) - B(k) a(-k),
provided that A(k), B(k) are real numbers satisfying
A(k)^2 - B(k)^2 = 1,
or, equivalently, that
A(k) = cosh(theta(k)), B = sinh(theta(k)).
If there were only finitely many modes, we could define
in Fock space the unitary operator
G = exp [- sum_k theta(k) (a(k)b(-k) - b*(-k)a*(k))],
and verify that
alp(k) = G a(k) G^{-1},
bet(k) = G b(k) G^{-1},
showing that we get an equivalent representation of the CCR.
We could deduce that
|vac> := G|>,
where |> is the bare vacuum, is the dressed vacuum on which
alp and bet act naturally. The dressed states were simply be
the images of the bare states under the Bogoliubov operator G.
Unfortunately, if there are infinitely many modes, G can no
longer be consistently defined as an operator in Fock space,
and the infinite-dimensional version of this scenario breaks
down. Ignoring this, one would find all sorts of infinities.
Mathematically, however, one simply changed the unitary
representation - G does not exist although the dressed
representation exists.
Physicists say that the above computations hold 'formally',
and mean (if a mathematician tries to give it a precise meaning)
that it holds in finite mode approximations but does not survive
the limit although they usually formulate it in the meaningless,
limit form.
The canonical anticommutation rules (CAR) also have the form (1),
except that the commutator is replaced by an anticommutator.
All statements above are valid with appropriate modifications;
the most important one being that occupation numbers are now
restricted to 0 and 1, and the definition of a^*(l) has 1-n_l in
place of 1+n_l.
For more details see the book
H. Umezawa, H. Matsumoto, and M. Tachiki,
Thermo Field Dynamics and Condensed States,
North Holland 1982.
--------------------------------------------
S5c. Why does QFT look so different from QM?
--------------------------------------------
This is only because of technical reasons and the power of tradition.
In ordinary quantum mechanics, pure states are described by
wave functions (more precisely by rays) in a Hilbert space,
there is a Hamiltonian H and an associated Schroedinger equations
i hbar psidot = H psi, the time evolution is described by a unitary
operator, the bound states are normalized eigenstates of the
Hamiltonian, etc.
This is also done in traditional quantum field theory, though it
is not directly apparent. But one can see it when studying
constructive field theory. It gives everything in case of 2D quantum
fields. There is a well-defined Hilbert space, a well-defined
Hamiltonian constructed without any use of perturbation theory,
a well-defined unitary dynamics, well-defined bound states that
are eigenstates of the Hamiltonian, and everything is invariant under
the 2D Poincare group ISO(1,1). See the book
J. Glimm and A Jaffe,
Quantum Physics: A Functional Integral Point of View,
Springer, Berlin 1987.
The only thing wanting is an explicit formula for H in the traditional
nonrelativistic form H=H_0+V. Instead, H is constructed in a more
abstract way, as analytic continuation of an operator in Euclidean
field theory.
That the 4D case is more difficult has to do with obstacles in getting
tight enough bounds for the analytic estimates needed. These are
mathematical difficulties, but not inconsistencies - no one proved that
there are contradictions, and the practice of QFT suggests that there
are indeed none (at least for asymptotically free theories).
On the perturbative level, there is no difficulty at all - see, e.g.
the book
M Salmhofer,
Renormalization: An Introduction,
Texts and Monographs in Physics,
Springer, Berlin 1999.
which constructs the Euclidean theory for Phi^4 theory in 4 dimensions
perturbatively, i.e., in the formal power series topology, with full
mathematical rigor. If this construction would work nonperturbatively
(i.e., give functions instead of formal power series),
analytic continuation using Osterwalder-Schrader theory would do
the rest. The latter is described, e.g., in Chapter 6 of the above
book by Glimm and Jaffe.
--------------------------------------------
S5d. Why is QFT based on a classical action?
--------------------------------------------
The path integral approach to QFT begins with classical fields
that are varied to produce quantum amplitudes as a 'sum over all
possible paths'. But, with exception of the elctromagnetic field,
the classical fields one meets there are not fields occurring
in classical physics. Nevertheless they are rightfully labelled
'classical'.
Classical physics is the physics of processes slowly varying in space
and time; of course, elementary particles do not belong there.
But classical mechanics can also be considered as an abstract
mathematical framework for dynamics in a general phase space
(described by a Poisson manifold), which has much wider applicability.
The classical fields that figure in the path
integral belong in this sense to classical mechanics.
In QFT, one needs a classical action to be able to implement
unitarity of the S-matrix and the cluster decomposition.
The first is essential for a correct probabilistic interpretation of
QFT, since it amounts to preservation of probability, and the second is
necessary to account for the fact that all our experiments are done
locally, and what is far away does not contribute significantly
except through effectively classical far fields. (What happens with
the stars should be irrelevant to experiments on the earth, except for
the experiments of astronomers. This is the basis of all physics.)
In terms of microphysics, cluster decomposition means that one cannot
scatter particles (clusters of elementary particles) at very distant
particles (clusters).
The arguments why this requires a classical action expressed in terms
of creation and annihilation operators are explained in detail in
Weinberg's quantum field theory book, Volume I, Chapters 3-7.
We need cluster decomposition because it is observed. We need
local fields and microcausality, mainly because it implies
(modulo fine print involving contact terms) at least perturbatively
cluster decomposition, and there is no other known way in QFT to
ensure the latter. But there are covariant N-particle models with
cluster decomposition, discussed, e.g., in
B.D. Keister and W.N. Polyzou,
Relativistic Hamiltonian Dynamics in Nuclear and Particle Physics,
in: Advances in Nuclear Physics, Volume 20,
(J. W. Negele and E.W. Vogt, eds.)
Plenum Press 1991.
www.physics.uiowa.edu/~wpolyzou/papers/rev.pdf
(The constructions are quite messy; they have, however, the
advantage that they do not need renormalization, and are useful
phenomenological models.)
The lack of references to cluster decomposition in standard textbooks
of QFT is explained by the fact that local QFT automatically satisfies
cluster decomposition. Most people start by taking QFT as starting
point, without asking why. Weinberg's treatise is about the only book
that asks this question and answers it in some depth.
But when you look at the literature on phenomenological covariant
multiparticle models, cluster decomposition plays an essential role
in that it is the main hurdle to overcome to get realistic models for
systems made of more than two unconfined particles. For details see
the survey by Keister and Polyzou mentioned above,
and the references there.
Cluster decomposition for field theory is also discussed from a
rigorous point of view in the book by Glimm and Jaffe, where
connections are made to multiparticle scattering.
Indeed, books on (nonrelativistic) scattering theory are the ones
where the cluster decomposition is discussed in detail, since it is
needed to describe the result of the most general multiparticle
scattering experiments, and an understanding of it is essential for
proving the asymptotic completeness of scattering states.
Nonrelativistic theory also shows that the 'correct'
cluster decomposition is always one for bound states,
as can be seen from a more detailed nonrelativistic analysis.
(This is not apparent from Weinberg's argument,
since perturbation theory breaks down in the presence of
bound states. This explains why QCD has no cluster
decomposition for isolated quarks.)
--------------------------------------------------------
S5e. Why does the action only contain first derivatives?
--------------------------------------------------------
On the classical level, higher derivatives cause no formal problems,
one can form the variational equations as always. There might be
problems with causality (= symmetric hyperbolicity), however.
These problems become worse (and apparently untractable) in the
quantum case.
In a k derivative theory with k>1, one can always introduce new fields
for the k-1 first derivatives, and add terms to the action that give
as variation their defining equations. Thus one can reduce any theory
to an equivalent one with only first derivatives in the action.
The problems appear when trying to go from the Lagrangian picture to
the Hamiltonian - then one gets similar difficulties as for gauge
theories.
-------------------------
S5f. Why normal ordering?
-------------------------
Field theory often deals with polynomial expressions in annihilation
operators a(p) and their adjoint creation operators a^*(p).
While a(p) is a linear operator on a dense subspace H of the
corresponding Fock space, its adjoint isn't. But both are densely
defined sesquilinear forms on Fock space.
A sesquilinear form is a linear mapping f from a space H (the domain;
a dense subspace of the Hilbert space, in the present case of Fock
space) to its dual space H^* (which properly contains H), while
an operator maps H into H. Thus the latter can be iterated
while the former usually cannot.
<phi|f|psi> is always defined when phi,psi in H (since f|psi> is in H^*,
the inner product is defined). Thus Hermitian sesquilinear forms are
satisfying candidates for 'observables'. However, matrix elements
<phi|fg|psi> of products fg make sense only for
operators f,g, since fg|psi> is not defined if g|psi> is outside H.
In particular, a(p)a(p)^* is a meaningless construct, while
:a(p)a(p)^*: = +-a^(p)*a(p)
makes sense as a Hermitian sesquilinear forms. But f(p)=a^(p)*a(p) is
no longer an operator in any sense (though good 1-particle
operators can be made by integration with suitable test functions).
That's why f(p)f(q) is meaningless while the permuted form
:f(p)f(q): = +-a^*(p)a^*(q)a(q)a(p)
(+ for Bosons, - for Fermions) is well defined (again as sesquilinear
form only).
More generally, any product O of creation and annihilation operators
which has all its creation terms to the left of all its annihilation
terms (these are called normally ordered products) defines a
sesquilinear form. The reason is that such an O can be written as
O=A^*B where A and B are products of annihilation operators only,
hence <phi|O|psi> = <phi|A^*B|psi> can be interpreted as the inner
product of the two vectors A|phi> and B|psi> obtained from phi and psi
by applying annihilation operators only, which produces vectors in H
for which the inner product is always defined.
Normal ordering just permutes arbitrary products to put them into the
normally ordered and hence well-defined form (and adds a minus sign
if an odd number of transpositions of Fermion operators is needed
to order the product). This is extended by linearity to polynomials
and infinite series in power products. Note that normal ordering is
defined for formal expressions (i.e. strings of letters),
not for operators or forms; only _after_ nornal ordering an
expression O one gets a sesquilinear form :O:.
In Fock spaces over finite-dimensional Hilbert spaces, the situation is
different; there a(p) and a^*(p) are indeed operators on Fock space
(and the index p ranges over finitely many items only). Thus all
products make sense, and the normally ordered version of a product
differs from the original product by terms involving fewer operators.
Normal ordering is usually motivated by starting with a
finite-dimensional discretization where integrals become finite sums;
then one can do all the formal manipulations rigorously. Upon passing
to the continuum limit, most expressions become infinite and hence
meaningless, but the normally ordered expressions happen to have a
well-definedlimit and hence are meaningful. So these are the relevant
'operators' or rather sesquilinear forms. Presenting things as above
avoids any infinities.
---------------------------------------------------
S5g. Why locality and causal commutation relations?
---------------------------------------------------
In measurement terms, locality is the idea that a measurement here
and a simultaneous measurement there can be performed independently,
and in particular don't limit each other in precision. This is encoded
in the requirement that 'local' quantities described by fields
Phi_a(\x,t) here (at \x) and fields Phi_b(\y,t) there (at \y)
commute if the positions \x and \y are distinct.
------------------------------------------------
S5h. Creation operators and rigged Hilbert space
------------------------------------------------
Physicists regard Fock space as the Hilbert space containing the
basis states
|x_1:N> = |x_1,...,x_N>
and their linear combinations. However, there is no Hilbert space
containing these states. The state |x_1:N> = |x_1,...,x_N>
is not in the Hilbert Fock space, for the same reason for which
|x> is not in the 1-particle Hilbert space. It is only a
distribution.
The Hilbert Fock space is made instead of all wave functions
psi = sum_N integral dx_1:n psi_N(x_1:N) |x_1,...,x_N>
with finite
<psi|psi> = sum_N |psi_N|^2/N!
Physicists also define annihilation operators a(x) and
their adjoints, creation operators a^*(x). However, these are
not operators, but operator-valued distributions. For example,
a^*(x) maps the vacuum state |vac> (with psi_0=1, other psi_N=0)
into a^*(x)|vac> = |x>, which is not in the Hilbert Fock space.
More generally, for every nonzero Hilbert Fock space vector psi,
the vector
psi' = a^*(x) psi
lies outside the Hilbert Fock space state.
Thus the domain of a^*(x) is just {0}.
However, the states |x_1:N> = |x_1,...,x_N> lie in the top
layer H^* of the right Gelfand triple = rigged Hilbert space.
This is the name for a triple H in Hbar in H^* of vector spaces,
where Hbar is a Hilbert space, H a dense 'nuclear' subspace
(containing very smooth states with very good behavior at infintity)
and H^* its dual space (containing among others very singular states
and states with very poor behavior at infintity). Observables (in the
weak sense) are bilinear forms, or, which is the same, linear mappings
from H to H^*. The adjoint of such a linear mapping is again an
observable in the weak sense. Annihilation operators a(x) (and their
adjoints a^*(x)) are observables in this weak sense, although they are
not Hermitian (and a fortiori not self-adjoint).
--------------------------
S5i. Why Feynman diagrams?
--------------------------
Feynman diagrams resemble processes with particles moving in space and
time, and are often figurately treated as such. But in fact they
do _not_ describe such processes, but certain multiple integrals.
(To emphasize this, the particles involved in Feynman diagrams are
called 'virtual particles'. (Still, many people think mistakenly
that virtual particles are somehow also real. See the entries about
virtual particles elsewhere in this FAQ.)
Although it is nowhere said explicitly, Feynman diagrams are just
a mnemonic for nicely picturing the composition of higher order tensors.
Create for each tensor of a theory a different vertex type, draw a
vertex of this type for each occurence of this tensor in a product
expression in Einstein summation convention, and draw a line between
two such vertices whenever they share an index to be summed over.
The form of the lines defines the value of the coefficient function
in such a product, and the sum over Feynman diagrams simply means that
one considers a linear combination of these products, integrated over
the arguments. Thus this defines a generic representation of an
expansion of a function of the tensors of the theory.
Tuus Feynman diagrams can be used whenever one expands a function of one
or more tensors into a linear combination of products of components of
these tensors.
Indeed, for this reason, they are also used in classical statistical
mechanics and in the analysis of stochastic differential equations
by functional integration techniques.
---------------------------------------------------------
S6a. Nonperturbative computations in quantum field theory
---------------------------------------------------------
There is well-defined theory for computing contributions to the
S-matrix in quantum electrodynamics (and other renormalizable field
theories) by perturbation theory.
There is also much more which uses handwaving arguments and appeals
to analogy to compute approximations to nonperturbative effects.
Examples are:
- relating the Coulomb interaction and corrections to scattering
amplitudes and then using the nonrelativistic Schroedinger
equation,
- computing Lamb shift contributions (now usually done in what is
called the NRQED expansion),
- Bethe-Salpeter and Schwinger-Dyson equations obtained by resumming
infinitely many diagrams.
The use of 'nonperturbative' and 'expansion' together sounds
paradoxical, but is common terminology in QFT. The term 'perturbative'
refers to results obtained directly from renormalized Feynman graph
evaluations. From such calculations, one can obtain certain information
(tree level interactions, form factors, self energies) that can be
used together with standard QM techniques to study nonperturbative
effects - generally assuming without clear demonstrations that this
transition to quantum mechanics is allowed.
Of course, although usually called 'nonperturbative', these techniques
also use approximations and expansions. The most conspicous
high accuracy applications (e.g. the Lamb shift) are highly
nonperturbative. But on a rigorous level, so far only the perturbative
results (coefficients of the expansion in coupling constants) have any
validity.
Although the perturbation series in QED are believed to be asymptotic
only, one can get highly accurate approximations for quantities like the
Lamb shift. However, the Lamb shift is a nonperturbative
effect of QED. One uses an expansion in the fine structure
constant, in the ratio electron mass/proton mass, and in 1/c
(well, different methods differ somewhat). Starting e.g., with
Phys. Rev. Lett. 91, 113005 (2003)
one should be able to track the literature.
Perturbative results are also often improved by partial summation of
infinite classes of related diagrams. This is a standard approach to
go some way towards a nonperturbative description. Of course, the
series diverges (in case of a bound state it _must_ diverge, already in
the simplest, nonrelativistic examples!), but the summation is done
on a formal level (as everything in QFT) and only the result
reinterpreted in a numerical way. In this way one can get
in the ladder approximation Schroedinger's equation, and in other
approximations Bethe-Salpeter equations, etc..
See Volume 1 of Weinberg's quantum field theory book.
---------------------------------------------------
S6b. The formal functional integral approach to QFT
---------------------------------------------------
On a purely formal level (i.e., with power series in place of actual
numbers), 4D QFT is very alive and useful. It is now almost
always based upon functional integrals.
The path integral is discussed e.g., in Weinberg I, Chapter 9, or
Peskin/Schroeder, also Chapter 9. As one can see there, the
path integral formalism involves no operators at all, only classical
(commuting or anticommuting) fields.
The quantities obtained in the expansion of the path integral in
powers of hbar are time-ordered vacuum expectation values.
Since the original ordering in a time-ordered vacuum expectation value
is immaterial (apart from a sign for fermions), the same must be the
case for the path integral itself, which explains why the fields
in the path integral are classical (i.e., commute or anticommute
at all arguments).
---------------------------------------------------------
S6d. Is there a rigorous interacting QFT in 4 dimensions?
---------------------------------------------------------
The Wightman axioms and the Osterwalder-Schrader axioms
[see, e.g., math-ph/0001010 or the book by Glimm and Jaffe]
are currently the basis on which rigorous quantum field theory
(at least for massive particles) is discussed.
In spite of many attempts (and though numerous uncontrolled
approximations are routinely computed), no one has so far succeeded
in rigorously constructing a single QFT in 4D which
has nontrivial scattering. Not even QED is a mathematical object,
although it is the theory that was able to reproduce experiments
(anomalous magnetic moment of the electron; see the entry
''Is QED consistent"" in this FAQ) with an accuracy of 1 in 10^12.
But till today no one knows how to formulate the
theory in such a way that the relevant objects whose approximations
are calculated and compared with experiment are logically well-defined.
See, e.g., the S.P.R. threads
http://groups.google.com/groups?q=Unsolved+problems+in+QED
http://groups.google.com/groups?q=What+is+well-defined+in+QED
This probably explains the high prize tag of 1.000.000 US dollars,
promised for a solution to one of the Clay millenium problems,
that asks to find a valid
construction for d=4 quantum Yang-Mills theories that is strong
enough to prove correlation inequalities corresponding to the
existence of a mass gap. The problem is to explain rigorously
why the mass spectrum for compact Yang Mills QFT begins at a positive
mass, while the classical version has a continuous spectrum
beginning at 0.
The mass gap is a property of the theory, not of a wave function.
Intuitively, it means that, in the rest frame of the total system,
the ground state (=vacuum) is an isolated eigenstate of the
Hamiltonian H, i.e., that the spectrum of H is a subset of
{0} union [E_1,inf]. The largest E_1 with this property defines
the mass gap m_1=E_1/c^2.
This would make proper sense for a nonrelativistic theory.
For a relativistic theory one has to read between the lines and
interpret everything in terms of suitable analogies,
for lack of a consistent mathematical theory.
The millenium problem essentially asks for a rigorous mathematical
setting in which the above can be made precise and proved.
The real problem is the rigorous construction of a Hilbert space with
a unitary representation of the Poincare group, such that a
perturbation argument recovers the traditional renormalized order by
order approximation of quantum field theory.
The state of the art at the time the problem was crowned by
a prize is given in
www.claymath.org/Millennium_Prize_Problems/Yang-Mills_Theory/_objects/Officia
l_Problem_Description.pdf
and the references quoted there. See also
http://www.claymath.org/millennium/Yang-Mills_Theory/ym2.pdf
I don't think significant progress has been published since then.
(The paper hep-th/0511173 which claims to have solved the problem
only consists of a bunch of heuristic arguments. That the author calls
it a proof doesn't turn it into a mathematical proof.)
Yang-Mills theories are (perhaps erroneously) believed
to be the simplest (hopefully) tractable case,
being asymptotically complete while not having the
extra difficulties associated with matter fields.
(There are only gluons, no quarks or leptons.)
Of course, one would like to show rigorously that QED is consistent.
But QED has certain problems (the Landau pole, see below) that are
absent in so-called asymptotically free theories, of which
Yang-Mills is the simplest.
------------------------------
S6e. Constructive field theory
------------------------------
Rigorously defined Lorentz-covariant quantum field theories are known
to exist in 2 and 3 dimensions; the standard reference (for d=2)
is the book by
J. Glimm and A. Jaffe,
Quantum physics. A functional integral point of view
New York, 1981
A recent review of the achievements of constructive
quantum field theory in dimensions < 4 is
V. Rivasseau
Constructive Field Theory and Applications:
Perspectives and Open Problems,
J. Math. Phys. 41 (2000), 3764-3775.
http://lanl.arxiv.org/pdf/math-ph/0006017
The case d=4 is a famous unsolved problem; the special case of 4D
quantum Yang-Mills gauge theory with a compact simple, nonabelian
gauge group is one of the Clay Millenium problems with a 1 million
Dollar prize attached to its solution.
-----------------------------------
S6g. What are interpolating fields?
-----------------------------------
Traditional QFT has rules for computing reasonable approximations
to the S-matrix of a field theory. The S-matrix describes the behavior
of a state of the system under a transition from time t=-inf to time
t=+inf. But in a complete dynamical theory, one would like to be able
know what happens in-between at finite times. In nonrelativistic QM,
this information is given by the Schroedinger equation. In QFT it is
given by the interpolating field - called interpolation since it
interpolates between the infinite limiting times.
More precisely, the dynamical information about the interpolating
field is represented mathematically in the Wightman functions,
which give the (renormalized) vacuum expectations of field products
at arbitrary combination of space-time points.
Unfortunately, no one knows how to compute the latter in relativistic
$D quantum field theories. However, Wightman functions have been
constructed rigorously in lower dimension (more precisely
in certain superrenormalizable theories in 2 and 3 dimensions).
-----------------------------------------------------------------------
S6h. Hilbert space and Hamiltonian in relativistic quantum field theory
-----------------------------------------------------------------------
Most of current quantum field theory (i.e., everything with exception
of 2D and 3D constructive field theory - which doesn't even cover QED)
does not have a well-defined Hilbert space at all, in which a
time operator would be defined.
Well-defined are only the asymptotic Hilbert spaces of in and out
states for scattering experiments. These are Fock spaces of
free particles, and hence defined on a mass shell.
There is a basic result called Haag's theorem which states that
these asymptotic Fock spaces cannot carry a nontrivial local dynamics,
as would be required for a field theory.
The full dynamics can be defined only indirectly, via CTP (closed
time path) integration, and subject to all interpretation problems
of the renormalization procedures.
---------------------------------------
S6i. 2-dimensional quantum field theory
---------------------------------------
Much of the state of the art in 2-dimensional relativistic quantum
field theories is covered in two books,
Elcio Abdalla, M. Christina Abdalla, Klaus D. Rothe
Non-Perturbative Methods in 2 Dimensional Quantum Field Theory
World Scientific, 1991, revised 2nd. ed. 2001.
and
J. Glimm and A Jaffe,
Quantum Physics: A Functional Integral Point of View,
Springer, Berlin 1987.
The first book treats exactly solvable theories, the second book
treats general polynomial interactions. The methods are completely
different in the two cases, and the two books are essentially disjoint.
Unfortunately, both books are somewhat difficult to read.
Abdallah et al. treat those (very special) 2-dimensional quantum field
theories having closed analytic expression for all S-matrix elements'.
These solvable models are to 2-dimensional quantum field theory what
the hydrogen atom is to quantum mechanics. It gives lots of details
about many solvable models, but I found it too specialized to give me
a feeling of general 2-dimensional quantum field theory.
Glimm and Jaffe assume a lot of measure theory and functional analysis.
This is summarized in Appendix A of their Part I, but working first
through Volume 3 of Thirring's Course in Mathematical Physics (which
only deals with nonrelativistic QM but in a reasonably rigorous way)
would be a good preparation for tackling Gliimm and Jaffe.
They construct - rigorously - for 2-dimensional relativistic
Lagrangian scalar field theories with polynomial interaction a Hilbert
space, a well-defined Hamiltonian, a well-defined unitary dynamics,
with well-defined bound states that are eigenstates of the Hamiltonian,
and everything is invariant under the 2D Poincare group ISO(1,1).
Chapter 3 defines a rigorous version of the path integral for ordinary
quantum mechanics, or rather for the Euclidean version of it, with the
i in the Schroedinger equation dropped. This amounts to analytic
continuation to imaginary time, where everything is easy and
respectable. In place of a hyperbolic differential equation one gets
a parabolic one (the heat equation), which makes things tractable
since the heat kernel is positive and hence the measures needed to
make the path integral rigorous are positive Wiener measures, with a
good rigorous theory.
Quantum field theory starts in Chapter 6. It is presented in a
Euclidean and a Minkowski version, the former being an analytic
continuation of the latter. Both versions are defined axiomatically,
by the Osterwalder-Schrader axioms and the Wightman axioms,
respectively. Again, the Euclidean version is the tractable one,
in which one can generalize the path integral and perform the
estimates needed for proving the existence of all the tools.
The Osterwalder-Schrader theory then guarantees that, given the
satisfaction of the Euclidean axioms, analytic continuation to
the Minkowski case is indeed possible. This is outlined in Section 6.1;
the remainder of the chapter discusses the (easy) special case of
free fields.
Chapters 7-12 and 19 then define the machinery needed to show how
to satisfy the axioms in the case of 2-dimensional relativistic
Lagrangian scalar field theories with polynomial interaction.
Chapter 7 discusses the Gaussian measures that define the Euclidean
path integral of free fields, Chapter 8 presents a rigorous theory of
perturbation theory for Euclidean path integrals, and the remaining
chapters mentioned provide the estimates needed to make sure that
everything works.
--------------------------
S7a. What is the mass gap?
--------------------------
In a relativistic theory, whenever there is a state with definite
4-momentum p, there is also one with definite momentum p' = Lambda p
obtained by applying a Lorentz transform Lambda. The orbit of
4-momenta obtained in this way forms a hyperboloid in the future
cone (because of causality), characterized by a mass m=>0.
p^2=m^2, p_0>0.
This includes as a limiting case massless states with m=0,
where the orbit consists of the future light cone with 0 excluded.
Therefore the possible values of p are characterized by the possible
values of m, which defines the mass spectrum of the theory. The mass
spectrum is the relativistic analogue of the energy spectrum of the
Hamiltonian in a nonrelativistic theory, shifted such that the ground
state has E=0.
The only state with zero momentum is the ground state, usually called
the vacuum. If the values of p^2 for the realizable nonzero p is
bounded below by a positive number, the theory is said to have a mass
gap. The largest value of m>0 for which m^2 is such a lower bound
defines the precise value of the mass gap. Usually there is a state
for which p^2=m^2; this is then interpreted as the state of a
single 'dressed' particle.
In general, the mass spectrum consist of a discrete and a continuous
part. The discrete part of the spectrum corresponds to bound states,
the continuous part to scattering states.
The continuous spectrum starts when there is the possiblity of
scattering. which means that the energy is large enough that two
asymptotically independent systems can exist. Given a state of mass
m, one expects to have states with two almost independent systems of
mass m and an arbitrary relative momentum, giving a continuous
spectrum of scattering states with all possible squared momenta
exceeding (2m)^2, as a simple calculation reveals:
If p is the sum of two timelike vectors p1,p2 of mass m then
p^2 = (sqrt(\p1^2+m^2)sqrt(\p2^2+m^2))^2 - (\p1+\p2)^2
= 2m^2 + 2 sqrt((\p1^2+m^2)(\p2^2+m^2)) -2\p1 dot \p2
By making \p2=-\p1 one gets arbitrarily large values of p^2, hence
part of the continuous spectrum. The minimum of p^2 must occur by
Cauchy/Schwarz for \p2=\p1, and is then (2m)^2, independent of the
spatial momentum.
Thus the continuous spectrum extends from mass 2m to infinity,
where m is the mass gap.
There may be bound states with mass m_b<2m, forming the discrete
spectrum. These are not scattering states, hence not obtained by
simply adding momenta. For bound states of k particles with masses
m_1,...,m_k, one needs to subtract from (m_1+...+m_k)c^2 the binding
energy of the bound particles. There might be bound states
with mass m_b>2m embedded in the continuous spectrum, but these are
possible only if there are selection rules that forbid the decay into
particles with smaller mass.
In particular, the state of minimal mass m, if it exists, is always
a bound state (including the case of a single particle).
-------------------------------------------------------
S7b. Why can a bound state of massless quarks be heavy?
-------------------------------------------------------
A system has a well-defined mass if it is in an eigenstate of p^2,
where p is the total momentum operator (whatever this is;
relativistically, bound states are very poorly understood).
So to understand, view it from a nonrelativistic perspective.
Because of E=mc^2, the mass shows up as energy, i.e., as eigenstate
of the Hamiltonian.
Now a bound state at rest defines the rest energy, and by giving
it uniform motion one can increase the energy by an arbitrary amount
of kinetic energy. The rest energy (and hence the rest mass), on the
other hand, is determined by the discrete spectrum of the Hamiltonian
in reduced coordinates, i.e., with center of mass motion separated out.
For forces that decay with distance, a bound state necessarily has
a mass that is less than the sum of the masses of the constituents.
For particles involving quarks, this does not apply since the strong
force increases with distance. Hence the rest mass of a bound state of
quarks could be anything.
------------------------------------------------------
S7c. Bound states in relativistic quantum field theory
------------------------------------------------------
Bound states are supposed to be poles of the S-matrix, and
Bethe-Salpeter equations for the bound state dynamics can be
obtained approximately from resumming infinite families of
Feynman diagrams. See Chapter 14 of Weinberg's QFT I. But...
Perturbative QED (even in Scharf's rigorous treatment)
has nothing at all to say about how to model bound states.
Bound states don't exist perturbatively: The poles in the S-matrix
can arise only by summing infinitely many Feynman diagrams.
(Sum the geometric series 1+x+x^2+... to see how poles arise by
summation.)
I haven't seen a single rigorous treatment of such an issue in
quantum field theory.
Weinberg states in his QFT book (Vol. I) repeatedly that bound state
problems (and this includes the Lamb shift) are still very poorly
understood (though the Lamb shift is one of the most accurately
predicted physical quantity). On p.564 he says,
'These problems are those inbolving bound states [...]
such problems necessarily involve a breakdown of ordinary
perturbation theory. [...] The pole therefore can only arise
from a divergence of the sum of all diagrams [...]'
On p.560, he writes,
'It must be said that the theory of relativistic effects
and radiative corrections in bound states is not yet in an
entirely satisfactory shape.'
This remark suggests that he seems to think that, in contrast,
for scattering problems, the theory is in an entirely satisfactory
state, as given in the rest of his book. Thus 'satisfactory'
does not mean 'mathematically rigorous', but only
'well understood from a physical, approximate point of view'.
There are, of course, methods for approximating bound state problems,
based on Bethe-Salpeter equations, Schwinger-Dyson equations, and
some other approaches. See, e.g., the review
H. Grotch and D.A. Owen,
Foundations of Physics 32 (2002), 1419-1457.
or hep-ph/0308280.
But all of this is done in completely uncontolled approximations,
and to get numerically consistent results is currently more an
art than a science.
This leaves plenty of scope for interesting (but hard)
new work on bound states on both the physical and mathematical side.
-------------------------
S8a. Why renormalization?
-------------------------
Quantum field theory is what particle physicists define it is, and
this includes many working interacting QFTs. But it is not a theory
in the mathematical sense. This is due to the freedom they take
when discussing the renormalization needed to remove formal
infinities from their theories.
-----------------------------------------
S8b. Renormalization without infinities I
-----------------------------------------
Renormalization in QFT is often associated with the need to handle
infinities. This makes everything look as nonsense from a
mathematical point of view. But this is just the sloppiness of
physicists; it is not difficult to get a satisfying view of
renormalization without encountering any weird infinities.
The basic principles can be explained without knowing anything about
quantum mechanics, since renormalization is a much more general
phenomenon associated with idealizations in a theory and the
corresponding limits. As such it is also needed in various classical
situations (classical point electrons, turbulence, etc.)
hep-th/0212049 is a nice paper discussing most of renormalization
without ever mentioning fields (which come in quite late).
In all cases, we want to describe a situation which is a limit of more
complex and often less symmetric situations. This limit is the only
problematic thing, and sometimes generates infinities if done in an
improper way. Just as when trying to compute
s_N = sum_{k=0:N} (-1)^k/(k+1)^s = u_N - v_N
by summing the even and odd contributions u_N and v_N separately.
The limit N to inf is well-defined for s>0, but can be obtained only
for s>1 by going to the limit in u_N and v_N separately.
One needs to proceed similar as in techniques to evaluate limits which
give naively inf-inf, by using some transformation that cancels the
infinities analytically. Example:
lim sqrt(n^2+n)-sqrt(n^2+1)
= lim ((n^2+n)-(n^2+1))/(sqrt(n^2+n)+sqrt(n^2+1))
= lim (n-1)/(sqrt(n^2+n)+sqrt(n^2+1)) = 1/2.
In quantum physics, the data (the Hamiltonian in QM, the action in QFT)
depends on some parameter vector v of dimension d, say, without direct
physical meaning. For example, v may consist of bare mass,
bare charge, and bare coupling constant.
Without the renormalization conditions we get a family solution
parameterized by v from which we can compute measurable quantities
combined into a vector q=q_N(v) of some dimension e>d.
where N is the parameter in which we want to take the limit.
(N might be an energy cutoff at energies beyond observability, and q the
observed particle spectrum.)
Anything we can reliably measure must clearly be essentially independent
of N, once N is large enough. Therefore the equation q=q_N(v) defines a
(generically) d-dimensional manifold in R^e whose limit as a set is also
a well-defined d-dimensional manifold. This is the manifold of interest,
since picking a particular finite value for N is usually subjective.
In a theory with finite renormalization, this limit manifold can still
be parameterized by v, since the limit
q(v)= lim_{N to inf} q_N(v) (*)
exists. Although v is unobservable it can be calculated from the
measurements by solving the equation q=q(v) in the least squares sense.
Rather than doing that (which would be numerically best in case the
measurements are inexact or q(v) is not exactly known) one proceeds
in theoretical work as if an s-dimensional vector mu of key physical
data and a corresponding subset of d equations were known exactly,
and can be solved exactly for v=v(mu).
Then one gets a renormalized parameterization
q=q_ren(mu), with q_ren(mu)= q(v(mu)), (**)
expressing everything in terms of the physical parameters mu.
When the limit (*) does not exist, the situation is more complicated.
Since there is no limiting q, one has to work at finite N. Proceeding
as before, one solves d of the equations in q=q_N(v) for v, getting
v=v_N(mu), but since the limit (*) does not exist, there will also be no
limit
v(mu) = lim_{N to inf} v_N(mu)
which would enable the use of (**). Instead, v_N(mu) diverges.
Loosely speaking, we get infinite bare masses and bare coupling
constants. But this limit will never be used, hence there are no
problems. It is just the loose way of speaking that creates the
impression of weirdness. The 'infinities' are caused by the nature
of the interactions. If they are too singular for a standard treatment
then the limits needed for a finite renormalization simple do not
exist anymore.
But this does not mean that the theory becomes meaningless but only
that one has to be careful in performing the limit only where it is
allowed to do so. This requires a small change in our procedure.
At finite N, we can still define a renormalized
parameterization
q = q_{N,ren}(mu), with q_{N,ren}(mu)= q_N(v_N(mu)).
For a renormalizable theory, the limit
q_ren(mu) = lim_{N to inf} q_N,ren}(mu)
exists although neither q_N nor v_N converge.
Once this limit replaces the naive bare recipe (*)-(**) which is
ill-defined, everything behaves properly as it should.
The situation may be slightly more complex than indicated above.
Instead of working with directly measurable quantities one often
works with formally more tractable quantities q that are finitely
related to the key measurable quantities mu (such as observed mass
spectra). However, their definition depends on an additional scale
parameter E that fixes the renormalization conditions. (This parameter
should not be mixed up with the cutoff energy, which after
renormalization is always infinite!)
Thus we actually have q=q_N(v,E), solve some of these equations for
v=v_N(mu,E), and get as a result
q = q_{N,ren}(mu,E), with q_{N,ren}(mu,E)= q_N(v_N(mu),E),
hence
q_ren(mu,E) = lim_{N to inf} q_{N,ren}(mu,E).
But since the scale E can be chosen arbitrarily, the final renormalized
result of physical predictions P(q,E) must be
independent of E. Thus,
d/dE P(q_ren(mu,E),E) = 0,
which is a form of the renormalization group equations.
----------------------------------------
S8d. Renormalization and coarse graining
----------------------------------------
In QFT, there are two different scales, one on the bare level and one
on the renormalized level, and the meaning of the renormalization
group is slightly different from that in statistical mechanics.
--------------------------------------------------------
S8e. Renormalization scale and experimental energy scale
--------------------------------------------------------
The picture drawn in the preceding is somewhat incomplete with
regard to the practice of computing, due to the fact that we cannot
compute this renormalized theory at any E, since it is exceedingly
complicated.
Thus we need to consider approximations. These approximations are
no longer independent of E, since the approximation errors depend
on it. It turns out that the approximation errors are small only
when the energy scale of the experiment for which a prediction is
made is close to the renormalization scale E, since (see, e.g.,
Weinberg's QFT book, Vol. 2, Chapter 18.1) the perturbative
expansion contains arbitrary powers of log(E_experiment/E) which
therfore must be kept small.
Thus one needs to evaluate the theory near the scale of interest.
However, perturbation theory is valid only near a fixed point E^* of
the renormalization group equations. Therefore, one determines
approximate formulas for the quantities q_ren(mu,E) with E close to
the appropriate fixed point E^*, and then uses (also approximate)
renormalization group equations to transform the result to the
scale of interest.
Thus there are two different scales involved, the energy scale
E_exp where the experiments are done, and the renormalization scale
E_ren (previously denoted by E).
On the experimental side, coupling constants (such as the charge)
are determined with reference to some effective, coarse grained theory
(such as the nonrelativistic Schroedinger equation). This effective
theory depends on E_exp (for QED, the charge is traditionally defined
in the low energy limit E_exp to 0). This effective theory behaves
like any other coarse-grained theory, giving rise to running coupling
constants such as e=e_exp(E_exp). But these depend on the details of
the coarse-graining scheme, and the computed results depend on the
coarse-graining, too, and hence on E_exp.
-------------------------------
S8f. Dimensional regularization
-------------------------------
The neatest way to perform regularization, and the only one which
works well in complicated cases such as nonabelian gauge theories
is dimensional regularization. Unfortunately, it is presented
in most textbooks in a way that looks quite mysterious, involving
unphysical fractional dimensions. This is however just sloppiness
on the side of physical tradition, and a more rigorous approach
removes everything strange.
-----------------------------------------
S8g. Nonrelativistic quantum field theory
-----------------------------------------
The right way to understand relativistic QFT is to regard it as
a limit of nonlocal nonrelativistic quantum field theory.
The latter is much better behaved.
Interacting QFT in 3+1 dimensions exists, however, as a rigorous
mathematical theory in the nonrelativistic case, since there only
finite renormalizations are needed and no infinities occur.
In this context, Feynman-Dyson perturbation theory can be given a
rigorous meaning. Note that nonrelativistic QFT is nonlocal
because of the Coulomb potential interaction.
Interacting QFT based on Feynman-Dyson perturbation theory
in 3+1 dimensions exists as a rigorous mathematical theory
in the relativistic case, as a limit of smeared, nonrelativistic
theories. This is done for Phi^4 theory in all details in
Salmhofer's book. For technical reasons, one gets the results
however only in a very weak topology corresponding to power series
in the coupling constant, rather than as true functions of the
coupling constants. Thus perturbative relativistic QFT is rigorously
established in 4D while nonperturbative relativistic QFT in 4D
is still elusive.
However, the infinities that plague 4D relativistic QFT are already
present in 3D, and there rigorous construction have been given.
Exactly the same kind of renormalization tricks are used in 3D.
Thus our present lack of understanding cannot be blamed on
renormalization, but has to do with the difficulty of getting
the hard analytical estimates needed to justify the constructions.
-----------------------------------------------------
S8h. Nonrenormalizable theories as effective theories
-----------------------------------------------------
The difference between renormalizable and unrenormalizable theories is
that the former are specified by a (small) finite number of parameters
while the latter are specified by an infinite number of parameters.
In a renormalizable quantum field theory, only few counterterms
must be added to the action in order to get a consistent
finite perturbative expansion at all orders. This means that a few
parameters suffice to get a consistent theory which will be correct
at the energies of interest (which should be essentially independent
of what happens at the inaccessible large energies).
In a nonrenormalizable quantum field theory, infinitely many
counterterms must be added to the action in order to get a consistent
finite perturbative expansion at all orders. This means that with a few
parameters one can only get an effective low order theory, which may,
however, still be good enough at the energies of interest.
But for better approximation, one needs to determine more and
more parameters...
-------------------------------------
S8i. What about infrared divergences?
-------------------------------------
Renormalization theory deals with the regularization of ultraviolet
divergences, occuring at very high but unobservable energies.
In contrast, infrared divergences arise if there are problems at
very low energies. They are not cured by renormalization and need
completely different techniques.
-----------------------
S9b. Is QED consistent?
-----------------------
Quantum electrodynamics (QED) gives the most accurate predictions
quantum physics currently has to offer.
The anomalous magnetic dipole moment matches the experimental data
to 12 significant digits:
M. Passera,
Precise mass-dependent QED contributions to leptonic g-2 at order
alpha^2 and alpha^3,
Phys. Rev. D 75, 013002 (2007).
http://arxiv.org/abs/hep-ph/0606174
B. Odom, D. Hanneke, B. D'Urso, and G. Gabrielse,
New Measurement of the Electron Magnetic Moment Using a
One-Electron Quantum Cyclotron,
Phys. Rev. Lett. 97, 030801 (2006)
http://hussle.harvard.edu/~gabrielse/gabrielse/papers/2006/NewElectronMagnet
icMoment.pdf
The Lamb shift, whose prediction made QED and renormalization
respectable, is much more difficult to measure with high precision,
hence offers no such phenomenal test of accuracy:
S.G. Karshenboim,
Precision physics of simple atoms: QED tests, nuclear structure
and fundamental constants,
Phys. Rep. 422 (2005), 1-63
http://arxiv.org/abs/hep-ph/0509010
Nevertheless, many physicists think that QED cannot be a consistent
theory. There is a phenomenon called the Landau pole:
http://en.wikipedia.org/wiki/Landau_pole
It indicates that at extremely large energies (far beyond the range of
physical validity of QED, even far beyond the Planck energy) something
might go wrong with QED. (QED loses its validity already at energies
of about 10^11 eV, where the weak interaction becomes essential.
The Planck energy at about 10^28 eV is the limit where some current
theories try to make predictions. But the Landau pole, if it exists,
has an energy far larger than the latter.)
This is probably why Yang-Mills and not quantum electrodynamics was
chosen as the model theory for the millenium prize.
Since the existence of the Landau pole is confirmed only in low order
perturbation theory and in lattice calculations,
hep-lat/9801004 and hep-th/9712244
the question whether the alleged landau pole implies limits to the
consistency of QED has currently no rigorous mathematical substance.
The observations about the Landau pole in perturbation theory can be
recast in mathematically rigorous terms using so-called renormalons,
obstructions to Borel summability; see
V Rivasseau
From Peturbative to Constructive Renormalization
Princeton 1991
But the resulting analysis is inconclusive as regards the existence
of the theory.
QED is renormalizable at all loops, which means that the power series
expansion of the S-matrix is mathematically well-defined at ordinary
energies. The _only_ thing that is missing is to give its limit a
mathematically well-defined meaning.
Note that the S-matrix S commutes with the Hamiltonian;
hence if P is the orthogonal projector to the space H_limit of
states involving only energies < E_limit(alpha)
then PSP is unitary on H_limit, and my conjecture is that PSP has
some (yet unknown but) rigorous nonperturbative construction.
The Landau pole (if it exists) just gives an upper bound to the allowed
energies. E_limit(alpha) is a function of alpha, which according to
perturbation theory has to satisfy
E_limit(0) < (Landau-pole in lowest order)
(and possibly decreases with increasing alpha); apart from that,
the known approximate results do not restrict the likely mathematical
validity of pure quantum electrodynamics.
A cautious evaluation of the situation is given in Weinberg's QFT book,
Vol. 2, pp.136-138 - all options are left open. On the other hand,
D. Espriu and R. Tarrach,
The case for triviality,
Phys. Lett. B383 (1996) 482,
argue that, because of the Landau pole, quantum electrodynamics is
only an effective field theory.
To summarize:
QED is renormalizable at all loops, which means that the power series
expansion of the S-matrix is mathematically well-defined at ordinary
energies. The _only_ thing that is missing is to give its limit a
mathematically well-defined meaning derived from a formulation of
QED that makes sense also at finite times and not only as a transition
from t=-infinity to t=+infinity.
-------------------------------------------------
S9c. What about relativistic QFT at finite times?
-------------------------------------------------
Although many time-dependent observable consequernces of QED
can be deduced in a nonrigorous way in the Schwinger-Keldysh
= closed time path (CPT) formalism, there is at present no rigorous
relativistic quantum field theory at finite times in 4 dimensions.
In lower dimensions, for all theories where Wightman
functions can be constructed rigorously, there is an associated
Hilbert space on which corresponding (smeared) Wightman fields
and generators of the Poincare group are densely defined.
This implies that there is a well-defined Hamiltonian H=cp_0 that
provides via the Schroedinger equation the dynamics of wave functions
in time.
In particular, if the Wightman functions are constructed via the
Osterwalder-Schrader reconstruction theorem, both the Hilbert space
and the Hamiltonian are available in terms of the probability measure
on the function space of integrable functions of the corresponding
Euclidean fields. For details, see, e.g., Section 6.1 of
J. Glimm and A Jaffe,
Quantum Physics: A Functional Integral Point of View,
Springer, Berlin 1987.
In particular, (6.1.6), (6.1.11) and Theorem 6.1.3 are relevant.
Unfortunately, no Wightman functions have been constructed so far
for interacting 4D quantum field theorys; see the FAQ entry on
'Is there a rigorous interacting QFT in 4 dimensions?'.
However, the functional integration measure of Euclidean QED is known
to exist perturbatively at all orders (Tomonaga, Schwinger and Feynman
got the Nobel prize for this), though a nonperturbative construction
is still missing. By analytic continuation as in the
Osterwalder-Schrader reconstruction theorem , one should be able to
obtain a perturbatively valid Hamiltonian for QED (cf. Theorem 6.1.3
in Glimm and Jaffe).
-------------------------------------------------
S9d. Perturbation theory and instantaneous forces
-------------------------------------------------
In classical relativity theory, causality demands that all forces
are retarded. In relativistic quantum theory, this principle is
somewhat obscured, due to the approximations needed to get a
dynamical picture. The general practice is to expand in powers
of v/c, where v is a velocity and c is the speed of light.
When doing this, the resulting formulas look instantaneous at
each order of perturbation theory, which might invite unfounded
conclusions.
However, the same already happens at the classical level, where
the situation is easy to understand. The retarded terms must
reappear when summing terms to all order.
This is most easily seen by noting that a retarded differential
equation (for simplicity 1D, but the 4D case is similar)
dx(t)/dt = f(x(t-tau)),
when expanded in powers of the small parameter tau, becomes a
higher order ordinary differential equation at fixed order.
To see this, differentiate the original equation k times and
introduce new functions
x_0=x, x_1=dx/dt, ..., x_k=d^kx/dt^k
to get a system of retarded differential equations.
Then expand the equation for dx_k/dt up to order n-k.
Then substitute terms on the right hand side.
The approximate equation is manifestly instantaneous, but it
describes the perturbative behavior of the retarded equation.
Thus perturbation theory in v/c cannot be used to decide about the
instantaneous or retarded nature of quantum dynamics.
-------------------------------------------
S9e. QED and relativistic quantum chemistry
-------------------------------------------
Relativistic quantum chemistry is needed to predict properties
of heavy atoms. This is usually done by invoking the Dirac-Fock
Hamiltonian, which is an approximation of the QED Hamiltonian
for which the multiparticle bound state problem is tractable.
Here are a few samples of what can be done:
The first is explicitly time-dependent;
the second is about bound states calculations;
the third shows how to add further QED corrections;
The fourth shows how the Dirac-Fock Hamiltonian arises as
approximation of QED. The last gives a discussion of some
mathematical problems involved.
Fink+Johnson
Electron correlations and spin-orbit interaction in two-photon
ionization of closed-shell atoms: A relativistic time-dependent
Dirac-Fock approach
Phys. Rev. A 42, 3801-3818 (1990)
Bieron et al.
Large-scale multiconfigurational Dirac-Fock calculations of the
hyperfine-structure constants and determination of the nuclear
quadrupole moment of 49Ti
Phys. Rev. A 59, 4295-4299 (1999)
Indelicato+Desclaux
Multiconfiguration Dirac-Fock calculations of transition energies
with QED corrections in three-electron ions
Phys. Rev. A 42, 5139-5149 (1990)
P Chaix and D Iracane
From quantum electrodynamics to mean-field theory.
I. The Bogoliubov-Dirac-Fock formalism
J. Phys. B: At. Mol. Opt. Phys. 22 (1989) 3791-3814
M Defranceschi and C Le Bris
Computing a molecule in its environment: A mathematical viewpoint
Int J Quantum Chemistry 71 (1999) 227-250
----------------------------------
S9f. Are protons described by QED?
----------------------------------
The traditional field equations of quantum electrodynamics (QED),
which can be found in any textbook on quantum field theory, describe
only electrons, positrons, and photons, but not protons, although
the latter have electromagnetic interactions.
The reason is that, unlike free electrons and positrons, free protons
do not obey the Dirac equation since they have form factors which are
(unlike for electrons and positrons) determined not only by interactions
with photons, but primarily by the inner structure of the proton.
Thus even bare protons cannot be understood as point particles, which
makes standard QED equations inapplicable.
To understand the proton's frm factors from first principles needs
quantum chromodynamics (QCD) - and even then they are imperfectly
understood.
In the traditional QED treatment of molecules and their interaction
with light, protons and other nuclei are typically treated as classical
sources of electromagnetic fields when determining the structure of
the electron. (The resulting effective potential between the
nuclear positions is quantized afterwards if a full classical treatment
is not adequate). This gives excellent agreement with experiment,
in particular for the hydrogen atom.
Of course, one can tread QED together with a proton field as an
effective (and nonrenormalizable) theory, in which in addition to the
Dirac equation for the bare electrons there is a Dirac-like equation,
modified by the form factors, for the bare protons. To describe atoms
correctly, one needs also fields for neutrons and mesons, and
appropriate interaction terms between them, leading to quantum
hadrodynamics (plus QED). This accounts for all practically
relevant properties of atoms (including nuclear fission and fusion).
-------------------------------------------
S10a. How are matrices and tensors related?
-------------------------------------------
Mathematicians and physicist differ in the notation used for
vectors, tensors, matrices, and multilinear forms. Here is
a dictionary.
T^q = tensor product of q copies of the vector space T;
in particular, T0=S is the algebra of scalar fields and T1=T.
T^p_q = space of all linear mappings from T^q to T^p;
elements are (p,q)-tensors with p upper and q lower indices.
T_0^q = T^q
T_p0 =: T_p = (T^p)^* is the so-called dual space of T^q;
in particular, T_1 = T^* is the dual space of T;
its elements are the linear forms = covectors.
One can associate with every A in T_p^q canonically a multilinear
mapping B: T_q tensor T^p --> S with
B(s,t) = t(As) for s in T^q, t in T_p,
and conversely; indeed, since the image As of s under A is in T^p,
its image t(As) is a well-defined scalar. Using the B's in place of
the A's gives an alternative way of defining tensors, although one
less convenient for visualization.
Given a basis on T and a dual cobasis on T^*, one can use coordinates.
Then physicists write
- elements of T as vectors = column vectors with an upper index,
- elements of T^* as linear forms = 1-forms = covectors = row vectors
with a lower index,
- elements of T^q as multivectors with q upper indices,
- elements of T_p as multicovectors with p lower indices,
- elements of T_p as mixed multi/ko/vectors with p lower and q upper
indices.
(There is also a dual version of this, where vector are considered
as rows and covectors as columns. The remainder then changes
accordingly.)
In particular.
(0,0)-tensor = scalar,
(1,0)-tensor = vector (vector in T=T1) = column vektor,
(0,1)-tensor = covector (vector in the dual space T^*=T_1)
= row vector,
(1,1)-tensor = matrix (linear mapping from T to T).
Clearly, the columns of the matrix A_i^k are column vectors = vectors,
the rows are row vectors = covectors, and the indexing is consistent.
The requirement that basis and cobasis are dual is equivalent to the
statement that for every vector u and covector w (i.e., linear mapping
from vectors to scalars),
w(u) = w_i u^i;
here the Einstein convention is used that formulas involving
pairs of equally labelled indices, one of them a lower index
and the other an upper index must be interpreted as a sum over these
indices.
The relation between the physicists form and the linear algebra form
of writing things can be inferred from (**) - we simply have
Phys. notation: g_ik
Math. notation: G = (g_ik)
Phys. notation: g^ik
Math. notation: G^{-1} = (g^ik)
--------------------------------------------------------------
S10b. Is quantum mechanics compatible with general relativity?
--------------------------------------------------------------
The difficulty to reconcile quantum mechanics and general relativity
counts as one of the big problems of fundamental physics.
There appears to be a problem because canonical quantum gravity
based on quantizing the Hilbert action is nonrenormalizable.
(See the section on 'Renormalization in quantum gravity' in this
FAQ about how nevertheless to renormalize a nonrenormalizable field
theory.)
----------------------------------------
S10c. Difficulties in quantizing gravity
----------------------------------------
(i) (mathematical) No consistent interaction relativistic quantum
field theory is known in 4 dimensions.
(ii) (theoretical) The accepted ways to avoid divergences in
expressions for scattering amplitudes that work in simpler theories
all fail because of the lack of renormalizability. See, e.g.,
the references in Section 2.2 of
http://relativity.livingreviews.org/Articles/lrr-2002-5/
(iii) (theoretical) The theories for which a (perturbatively)
finite scattering theory is available have not been related
quantitatively to the established theories.
A convincing classical limit (to general relativity),
nonrelativistic limit (to a multiparticle Schroedinger equation with
Newtonian interaction), and low energy limit (at currently accessible
energies no new particles apart from the graviton) would be needed.
(iv) (conceptual) The three limits pose severe constraints on possible
quantum gravity theories, and it requires much imagination to come up
with a conceptual basis in which these limit make sense and are
tractable. (But see the preceing entry.)
(v) (experimental) Quantum effects in gravity are so weak that no
experiments sensitive to quantum effects are in reach in the near
future, and the data from astromomy that may cast light on quantum
gravity are scarce. (Quantum gravity is not demanded by unexplained
data but only by the quest for consistency with particle physics.)
----------------------------------------
S10d. Renormalization in quantum gravity
----------------------------------------
Renormalization of QFTs is needed to make the coefficients in the
loop expansion (i.e., the expansion in powers of Planck's number hbar)
of the S-matrix well-defined.
Canonical quantum gravity is the theory obtained by writing down the
Einstein-Hilbert action in a (3+1)-dimensional splitting (ADM formalism)
and either fixing coordinates and solving the constraints (reduced phase
space quantization) or quantizing using Dirac's approach to constrained
systems (Dirac quantization).
Covariant quantum gravity is the theory obtained as follows:
Write down the classical Hilbert action for general relativity,
look at the corresponding functional integral defined perturbatively
as for QED or QCD, and try to compute S-matrix elements using the
usual renormalization prescriptions for the integrals corresponding
to the various Feynman diagrams.
Quantum field theories are nowadays almost always defined in the
covariant way; the covariant approach has the advantage of being
manifestly invariant under the full symmetry group. (The canonical
approach to scalar QED fails in certain versions to preserve
Poincar'e symmetries, due to term ordering problems; see
gr-qc/9403065.) On the other hand, the canonical approach is
intrinsically nonperturbative, while the covariant approach needs
extra tricks (renormalization group enhancements) to get partial
nonperturbative results.
Covariant quantum gravity only works in the traditional way up to
1 loop (and together with matter not even then); at higher loops
(i.e., for corrections of higher order in the Planck constant hbar)
one needs more and more counterterms to make the resulting combination
of integrals finite. See
S. Deser,
Infinities in Quantum Gravities,
http://arxiv.org/pdf/gr-qc/9911073v1
(and references [2,4] there). This is called 'nonrenormalizability',
and is the main blemish of covariant quantum gravity.
(For other potential problems, see, e.g., gr-qc/0108040.)
Note that quantum gravity, though nonrenormalizable in the
established sense, is renormalizable in a weak sense,
where infinitely many counterterms are allowed; see
J. Gomis and S. Weinberg,
Are Nonrenormalizable Gauge Theories Renormalizable?
http://arxiv.org/pdf/hep-th/9510087.
Most researchers in quantum gravity want a renormalizable theory
in the strong sense (so that finitely many counterterms suffice);
then covariant quantum gravity is out, and people look
for fancy alternatives (loop quantum gravity, superstring
theory, etc.). However, these theories have their own difficulties.
Some online references are:
gr-qc/9803024: Strings, loops and others: a critical survey
of the present approaches to quantum gravity
gr-qc/9710008: Loop quantum gravity
http://relativity.livingreviews.org/Articles/lrr-1998-1/index.html
hep-th/9709062: Introduction to superstring theory
astro-ph/0304507: Update on string theory
hep-th/0311044: The nature and status of string theory
physics/0605105: a short review of superstring theories
gr-qc/0410049 shows how gravity derives from string theory;
a more complete derivation is in section 3.7 of Polchinski's book.
Phys. Rev. Lett. 60, 2105-2108 (1988) discusses the lack of Borel
summability of the S-matrix expansion for the bosonic string.
http://math.ucr.edu/home/baez/week195.html tells about the state
in 2003 concerning the claims of (super)string theory to be a
renormalizable quantum theory. Only the 2 loop case seems to be
settled; see arXiv:hep-th/0501197 and hep-th/0211111 (especially
Section 14 of the latter for the unsolved problems at 3 loops and
higher).
Others treat covariant quantum gravity just as they treat
nonrenormalizable effective field theories, and fare well with it.
See, for example,
C.P. Burgess,
Quantum Gravity in Everyday Life:
General Relativity as an Effective Field Theory
Living Reviews in Relativity 7 (2004), 5
http://www.livingreviews.org/lrr-2004-5
for 1-loop corrections, and
Donoghue, J.F., and Torma, T.,
Power counting of loop diagrams in general relativity,
Phys. Rev. D, 54, 4963-4972,
http://arxiv.org/abs/hep-th/9602121
for higher-loop behavior.
Section 4.1 discussed recent computational studies showing that
covariant quantum gravity regarded as an effective field theory
predicts quantitative leading quantum corrections to the
Schwarzschild, Kerr-Newman, and Reisner-Nordstroem metrics.
Only a few new parameters arise at each loop order, in particular only
one (the coefficient of curvature^2) at one loop.
In particular, at one loop, Newton's constant of gravitation becomes
a running coupling constant with
G(r) = G - 167/30pi G^2/r^2 + ...
in terms of a renormalization length scale r.
Here is a quote from Section 4.1:
''Numerically, the quantum corrections are so miniscule as to be
unobservable within the solar system for the forseeable future.
Clearly the quantum-gravitational correction is numerically extremely
small when evaluated for garden-variety gravitational fields in the
solar system, and would remain so right down to the event horizon even
if the sun were a black hole. At face value it is only for separations
comparable to the Planck length that quantum gravity effects become
important. To the extent that these estimates carry over to quantum
effects right down to the event horizon on curved black hole
geometries (more about this below) this makes quantum corrections
irrelevant for physics outside of the event horizon, unless the
black hole mass is as small as the Planck mass''
----------------------------------------------
S10e. Hadamard states and their Hilbert spaces
----------------------------------------------
In his book on qunatum field theory in curved spacetime
Wald delineates a class of 2-point functions called Hadamard states
that have locally the same kind of singular behavior as the flat
free 2-point functions. This class of states is also natural from
several other points of view, though I cannot give details off-hand
since this is slightly outside my field of knowledge.
Associated to each Hadamard state is a Gaussian state |0>
of the quantum field which is constructed from the 2-point function
via Wick's theorem. This state is often called a 'vacuum state',
though this is not quite appropriate, unless one allows the vacuum
to carry gravitational and electromagnetic fields. A more appropriate
name would be a 'coherent state' since it is the generalization of
coherent states in the Fock spaces considered in optics.
Each Gaussian state produces a Hilbert space of wave functions
consisting of linear combinations of the a*_k1 a*_k2 ...|0>,
weighted by sufficiently smooth functions of the k's to render
their norm finite.
All states in this Hilbert space are also physically reasonable,
but they do not have the same basic (vacuum-like)
status as the Hadamard states since they are no longer Gaussian,
and hence are harder to work with.
But you can evaluate <psi|phi(x)phi(y)|psi> in such a state by
expanding everything in terms of vacuum expectations of expressions
in a's and a^*'s and applying Wick's theorem. Their leading singular
behavior is probably the same as for the Gaussian state itself,
though I haven't tried to check this.
-----------------------------------
S10f. Why do gravitons have spin 2?
-----------------------------------
The reason is that gravitation is described by a metric
(symmetric 2-tensor field) modulo general covariance,
which gives locally, in the tangent Minkowski space of any point,
a spin 2 representation of the Poincare group.
Gravitational waves have to be (classically) long range,
which requires (after quantization) massless particles.
Thus gravitons (although never observed) should be massless
spin 2 particles.
-----------------------------------
S10g. What is the tetrad formalism?
-----------------------------------
A way of writing general relativity such that it can be
applied to a spinor (e.g. electron) field.
A tetrad is a set of four linearly independent
vector fields e_0, e_1, e_2, e_3.
Considering them orthonormal in the sense that
g(e_j,e_k)=eta_jk (*)
where eta is the Minkowski metric defines the
metric g uniquely; conversely, for any metric one can
choose (on any chart) such an orthonormal basis.
If the manifold is parallelizable then one can choose
the ONB even globally. In 4 dimensions, any manifold
which allows to define spinors consistently is
parallelizable (by a result of Geroch), hence reality
is most likely described by such a manifold.
Using (*), one can rewrite any formula involving the
metric into one involving instead tetrads, and many
things simplify - using tetrads is closer to the Cartan
formalism of differential geometry than using the metric
directly. E.g.,
sqrt(-det g) = det(e).
One has to be slightly careful not to confuse curved
and flat indices, but this is learnt very quickly.
Then one needs much less index shifting.
For gravitation coupled to a (classical) Dirac field,
the tetrad formalism is indispensable, since spinors
cannot be defined without a flat representation.
----------------------------------
S10h. Energy in general relativity
----------------------------------
Energy is no absolute concept, but depends on the observer
(in the nonrelativistic case, by choice of a velocity,
in the relativistic case, by choice of time-like unit
vector that defines the direction of time and hence the
time coordinate).
In classical mechanics there is always a (up to rotations)
distinguished center of mass frame where the whole system
is at rest and the center of mass at zero.
The observer is usually (silently) considered to be at rest
with respect to that frame; then there is no ambiguity
left in the energy.
In special relativity things are already more problematic
since there is no natural center of mass. But one can fix
the time direction by taking it to be that of the total
4-momentum of the whole system. This again fixes a frame,
now up to Euclidean motions. On the other hand, this is not
what an observer (who has a slightly different eigentime
depending on its 4-momentum) sees, and must be corrected
accordingly.
In general relativity the conserved total 4-momentum is
identically zero, so there is no longer a way to fix a
time direction. But assuming an asymptotically flat
space-time one can take its flat coordinate system
(determined up to a Poincare transformation) and
use it to chart the localized part, and gets a Minkowski
description, to which the preceding applies.
In general relativity, the concept of energy depends on the
choice of a spacelike hypersurface defining a region of space
and a time-like vector field along that hypersurface defining
the direction of time: Then the integral of [part of]
the (0,0)-component of the energy-momentum tensor over this
hypersurface defines the corresponding [part of the]
energy in this region.
This allows one to talk about the (observer-dependent) energy
of a subsystem, or of all matter in the universe, etc.
Observer-independent is the energy-momentum tensor density
as a whole, but not energy.
The weak-field limit defines a preferred coordinate system,
thus reducing the arbitrariness to the choice of the time
direction, and the nonrelativistic limit fixes this choice
to be the direction of the total momentum of the reference
object (e.g., the earth or sun or our galaxy). This makes
everything completely determined and gives us a good
energy for everyday life.
----------------------------------
S10i. What happened to the aether?
----------------------------------
The aether as supporting substance for electromagnetic waves
was a standard hypothesis in the 19th century but fell out of
favor with the successes of relativity theory.
When in vogue, the aether was the substance filling empty space
- i.e., the physics of the aether is the physics of empty space.
In a way, the classical background field (also termed the 'vacuum',
or more neutral a 'coherent state' or - in quantum gravity -
a 'Hadamard state') around which the quantum field is expanded into
excitation modes (photons, gravitons, etc.) is the modern equivalent
of the aether. However nobody uses the term since it it fraught with
misleading connotations, and not really needed.
In modern language, the aether is called the vacuum, and the properties
of the aether are the properties of the vacuum.
While the 19th century aether was thought to be at rest,
the 20th century aether (= the vacuum in a quantum field theory)
is a Poincare invariant state with zero quantum numbers.
(In a putative quantum gravity, it would even be a diffemorphism
invariant state, should something like that exist. The Unruh effect
indicates, however, that there is probably no objective vacuum,
since emptiness is observer dependent.)
Indeed, Poincare invariance is the modern way of saying
'being at rest' - the momentum of a Poincare invariant state is zero
in every frame of reference, and the mass of a Poincare invariant
state must also be zero, which implies that the vacuum is empty
in terms of mass. (It is however allowed to be filled by a constant
nonzero Higgs field, as required in the standard model.)
Identifying the aether and the vacuum is consistent with the way
Einstein thought about the topic, as the following quotes from
Einstein's lecture (in German) at the University of Leyden, 1920, show:
''Da solche Felder auch im Vakuum - d.h. im freien Aether - auftreten,
so erscheint auch der Aether als Traeger von elektromagnetischen
Feldern.''
''Man kann hinzufuegen, dass die ganze Aenderung der Aetherauffassung,
welche die spezielle Relativitaetstheorie brachte, darin bestand,
dass sie dem Aether seine letzte mechanische Qualitaet, naemlich die
Unbeweglichkeit, wegnahm.''
''Man kann die Existenz eines Aethers annehmen; nur muss man darauf
verzichten, ihm einen bestimmten Bewegungszustand zuzuschreiben,
d.h. man muss ihm durch Abstraktion das letzte mechanische Merkmal
nehmen, welches ihm Lorentz noch gelassen hatte.''
''Der Aether der allgemeinen Relativitaetstheorie ist ein Medium,
welches selbst aller mechanischen und kinematischen Eigenschaften
bar ist, aber das mechanische (und elektromagnetische) Geschehen
mitbestimmt.''
''Man kann also wohl auch sagen, dass der Aether der allgemeinen
Relativitaetstheorie durch Relativierung aus dem Lorentzschen Aether
hervorgegangen ist.''
''... Den Aether leugnen bedeutet letzten Endes annehmen, dass dem
leeren Raume keinerlei physikalische Eigenschaften zukommen...''
For the complete speech in German and in English translation, see
http://www.alberteinstein.info/db/ViewCpae.do?DocumentID=34003
(the part with the above quotes is not freely available online).
-------------------
S10j. What is time?
-------------------
It is commonly asserted that in general relativity there is no
absolute simultaneity. On the other hand, it is asserted that
we see the Sun as it was 8 minutes ago and the Andromeda nebula
as it was 2.5 million years ago. This seems to conflict with
each other - apparently we have no diffeomorphism invariant way
of assigning a relative time to a distant object.
Let us take a closer look at the issues involved.
The invariant way of defining present is to say that
x and y are present if the two points are in a spacelike relation,
and to say y was earlier (or later) than x if y lies in or on
the past (or future) light cone.
Thus the present is well-defined as the complement of the
closed light cone.
Now suppose that you look at the sun. If one is really pedantic,
one would have to say that you see the sun in your eye, as a
2D object, and not out there in 3D. But we are accustomed to
interpret our sensations in 3D and hence put the sun far away
but into the here.
In general relativity, one goes a step further.
One thinks in terms of the 4D spacetime manifold and places the sun
there. Calculating the length of the geodesic gives a value of 0,
so the sun is not in your present. Consideration of the sign of the
time component in an arbitrary proper Lorentz frame, one finds that
the sun is in your past, as everything you observe.
But the amount of invariant time passed, as measured by the metric,
is zero. This looks like a paradox. What happened with the claimed
8 minutes?
The answer is that the metric time is not the right way to measure
time. It is the only time available in a Poincare-invariant flat
universe, or in a diffeomorphism invariant curved universe.
An empty universe where only noninteracting observers
move has no notion of simultaneity.
But a matter-filled, homogeneous and isotropic universe
generally has one, defined by the rest frame of the galactic fluid
with which general relativity models cosmology.
Since the fluid breaks Lorentz symmetry (except in
very special cases, which are ruled out by experiment)
it creates a preferred foliation of spacetime.
This foliation gives a well-defined cosmic time, when
scaled to make the expansion of the universe uniform.
(Actually there are several natural scalings = monotone
transformations of the time parameter;
see Section 27.9 in Misner/Thorne/Wheeler, so cosmic time
without a reference to the scale used is ambiguous.)
This cosmic time figures in all models of cosmology.
The values commonly talked about when quoting times
for cosmological events, such as the date of the big bang
or the time a photon seen now left the Andromeda nebula,
refer to this cosmological time.
-------------------------------
S10k. Time in quantum mechanics
-------------------------------
In the traditional formulation of quantum mechanics, time is not an
observable. Nevertheless it can be observed...
However, this analysis works only when one assigns to single clocks
a well-defined state, hence assumes a version of the Copenhagen
interpretation.
From the point of view of the minimal statistical interpretation,
one needs in contrast a whole ensemble of identically prepared
clocks to measure time...
--------------------------------------------------
S10l. Diffeomorphism invariant classical mechanics
--------------------------------------------------
In mechanics, time is a point in a 1-dimensional manifold,
and diffeomorphisms are just smooth reparameterizations of the time.
For any Lagrangian of the form
L(q,qdot,t) := U(q(t)) qdot(t),
where q is an n-dimensional column vector and U an n-dimensionaler
row vector, the action
S = integral L(q,qdot,t) dt
is diffeomorphism invariant. As a consequence, the Noether energy
(the formal Hamiltonian constructed in the transition from a Lagrangian
to a Hamiltonian formulation) vanishes identically and has no physical
content. For one can bring an arbitrary Hamiltonian system
xdot=H_p(p,x) , pdot=-H_x(p,x),
where H is the physically relevant energy, into the above form by
putting
q^T = (x^T,p^T,s),
U(q) = (p^T,0^T,-H(p,x)).
For a careful discussion see Section 4.3 of
PJ Olver,
Applications of Lie groups to differential equations,
Springer, New York 1993.
Those who can read German, can find more in the Section on
''Diffeomorphismeninvariante klassische Mechanik'' in my
German Theoretische-Physik-FAQ at
http://www.mat.univie.ac.at/~neum/physik-faq.txt
For diffeomorphism invariant reformulations of arbitrary field
theories, see
C.G. Torre,
Covariant phase space formulation of parameterized field theories,
J. Math. Phys. 33 (1992) 3802-3812
hep-th/9204055
----------------------------
S10m. The concept of ''Now''
----------------------------
Time is passing - what is ''now'' in our subjective experience
changes. But there is no concept of ''now'' in physics.
Classical nonrelativistic mechanics does not know the concept of now.
One declares some time to be ''now'' - but which time one declares to
be ''now'' is completely subjective (i.e., in different situations it
will be declared differently). Similarly, one declares some position
to be ''here'', but which position you declare to be ''here'' is
completely subjective, in the same sense.
Classical relativistic mechanics does not know the concept of now,
either, but things change a little: Here one declares some event
(= spacetime point) to be ''here and now'' - but which event one
declares to be ''here and now'' is completely subjective.
Nonrelativistic quantum mechanics treats time completely differently
from space (time is a parameter, space coordinates are operators),
and introduces stochastic elements into the dynamics.
but with respect to ''here'' and ''now'', the situation is identical
with that in the classical nonrelativistic case.
Relativistic quantum mechanics restores the treatment of space and
time on equal footing (space annd time coordinates are parameters),
and introduces stochastic elements into the dynamics.
But with respect to ''here and now'', the situation is identical
with that in the classical relativistic case.
Once one has chosen ''here'' and ''now'', respectively ''here and now'',
it serves as origin of the tangent hyperplane, in which localized, flat
physics can be done, reflecting faithfully what happens in a
neigborhood of the spacetime point. This is the domain of relativistic
quantum field theory.
------------------------------------------------------------
S11a. A concise formulation of the measurement problem of QM
------------------------------------------------------------
Quantum mechanics asserts in the Born rule (also called Lueder's rule)
that when a particle prepared in a pure state passes an ideal
measuring instrument characterized by a finite family of mutually
orthogonal projectors P_k (with P_k = P_k^*, P_k P_l = delta_kl P_l
and sum_k P_k = 1), it transforms the pure state psi into the pure
state psi_k = P_k psi/p_k with probability p_k= psi^* P_k psi.
This is a consistent rule in a purely statistical interpretation
in which psi is an objective property of a source (describing the
statistical behavior of an ideal - stationary and pure - source of
particles) rather than an objective property of each individual
particle.
The measurement problem arises when (as is commonly informally assumed)
the wave function is regarded as an objective property of a particle.
Then the stochastic transformation demanded by the Born rule, called
the collapse of the wave function, conflicts with the deterministic,
unitary dynamics of the wave function demanded by quantum mechanics
of the joint system consisting of particle+instrument+environment.
The unitary dynamics predicts that the joint system is in a macroscopic
superposition, which is not observed.
Note that a measurement does not need a conscious observer.
A measurement is any permanent record of an event, whether or not
anyone has seen it. Thus the terabytes of collision data collected
by CERN are measurements, although most of them have never been
looked at by anybody. We human beings only look at crude summaries
of such high tech data, but the collapse (which gives rise to
individual particle tracks) is clearly independent of whether or when
we look at them.
--------------------------------
S11b. The double slit experiment
--------------------------------
The double slit experiment, where a broad beam of particles passes
a screen with two slits, is one of the most fundamental quantum
experiments.
Standard wave function arguments for purely unitary quantum mechanics
predict (at best) that the effect of the screen is to turn a particle
in a pure state psi into a superposition of at least three terms,
one each for being in one of the two beams (for sufficiently wide
slits) or spherical waves (if the slits are narrow enough)
passing the slit and a third (or more) for the particle being stuck
somewhere on the screen.
This conclusion is arrived at as a simple consequence of linearity of
the Schroedinger equation, together with natural assumptions of what
happens for particles prepared in coherent states.
But it is generally believed - and assumed in _all_ discussions of
interference - that a double slit screen projects a particle with
incoming wave function psi with the correct Born probability to a
particle in a superposition of the two beams that pass the slits.
The challenge is to derive this from a quantum model of the situation,
without invoking explicit collapse anywhere in the derivation.
Before this cannot be done convincingly, I don't consider the
measurement problem solved.
For a precise version of a (slightly different) challenge, see
http://www.mat.univie.ac.at/~neum/collapse.html
----------------------------------
S11c. The Stern-Gerlach experiment
----------------------------------
Another basic quantum experiment is the Stern-Gerlach experiment.
An input beam of silver atoms is passed through an inhomogeneous
magnetic field in a fixed direction, which produces a sideways
classical force on each silver atom proportional to the atom's
magnetic moment. The magnetic field is said to split the input beam
into two separate beams corresponding to atoms of spin up and down,
respectively, which shows in the experiment as silver spots where
the beams hit a screen. If the beam of silver atoms is replaced by
a beam of electrons with very low intensity and the screen is replaced
by a more sensitive detector, one observes single detection events,
each randomly at one of the two spots. Each such event is generally
interpreted as a spin measurement (up or down), which makes sense
only if the wave function actually collapses to |up> or |down>.
(Though this is very questionable since the electron stops existing
as an object separable from the screen.)
If a blocker is put in the way of one of the beams, the corresponding
spot on the screen disappears, but if the blocker is sensitive as well,
single observations are found to occur at the blocker as well.
According to strictly orthodox but purely unitary quantum mechanics,
the situation is the following:
If a single particle leaves the magnetic area, it is in an entangled
state consisting of a bilocal superposition of wave packets somewhere
along the two beams. When it encounters the blocker,
this single electron turns into a still bilocal superposition of wave
packets: One remains stuck where the blocked beam meets the block
and the other continues its motion along the unblocked beam.
A little later, this second wave packet meets the screen, and we end up
with a still bilocal superposition of wave packets, now both sitting
at the end points of the respective beam. Without the blocker,
essentially the same happens, except that the electron ends up
in a superposition of two spots on the screen.
More precisely, what happens is that if one starts with a pure
state |x,p> |left>, where |x,p> denotes an approximately coherent
state with position x and momentum p, and
|left>=1/sqrt(2)(|up>+|down>),
one gets approximately a superposition
1/sqrt(2)(|x^+(t),p^+(t)>|up> +|x^-(t),p^-(t)>|down>),
where the parameters in the approximately coherent states follow
classical paths in phase space determined by approximately classical
motion due to the magnetic field, the blocker and the screen -
After hitting blocker and screen. respectively, positions are constant
and momenta vanish, and the particle is in a superpostion of two spots.
All this follows without difficulty from the superposition principle,
i.e., from the linearity of the Schroedinger equation.
To match observations in an objective interpretation of the wave
function, one needs a mechanism for changing the unobserved
superposition of spots into the observed definite spot. In an
observer-independent interpretation this has to happen in the split
moment between the particle feeling the presence of blocker or screen
and hitting or passing it. This is the so-called collapse of the wave
function.
According to the old school (von Neumann, London and Bauer, Wigner),
in a purely unitary setting it requires a conscious look at what
really happened to change the superposition of spots into a definite
spot, which gives quantum mechanics an uncomfortable subjective,
human-centered touch.
--------------------------------
S11d. The minimal interpretation
--------------------------------
The minimal interpretation of quantum mechanics does not model
what really happens - it only claims probabilities.
When quantum mechanics is applied to small systems, one usually asks
only for statistical information. Here a collapse simply means a
change of the point of view resulting in taking conditional
expectations, and all difficulties disappear.
In that case, each particle simply moves in an undeclared and
undeclarable fashion along the experimental setting, the classical
instruments are always in a definite state, and instead of
superpositions one has probabilities of observation of exactly one
of the possible results in the superposition.
Now all objectivity (sources and preparation, detectors and
measurements) is in the classical setting only, which coexists
with the somewhat spooky quantum world, connected by quantum statistics.
The problem here is how to unify what happens classically
and quantum mechanically. This minimal view becomes inconsistent
once one wants to consider the classical system as a large quantum
system - all objectivity disappears since macroscopic superpositions
are possible.
(Generally, nonlinear modifications of the Schroedinger dynamics
are considered a possible way out, but this introduces other problems.)
The main limitation of the minimal interpretation is that it does not
apply to systems that are so large that they are unique.
Today no one disputes that the sun is governed by quantum mechanics.
But one cannot apply statistical reasoning to unique systems, such as
the sun as a whole.
If quantum mechanics is a universal theory of nature, it should also
apply to the sun as a whole. At least we know that it applies to the
extent that it governs the energy generating processes in the sun.
The actual numerical analysis of models of the sun use just
treats the nuclear reactions within a classical reaction-diffusion
framework, which (in principle - I don't know whether anyone has
actually done it) should be derivable from quantum mechanics using
statistical mechanics arguments.
A purely statistical interpretation has also a problem with the
notion of probability. (See the discussion on probability elsewhere
in this FAQ.) Probability (and hence the quantum state that predicts it)
is often seen as a subjective view about the experimenter's assumed
knowledge, or the knowledge an experimenter could gain when 100%
attentive. There is the subjectivist difficulty to determine
whose knowledge counts and why unobserved (and hence unknown)
classical processes still make a difference;
but one could imagine an ideal classical observer of the status of
Laplace's demon, for whom these problems would be absent.
---------------------------------
S11e. The preferred basis problem
---------------------------------
Born's rule, stated in the form that |<phi|psi>|^2 is the probability
that a system prepared in state psi is, upon measurement, found in state
phi, is valid only if a complete set of commuting observables is
measured and phi belongs to the preferred basis determined by the
experimental setting (i.e., the family of projectors).
Given the present state of the universe (which fixes the experimental
setting), there is no choice in the preferred basis. Thus, in a
mathematical model of quantum mechanics in the large, it has to
be deduced from the assumptions about the initial state and the
dynamics.
The preferred basis is fully determined by Nature, and that's why we can
find it out. Given an unknown instrument, one finds out by
experimenting with the new piece, letting it interact with systems
of known properties, and matching the collected data to trial models
until one fits. This is how things are indeed done in practice.
The process is called model calibration (or parameter estimation if
the model is fixed up to adjustable parameters).
At first, one never knows a new instrument precisely, and has to check
out its properties. After sufficient experience with enough instruments,
one knows reasonably well what to expect of the next, similar one.
Then only fine-tuning is needed, which saves time. And this knowledge
can be used to create new instruments which are likely to behave a
certain way; but one still has to check to which extent they actually
do, since no theoretical design is realized exactly in practice.
Not even in the classical, macroscopic domain!
Nature's choice is systematic, hence after having
seen that a number of screens have a preferred position basis,
we conclude that this is the case generally. As for a spectrometer,
if it is built with a prism to analyze light, it is reduced by theory
to the observation of light or current at certain positions of the
screen, which is done in the preferred position basis. Something
similar can be said about the Stern-Gerlach experiment.
So once one knows _some_ of Nature's preferences and the general laws,
one can deduce other preferences.
The challenge posed in the measurement problem is to deduce
from first principles that a screen made of quantum matter,
with two slits in it, actually has a preferred position basis and
projects the incoming system to the part determined by the slits.
-------------------------------------------
S11f. Master equation and pointer variables
-------------------------------------------
On an approximate level, the preferred basis problem is approached
via quantum master equations.
A quantum master equation is a dynamical equation for the density matrix
of a dissipative quantum systems, which approximates a quantum system
weakly coupled to an environment at time scales long compared to the
typical interaction time but short enough to avoid recurrence effects.
More precisely, the dynamics is given by a completely positive
Markovian semigroup in a representation named after Lindblad,
wo discovered its general form.
For a classical damped linear system xdot(t)=Ax(t) with a matrix A
whose spectrum is in the left complex half plane, the contribution of x
in the invariant subspace corresponding to eigenvalues which are not
purely imaginary decays to zero, so that at large times t,
x(t) essentially approaches the invariant subspace corresponding to
purely imaginary eigenvalues.
For a quantum master equation, a similar analysis holds and shows that
(under suitable conditions) the density matrix at times much larger
than the so-called decoherence time approaches a block diagonal form
in a suitable basis. Thus it (almost) commutes with a special set
of observables, which define the 'pointer variables' of the system.
These pointer variables therefore behave essentially classically.
If the pointer variables form a complete set of commuting variables,
the density matrix approaches a diagonal matrix, and the basis in
which this happens is called the 'preferred basis'.
For details, see, e.g., cond-mat/0011204 or gr-qc/9406054
-----------------------------------------------------
S11g. Does decoherence solve the measurement problem?
-----------------------------------------------------
Many physicist nowadays think that decoherence provides a fully
satisfying answer to the measurement problem. But this is an illusion.
Decoherence is the (experimentally verified) decay of
off-diagonal contributions in a density matrix (written in a
preferred basis), when information dissipates into unobservable
degrees of freedom in the environment of a system.
In particular, decoherence reduces a pure state to a _mixture_
of eigenstates. This is enough to induce classical features
in many large quantum systems, characterized by a lack of
interference terms.
Thus decoherence is very valuable in understanding the classical
features of a world that is fundamentally quantum.
------------------------------------------------------------------
S12b. Which textbook of quantum mechanics is best for foundations?
------------------------------------------------------------------
For large ensembles, there seems to be no disagreement about the
interpretation. The book
A. Peres,
Quantum theory - concepts and methods,
Kluwer, Dordrecht 1993
is probably the most useful (i.e., both clear and applicable)
account of foundational aspects on this level. It is not the easiest
book, though, and reading it demands more attention than, say
Sakurai's book. The latter is much more readable but has sloppy
foundations only; see the discussion in
http://groups-beta.google.com/group/sci.physics.research/msg/77630f64b987274f
?dmode=source
There are also nice online treatises on certain aspects.
For the basics as related to quantum information theory, see, e.g.,
M. Plenio, Quantum Mechanics
http://www.lsr.ph.ic.ac.uk/~plenio/lecture.pdf
M.B. Plenio and V. Vedral
Entanglement in Quantum Information Theory
quant-ph/9804075
M.B. Plenio and P.L. Knight
The Quantum Jump Approach to Dissipative Dynamics in Quantum Optics
quant-ph/9702007
Modern experiments appear to need, however, a quantum mechanics
of individual systems, and that's where controversy and confusion
prevails. I find none of the existing interpretations convincing,
and wrote up in Int. J. Mod. Phys. B 17 (2003), 2937-2980
= quant-ph/0303047 my own constructive (but incomplete) view
of the matter.
This paper is completely self-contained and works directly
with the statistical mechanics version of QM, with the
benefit that it avoids many of the traditional obscurities.
It discusses complementarity, ensembles, uncertainty relations,
probability, quantum logic, nonlocality, Bell inequalities,
sharpness of measurements, and rudiments of quantum dynamics.
The German ''Theoretische Physik FAQ'' at
http://www.mat.univie.ac.at/~neum/physik-faq.txt
contains a German language exposition of my consistent experiment
interpretation of quantum mechanics, which is a much extended version
of the above and gives a consistent setting for a quantum universe
which explains the nature of quantum chance. A paper on this
(in English) is in preparation.
For the history of the interpretation of QM, see the excellent book
Max Jammer
The philosophy of quantum mechanics.
The interpretations of quantum mechanics in historical perspective
Wiley, New York 1974
and the collection of original papers,
J.A. Wheeler and W. H. Zurek (eds.),
Quantum theory and measurement.
Princeton Univ. Press, Princeton 1983,
----------------------------------------
S12c. What is the role of quantum logic?
----------------------------------------
Quantum logic is a variant of logic often thought to be
appropriate for the foundations of quantum mechanics.
A good exposition is given in
K. Svozil,
Quantum Logic,
Springer, Singapore 1998.
The book is nice and useful for its material on hidden-variable
related arguments.
However, all that is commonly argued in textbooks about QM is argued
in terms of classical logic. An even cursory look at the large
quantum mechanical literature reveals that quantum logic only has
a marginal spectator role in QM, while all proofs of all properties
of quantum systems have always been discussed using the familiar
classical logic. Even in Svozil's book, one can see that quantum
logic is argued in terms of classical logic, and that it has
essentially no role in the analysis of actual physical situations
(apart from those used for testing the foundations).
Beyond a certain point, quantum logic is sterile, which is the reason
it never figures in textbooks (except perhaps in passing).
All one ever needs to know about quantum logic (unless one wants to
specialize in it) is summarized in Sections 6 and 7 of my paper
Int. J. Mod. Phys. B 17 (2003), 2937-2980 = quant-ph/0303047.
----------------------------------
S12d. Stochastic quantum mechanics
----------------------------------
For certain Hamiltonians, the Schroedinger equation can be interpreted
as a classical diffusion process. This leads to the stochastic
quantum mechanics of Nelson. For an overview, see, e.g.,
http://www-stud.uni-essen.de/~sb0264/stochastic.html
While it gives an interesting aspect to quantum mechanics and its
classical limit, Nelson's description has a severe deficiency
in that it cannot handle the situation when the wave function vanishes
at some point. At all such points, R has a singularity, and S is
entirely undefined. This happens, e.g., for excited states of hydrogen,
hence is an integral part of standard quantum mechanics.
Even if one argues that such states are idealized and cannot occur,
it seems not be possible to show that a state that is everywhere
nonzero will preserve this property under time evolution.
Thus Nelson's representations may develop spurious singularities
which are not in the observable part of quantum mechanics.
Also, it is awkward to do scattering calculations in Nelson's
framework. Moreover, Nelson, as quoted on p. 16 of the above paper,
says correctly,
''Quantum mechanics can treat much more general Hamiltonians
for which there is no stochastic theory.''
Thus it is unlikely to be useful as a 'fundamental' description
of nature.
Instead, natural stochastic forms of quantum mechanics are those of
quantum diffusion processes and quantum jump processes, in which the
wave function itself is regarded as a classical random object.
For their use in an experimental context, see, e.g., quant-ph/9805027.
-------------------------------------------------
S12e. Is there a relativistic measurement theory?
-------------------------------------------------
Real measurements take time, and are not instantaneous.
To treat the collapse as instantaneous is an idealization,
valid for many applications of quantum mechanics.
If relativistic effects play a role, one needs to use
quantum field theory. However, the measurement process in
quantum field theory is very poorly researched.
Thus statements about the conflict of instantaneous collapse
and relativity theory are based on very shaky grounds.
For measurement in the relativistic case (but without
invoking field theory) see quant-ph/9906034 and other papers
by Peres and/or Terno available in the arxiv.
They indicate the absence of problems, as far as such a simplified
analysis can be trusted.
--------------------------------
S12f. Quantum mechanics and dice
--------------------------------
It is frequently held that quantum mechanics makes only statements
about probabilities and not about single events.
This is very strange for a theory that claims to be the foundation
for everything scientifically observable.
According to the probabilistic view, quantum mechanics is incapable
of making any statement about dice that have been thrown already.
Although we can observe with perfect accuracy the value of the throw,
all that traditional quantum mechanics can give is the probability
distribution of the possible values of the throw, if this value were
not yet known.
Quantum mechanics has similar difficulties coping with other
actual events, since it never ever predicts what must happen or what
must have happened, but only gives probabilities.
This is of little consequence for quantities like the value of a
throw of three dice, but is a severe defect when discussing the
trajectories of the planets of the Solar System (for which we cannot
make meaningful statistics), of air planes, or of cars.
Clearly there must be something objective about these, although
traditional quantum mechanical interpretations - taken seriously -
are unable to accont for definite individual events.
---------------------------------------------
S13a. Random numbers and other random objects
---------------------------------------------
In probability theory, a random number is just a random variable x,
i.e., a measurable function on the set Omega of possible experiments,
that assigns to each experiment omega in Omega the value x(omega)
of x in this experiment.
In the important, 'noninformative' case where the measure is invariant
under a group transitive on Omega, so that all experiments are
identical copies of one another, physicists refer to this set Omega
as a (classical) 'ensemble',
although they are usually too vague to express this in formal terms.
The terminology easily extends to the inhomogeneous case if one
allows in ensembles each realization with a different frequency.
Mathematicians prefer to leave the set Omega (which they call the
'sample space') unspecified and talk about 'realizations' in place of
'experiments'. Thus, for each experiment omega in Omega, x(omega) is
a realization of x, i.e., what physicists would call the value found
in this particular experiment.
By giving a specific definition of the sigma algebra of interest,
and specific recipes defining x(omega), one has a model world in which
realizations make perfect sense.
A difficulty is, of course, that we do not have such a model for the
real world, and hence must resort to empirical approximations when
treating real-life problems. (This places physicists at a slight
disatvantage; however, there is the compensating advantage that their
results apply to real life instead of only satisfying one's sense of
beauty and precision....)
The only thing not specified in probability theory (unless one specifies
a particular model as indicated above) is the mechanism that draws
the number, and hence there is no way to know which experiment omega
has been realized. Therefore, probability theory makes only statements
about _all_ realizations simultaneously.
Example. Given the axioms of probability theory, a random number
uniformly distributed between zero and one is defined as a random
variable x such that
<f(x)> = integral_0^1 f(s) ds
for all Lebesgue-integrable functions f on [0,1], and any x(omega) is a
realization of it, i.e., an actual number in [0,1]. (In particular,
random numbers are _not_ numbers!)
Mechanisms to draw numbers that may be used as approximations to a
sequence of independent realizations x(omega) are called randon number
generators. They do not produce random numbers (since random numbers
are not numbers but measurable functions). Instead, they produce
sequences that look like typical
realizations of sequences of independent, uniformly distributed random
numbers (in the sense that they usually pass with high confidence level
certain statistical tests valid for such random sequences).
Therefore, the numbers they generate are used in practice as (often
completely adequate) substitutes for random numbers.
(On the other hand, there is no uniformly distributed random natural
number since the uniform measure on natural numbers,
mu(f) = sum_{k>=0} f(k) is not normalizable.)
-------------------------------------------
S13b. What is the meaning of probabilities?
-------------------------------------------
To say that
"The probability that someone in risk group A will die of cancer is 1/3"
does _not_ mean that
"10 out of 30 people in risk group A will die of cancer".
It only means that,
"on the average, 10 out of 30 randomly chosen people in risk group A
will die of cancer".
This can be checked (in the limit) by many repeated simulations,
or (directly) by a theoretical computation; both require that the
complete ensemble is available. Of course, in using probabilities for
predictive purposes, an insurance company tacitly assumes
(without any guarantee)
that the group of 30 people of interest is actually well approximated
by a random sample, so that one can expect 10 out of the 30 to die of
cancer. But this tacit assumption may well turn out to be wrong.
Statements about ensembles are in principle exactly checkable:
Operationally, to say that "The probability that someone in
risk group A will die of cancer is 1/3" means nothing more or less
than that exactly 1/3 of _all_ people in risk group A will die of
cancer.
(This assumes that risk group A is finite. For infinite ensembles,
to define the precise meaning of '1/3 of all',one needs to go into
technicalities leading to measure theory. Indeed, measures are the
mathematically rigorous versions of 'classical ensembles' in general.
For quantum ensembles, see quant-ph/0303047.)
Of course, we cannot check this before we have information about
how _all_ people in risk group A died, but once we have this
information, we can check and verify or falsify the statement.
In terms of precise mathematics: A classical ensemble is the set of
elementary events underlying the sigma algebra over which the measure
is defined. For example, in any finite sigma algebra containing random
variables representing a fair coin (realizations 0,1; 1=head)
with probability 50%), one has a finite ensemble of elementary events,
and exactly half of them come out heads. For an infinite sigma algebra,
the ensemble is infinite; but with the natural weighting, again exactly
half of them come out head.
Usually, however, we only have incomplete knowledge about the ensemble.
For example, 'Tossing 10 fair coins' is just a sloppy way of saying
'Selecting a sample of size 10 from the total ensemble'.
The sigma algebra for modeling this must contain at least 10 indepemdent
random variables representing fair coins. This is the case, e.g., in the
direct product of N>=10 sigma algebras isomorphic to 2^{0,1}. For N>10,
it is obvious that here the number of heads is 5 (=50%) only on
average over many random samples; and it is impossible to infer the
exact probability from a single sample.
This is why statisticians say that they _estimate_ probabilities
based on _incomplete_ knowledge, collected from a sample.
The resulting estimated probabilities are known to be inherently
inaccurate; but they can be checked approximately by independent data
(cross-validation) providing confidence levels indicating how much
the predictions can be trusted.
On the other hand, they _compute_ probabilities from _assumed_complete
knowledge about the ensemble, namely the theoretical probability
distribution. Thus if complete information goes in, exact information
comes out, while computations based on incomplete information
naturally only gives approximate results inheriting some uncertainty
from the input.
Computed probabilities are powerful, but only if the assumed stochastic
model is correct. Empirical estimates are usually inaccurate but useful.
The two approaches are not contradictory; indeed, they are combined in
practice without difficulties at all.
The only subjective aspect in the whole thing is the choice of a
stochastic model when making theoretical predictions; and even this
is made almost objective by the standard rules of statistical
inference and model building.
Indeed, the choice of ensemble is _always_ a subjective act that
determines what the probabilities mean. It encodes what the user is
prepared to assume about the given situation. Once the ensemble is
chosen - either a theoretical, exactly known ensemble, defined by
specifying a distribution, or as a real life ensemble of which only
a (perhaps growing) sample is available, all probabilities have an
objective meaning.
A chosen ensemble is knowledge precisely if it is close to the correct
ensemble, and we have a good idea of how close it is.
That's why we value highly scientists such as Gibbs who guessed
the right ensembles for statistical mechanics, which turned out to be
a highly accurate description of equilibrium situations.
Only good choices are knowledge.
And what is good is found out only through proper checking,
and not through the principle of insufficient reason.
In case of tossing a coin we know that the fairness assumption is
usually reasonable, being consistent with experience.
In case of taking an exam at a newly appointed professor about whom
no one knows anything, reasoning from the two possible outcomes
(pass or fail) and the principle of insufficient reason to assign
a probability of 50% failure is ridiculous, and dangerous for those
who are not prepared.
-----------------------------------------------------------------
S13c. What about the subjective interpretation of probabilities?
-----------------------------------------------------------------
People with a preference for subjective interpretations would say
''probabilities depend on someone's knowledge''.
instead of
''probabilities are a property of the ensemble under consideration''.
They talk of ''arrival of new information'' or ''learning'' instead of
the objective and unassailable formulation ''restricting the ensemble to
a subset defined by the conditions'' when discussing conditional
probabilities (the classical analogue of the statistical collapse of
the wave packet in quantum mechanics).
But knowledge is an even more poorly defined concept than probability,
which at least has an undisputed axiomatic basis. Thus explaining
probability in terms of knowledge only makes the meaning of probability
more foggy by putting it deep into the psychological realm.
Moreover, the subjective interpretation based on the Bayesian paradigm
of conditional probability has no formal way of coping with
misinformation (the ensemble grows if one learns that some of the
information one believed to know turns out to be false!) while,
on the objective level, the latter is just another change of the
ensemble.
Thus the subjective interpretation of probability is an inadequate
foundation for the use of probabilities in physics.
-------------------------------------------------------
S13d. Are probabilities limits of relative frequencies?
-------------------------------------------------------
Sometimes, probabilities are regarded as limits of relative
frequencies as the number of trials becomes arbitrarily large.
But the weak law of large numbers only guarantees that most trial
histories will give a sequence of relative frequencies that converge
to the probability. It might just fail for the one actually tried...
Moreover, in practice we only have partial knowledge of such an infinite
sequence of trials (which cannot be performed). This knowledge about
the sample give no knowledge at all about the limiting ensemble.
Just as the knowledge of the first n items of a sequence give, in
theory, no knowledge at all about the limit of the sequence.
That we often estimate the limit using a small part of the sequence
is asnother matter, and is like estimating probabilities from samples.
But the estimate may be completely wrong.
Thus interpreting probability as relative frequency is a philosophically
difficult interpretation step. For a thorough discussion, see the very
informative books by
T.L. Fine,
Theory of probability; an examination of foundations.
Acad. Press, New York 1973.
and
L. Sklar,
Physics and Chance,
Cambridge Univ. Press, Cambridge 1993.
--------------------------------------------------------
S13e. How meaningful are probabilities of single events?
--------------------------------------------------------
(Note: In this FAQ, 'event' is always understood in the ordinary sense
of the word, as 'something specific happening'.
In axiomatic probability theory based on Kolmogorov's axioms,
there is a slightly different, formal meaning of an event as an
element of the underlying sigma algebra.
An axiomatic foundation of probability theory equivalent to that of
Kolmogorov, but not based on sigma algebras, can be found in the book
'probability via expectation' by Paul Whittle, and a quantum extension
in quant-ph/0303047.)
Probabilities of single events are not at all meaningful
- at least not in any scientific sense -, although we are
used to scientific-sounding phrases such as
''There is a 60% probability for rain tomorrow''.
Instead, probabilities are properties of ensembles of events.
In the case just cited, the ensemble is the set of all tomorrow's,
(or rather an infinite idealization of it), and the probability is not
an exact probability, but an estimate computed on the basis of a sample
of former 'tomorrow's, together with statistical weather models.
-----------------------------
S13f. Objective probabilities
-----------------------------
Consider a physical die (for simplicity assumed perfectly symmetric)
with six elementary events 1,...,6.
If the die is not thrown, all events are equivalent, and the
probabilities are 1/6 for each event. These probabilities are
associated to the die (_not_ to a throw), and can be determined
uniquely from the knowledge of the geometry and composition of
the die. All of probability theory happens at this level,
since the 'happening' of an event is not formally defined.
If the die is thrown, a given event (say 3) either happens or
does not happen. If the event happens (does not happen), the
statement 'This throw is a 3' is true (false), hence has a
probability of 100% (0%), although before the throw, these
probabilities are not yet known. These probabilities are
associated to each particular throw (_not_ to the die).
Thus a die functions as a potential stationary source of throws,
and hence _defines_ an ensemble of (conceivable) throws.
An actual throw, though a realization of this ensemble,
is determined by the outcome, and cannot be assigned a
probability different from 0 or 1.
[See, e.g., the wikipedia entry
http://en.wikipedia.org/wiki/Probability_theory
''Omega is a non-empty set, sometimes called the "sample space",
each of whose members is thought of as a potential outcome of a
random experiment.''
'is thought of' signifies the interpretational level.
Probabilities are only about 'potential outcomes' (what I call
conceivable), not abut actual ones.]
---------------------------------------------
S13h. How do probabilities apply in practice?
---------------------------------------------
If one has a sound probabilistic model of a multitude of independent
events e_i with same assigned probability p one would be surprised
if the frequency of events is not close to p within a small multiple of
sqrt(p(1-p)/N). Rather than just accepting a rare occurence
(e.g., a brick going upwards due to fluctuations) as something within
one's probabilistic model, one would probably rather try to explain
it away by assuming a hidden, unobserved cause (someone throwing it).
The way probabilities are used in practice is always as informative
guides of what to expect, but not as statements with a 100% exact
meaning. I wrote a paper on surprise:
A. Neumaier,
Fuzzy modeling in terms of surprise,
Fuzzy Sets and Systems 135 (2003), 21-38.
http://www.mat.univie.ac.at/~neum/papers.html#fuzzy
that may help understand the fuzziness inherent in our concepts of
reality.
-----------------------------------------
S13i. Incomplete knowledge and statistics
----------------------------------------
It is offen erroneously assumed that incomplete knowledge can
always be described by statistics. But this is by no means the case.
If one knows about a number x only that it is in [0,1], one cannot
apply statistics since one knows nothing at all about the distribution
(except for its support). It is perfectly consistent with
the knowledge that in fact always x=0.75, except that one does not
know it, or that x oscillates regularly, or....
The ignorance is in this case simply deterministic lack of information.
In particular, it would be a mistake to assume that the distribution
is uniform (ignorance interpretation). Using the noninformative prior
of the Bayesian school, which makes this assumption, may be seriously
flawed.
More realistically, in engineering, an uncertainty in the elasticity
module of 5% in steel bars may be the only information available
to an architect; but 3/4 of the bars used later in the building
may have a deviation of 0.1% and the remaining quarter one of 3.7%.
In general, all one can deduce from information that takes the form of
deterministic bounds on a vector x of variables and/or on expressions
in x are bounds on derived quantities y=f(x) one would like to compute
from it. This leads to global optimization problems, where f(x) is
minimized or maximized subject to the known constraints. See
http://www.mat.univie.ac.at/~neum/glopt/intro.html
The lack of knowledge that statistics can model is of a different kind.
It assumes that the _maximal_attainable_ knowledge about the system
- at the given level of description - is a probability distribution,
and that this probability distribution is indeed known.
The knowledge of the probability distribution can be replaced by a
qualitative knowledge of it (e.g. 'some Gaussian distribution'),
together with the knowledge of an incomplete sample from the ensemble
of interest; in this case, however, the best statistics can offer are
parameter estimation techniques that give credible probability
distributions compatible at some confidence level with the sample data.
There are also combinations of both kinds of incomplete information,
where one knows the maximal knowledge about a system should be
stochastic, but one lacks complete information on the distribution.
This is handled by the field of 'imprecise probability', although
there is not yet a generally accepted way for analyzing such
situations, and different schools with quite different basic
approaches compete. See, e.g, the links in
http://class.ee.iastate.edu/berleant/home/ServeInfo/Interval/intprob.html
The current treatment of bound states in QFT (see elsewhere in this FAQ)
is a very loose patchwork of techniques borrowed from perturbative
field theory and nonrelativistic quantum mechanics that should make
every theoretician shudder. There are some beginnings in algebraic
QFT of what bound states should be, but nothing convincing on the
quantitative level.
------------------------------------------------
S14b. Does the standard model predict chemistry?
------------------------------------------------
The standard model is widely believed to be in agreement with
all we know about matter and radiation on earth, within the range of
accessible energies, as long as gravitational effects can be neglected.
But this does not mean that it has a high predictivity, except
on the level of high energy elementary particle scattering.
The reason is that we can compute from it almost nothing at the scales
of interest in nuclear, atomic, or molecular physics.
Lattice gauge calculations show that the standard model implies the
existence of baryons such as proton and neutron with masses that
match the experimental masses with an accuracy of about 5%.
This is far too low to be of use in chemistry or even in nuclear
physics. The accuracy of the effective forces between them is even
poorer.
We have very little control over confinement, which is essential to
get useful forces at the energies relevant for nuclear physics.
Thus predictivity of the standard model for nuclear information
is almost nil.
And indeed, nuclear physicists do not use the standard model
(except for paying religious lip service to it), but work with
their own phenomenological models. They just borrow some of the
symmetries. These were of course known long before the standard
model was born, and built into the latter to match reality; so they
cannot count as predictions from the standard model.
If we had only the standard model and the numerical estimates
for the constants of effective actions computed from it,
this would give _very_ poor predictions of properties of protons,
neutrons, and their bound states.
One can show that the effective dynamics of protons and neutrons is
governed by effective field theories whose form can be derived
from the standard model (but also follows from assumed symmetry
principles built into the standard model) but whose coefficients
are derived by fitting calculations to _measured_ data about form
factors of proton and neutron, which have _not_ been calculated
from the standard model but must be put in by hand as additional
information.
From this, one can calculate the energy of the nuclei, using a combined
droplet/shell model. We understand the structure of nuclei, in agreement
with the standard model, but _not_ derived from it.
If we had only the standard model and the numerical estimates computed
from it, this would give _very_ poor predictions of nuclear properties.
There would be neither nuclear energy nor nuclear weapons based on
knowledge derived form the standard model only.
Even knowing the properties of proton and neutron from measurement
and the effective equations (but nothing else) does not allow to get
highly accurate predictions for the properties of larger nuclei.
At atomic distances from the nucleus (for QED-dominated phenomena),
one can further approximate the theory by Dirac-Fock equations,
or, for light nuclei, by Schroedinger's equation
for electrons and nuclei together with relativistic corrections.
The details of the nuclei become irrelevant for atomic physics and
chemistry, except for their atomic weights. These cannot be derived
accurately enough from lower levels, and must again be supplemented by
additional experimental information.
If we had only the standard model and the numerical estimates computed
from it, this would give _very_ poor predictions of most chemical
properties of everything including the hydrogen spectrum.
Only starting on this level, _assuming_ the properties of the nuclei
and the electron, we are able to predict much of macroscopic physics:
We can solve the Dirac equation exactly for hydrogen, and
compute the radiation corrections from QED and other corrections from
the Standard Model. It agrees with the experimental measurement of
hydrogen spectra to extraordinary accuracy. We can understand why the
periodic table works, and predict the properties of even large
atoms (such as the color of gold) reasonably well using the Dirac-Fock
equations.
From this level on upwards, one has enough experimental data to
calculate chemical information for small molecules that is predictive
in the sense that it may give quantitative information that is
reasonably accurate and not put in by hand.
But already for proteins, one again needs to complement the theoretical
input by measurements to get predictions of reasonable accuracy.
Thus the standard model is a very inaccurate tool for chemistry.
It is useful only for elementary particle scattering experiments.
At each higher level, one needs additional information from
experiment to complement the predictions of the lower levels.
---------------------------------------------------
S14c. Is the result of a measurement a real number?
---------------------------------------------------
A single measurement (reading from a scale) always gives a rational
number, at least if the scale is in terms of rastional units.
(If the scale gives an angle in degrees which is then converted into
arc length, the measuremnt gives rational multiples of pi instead).
However, this is by convention only, since a pointer position is
just a position in 3-space which must be translated into a number
by a subjective reading or by a digital reading device of limited
resolution. Thus the true position is not determined accurately
enough to associate it with a single number.
Infinitely many rationals (and uncountably many reals) are
compatible with any observable state of the voltmeter.
That's why the error bars are intrinsic to measurement results, even to
single readings. Deleting them and claiming exact measurement results is
just laziness, acceptable when the resolution of an instrument is known.
Therefore, according to the standards of NIST (National Institute
of Standards and Technology), a measurement gives an interval
consisting of a rational number together with an error bar; see
http://physics.nist.gov/cuu/Uncertainty/
Of course, the error bar is also somewhat uncertain, but one generally
accounts for this uncertainty by rounding it upwards, to make the
whole estimate conservative.
The NIST definition has the advantage that it also applies to indirect
measurements obtained from raw measurements by some computations.
Indeed, most high quality measurements are of this kind.
Nevertheless there is no contradiction if one assumes that reality is
governed by equations in terms of exact real (or complex) numbers,
and only the measurement abilities are limited.
-----------------------------------------
S14d. Why use complex numbers in physics?
-----------------------------------------
Complex numbers are _the_ natural number system for all but
elementary physics; one needs them to make sense of many advanced
concepts. Avoiding complex numbers would make much
of what is done incomprehensible.
Already Fourier analysis is most natural with complex numbers,
though here it could be avoided by using trigonometric series
instead.
The time-independent Schroedinger equation defines the
Fourier components of real, measurable expectations. So it is
very natural that quantum mechanics is based on complex entities, too.
Dispersion relations in optics are natural only in a complex setting.
Spectra of nonhermitian operators, essential for dissipative systems
even in the classical case, are always complex.
Analytic continuation plays a significant role in some physical
theories. For example, lattice gauge theory works in a continuation
of quantum field theory to Euclidean space, and the results must be
continued back to Minkowski space to get physical meaning.
On the other hand, at first sight it seems that only real quantities
are measurable. However this only holds for the most direct measurements
where you read a number from a meter. Most measurements are of a
more indirect kind, and then this restriction no longer applies.
To measure a family of physical quantities x_l (l=1,...,n),
one measures some related real quantities r_1,...,r_m connected to
the x_l by a system of equations F(x,r)=0 (in the absence of
measurement errors). In fact, there will always be measurement errors,
hence one generally uses more equations than unknowns and solves
the least squares problem ||F(x,r)||^2=min (or a more complicated
related problem if a model of measurement errors is avaialble)
to get an estimate of x.
This recipe is universally used for all sorts of measurements and
works whether the x_l are real or complex.
-------------------------------------------
S15a. How precise can physical language be?
-------------------------------------------
The relation between theory and reality necessarily uses ordinary
language and is therefore somewhat fuzzy. If one insists on 100%
unambiguous statements, one is on the level of pure mathematics or
mathematical physics (platonic reality), and cannot have any contact
with (physical) reality.
The best one can do is to have completely precise concepts on the
theoretical level and a description in ordinary, informal language
that relates theory to reality. In the formal theory, all concepts
can be precisely defined, and get names corresponding to their intended
use in reality. This ensures that one knows precisely what one talks
about - on the conceptual level.
In this informal language there must be room for linguistic
approximations without specifying their quality more than by
fuzzy words interpreted by the circumstances, since this is the way
we necessarily perceive reality.
When formulating the interface between theory and reality,
one must use the formulations people use who are using this interface,
They know how 'large' something must be to be taken as 'infinite'.
They estimate limits from finite sequences (most of numerical
analysis would be void if we couldn't...), usually quite successfully
- although this is meaningless mathematically.
A mathematical limit in theory does _not_ translate into a mathematical
limit in reality.
This is necessary since all our observations are finite, and most of
them are noisy. As there are approximate ways of determining the mass
of the Moon, but no exact methods, so there are approximate methods
for determining probabilities, but no exact ones. Exact real numbers
belong to theory, not to reality. (Even counting is not sure to result
in an integer. What about the number of people in a room when just
someone enters?)
Careful protocols for experimentation and measurement are useful to
achieve a certain amount of objectivity and repeatability, but even the
best protocols cannot reduce the level of fuzziness in the interface
between theory and reality to zero. I recommend
Experimentation and Measurement, by W.J. Youden,
reprinted 1997 by the National Institute of Standards and Technology
http://ts.nist.gov/ts/htdocs/230/233/calibrations/Publications/exp_meas.pdf
Although a very old paper (from 1961), it is still considered by NIST
to be up to date and exemplary in its lessons about measurements.
Among other things, it discusses on pp. 26ff in greatest detail
how to measure the thickness of a sheet of paper in an ensemble of
sheets typically called a thick book.
If one follows his argument closely, one finds that even classically,
observables such as the 'thickness of a sheet of paper' are
probabilistic only, notwithstanding that probably everything relevant
about paper can be understood by classical mechanics and
thermodynamics.
Thus there are no exact concepts in observed Nature.
But in a good theory of Nature, all concepts should be exact.
----------------------------------------
S15b. Why bother about rigor in physics?
----------------------------------------
Approximate methods are almost always more efficient than rigorous ones.
You can see this, for example, from the way integrals are calculated in
numerical analysis. No one uses the 'constructive proof' by
Riemann sums or, harder, by measure theory.
But for the logical coherence of a theory, the rigorous approach
is important.
To prove that a long, complicated expression in a single variable is
monotone may be quite hard and exceed the capacity of a typical
mathematician or phycisist, but to evaluate it at a few hundred points
and look at the plot generated is easy.
If you (the reader) are satisfied with the latter, never try to
understand mathematical physics - it will be a waste of your time.
But if you want to have physics in general look like classical
Hamiltonian mechanics - a beautiful piece of mathematically rich
and powerful theory, then you should not be satisfied with the way
current quantum field theory (say) is done, and keep looking for
a better, more solid, foundation.
About the pitfalls of using mathematics ''formally'' (i.e., without
bothering about convergence of the expressions, existence or
interchangability of limits, etc.), I recommend reading
F. Gieres,
Mathematical surprises and Dirac's formalism in quantum mechanics,
Rep. Prog. Phys. 63 (2000) 1893-1931.
quant-ph/9907069
and
G. Bonneau, J. Faraut, G. Valent,
Self-adjoint extensions of operators and the teaching of quantum
mechanics,
Amer. J. Phys. 69 (2001) 322-331.
quant-ph/0103153
See also:
K Davey,
Is Mathematical Rigor Necessary in Physics?
British J. Phil. Science 54 (2003), 39-463.
http://philsci-archive.pitt.edu/archive/00000787/
On the other hand, on the way towards finding out what is true,
nonrigorous first steps are the rule, even for hard die
mathematicians. The role of intuition and nonrigorous thinking in
mathematics is well depicted in the classics
J. Hadamard,
An essay on the psychology of invention in the mathematical field,
Princeton 1945.
and
G. Polya,
Mathematics and plausible reasoning,
2 Vols., 1954.
or
G. Polya,
Mathematical discovery,
John Wiley and Sons, New York, 1962.
More recently, the article
A. Jaffe and F. Quinn,
"Theoretical mathematics": Toward a cultural synthesis of
mathematics and theoretical physics,
Bull. Amer. Math. Soc. (N.S.) 29 (1993) 1-13.
math.HO/9307227
reports on the potential and dangers of nonrigorous approaches
to scientific truth. This paper was commented in contributions
by a number of influential mathematicians and mathematical physcists in
M. Atiyah et al.,
Responses to ``Theoretical Mathematics: Toward a cultural
synthesis of mathematics and theoretical physics'',
by A. Jaffe and F. Quinn,
Bull. Amer. Math. Soc. 30 (1994) 178-207.
math/9404229
and the response of Jaffe and Quinn is given in
A. Jaffe and F. Quinn,
Response to comments on ``Theoretical mathematics'',
Bull. Amer. Math. Soc. 30 (1994) 208-211.
math/9404231
See also
D. Zeilberger,
Theorems for a Price: Tomorrow's Semi-Rigorous Mathematical Culture,
math.CO/9301202,
J. Borwein, P. Borwein, R. Girgensohn and S. Parnes
Experimental Mathematics: A Discussion
(1996?)
http://grace.wharton.upenn.edu/~sok/papers/age/expmath.pdf
--------------------------------------------
S15c. Justifying the foundations of a theory
--------------------------------------------
Quantum mechanics is a somewhat unintuitive theory, and generated
a lot of foundational literature aimed at justification and
explanation of the conceptual basis.
Justification of the basic postulates of any theory is necessarily
circular. If it were not, the postulates were not basic but derivable.
One must take all the basic postulates as a single foundation
on which everything else rests without circularity.
But the basic postulates themselves can only be motivated, but not
derived.
Most people simply trust that tradition selected good foundations.
If you want to probe that trust you can go into studying the sea of
publications on the foundations of quantum mechanics. But unless
you are very dedicated and spend a lot of effort on it,
it is likely that you'll drown there before having found satisfaction...
----------------------------------------
S15d. Foundations, theory and experiment
----------------------------------------
Foundations of physics is the quest for getting the mathematical
concepts right to be able to do correct physics and think correctly
about it. Without correct concepts operational statements have no
meaning. The theory defines what a measurement is. Outside the
immediate realm of everyday experience, one needs already the
conceptual basis to even discuss what has operational meaning.
These statements apply both to good and bad theories. Even a bad theory
defines what a measurement is; it just defines is more poorly.
There is in fact a crossfertilization between measurement and
foundations. If one gets better the other profits from it.
On the other hand, fuzzy foundations lead to poor judgment and
ambiguity in measurements, and poor measurements lead to low
discrimination among theoretical alternatives.
One can observe from history that progress in concepts lead to better
inverstigations of nature, and better experiments lead to higher
demands on the theory, forcing people to look for more stringent
concepts and simpler or more encompassing frameworks.
------------------------------------------------------
S15e. Theoretical physics as a formal model of reality
------------------------------------------------------
Can the meaning of all terms in a physical model be determined
precisely without an infinite regress? I want to show that the
answer is a clear `yes'.
Look at the question `What is a force?' To answer this, one needs
to consider the concepts of force, mass, acceleration, pressure,
stress, recoil, perhaps the gravitational field, etc., in total
a small number of physical items. If we want to define them in reality,
we don't get an infinite chain but a circular definition -- we can only
define one in terms of another, illustrating the concepts by pointing
to situations where we hope everything is obvious.
In practice (i.e., in teaching physics), this works alright since
each of us knows reality already
and only needs enough context to identify the usage of the concepts --
there is essentially only one fit that works, and once the
light goes on, we understand -- or at least the level of
understanding deepens. (Later, when doing high precision measurements,
we may notice that our understanding is not adequate,
and become more careful and sophisticated, and at some advanced s
tage one can probably write a whole book to get definitions
that are really precise...)
But there is another way that is fruitful and neither circular nor
infinite. It is obtained by mimicking how modern logic investigates
its foundations. It assumes that we know at the 'external reality'
level what logic is; then it builds a formal model, a 'formal reality',
in which one can talk about everything one talks in 'real' logic,
but in completely formal terms.
You don't need to know what truth, propositions, etc. are in reality,
but you declare the rules for manipulating
with them -- since this is the heart of the matter.
This is done in exactly the same way as the Greeks declared rules for
manipulating geometric terms. In addition, they had definitions like
'a point is what has no parts'; but in modern geometry, this is
considered to be not a well-defined formal statement
(instead it has the circular character of relating the concept
to reality), and hence is simply dropped from the list of axioms.
So modern geometers define a projective plane by a few simple
statements:
''There are points, there are lines, there is a relation which
tells which points are on which lines, through any two distinct
points there is exactly one line, and any two distinct lines
have exactly one common point.''
That's all, and it is enough to do planar projective geometry with
full clarity and completeness. We do not need to know anything about
the objects to analyze a situation
(unless we want to check it's impact on external reality).
Of course, it is good to have a few more restrictions and concepts to
go really deep, but this is supposed to be just an example.
In the same way, one can discuss _everything_ about
the real logic in the formal model of logic, and reach clarity.
It is my proposal to do this for physics as well.
Actually it has been nearly achieved in classical physics,
and fully achieved in Hamiltonian mechanics.
You start with a phase space and a Hamiltonian which fall from
heaven. (They are motivated by circular arguments, but these arguments
are not part of the theory in the formal sense.)
Having this, you can build a whole world, with atoms,
dynamics, paths, forces, accelerations, stress, etc.
In fact, you can discuss any question about the classical world
in this mathematical frame, without ever needing any undefined term.
Formal reality is define by what is expressible in terms of the
concepts already available, and 'true' reality with its circularity
never enters except as a guide to formulating new concepts and to
discuss their consequences.
This is what I think theoretical physics is about.
It builds a formal model of the world, with a 'formal reality',
in which every important concept from experimental physics has
a well-defined formal meaning, and in which every reasonable
question about the physical world can be posed and investigated.
What can be posed and analyzed in such a framework counts
as understood, and understanding of nature increases by bringing
more and more into such a formal model, until everything about
physical nature is representable.
My vision is that the same is possible and desirable for quantum
physics. For me, realizing this vision is
equivalent to having understood quantum physics.
So I want to have a mathematical quantum model of nature,
in which one can talk about all the things physicists talk about
when they talk about nature in the physical sense. In particular,
there will be concepts like particles, fields, detectors, measurement,
probability, memory, etc. but -- unlike in real nature --
they will have a precise and unambiguous formal definition,
of the same formal quality as force, acceleration, etc. are defined
in Hamiltonian mechanics.
Then we can ask about the "meaning" of each term,
and get a well-defined answer within the formalism,
without infinite regress.
----------------------------
S16a. On progress in science
----------------------------
The frontier in science is the frontier because there is no clear
understanding of what is beyond. All that is there is a set of
questions bothering those close to the frontier, and a set of
experiences of more or less failed attempts to push the frontier
forward.
Real improvements in difficult matters never come by starting from
scratch - they come from patiently building upon the best of
what already exists, being open-minded but critical about new
possibilities, and trying to integrate what looks most promising.
Those who had the questions and found real answers published it and
andvanced the state of the art. The others can only share their
experience and their chart of the uncharted territory. As one can see
from the conflicting opinions, these charts are not reliable.
-------------------------------------------------------------
S16b. How different are physical sciences and social sciences
-------------------------------------------------------------
From the subject matter treated, a lot. From the modeling side far less.
There is no difference in principle. All science is based on observation
and experiment. All experimental data must be observed according to
well-defined protocols, to be objective (and hence science).
The main difference between physical sciences and social sciences is
that in the former one generally studies systems which are strongly
constrained by the experimental setting, so that they give much more
predictable results.
In both cases, however, the correct mathematical model is that of a
stochastic process, and physiccal sciences and social sciences only
differ in the size of the noise relative to the signal.
Sometimes to the extent that one can ignore the noise and treat a
physical system as deterministic, while a social system can never
be controlled well enough to make the remaining fluctuations
negligible.
-------------------------------------
S16c. Can good theories be falsified?
-------------------------------------
The philosopher Karl Popper claimed that falsifiability is the
hallmark of scientific theories. But scientific practice speaks
against him.
A correct theory cannot be falsified, and in this sense is not
falsifiable, in spite of Popper. (Falsifiability can be asserted
only in a contrafactual sense, that there are _conceivable_ situations
that, according to the theory, are excluded. But for a correct theory,
these situation will never happen, hence are completely ficticious.)
What happens with good theories is, at worst, that their region of
validity or accuracy gets restricted as new data about more remote
instances come in.
In today's understanding, people are careful to indicate the
limits where a theory is claimed to be valid, and the accuracy
to which its answers are to be trusted.
For example, the Standard Model is claimed to be valid whenever
gravitation is negligible, accuracies conform to present possibilities,
and energies are well below a putative unification scale.
Failures outside this domain are not counted as falsifications.
While limits and accuracy claims are not necessarily part
of the theory proper, they are part of the theory as actually taught
and applied. Indeed, although people try to extrapolate, one can
never be sure whether a theory is correct outside the domain where
the data were collected.
But one can be reasonably sure within the domain where enough data
are available. Good scientific practice requires that a good theory
agrees with the data within the tolerances claimed.
Once this is the case, these theories can never be falsified.
Rather, if people find disagreement in experiments, the
theory falsifies the experimental arrangement or analysis.
All science students who ever did experiments in the lab know
very well that this is common practice.
----------------------------------------------
S16d. What, then, distinguishes a good theory?
----------------------------------------------
We can _know_ whether a theory has been correct in the past,
and we can _trust_ that it will remain so in the future.
There is no other kind of knowledge than that of the past.
Relying on that ''anything in the future is like in the past'' is an
act of faith. The question is not about faith or not, but about
faith in what is best supported by past experience.
Theories that conform with the past are easy to trust.
But they come in different degrees of stringency.
Theories which are not restrictive at all but accommodates everything
(such as astrology or psychoanalysis) are in vogue (as society shows)
but useless (and probably harmful). These are the ones that Popper calls
unfalsifiable.
Highly restrictive theories (what Popper calls scientific) are preferred
by those who want to control their destiny as far as possible.
Theories like Newton's, general relativity, or QED are extremely
restrictive and in agreement with past experience, hence both
trustworthy and very useful.
What makes a theory good is not its potential falsifiability, but that
it drastically reduces the number of possibilities which are present
without the theory, without eliminating something that can actually
happen.
If you have no theory and put two marbles into your empty pocket,
and then another two, you don't know how many marbles you can take out.
If you know arithmetic and the law of conservation of marbles you can
predict that exactly four can be taken out. This is testable, and will
always come out correct. So you have a correct theory. Of course, its
validity is not unlimited, since it assumes that your pocket does not
have a hole; so if some experiment does not conform to your theory
since you can only take out three, you suspect that the domain of
validity was violated; you check for the hole - and surely you'll
find it.
This is exactly analogous to the way Newton's theory works, within
its domain of validity. If it fails, we suspect speed close to c,
or highly accurate measurements, or tiny distances. And surely
we'll find it so.
------------------------------------------------
S16e. When is a theory preferred to another one?
------------------------------------------------
Frequently, Ockham's razor
''frustra fit per plura quod potest fieri per pauciora'',
that we should not use more degrees of freedom than are
necessary to model a phenomenon, is invokes to argue that the theory
with the fewest parameters is the best. But this is true only
when taken with many grains of salt.
Chemists prefer as a starting point of their deepest investigations
the theory based on Dirac-Fock theory or even cruder approximations,
treating the nuclei (for large problems even atoms) as elementary.
This gives them all the information they need, while they can deduce
nothing at all from the standard model which is supposed to be a much
more exact and general theory.
Thus what is preferred depends a lot on which use can be made of it
Ockham's razor is appropriate only if two theories allow the same
deductions with a similar amount of work, or if the more parsimonious
theory is even superior in allowing one to derive the desired
properties.
Nothing in science is against a complicated model if it gives more ready
access to the quantities of interest than a formally simpler but
computationally more difficult or even untractable formulation.
Given only the standard model +classical relativity
(allegedly correctly describing all phenomena of the world at
accessible energies, distances, and accuracy), we'd know very little
about our world, and only very inaccurately. Not even the masses of
the nuclei can be predicted at present with any confidence, let alone
the properties of water or gold.
And given only string theory (a theory without any free parameter),
we'd know essentially nothing about our world.
(See http://rz70.rz.uni-karlsruhe.de/~ed01/Hyle/Hyle3/hoffman.htm
for further discussion of Ockham's razor.)
--------------------
16f. What is a fact?
--------------------
In discussion on sci.physics.research, one often finds very good
information, but also often poor and misleading information.
How to distinguish the good from the poor?
Everything called knowledge is in fact a set of beliefs of the
person claiming it. And this set of beliefs is more or less close
to the objective truth, depending on the standards of that persons.
Calling so-called knowledge a set of beliefs does not contradict the
objectivity of mathematical definitions. When I say that a Banach
space is a normed, complete vector space, I both state my belief
and happen to coincide with the social consensus of the guild of
mathematicians. And when I say that state reduction is a
physical process, I both state my belief and happen to coincide with
famous physicists like von Neumann and many others, and this is good
enough to make this statement honestly, since the community has not
reached an agreement on the matter.
----------------------------
S16g. Physics and experience
----------------------------
On superficial reasoning, time is only a concept that helps us
to order our experiences. Thus,
''experience exists; time does not''.
By exactly the same, argument,
''experience exists; space does not''
''experience exists; mass does not''
''experience exists; charge does not''
''experience exists; gravitation does not''
etc.
Physics is exactly about the concepts that are substituted for
experience to make experience quantitatively predictable.
Therefore, in this deeper sense, time, space, mass, charge,
gravitation, etc. exist, and are more fundamental than experience.
----------------------
S16h. Modeling reality
----------------------
In describing reality from a physics point of view,
the person modeling a system of interest makes certain
choices. These consist in choosing a mathematical model
of the system, and setting up a correspondence between
informal objects related to the system and formal objects
in the mathematical model.
More specifically, an assertion about reality is modelled
as a mathematical assertion about mathematical objects in the
mathematical model that carry the same names as those
in the reality they are supposed to model.
--------------------------------------------
S16i. What is a system (e.g., an ideal gas)?
--------------------------------------------
Theories of physics do not say what a system (such as an
electron, a star, an ideal gas, a crystal) is in reality.
Nevertheless, it is possible to check the reality contents
of a physical theory. How does this come about?
Let us consider thermodynamics. Thermodynamics does not say which
system is an ideal gas, which is only a van-der-Waals gas,
which is a liquid, or a solid.
Indeed, such questions need not be answered by the theory.
Instead, they are answered by checking how a system behaves:
---------------------------------
S16j. When is a theory confirmed?
---------------------------------
Any deviation from a law can only be 'confirmed' by narrowing error
bars for the parameters modeling the deviation. As long as the error
bars contain zero, the law counts as confirmed.
With time, confirmation of the law may be at a higher level of
accuracy, or (as in the case of neutron masses) confirmation of the
deviation (if the more accurate error bars no longer contain zero).
If one disputes any of the established theories because of not enough
confirmation, one can as well dispute Lorentz symmetry, translation
invariance, zero photon mass, general relativity, etc., which are
basic to contemporary physics but all confirmed only to a certain
precision.
There are experiments testing the limits of all these assumptions,
but even when one of these experiments succeeds (as in the case of
neutron masses), the previous theory remains valid to the accuracy
it was known to be valid before. In this sense, older theories don't
die even when they are superseded. A well-known case is Newton's
gravitational theory which is still taught and heavily used
although not completely correct.
-------------------
S16k. What is real?
-------------------
All physics is just a handy way of thinking about certain phenomena.
This - a handy way of thinking - is what it means that something
- the concept we find useful - exists.
We say that people exist, because they are a handy way to describe
certain blobs of matter like ourselves. We say that electrons exist,
because they are a handy way to describe ionization phenomena.
We say that photons exist because they are a handy way to describe
quantum optics phenomena.
Photons are objectively real because they are needed in the only
comprehensive coherent theory of microscopic interactions that we
know of.
On the other hand, 'photon' is merely a word that physicists use on
paper and in conversation. But in precisely the same sense that
entropy, energy, or the electromagnetic field are merely words that
physicists use on paper and in conversation.
Even our best concepts are 'merely' words.
If we give up concepts, only an undifferentiated happening in
space-time remains, and even talking about this becomes impossible.
---------------------------------------------------
S16l. How many angels fit onto the tip of a needle?
---------------------------------------------------
Anton Zeilinger writes in
http://www.ap.univie.ac.at/users/Anton.Zeilinger/philosop.html
''the question whether such a description exists or not was therefore
similarly irrelevant as, according to Pauli, the old question
how many angels fit onto the tip of a needle.''
This question has become a well-known metaphor for doing
irrelevant physics.
But how old is this question really?
Who was the person who discussed it seriously?
http://web.maths.unsw.edu.au/~jim/headsofpins.html
mentions explicitly Chillingworth's
''Religion of Protestants a Safe Way to Salvation''
(1638, reprinted 1972, 12th unnumbered page of the preface)
accusing unnamed scholars of debating
''Whether a Million of Angels may not fit upon a needles point?''
It seems that, as here, the question has always been used in a derisive
manner only. In the historical essay
E.D. Sylla,
Swester Katrei and Gregory of Rimini:
Angels, God and mathematics in the fourteenth century,
pp. 251-270 in:
Mathematics and the Divine: A Historical Study
(T. Koetsier and L. Bergmans, eds.)
Elsevier 2005,
http://www.elsevier.com/wps/find/bookdescription.cws_home/704302/description#
description
Sylla conjectures that the question might have been coined by
Thomas Hobbes, who had learnt the scholastic tradition in Oxford
between 1603 and 1608. See also
http://en.wikipedia.org/wiki/How_many_angels_can_dance_on_the_head_of_a_pin%3
F
------------------------------------------------------
S17a. How to get information from sci.physics.research
------------------------------------------------------
If you read sci.physics.research out of curiosity, you may find that
the discussions get too specific for you but make you curious to
learn more about the background. But it may be difficult to find out
where to get started.
The right way to find out is to ask on sci.physics.research
for what you need, in response to someone's contribution.
The writers usually know how they got the knowledge, and are happy to
give you hints or recommendations, and others will join in if they
think they have better advice. The more specific your question, the
more likely you'll get an answer, and the more useful it will
be for others, too. By asking good questions you are doing a
service to all.
My Lord Jesus Christ, for whom I live, asserted:
"Ask, and it will be given you; search, and you will find; knock,
and the door will be opened for you. For everyone who asks receives,
and everyone who searches finds, and for everyone who knocks,
the door will be opened." (Matth. 7:7-8)
It took me a while to realize that this was excellent advice.
------------------------------------
S17b. How to get your work published
------------------------------------
You did some work that you think is great (or at least reasonable),
but it was rejected by the journal you sent it to?
This is disappointing, but not the end of all hope...
Rejection letters usually give some reasons for rejection; if they
don't you may request (in a polite way!) getting reasons so that you
can learn from them. And then _do_ learn from them! Usually the reasons
for rejection are sound and mean at least that you didn't pose your
case well. It also takes some time to learn the standards that
publications should respect, and it is likely that you violated
some of the unspoken rules without realizing it.
If your idea is far from mainstream, you need also convince people
that your approach is sound and merits spending the time to read
through the new proposal. This is difficult since you need to build
up trust; it requires that you have a high level of frustration
tolerance.
The less mainstream an idea the stronger must be its contents and the
more careful it must be argued to be publishable; use the feedback you
get to find out the standards expected and then go and meet them.
The difference between a crank and a serious researcher is that
the letter learns from criticisms and grows through each feedback,
while the former 'knows' (and acts on this assumption) that he is right
and that established physics is just rejecting him or her for no good
reasons.
If you enter a correspondence with anyone who takes the time to
read your work, stay polite even when the answers you get are not
what you hoped for. Once the tone of your mail gets defensive or
aggressive, you probably lost your case - your partner sees that
you try to replace facts by emotions and your credibility is gone.
--------------------------------------------------
S17c. How to respond to critical referee's reports
--------------------------------------------------
{This is taken verbatim from http://authors.aps.org/faq_review.html]
What Should I Do When a Referee Criticizes My Paper?
Read the referee report carefully and dispassionately. Approach the
report with an open mind. What may at first seem like a devastating
blow is perhaps a request for more information or for a more detailed
explanation. At other times the referee may indeed have found a fatal
flaw in the research or logic. Put yourself in the position of a
reader, which is exactly the position of the referee. Is the paper
well written? Is the presentation clear, unambiguous, and logical?
Respond to all referee comments, suggestions, and criticisms.
Explain which changes have been made and state your position on points
of disagreement. In our experience, appropriate response to some
referee comments may require more research or even reconsideration
of the research project.
-----------------------------------------
S17d. How to sell your revolutionary idea
-----------------------------------------
Unless you don't care about making a fool of yourself, don't tell
it to others before you worked out enough details to be convincing.
Your audience is very likely to be skeptic (since there are too many
revolutionary ideas around which don't stand the test); so you need
to make best use of this fact.
------------------------------
S17f. Stories about physicists
------------------------------
Memories about Theoretical Physicists (by R.F. Streater)
http://www.mth.kcl.ac.uk/~streater/links.html
Short Stories
http://www2.physics.umd.edu/~yskim/home/storie.html
Parables for Modern Academia (by D. and L. Haarsma)
http://www.calvin.edu/~lhaarsma/parables.html
http://www.asa3.org/archive/asa/200006/0147.html
------------------------
S17g. Other physics FAQs
------------------------
http://math.ucr.edu/home/baez/physics/
Usenet Physics FAQ
(extensive, has also links to further physics-related FAQs)
http://www.faqs.org/faqs/physics-faq/
Physics FAQ
(a list of links)
http://www.kar.net/~plasma/faq/
Plasma FAQ
http://www.iworld.de/~ej/faq.html
Quantum Physics FAQ
(current views of Erich Joos)
http://theory.gsi.de/~vanhees/faq/index.html
Physik und das Drumherum
(Physics FAQ in German)
-----------------------
S17h. Naming in science
-----------------------
How do scientific concepts, effects, or inventions named after
their discoverers?
It is good practice to name important concepts, effects, or inventions
created by esteemed collegues after them - good names are always hard
to find, and besides names clearly related to the content, names
naturally related to the history stick best. If a naming is successful
(in that others find it appropriate and useful) it will spread,
and soon everywhere is using it. Then the name is established.
It is bad practice if authors calls something by their own name
before it has been established by others. It suggests both vanity
and a lack of confidence that others do a good naming job.
And if the self chosen vanity name does not stick, it serves them
right for having made a fool of themselves.
On the other hand, naming is at times unfair. Not rarely in the past,
a concept (or theorem, etc.) got the name of one of its main proponents
rather than that of its creator.
http://en.wikipedia.org/wiki/Stigler%27s_law_of_eponymy
There are several reasons for this.
It takes time (and a certain amount of interest) to find the true
origin of a concept; but a good name is needed once it is used by more
than a few people. But once a name is established, it is nearly
impossible to change it.
A concept may also be rediscovered independent of its first inception.
If the time wasn't ripe for it the first time, it is likely that the
name of the rediscoverer sticks, and the voices of those who had known
the first source come too late.
See also:
List of misnamed theorems
http://en.wikipedia.org/wiki/List_of_misnamed_theorems
-----------------------------------------------
S18a. What is the meaning of 'self-consistent'?
-----------------------------------------------
A self-consistent solution (or method, or theory) refers to the fact
that one has two sets of equations relating two sets of unknown
quantities, and wants to solve the equations jointly for the unknowns.
If aspect A of a theory says y=x^2 and aspect B of the theory
(or of another theory) says x=y-2 then self-consistency means that
both equations are assumed to be valid, giving
x^2 = y = x+2,
which leads to the two solutions
x=2, y=4 and x=-1, y=1.
That's all. Of course, the self-consistent Hartree-Fock method,
say, has more variables and is harder to solve, but the principle
is the same.
-----------------------
S18b. What is a vector?
-----------------------
A vector is (for the beginner) a list of numbers written below each
other. For example the x,y, and z coordinate of a point in a
3-dimensional coordinate system. Physicists write the three
coordinates as x_1, x_2, x_3 and combine it to a vector
simply called x.
/ \
| x_1 |
x = | x_2 | (The parentheses look a bit awkward in ascii.)
| x_3 |
\ /
The same for a list of n numbers. This gives a vector x with n
coordinates x_1,...x_n, and is thought of as a point in a
space with n dimensions.
Two vectors are added or subtracted just by adding or subtracting
their entries. A vector is multiplied by a number just by multiplying
each entry with the number. Then there is the inner product of two
vectors
x dot y = sum_i x_i*y_i
which is a number and not a vector.
Once you mastered vectors you need to understand matrices.
These are rectangular arrays of numbers.
Later you need to enrich the meaning of a vector by learning
the concept of a vector space. Now all sorts of objects might
also deserve the name vector, most prominently functions,
matrices, tensors, operators. They behave in many respects
just like ordinary vectors.
------------------------------------------
S18c. Learning quantum mechanics at age 14
------------------------------------------
If you want to learn about quantum physics and really understand
you need to learn first how to do calculations with vectors and
matrices. Look in your local library for math books, about
'linear algebra' or 'analytic geometry'. You may have to try
several before you find one suitable at your level.
Linear algebra (i.e., vectors and matrices) is more fundamental
to quantum mechanics than calculus, although the latter is needed
to understand how things change steadily with time.
But one can understand the time-independent part of quantum mechanics
already without calculus, namely everything involving entanglement,
Schroedinger's cat, quantum cryptography, and the like.
This only needs linear algebra, which may be easier.
(On the other hand, calculus is not really difficult either,
once one gets used to it.)
------------------------
S18d. Research at age 16
------------------------
With 16, you should spend your time with learning rather than
with doing research. Lacking ideas means knowing too little...
Once you know enough about what others did and where they
got stuck, you'll have more than enough ideas to work on.
I'd like to suggest that you read the Nobel lectures of the
physics Nobel laureates,
http://nobelprize.org/physics/laureates/
The material spans a whole century, and will occupy you for long!
It will put your mind to themes that have been important enough
to merit the prize; most of them will continue to be important in
the future.
In parallel, use the web to sort out all concepts used in the Nobel
lectures that you don't yet understand; at first it will be a lot,
and you have to search a bit to find out where the basics you need are
well explained. Some items might be explained in this theoretical
physics FAQ, or in the book mentioned at the top of this FAQ.
Doing both will put you on a learning track which will end in a
research career and bear plenty of fruit.
------------------------------------------
S18e. Are there indefinite Hilbert spaces?
------------------------------------------
There are no indefinite Hilbert spaces. There are, however,
vector spaces with a distinguished indefinite inner product;
these are called Krein spaces. Their structure is much weaker than
that of Hilbert spaces; there is no natural topology, no completeness,
nothing resembling a Hilbert space except the inner product.
Since there are physical situations where indefinite inner products
arise naturally, some people show their lack of knowledge of the
literature by referring to Krein spaces as indefinite Hilbert spaces.
But if a few people do so, it doesn't mean that the terminology is
justified.
For example, quant-ph/0211048 uses this poor terminology.
The ghosts referred to in this paper are nonphysical vectors in a
Krein space which contains a definite subspace of physical vectors
whose completion gives the physical Hilbert space. This is a natural
construction in gauge theories (Gupta-Bleuler formalism) where
the direct construction of a physical Hilbert space would
manifestly break Lorentz and/or gauge invariance, while the
nonphysical, bigger Krein space enjoys all desired invariance
properties.
The indefinite metric in relativity, also mentioned in that paper,
has nothing to do with indefinite Hilbert spaces, since the
underlying vector spaces (Minkowski space in special relativity,
the tangent spaces at space-time points in general relativity)
are 4-dimensional spaces with the ordinary Euclidean topology
(although the metric is non-Euclidean).
---------------------
S19a. God and physics
---------------------
This is most likely to be controversial; but you might be
interested in how the author of this FAQ sees the issues.
The following links are to some relevant pages from my web site.
How Do We Know Whether God Acts In The World?
http://www.mat.univie.ac.at/~neum/sciandf/eng/godacts.html
''I found the assumption that `God acts in the world' a superior
way of organizing the events that I see or hear happen.''
Knowledge, Chance, and Creation
http://www.mat.univie.ac.at/~neum/sciandf/eng/chance.html
(On the difficulty to know, and the role of the second law of
thermodynamics as an instrument of creation)
How to study
http://www.mat.univie.ac.at/~neum/sciandf/eng/study.html
''When I questioned the bible about the attitude appropriate
to the study of science I found the following instructions.''
How to Create a Universe - Instructions for an Apprentice God.
http://www.mat.univie.ac.at/~neum/other/turing.txt
(A fantasy to be read at leisure time)
Science and Faith
(an extensive collection of links)
http://www.mat.univie.ac.at/~neum/sciandf.html
''Science is the truth only in matters that can be objectified;
in the spiritual world, where values, goals, authority and purpose
are located, science has nothing to say. It is a poor life that is
restricted to the scientific standard of truth, where you and I are
nothing but a collection of atoms without meaning and purpose.
Realizing the narrow-minded nature of science opens the gate to an
understanding of God that complements the scientific truth and gives
life, love and peace.''
and in German:
Gott - die grosse Unbekannte
http://www.mat.univie.ac.at/~neum/sciandf/ger/unbek.html
Mathematik, Physik und Ewigkeit (mit einem Augenzwinkern betrachtet)
http://www.mat.univie.ac.at/~neum/sciandf/ger/neumann.pdf
---------------------
S20a. Acknowledgments
---------------------
Thanks to the contributors to the newsgroup sci.physics.research
for their more or less challenging questions and comments, without
which this FAQ wouldn't exist.
Thanks also to Steve Carlip, Norbert Dragon,
Hendrik van Hees, Don Koks, Nick Maclaren,
Alejandro Rivero, Joe Rongen, and Gerard Westendorp
for useful comments that lead to improvements in the FAQ.
Finally, thanks to God for his wonderful and interesting universe,
and for the gift of being able to understand his wonders.