Professional Documents
Culture Documents
4 Mathematical Preliminaries: 4.1 Linear Vector Space
4 Mathematical Preliminaries: 4.1 Linear Vector Space
Mathematical Preliminaries
The concept of linear vector space is the backbone of quantum mechanics. The important feature of quantum mechanics is the
principle of superposition. If ψ1 and ψ2 are two states of a system, then there exists a state corresponding to ψ = c1ψ1 + c2ψ2,
where c1 and c2 are complex numbers. It is the concept of linear vector space which expresses the idea of superposition. The simple
concept of linear vector space is not sufficient in quantum mechanics. The structure of quantum mechanics requires imposition of
additional properties on linear vector space. More particularly, quantum mechanics requires the linear vector space to be Hilbert
space. The simplest way of expressing the Hilbert space is that it is a linear vector space with an inner product.
First, let us review the basic ideas of linear vector space. Let us start with a collection of geometric vectors-quantities with
direction and magnitude. The best examples are position vector, velocity vector, force vector and so on. If V1 and V2 are two vectors
representing velocity, then we can define a new vector c1V1 + c2V2 which also represents another velocity vector V.
i.e.,
For a given pair of vectors A and B, we can assign a number(A, B). In vector algebra, the most useful way of assigning such a
number is the usual dot product.
Note that this expression is only a definition. One has to recognize the fact that there are many ways of assigning a number for any
two vectors. However, within the vector algebra, the definition given in the Equation (4.3) is one of the most profitable definitions,
especially in the context of classical physics and engineering. This definition satisfies the following properties:
A · B = B · A
A · (c1B + c2C) = c1A · B + c2A · C
and
We have just summarized a few important properties of geometrical vectors in order to develop the basic concepts of linear vector
space.
The next thing to do is to leave the fact that the vector is a quantity with magnitude and direction and concentrate on some
abstract properties satisfied by the geometrical vectors. In particular, we have to focus on the ideas of closure property, linear
combinations, basis, dimension, scalar product and so on, and we will extend these ideas to other mathematical objects.
In what follows, we present a brief introduction to linear vector space.
A linear vector space L is a collection of (or a set of) objects {ϕ1,ϕ2,ϕ3,…} called vectors with an operation of addition,
providing a rule to add two elements, and an operation of scalar multiplication, providing a rule to multiply a vector by scalar (real
or complex number), which obey the following axioms:
1. For every ϕ1 and ϕ2, which are elements of L, we can define an addition operator + such that
ϕ1 + ϕ2 = ϕ = ϕ2 + ϕ1
ϕ + 0 = 0 + ϕ = ϕ
4. If c is a real or a complex number, then the multiplication of vectors by a scalar is distributive
c(ϕ1 + ϕ2) = cϕ1 + cϕ2
(c1 + c2)ϕ = c1ϕ + c2ϕ
7. 1 · ϕ = ϕ · 1 = ϕ
8.
It is very easy to see that all these axioms are satisfied by geometrical vectors V1, V2, V3…. In fact, these axioms have been
formulated from the study of geometrical vectors. However, they are also satisfied by a number of other mathematical objects. This
can be seen by the list of examples given below for such linear vector space.
We can see that which is again a column vector. So the closure property is
satisfied.
The null vector of this linear vector space . It can be easily seen that
It can be multiplied by a real number like, say, 5.
In fact, the verification of all these axioms by these sets of column vectors is trivial.
Let
L = {f1(x), f2(x),…}
It is obvious to see that the sum of two continuous functions f1(x) and f2(x) is again a continuous function.
f(x) = f1(x) + f2(x)
It is obvious that the sum of two polynomial of degree 2 is again a polynomial of degree 2.
Let ϕ1 = a1 + a2x + a3x2 and ϕ2 = b1 + b2x + b3x2. Then we have
The null element corresponds to a = b = c = 0. It is easy to verify that all the axioms of a linear vector space are satisfied by the set
of polynomials of degree 2.
Any object which satisfies all the eight axioms of a linear vector space is called a vector.
Linear Combination
If ϕ1, ϕ2,…ϕn are elements of the linear vector space, then we can form a new vector ϕ by
Linear Independence
The vectors ϕ1, ϕ2 … are said to be linearly independent if it is not possible to find non-zero scalars c1, c2,… cn such that the linear
combination
The simplest example is the set {i, j, k}. Let us write the linear combination
c1i + c2j + c3k = 0
Basis
The linearly independent vectors e1, e2, e3,… en form a basis of a linear vector space provided any vector ϕ of the linear vector space
can be expressed as a linear combination of the vectors e1, e2,… en.
i.e.,
ϕ = c1e1 + c2e2 + c3e3 + … cnen
For example and are the basis vectors for the linear vector space of 2 × 1 column vectors. Any vectors can be
expressed as
Dimension
A linear vector space L is said to be of dimension n if the maximum number of linearly independent vectors in that space is n.
Continuing the example of linear vector space of 2 × 1 column vectors, let us choose three vectors, say, and
for which we can find a linear combination with non-zero values for c1, c2 and c3 such that
i.e., one solution is c1 = c2 = c3 = 0; but there also exist solutions with c1 ≠ 0, c2 ≠ 0, c3 ≠ 0 (for instance, (c1 = 6, c2 = 9, c3 = −3).
pairs. For example, and are a pair of two linearly independent vectors. Any vector of L can be expressed as a
However, of the four vectors and , only two are independent. So the linear vector space of 2 × 1
column vectors is a two-dimensional linear vector space.
Complete Set
The set of n vectors {ϕ1, ϕ2,…ϕn} forms a complete set in a linear vector space L, provided they satisfy the following conditions:
This implies that any arbitrary vector ψ of the linear vector space L can be expressed as linear combination of ϕ1, ϕ2,…ϕn.
i.e.,
Now let us introduce additional structures to a linear vector space. Depending on the kind of additional structure imposed, a linear
vector space can become a metric space, Banach space, Hilbert space and so on. From the point of view of quantum mechanics, we
are interested in a particular kind of linear vector space called Hilbert space. This requires the introduction of new concept called
scalar product.
In the cases of geometrical vectors, we have already seen that a scalar product assigns a number for a given pair of
vectors A and B through the dot product A · B = AB cosθ. In the same way, the concept of scalar product has to be extended to other
linear vector spaces. We have to bear in mind that the term vector is no longer restricted to quantities with direction and magnitude
only. Let us introduce the following notation to denote a scalar product:
Instead of writing ψ ϕ for the scalar product between two vectors ψ and ϕ, we write it as (ψ,ϕ). In general, the scalar
product(ψ, ϕ) is a complex number. Mathematically, it is said that the scalar product (ψ, ϕ) maps the vectors ψ and ϕ into a complex
number. That is, it assigns a complex number for a given pair of vectors ψ and ϕ. This is shown in Fig. 4.1.
Fig. 4.1 Scalar product as a mapping from a linear vector space to complex number
The scalar product between two elements ψ and ϕ has to satisfy the following properties:
(ψ, ϕ) = (ϕ, ψ)*
(ψ, c1ϕ1 + c2ϕ2) = c1(ψ1, ϕ1) + c2(ψ, ϕ2)
There are no special reasons to demand these particular conditions on a scalar product, except for the fact that in quantum
mechanics, only this kind of scalar product is useful.
So we have
(ϕ, cψ) = c(ϕ,ψ)
Norm of a Vector ψ
In analogy with geometrical vectors (where A · A = ∥A∥2), we define the norm of a vector ψ (we reiterate again that ψ is no longer
an object with magnitude and direction) denoted by ∥ψ∥ as
Orthogonality
Let ϕ(x) and ψ(x) be two functions which are continuous in the interval [a, b]. Then we can define the scalar product between ϕ(x)
and ψ(x) as
Note that this definition satisfies all the conditions given in (4.9).
Since the integrand ψ*(x) ψ(x) is always positive, if we demand (ψ,ψ) to be zero, the only possibility is ψ(x) = 0 for all x in the
interval [a, b].
We can define the scalar product (ϕ,ψ) as (ϕ,ψ) = ϕ†ψ = (a*b*) = a*c + b*d.
It is very easy to verify that this definition satisfies all the three axioms given (4.9).
Take the first condition:
Let
Then
Consider (ψ,ψ)
A linear vector space with an inner product is called Hilbert space. Note that this is not the way the mathematicians define a Hilbert
space. In mathematics, the Hilbert space is introduced in a complex way. First, metric space and Cauchy sequences in metric space
are considered. Then, the Hilbert space is seen as completion (another mathematical concept) of metric space. This way of
introducing Hilbert space is more meaningful in the context of infinite dimensional linear vector space. Interested reader can refer to
a simple exposition of these concepts by C. I. Tan.
In quantum mechanics, the stress is only on the inner product. Therefore, in quantum mechanics, both finite dimensional and
infinite dimensional linear vector spaces with inner product are considered as Hilbert space.
Within non-relativistic quantum mechanics, broadly, we concentrate on the following three Hilbert spaces:
Hilbert Space of Bound State Wave Functions or Square Integrable Wave Functions
The Hilbert space L2(R, dx) is a collection of square integrable wave functions. It is a collection of wave functions which go to zero
as x → ±∞. That is, if ψ(x) is an element of L2(R, dx), then, we have
The scalar product between two bound state wave functions ϕ(x) and ψ(x) is defined as
This can be extended to other higher dimensional configuration spaces. In three-dimensional space, the bound state wave
function ψ(x, y, z) has to satisfy the boundary condition
ψ(x, y, z) → 0 as x → ±∞, y → ±∞, z → ±∞
In Cartesian coordinates,
The elements of this Hilbert spaces are ϕ = (c1, c2, c3,…cn,…) such that |c1|2 + |c2|2 +…|cn|2 +… = 1.
4.3 OPERATORS
It is an obvious fact that any operator acting on a mathematical object like wave function or a column vector produces a new
function or column vector.
Aopψ = ϕ
i.e.,
Aopψ = ϕ
i.e.,
and
This implies that within L2(R, dx) only a subset or subspace will obey these two conditions.
On the other hand, xψ(x) is not a square integrable. This can be easily checked as follows:
Obviously xop is not defined for this function. The wave functions on which xop can act form a subset of entire Hilbert space. The
subspace for which xop is defined is called the domain of xop. The domain of xop is smaller than the entire Hilbert space L2(R, dx).
So, for each operator Aop, we have to specify the domain D(A) of the operator also.
D(A) may be the entire space H itself or a proper subspace of H. Strictly speaking, a Hilbert space operator is a pair of (Aop, D(A)).
If B is another operator, the corresponding Hilbert space operator is (Bop, D(B)). Two Hilbert space operators are the same if
Aopϕ = Bopϕ for all ϕ ɛ D(A) = D(B)
For every linear operator, another operator called adjoint of Aop, denoted by , is defined by the relation
Note need not be the same as Aop itself. It is not even complex conjugate of the operator Aop. One has to do a detailed
calculation to find the adjoint of a given operator.
Example 4.4 Find the adjoint of where Aop acts on the elements of the Hilbert space of bound states.
Solution: Note that since ϕ and ψ are elements of the Hilbert space of bound states, the functions ϕ(x) and ψ(x) are such that
ϕ(x) → 0 as x →±∞
ψ(x) → 0 as x →±∞
But since ϕ and ψ are bound state wave functions, the first two terms are zero.
In the case of Hilbert space of m × 1 column vectors, the linear vector space is a finite-dimensional linear vector space. In such
cases, if A is the matrix operator, A† is the conventional Hermitian conjugate. This can be easily seen as follows:
Let ψ and ϕ be
where the dagger symbol indicates a Hermitian conjugate of the matrix. Let us recall that the scalar product between two column
vectors ψ and ϕ is ψ†ϕ
(ψ, Aϕ) = ψ†Aϕ
(A ψ, ϕ) = (A†ψ)†ϕ = ψ†(A†)†ϕ = ψ†Aϕ
†
∴ (ψ, Aϕ) = (A†ψ, ϕ).
Example 4.5 Prove that the momentum operator pop = −iħd/dx is a self-adjoint operator when it acts on the elements of Hilbert
space of bound state wave functions.
Solution:
Since ϕ(x) and ψ(x) are bound state wave functions, both ϕ(x) and ψ(x) go to zero as x → ±∞. Therefore, the first two terms are zero.
Therefore, pop = −i is a self-adjoint operator.
Here a is called the eigenvalue and ϕa is called the eigenvector or eigenfucnction corresponding to the eigenvalue a. Note that
eigenvalues and the eigenfunctions for a given operator are not arbitrary. We have to solve the eigenvalue Equation (4.22) to get the
eigenvalues and the corresponding eigenfunctions for a given operator.
The list of all possible eigenvalues are called the eigenvalue spectrum of the operator Aop. The eigenvalues may be discrete or
continuous. If they are discrete, they can be enumerated as {a1, a2, a3,…}. For the same eigenvalue, if there exists more than one
eigenfunction, such eigenvalue is called degenerate, and the corresponding eigenfunctions are called degenerate eigenfunctions.
Let us make the following observation from the point of view of quantum mechanics. Though the domain of the operator is
restricted to a subspace of Hilbert space, its eigenfunctions need not be the elements of the Hilbert space. For instance, the
eigenfunctions of the momentum operator are not the elements of Hilbert space of square integrable functions.
The eigenvalues of a self-adjoint operator are real, and their eigenfunctions are orthogonal to each other. We shall prove this result
by considering a restricted case, namely distinct eigenvalues and their eigenfunctions. However, this result is true, irrespective of
the fact whether the eigenvalues are discrete or continuous and degenerate or non-degenerate.
Example 4.6 Prove that the eigenvalues of a self-adjoint operator are real and the eigenfunctions belonging to distinct eigenvalues
are orthogonal to each other.
Let a1, a2, a3,…be the set of eigenvalues of the operator Aop, and the corresponding eigenfunctions are ϕ1, ϕ2,…ϕn.
Aopϕi = aiϕi and Aopϕj = ajϕj
If the reader finds the above results difficult, he can work them in the Hilbert space L2(R, dx).
Case 1: i = j: The above equation becomes
(ai − )(ϕi, ϕi) = 0
ai − = 0 or ai =
Case 2: ai ≠ aj
Then the Equation (4.23) becomes
(aj − ai) (ϕi, ϕj) = 0
Orthonormal Eigenfunctions
In the previous example, note that (ϕi, ϕi) ≠ 0. This can be made to be equal to 1, by suitably re-defining ϕi. Therefore, without
losing any generality, we can write
(ϕi, ϕi) = 1 and (ϕi, ϕj) = 0 if i ≠ j
The set of linearly independent functions ϕis is said to be orthonormal set if they satisfy (4.24). Therefore, the set of eigenfunctions
of a self-adjoint operator forms an orthonormal set.
Complete Set
The eigenfunctions of a self-adjoint operator form a complete set. The meaning of this statement is that any arbitrary wave function
in the Hilbert space can be expressed as a linear combination of this set of orthonormal functions. Let ψ be any arbitrary wave
function in the Hilbert space. Then we have
This is very much similar to an arbitrary geometrical vector F = F1i + F2j + F3k. The component or coefficient can be obtained by
taking scalar product F with a basis vector. For instance, F1 can be obtained by
F1 = i · F
In the last summation , only one term, namely the term with n = m, alone will survive. i.e.,
cn δmn = cm
∴ (ϕm, ψ) = cm
The formal mathematical treatment of Hilbert spaces concentrates considerably on the validity of the expansion of ψ as a linear
combination of the basis function as given in (4.25). Though we have used the analogy of geometrical vectors to explain the
meaning of the Equation (4.25), in the case of infinite dimensional linear vector space, things are not so simple, and the validity of
the Equation (4.25) requires elaborate scrutiny.
There are many definitions of Dirac delta functions. Here we choose the simplest one, though this may not be mathematically
satisfactory.
The Dirac delta function δ(x) is defined as
such that
In the above integral, it is enough to have the upper and lower limits on either side of x = 0 so that x = 0 is included within these
limits.
i.e.,
This delta function δ(x) is said to be centred at x = 0. We can have a delta function centred at any point x = a. In such case, the Dirac
delta function becomes
such that
We have a number of representations of Dirac delta function. One of the simplest representations of Dirac delta function is to
consider gɛ(x, a) in the limit ɛ → 0 where gɛ(x, a) is given by
In the Fig. 4.3, gɛ (x, a) is sketched for different values of ɛ. It is a rectangular box of width 2ɛ and height 1/2ɛ so that the area under
the curve always is 1 for all values of ɛ. However, as ɛ goes to zero, the width becomes smaller, and the height becomes larger. This
is essentially the Dirac delta function, so we can define Dirac delta function as
δ(x − a) = gɛ(x, a)
Fig. 4.4
In the limit ɛ → 0, the function f(x) does not change much from f(a) in the integration range a − ɛ to a + ɛ. Therefore, f(x) can be
replaced by f(a) throughout this range of integration.
i.e.,
In fact, this result itself can be taken as the defining property of Dirac delta function.
Example 4.8 By considering Fourier transform and its inverse, show that = 2πδ(x − a). The Fourier transform and
its inverse are given by
Solution:
The integral does not converge, so multiply each integrand by convergent factor. Multiply the integrand in the first integral by ekɛ (in
this case, k varies from −∞ to 0, so ekɛ = e−|k|ɛ) and multiply the integrand in the second integral bye−kɛ. In the end, take ɛ → 0 limit.
1. δ(ax) = δ(x)
2 2
2. δ(x − a ) = [δ(x + a) + δ(x − a)]
3. xδ′(x) = −δ(x)
4. δ(g(x)) =
or
δ(ax) = δ(x) a > 0
δ(ax) = δ(x) a < 0.
Since the integrand is zero in the ranges −∞ < x < −(a + ɛ), (−a + ɛ) < x < (a − ɛ) and (a + ɛ < x < ∞), the integration in these ranges
does not contribute to the integral.
But δ[−2a(x + a)] = δ(x + a).
Since the above result is true for any arbitrary function f(x) we have,
EXERCISES
4. What are the differences between axioms for a scalar product defined in a linear vector space of geometrical vectors and the linear vector space
in quantum mechanics?
6. Give the examples of Hilbert spaces that are used in non-relativistic quantum mechanics.
14. Express as a linear combination of basis vectors (i) e1 = and e2 = (ii) e1 = and e2 = (iii) e1 =
and e2 =
i.
ii.
16. Prove that xδ′(x) = −δ(x),where the prime denotes differentiation with respect to x.
REFERENCES
1. A. K. Ghatak, I. C. Goyal and S. J. Chua, 1995. Mathematical Physics. New Delhi: Macmillan India Limited.
2. K. F. Riley, M. D. Hobson and S. J. Bence, 1998. Mathematical Methods for Physics and Engineering. Cambridge University Press.
3. P. M. Mathews and K. Venkatesan, 1976. A Textbook of Quantum Mechanics, Tata McGraw-Hill Publishing Company Limited.
6. Francois Gieres, ‘Mathematical Surprises and Dirac’s Formalism in Quantum Mechanics’, Reports in Progress in Physics 63 (2008), 1893–
1931.