Mathematics Review A

I. Matrices
A. Vectors
A vector is an object having n-components
x = (x
1
,x
2
,...x
n
).
These components may represent, for example, the cartesian coordinates of a particle (in
which case, n=3) or the cartesian coordinates of N particles (in which case, n=3N).
Alternatively, the vector components may have nothing what so ever to do with cartesian or
other coordinate-system positions.
The numbers x
i
are called the components of the vector x in the directions of some
n elementary unit vectors:
x = x
1
⋅ e
1
+ x
2
⋅ e
2
+ x
3
⋅ e
3
+ ... + x
n
⋅ e
n
= x
1

]
]
]
]
]
1
0
0
0
0
.
.
.
+ x
2

]
]
]
]
]
0
1
0
0
0
.
.
.
+ x
3

]
]
]
]
]
0
0
1
0
0
.
.
.
+ ... + x
n

]
]
]
]
]
0
0
.
.
.
0
0
1

The unit vectors e
i
, whose exact definition, meaning and interpretation depend on the
particular application at hand, are called basis vectors and form the elements of a basis .
They are particularly simple to work with because they are orthogonal . This means that
their dot products vanish e
i
.
e
j
= 0, unless i = j. If i = j, then the scalar or dot product is
unity (it is usually convenient, but not necessary, to use bases that are normalized so e
i
.
e
i
= 1). The shorthand way of representing this information is to write
e
i
.
e
j
=
< >
e
i
|e
j
= δ
ij
,
where δ
ij
is called the Kronecker delta function defined by:
δ
ij
= 0, if i ≠ j, and
δ
ij
= 1 if i = j.
The above equation for x provides an example of expressing a vector as a linear
combination of other vectors (in this case, the basis vectors). The vector x is expressed as
a linear combination of the unit vectors e
i
, and the numbers x
i
are the coefficients in the
linear combination. Another way of writing this within the summation notation is:
x =

i
n
x
i
e
i
.
The idea of a linear combination is an important idea that will be encountered when we
discuss how a matrix operator affects a linear combination of vectors.
B. Products of Matrices and Vectors
If M is an n x n matrix with elements M
ij
, (the first subscript specifies the row
number and the second subscript specifies the column number), then the product M x = y
is a vector whose components (when subscripts i,j,k, etc. appear, they can take any value
1,2,...n unless otherwise specified) are defined as follows:
y
k
=

j
n
M
kj
x
j

The vector components y
k
can be understood as either the components of a new
vector y in the directions of the original basis e
i
(i=1,2...n) or as the components of the old
vector x in the directions of new bases.
There are always these two ways to view a matrix acting on a vector:
1. The operation can be thought of as transforming the vector into a different
vector. This view is called the active view (vector in a different place), and is the
interpretation we will use most often.
2. The operation can be thought of as expressing the same vector in terms of a
different coordinate system or basis. This view is called the passive view.
Some examples may help to clarify these perspectives:
For the matrix-vector product

]
]
a 0
0 1

]
]
x
y
=

]
]
ax
y

the active interpretation states that the vector is scaled in the x direction by an amount a. In
the passive interpretation, the original vector is written in terms of new bases (a
-1
, 0) and
(0 , 1):

]
]
x
y
= ax

]
]
] a
-1
0
+ y

]
]
0
1
.
As another example, consider the following matrix multiplication:
M x =

]
]
]
] Cosθ -Sinθ
Sinθ Cosθ

]
]
]
x
y
=

]
]
]
] (xCosθ - ySinθ)
(xSinθ + yCosθ)

In the active interpretation, the vector whose cartesian and polar representations are:
x =

]
]
]
x
y
=

]
]
]
] rCosφ
rSinφ
,
is rotated by an angle θ to obtain:
M x =

]
]
]
] Cosθ -Sinθ
Sinθ Cosθ

]
]
]
] rCosφ
rSinφ

=

]
]
]
] (rCosφCosθ - rSinφSinθ)
(rCosφSinθ + rSinφCosθ)

=

]
]
]
] rCos(φ+θ)
rSin(φ+θ)
.
In the passive interpretation, the original vector x is expressed in terms of a new coordinate
system with axes rotated by -θ with new bases

]
]
]
] Cosθ
-Sinθ
and

]
]
]
] Sinθ
Cosθ
.

]
]
]
x
y
= (xCosθ - ySinθ)

]
]
]
] Cosθ
-Sinθ
+ (xSinθ + yCosθ)

]
]
]
] Sinθ
Cosθ

=

]
]
]
] x(Cos
2
θ + Si n
2
θ) + y(SinθCosθ - SinθCosθ)
y(Cos
2
θ + Si n
2
θ) + x(SinθCosθ - SinθCosθ)

=

]
]
]
x
y

As a general rule, active transformations and passive transformations are inverses
of each other; you can do something to
a vector or else do the reverse to the coordinate system. The two pictures can be
summarized by the following two equations:
(i.) M x = y states the active picture, and
(ii.) x = M
-1
y states the passive picture.
C. Matrices as Linear Operators
Matrices are examples of linear operators for which
M (ax + by) = a M x + b M y,
which can easily be demonstrated by examining the components:
[ M (ax+by)]
i
=

k
M
ik
(ax
k
+ by
k
)
= a

k
M
ik
x
k
+ b

k
M
ik
y
k

= a( M x)
i
+b( M y)
i
.
One can also see that this property holds for a linear combination of many vectors rather than
for the two considered above.
We can visualize how the action of a matrix on arbitrary vectors can be expressed if
one knows its action on the elementary basis vectors. Given the expansion of x in the e
i
,
x =

i
x
i
e
i
,
one can write
M x =

i
x
i
M e
i
.
Using the fact that all of the components of e
i
are zero except one, (e
i
)
i
= 1, we see that
(Me
i
)
k
=

j
M
kj
(e
i
)
j
= M
ki
This equation tells us that the i-th column of a matrix, M , contains the result of operating on
the i-th unit vector e
i
with the matrix. More specifically, the element M
ki
in the k-th row
and i-th column is the component of M e
i
in the direction of the e
k
unit vector. As a
generalization, we can construct any matrix by first deciding how the matrix affects the
elementary unit vectors and then placing the resulting vectors as the columns of the matrix.
II. Properties of General n x n (Square) Matrices
The following operations involving square matrices each of the same dimension are
useful to express in terms of the individual matrix elements:
1. Sum of matrices; A + B = C if A
ij
+ B
ij
= C
ij
2. Scalar multiplication; c M = N if c M
ij
= N
ij
3. Matrix multiplication; AB = C if

k
n
A
ik
B
kj
= C
ij
4. Determinant of a matrix;
The determinant is defined inductively
for n = 1 A = det(A) = A
11
for n > 1 A = det(A) =

i
n
A
ij
det(a
ij
) (-1)
(i+j)
; where
j=1,2,...n and a
ij
is the minor matrix obtained
by deleting the ith row and jth column.
5. There are certain special matrices that are important to know:
A. The zero matrix; 0
ij
= 0 for i = 1,2,...n and
j = 1,2,...n
B. The identity matrix; I
ij
= δ
ij
(Note ( I M )
ij
=

k
δ
ik
M
kj
= M
ij
, so I M = M I = M )
6. Transpose of a matrix; (M
T
)
ij
= M
ji
7. Complex Conjugate of a Matrix; (M*)
ij
= M*
ij
8. Adjoint of a Matrix; (M
÷
|
)
ij
= M*
ji
= (M
T
)*
ij
9. Inverse of a Matrix; if N M = M N = I then N = M
-1
10. Trace (or Character) of Matrix; Tr(M) =

i
M
ii

(sum of diagonal elements)
III. Special Kinds of Square Matrices
If a matrix obeys certain conditions, we give it one of several special names. These
names include the following:
A. Diagonal Matrix: D
ij
= d
i
δ
ij
= d
j
δ
ij
B. Real Matrix: M = M * or M
ij
= M
ij
* (real elements)
C. Symmetric Matrix: M = M
T
or M
ij
= M
ji
(symmetric about
main diagonal)
D. Hermitian Matrix: M = M
÷
|
or M
ij
= M
ji
*
E. Unitary Matrix: M
÷
|

= M
-1
F. Real Orthogonal M
÷
|
= M
T
= M
-1
IV. Eigenvalues and Eigenvectors of a Square Matrix
An eigenvector of a matrix, M , is a vector such that
M v = λv
where λ is called the eigenvalue. An eigenvector thus has the property that when it is
multiplied by the matrix, the direction of the resultant vector is unchanged. The length,
however, is altered by a factor λ. Note that any multiple of an eigenvector is also an
eigenvector, which we demonstrate as follows:
M (av) = a M v = aλv = λ(av).
Hence, an eigenvector can be thought of as defining a direction in n-dimensional
space. The length (normalization) of the eigenvector is arbitrary; we are free to choose the
length to be anything we please. Usually we choose the length to be unity (because, in
quantum mechanics, our vectors usually represent some wavefunction that we wish to obey
a normalization condition).
The basic eigenvalue equation can be rewritten as
( M - λ I )v = 0
or, in an element-by-element manner, as
(M
11
-λ)v
1
+ M
12
v
2
+ M
13
v
3
+
⋅ ⋅ ⋅
M
1n
v
n
= 0
M
21
v
1
+
(M
22
-λ)v
2
+ M
23
v
3
+
⋅ ⋅ ⋅
M
2n
v
n
= 0
⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅
⋅ ⋅ ⋅ ⋅
M
n1
v
1
+ M
n2
v
2
+ M
n3
v
3
+
⋅ ⋅ ⋅ (M
nn
-λ)v
n
= 0.
If you try to solve these n equations for all of the elements of the v vector (v
1
...v
n
), you
can eliminate one variable using one equation, a second variable using a second equation,
etc., because the equations are linear. For example you could solve for v
1
using the first
equation and then substitute for v
1
in the second equation as you solve for v
2
, etc. Then
when you come to the nth equation, you would have n-1 of the variables expressed in
terms of the one remaining variable, v
n
.
However, you find that you cannot use the remaining equation to solve for the
value of v
n
; the last equation is found to appear in the form
(C - λ) v
n
= 0
once the v
1
,v
2
,v
3
, and v
n-1
are expressed in terms of v
n
. We should not really have
expected to solve for v
n
since, as we saw above, the length of the vector v is not
determined from the set of eigenvalue equations. You find that the only solution is v
n
= 0 ,
which then implies that all of the other v
k
= 0 because you expressed them in terms of v
n
,
unless the eigenvalue λ is chosen to obey λ= C.
Upon analyzing what has gone into the C element, one finds that the v
k
(k=1,2,3,...,n-1) were eliminated in favor of v
n
by successively combining rows of the ( M
-λ I ) matrix. Thus, (C-λ) can vanish if and only if the last row of ( M -λ I ) is a linear
combination of the other n-1 rows. A theorem dealing with determinants states that the
rows of a matrix are linearly dependent (i.e., one row is a linear combination of the rest) if
and only if the determinant of the matrix is zero. We can therefore make the eigenvalue
equation have a solution v by adjusting λ so the determinant of ( M -λ I ) vanishes.
A. Finding the Eigenvalues
In summary, to solve an eigenvalue equation, we first solve the determinantal
equation:
| M -λ I | = 0.
Using the definition of a determinant, one can see that expanding the determinant results in
an n
th
order polynomial in λ
a
n
(M)λ
n
+ a
n-1
(M) λ
n-1
+ ... a
1
(M)λ + a
o
(M) = 0,
where the coefficients a
i
(M) depend on the matrix elements of M . A theorem in algebra
shows that such an equation always has n roots some or all of which may be complex
(e.g., a quadratic equation has 2 roots). Thus there are n different ways of adjusting the
parameter λ so the determinant vanishes. Each of these solutions is a candidate for use as
λ in subsequently solving for the v vector's coefficients. That is, each λ value has its own
vector v.
B. Finding the Eigenvectors
One can substitute each of these n λ values (λ
k
, k = 1,2,n) back into the eigenvalue
equation , one at a time, and solve for the n eigenvectors v(k). By using one of the λ
k
values, the n
th
equation is guaranteed to be equal to zero, so you use n-1 of the equations to
solve for n-1 of the components in terms of the n
th
one. The eigenvectors will then be
determined up to a multiplicative factor which we then fix by requiring normalization:

i
v
i
*(k)v
i
(k) =
< >
v(k)|v(k) = 1.
This expression now defines the dot or inner product (Hermitian inner product) for vectors
which can have complex valued components. We use this definition so the dot product of a
complex valued vector with itself is real.
In summary, we have shown how to solve the n equations:

j
M
ij
v
j
(k) = λ
k
v
i
(k); k = 1,2,...,n.
for the eigenvalues of the matrix λ
k
and the corresponding normalized eigenvectors v(k), k
= 1,2,...n. Now let us work an example that is chosen to illustrate the concepts we have
learned as well as an additional complication, that of eigenvalue degeneracy.
C. Examples
Consider the following real symmetric matrix:
M =

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2

The set of eigenvalue-eigenvector equations has non-trivial (v(k) = 0 is "trivial") solutions
if
1 Μ − λ
k
I 1= 0.
In our example, this amounts to
¹
¹
¹
¹
¹
¹
¹
¹
3-λ
k
0 0
0
5
2

k
1
2
0
1
2
5
2

k
= 0 ,
or
| | 3-λ
k

¹
¹
¹
¹
¹
¹
¹
¹
5
2

k
1
2
1
2
5
2

k
= 0 ,
or
,
(3-λ
k
)⋅

]
]
]
.
|
,
`
5
2
- λ
k
2
-
1
4
= 0 ,
or,
(3-λ
k
) [λ
k
2
-5λ
k
+6] = 0 = (3-λ
k
)(λ
k
-3)(λ
k
-2).
There are three real solutions to this cubic equation (why all the solutions are real in this
case for which the M matrix is real and symmetric will be made clear later):
1) λ
1
= 3,2) λ
2
= 3, and 3) λ
3
= 2.
Notice that the eigenvalue λ
k
= 3 appears twice; we say that the eigenvalue λ
k
= 3 is
doubly degenerate . λ
k
= 2 is a non-degenerate eigenvalue.
The eigenvectors v(k) are found by plugging the above values for λ
k
into the basic
eigenvalue equation
M v(k) = λ
k
v(k)
For the non-degenerate eigenvalue (λ
3
= 2), this gives:

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2

]
]
]
]
v
1
(3)
v
2
(3)
v
3
(3)
= 2

]
]
]
]
v
1
(3)
v
2
(3)
v
3
(3)
.
The following algebraic steps can then be followed:
i. 3v
1
(3) = 2v
1
(3) implies that v
1
(3) = 0,
ii.
5
2
v
2
(3) +
1
2
v
3
(3) = 2v
2
(3), and
iii.
1
2
v
2
(3) +
5
2
v
3
(3) = 2v
3
(3).
The last two equations can not be solved for both v
2
(3) and v
3
(3). To see the trouble,
multiply equation iii. by 5 and subtract it from equation ii. to obtain
-12 v
3
(3) = 2v
2
(3) - 10v
3
(3),
which implies that v
3
(3) = -v
2
(3). Now, substitute this result into the equation ii. to obtain
5
2
v
2
(3) +
1
2
(-v
2
(3)) = 2v
2
(3),
or,
2v
2
(3) = 2v
2
(3).
This is a trivial identity; it does not allow you to solve for v
2
(3).
Hence, for this non-degenerate root, one is able to solve for all of the v
j
elements in terms
of one element that is yet undetermined.
As in all matrix eigenvalue problems, we are able to express (n-1) elements of the
eigenvector v(k) in terms of one remaining element. However, we can never solve for this
one last element. So, for convenience, we impose one more constraint (equation to be
obeyed) which allows us to solve for the remaining element of v(k). We require that the
eigenvectors be normalized :
< >
v(k)|v(k) =

a
v*
a
(k) v
a
(k) = 1.
In our example, this means that
v
1
2
(3) + v
2
2
(3) + v
3
2
(3) = 1,
or,
0
2
+ v
2
2
(3)

+

(-v
2
(3))
2
= 1,
which implies that v
2
(3) = t
1
2
.
So,
v
3
(3) = t
1
2
,
and finally the vector is given by:
v(3) = t

]
]
]
]
0
1
2
-
1
2
.
Note that even after requiring normalization, there is still an indeterminancy in the
sign of v(3). The eigenvalue equation, as we recall, only specifies a direction in space.
The sense or sign is not determined. We can choose either sign we prefer.
Finding the first eigenvector was not too difficult. The degenerate eigenvectors are
more difficult to find. For λ
1
= λ
2
= 3,

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2

]
]
]
]
v
1
(1)
v
2
(1)
v
3
(1)
= 3

]
]
]
]
v
1
(1)
v
2
(1)
v
3
(1)
.
Again, the algebraic equations can be written as follows:
i. 3v
1
(1) = 3v
1
(1); this tells us nothing!
ii.
5
2
v
2
(1) +
1
2
v
3
(1) = 3v
2
(1), and
iii.
1
2
v
2
(1) +
5
2
v
3
(1) = 3v
3
(1).
If we multiply equation iii. by 5 and subtract if from equation ii., we obtain:
-12v
3
(1) = -15v
3
(1) + 3v
2
(1),
or
3v
3
(1) = 3v
2
(1),
which implies that v
3
(1) = v
2
(1). So far, all we have is v
3
(1) = v
2
(1); we don't know
v
1
(1) nor do we know either v
3
(1) or v
2
(1), and we have used all three equations.
Normalization provides one more equation v
1
2
(1) + v
2
2
(1) + (v
2
(1))
2
= 1, but we are still
in a situation with more unknowns (2) than equations (1).
One might think that by restricting our eigenvectors to be orthogonal as well as
normalized, we might overcome this difficulty (however, such is not the case, as we now
show).
For our vectors, the constraint that the nondegenerate vector v(3) be orthogonal to
the one we are trying to find v(1), amounts to
< >
v(3)v(1) = 0
v
1
(3)* v
1
(1) + v
2
(3)* v
2
(1) + v
3
(3)* v
3
(1) = 0
0 v
1
(1) t

]
]
1
2
v
2
(1) -
1
2
(v
2
(1)) = 0.
We see that v(3) and v(1) are already orthogonal regardless of how v
2
(1) and v
3
(1) turn
out. This is shown below to be guaranteed because v(1) and v(3) have different
eigenvalues (i.e., two eigenvectors belonging to different eigenvalues of any symmetric or
hermitian matrix must be orthonogonal). Hence, this first attempt at finding additional
equations to use has failed.
What about the two degenerate eigenvectors v(1) and v(2)? Are they also
orthonormal? So far, we know that these two eigenvectors have the structure
v(1) =

]
]
]
]
v
1
(1)
v
2
(1)
v
3
(1)
, with 1 = v
1
2
(1) + 2v
2
2
(1).
If we go through all of the above steps for v(2) with λ
2
= 3, we will find that this vector
obeys exactly the same set of equations
v(2) =

]
]
]
]
v
1
(2)
v
2
(2)
v
3
(2)
, with 1 = v
1
2
(2) + 2v
2
2
(2).
We showed above that
< >
v(1)v(3) = 0, and it is easy to show that
< >
v(2)v(3) = 0 because
the elements of v(2) , thus far, obey the same equations as v(1).
If we also wish to make the two degenerate eigenvectors orthogonal to one another
< >
v(1)v(2) = 0,
then we obtain additional relationships among our yet-undetermined vector amplitudes. In
particular, we obtain
v
1
(1)v
1
(2) + v
2
(1)v
2
(2) + v
3
(1)v
3
(2) = 0,
or,
v
1
(1)v
1
(2) + 2v
2
(1)v
2
(2).= 0.
Although progress has been made, we still have four unknowns v
1
(1),v
2
(1); v
1
(2),
v
2
(2) and only three equations :
0 = v
1
(1) v
1
(2) + 2v
2
(1) v
2
(2) = 0,
1 = v
1
(1) v
1
(1) + 2v
2
(1) v
2
(1) = 1, and
1 = v
1
(2) v
1
(2) + 2v
2
(2) v
2
(2) = 1.
It appears as though we are stuck again. We are; but for good reasons. We are
trying to find two vectors v(1) and v(2) that are orthonormal and are eigenvectors of M
having eigenvalue equal to 3. Suppose that we do find two such vectors. Because M is a
linear operator, any two vectors generated by taking linear combinations of these two
vectors would also be eigenvectors of M . There is a degree of freedom, that of
recombining v(1) and v(2), which can not be determined by insisting that the two vectors
be eigenfunctions. Thus, in this degenerate-eigenvalue case our requirements do not give
a unique pair of eigenvectors. They just tell us the two-dimensional space in which the
acceptable eigenvectors lie (this difficulty does not arise in nondegenerate cases because
one-dimensional spaces have no flexibility other than a sign.).
So to find an acceptable pair of vectors, we are free to make an additional choice.
For example, we can choose one of the four unknown components of v(1) and v(2) equal
to zero. Let us make the choice v
1
(1) = 0. Then the above equations can be solved for the
other elements of v(1) to give v
2
(1) = t
1
2
= v
3
(1). The orthogonality between v(1) and
v(2) then gives 0 = 2
.
|
,
`
t
1
2
v
2
(2), which implies that v
2
(2) = v
3
(2) = 0; the remaining
equation involving v(2) then gives v
1
(2) = t 1.
In summary, we have now found a specific solution once the choice v
1
(1) = 0 is
made:
v(1) = t

]
]
]
]
0
1
2
1
2
, v(2) = t

]
]
]
]
1
0
0
, and v(3) = t

]
]
]
]
0
1
2
-
1
2

Other choices for v
1
(1) will yield different specific solutions.
V. Properties of Eigenvalues and Eigenvectors of Hermitian Matrices
The above example illustrates many of the properties of the matrices that we will
most commonly encounter in quantum mechanics. It is important to examine these
properties in more detail and to learn about other characteristics that Hermitian matrices
have.
A. Outer product
Given any vector v, we can form a square matrix denoted | | v(i)><v(i) , whose
elements are defined as follows:
| | v(i)><v(i) =

]
]
]
]
v
1
*(i)v
1
(i) v
1
*(i)v
2
(i) . . . v
1
*(i)v
n
(i)
v
2
*(i)v
1
(i) v
2
*(i)v
2
(i) . . . v
2
*(i)v
n
(i)
. . . . . . . . . . . .
v
n
*(i)v
1
(i) v
n
*(i)v
2
(i) . . . v
n
*(i)v
n
(i)

We can use this matrix to project onto the component of a vector in the v(i) direction. For
the example we have been considering, if we form the projector onto the v(1) vector, we
obtain

]
]
]
]
0
1
2
1
2

]
]
0
1
2
1
2
=

]
]
]
]
0 0 0
0
1
2
1
2
0
1
2
1
2
,
for v(2), we get

]
]
]
]
1
0
0
[ ] 1 0 0 =

]
]
]
]
1 0 0
0 0 0
0 0 0
,
and for v(3) we find

]
]
]
]
0
1
2
-
1
2

]
]
0
1
2
-
1
2
=

]
]
]
]
0 0 0
0
1
2
-
1
2
0 -
1
2
1
2
.
These three projection matrices play important roles in what follows.
B. Completeness Relation or Resolution of the Identity
The set of eigenvectors of any Hermitian matrix form a complete set over the space
they span in the sense that the sum of the projection matrices constructed from these
eigenvectors gives an exact representation of the identity matrix.

i
| | v(i)><v(i) = I .
For the specific matrix we have been using as an example, this relation reads as follows:

]
]
]
]
0
1
2
1
2

]
]
0
1
2
1
2
+

]
]
]
]
1
0
0
[ ] 1 0 0 +

]
]
]
]
0
1
2
-
1
2

]
]
0
1
2
-
1
2
=

]
]
]
]
1 0 0
0 1 0
0 0 1
.
Physically, this means that when you project onto the components of a vector in these
three directions, you don't lose any of the vector. This happens because our vectors are
orthogonal and complete. The completeness relation means that any vector in this three-
dimensional space can be written in terms of v(1), v(2), and v(3) (i.e., we can use
v(1),v(2),v(3) as a new set of bases instead of e
1
, e
2
, e
3
).
Let us consider an example in which the following vector is expanded or written in
terms of our three eigenvectors:
f =

]
]
]
]
7
-9
12
= a
1
v(1) + a
2
v(2) + a
3
v(3)
The task at hand is to determine the expansion coefficients, the a
i
. These coefficients are
the projections of the given f vector onto each of the v(i) directions:
< >
v(1)f = a
1
,
since,
< >
v(1)v(2) =
< >
v(1)v(3) = 0.
Using our three v
i
vectors and the above f vector, the following three expansion
coefficients are obtained:
a
1
=

]
]
0
1
2
1
2

]
]
]
]
7
-9
12
=
3
2

< >
v(2)f = 7
< >
v(3)f = -
21
2

Therefore, f can be written as:
f =

]
]
]
]
7
-9
12
=
3
2

]
]
]
]
0
1
2
1
2
+ 7

]
]
]
]
1
0
0
-
21
2

]
]
]
]
0
1
2
-
1
2

This works for any vector f, and we could write the process in general in terms of the
resolution of the identity as
f = I f =

k
| | v(k)><v(k) f> =

k
|v(k)> a
k
,
This is how we will most commonly make use of the completeness relation as it pertains
to the eigenvectors of Hermitian matrices.
C. Spectral Resolution of M
It turns out that not only the identity matrix I but also the matrix M itself can be
expressed in terms of the eigenvalues and eigenvectors. In the so-called spectral
representation of M , we have
M =

k
λ
k| | v(k)><v(k) .
In the example we have been using, the three terms in this sum read
3

]
]
]
]
0 0 0
0
1
2
1
2
0
1
2
1
2
+ 3

]
]
]
]
1 0 0
0 0 0
0 0 0
+ 2

]
]
]
]
0 0 0
0
1
2
-
1
2
0 -
1
2
1
2
=

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2
= M
This means that a matrix is totally determined if we know its eigenvalues and eigenvectors.
D. Eigenvalues of Hermitian Matrices are Real Numbers
A matrix can be expressed in terms of any complete set of vectors. For n x n
matrices, a complete set is any n linearly independent vectors. For a set of vectors k>, k
= 1,2,...n, the matrix elements of M are denoted M
jk
or
< >
jMk . If the matrix is Hermitian
then
< >
jMk =
< >
kMj * .
If the vectors k> are eigenvectors of M , that is, if M v(k)> = M k> = λ
k
k>, then the
eigenvalues are real. This can be shown as follows:
< >
kMk = λ
k

< >
kk =
< >
kMk * = λ
k
*
< >
kk ,
so λ
k
= λ
k
*. This is a very important result because it forms the basis of the use of
Hermitian operators in quantum mechanics; such operators are used because experiments
yield real results, so to connect with experimental reality, only hermitian operators must
appear.
E. Nondegenerate Eigenvectors of Hermitian Matrices are Orthogonal
If two eigenvalues are different, λ
k
≠ λ
j
, then
< >
kMj = λ
j< >
kj

=
< >
jMk * = λ
k
*
< >
jk * = λ
k

< >
kj ,
which implies that (λ
k

j
)
< >
kj = 0. Since, by assumption, λ
k
≠ λ
j
, it must be that
< >
kj =
0. In other words, the eigenvectors are orthogonal. We saw this earlier in our example
when we "discovered" that v(3) was automatically orthogonal to v(1) and to v(2).
If one has degenerate eigenvalues, λ
k
= λ
j
for some k and j, then the corresponding
eigenvectors are not automatically orthogonal to one another (they are orthogonal to other
eigenvectors), but the degenerate eigenvectors can always be chosen to be orthogonal. We
also encountered this in our earlier example.
In all cases then, one can find n orthonormal eigenvectors (remember we required
< >
kk = 1 as an additional condition so that our amplitudes could be interpreted in terms of
probabilities). Since any vector in an n- dimensional space can be expressed as a linear
combination of n orthonormal vectors, the eigenvectors form a complete basis set. This is
why the so-called resolution of the identity, in which the unit matrix can be expressed in
terms of the n eigenvectors of M , holds for all hermitian matrices.
F. Diagonalizing a Matrix using its Eigenvectors
The eigenvectors of M can be used to form a matrix that diagonalizes M . This
matrix S is defined such that the k
th
column of S contains the elements of v(k)
S =

]
]
]
]
]
]
v
1
(1) v
1
(2) ⋅ ⋅ ⋅ v
1
(n)
v
2
(1) v
2
(2) ⋅ ⋅ ⋅ v
2
(n)
v
3
(1) v
3
(2) ⋅ ⋅ ⋅ v
3
(n)
⋅ ⋅ ⋅ ⋅ ⋅ ⋅
v
n
(1) v
n
(2) ⋅ ⋅ ⋅ v
n
(n)
.
Then using the above equation one can write the action of M on S as

j
M
ij
S
jk
=

j
M
ij
v
j
(k) = λ
k
v
i
(k) = S
ik
λ
k
.
Now consider another matrix Λ which is a diagonal matrix with diagonal elements λ
k
, i.e.
Λ
ik
= δ
ik
λ
k
.
One can easily show using the δ
jk
matrix that

j
S
ij
δ
jk
λ
k
= S
ik
λ
k
,
since the only non-zero term in the sum is the one in which j = k. Thus, comparing with
the previous result for the action of M on S ,

j
M
ij
S
jk
=

j
S
ij
δ
jk
λ
k
.
These are just the i,kth matrix elements of the matrix equation
M S = S Λ .
Let us assume that an inverse S
-1
of S exists. Then multiply the above equation on
the left by S
-1
to obtain
S
-1
M S = S
-1
S Λ = I Λ = Λ .
This identity illustrates a so-called similarity transform of M using S . Since Λ is diagonal,
we say that the similarity transform S diagonalizes M . Notice this would still work if we
had used the eigenvectors v(k) in a different order. We would just get Λ with the diagonal
elements in a different order. Note also that the eigenvalues of M are the same as those of
Λ since the eigenvalues of Λ are just the diagonal elements λ
k
with eigenvectors e
k
(the
elementary unit vectors)
Λ e
k
= λ
k
e
k
, or, (Λe
k
)
i
=

j
λ
k
δ
kj
δ
ji
= λ
k
δ
ki
= λ
k
(e
k
)
i
.
G. The Trace of a Matrix is the Sum of its Eigenvalues
Based on the above similarity transform, we can now show that the trace of a
matrix (i.e., the sum of its diagonal elements) is independent of the representation in which
the matrix is formed, and, in particular, the trace is equal to the sum of the eigenvalues of
the matrix. The proof of this theorem proceeds as follows:


k
λ
k
= Tr(Λ) = ∑
k
(S
-1
MS)
kk
=

kij
(S
ki
-1
M
ij
S
jk
)
=

ij
M
ij


k
S
jk
S
ki
-1
=

ij
M
ij
δ
ji
=

i
M
ii
= Tr(M)
H. The Determinant of a Matrix is the Product of its Eigenvalues
This theorem, det(M) = det(Λ), can be proven, by using the theorem that det(AB) =
det(A) det(B), inductively within the expansion by minors of the determinant
λ
1
λ
2
λ
3
. . . λ
n
= det(Λ) = det (S
-1
MS) = det(S
-1
)det(M)det(S)
= det(M) det(S
-1
) det(S)
= det(M) det(S
-1
S)
= det(M) det(I) = det(MI) = det(M).
I. Invariance of Matrix Identities to Similarity Transforms
We will see later that performing a similarity transform expresses a matrix in a
different basis (coordinate system). If, for any matrix A , we call S
-1
A S = A ', we can
show that performing a similarity transform on a matrix equation leaves the form of the
equation unchanged. For example, if
A + B C = P Q R ,
performing a similarity transform on both sides, and remembering that S
-1
S = I, one finds
S
-1
( A + B C ) S = S
-1
P Q R S =
S
-1
A S + S
-1
B ( S S
-1
) C S = S
-1
P ( S S
-1
) Q S S
-1
R S ,
or,
A ' + B ' C ' = P ' Q ' R ' .
J. How to Construct S
-1
To form S
-1
, recall the eigenvalue equation we began with
M v(k) = λ
k
v(k).
If we multiply on the left by S
-1
and insert S
-1
S = I , we obtain
S
-1
M ( S S
-1
) v(k) = λ
k
S
-1
v(k),
or,
Λ v(k)' = λ
k
v(k)' .
According to the passive picture of matrix operations, v(k)' = S
-1
v(k) is the old vector v
expressed in terms of new basis functions which are given by the columns of S . But the
rows of S
-1
, or equivalently, the columns of S have to be orthonormal vectors (using the
definition of an inverse):

j
S
-1
ij
S
jk
= δ
ik
.
However, the columns of S are just the eigenvectors v(k), so the rows of S
-1
must also
have something to do with v(k).
Now consider the matrix S
÷
|
. Its elements are (S
÷
|
)
kj
= S
jk
*, so (S
÷
|
)
kj
= v
j
*(k),
as a result of which we can write
(S
÷
|
S)
ik
= ∑
j
S
÷
|
ij
S
jk
=

j
v
j
*(i)v
j
(k) =
< >
v(i)v(k) = δ
ik
.
We have therefore found S
-1
, and it is S
÷
|
. We have also proven the important theorem
stating that any Hermitian matrix M , can be diagonalized by a matrix S which is unitary
( S
÷
|
= S
-1
) and whose columns are the eigenvectors of M .
VI. Finding Inverses, Square Roots, and Other Functions of a Matrix Using its
Eigenvectors and Eigenvalues
Since a matrix is defined by how it affects basis vectors upon which it acts, and
since a matrix only changes the lengths, not the directions of its eigenvectors, we can form
other matrices that have the same eigenvectors as the original matrix but have different
eigenvalues in a straightforward manner. For example, if we want to reverse the effect of
M , we create another matrix that reduces the lengths of the eigenvectors by the same
amount that M lengthened them. This produces the inverse matrix. We illustrate this by
using the same example matrix that we used earlier.
M
-1
=

k
λ
k
-1
| | v(i)><v(i) (this is defined only if all λ
k
≠ 0)
1
3

]
]
]
]
0 0 0
0
1
2
1
2
0
1
2
1
2
+
1
3

]
]
]
]
1 0 0
0 0 0
0 0 0
+
1
2

]
]
]
]
0 0 0
0
1
2
-
1
2
0 -
1
2
1
2
=

]
]
]
]
1
3
0 0
0
10
24
-2
24
0
-2
24
10
24
= M
-1
To show that this matrix obeys M
-1
M = 1 , we simply multiply the two matrices together:

]
]
]
]
1
3
0 0
0
10
24
-2
24
0
-2
24
10
24

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2
=

]
]
]
]
1 0 0
0 1 0
0 0 1

An extension of this result is that a whole family of matrices, each member related
to the original M matrix, can be formed by combining the eigenvalues and eigenvectors as
follows:
f( M ) =

i
| | v(i)><v(i) f(λ
i
),
where f is any function for which f(λ
i
) is defined. Examples include exp( M ) , sin( M ),
M
1/2
, and ( I - M )
-1
. The matrices so constructed (e.g., exp( M ) =

i
exp(λ
i
)| | v(i)><v(i) ) are
proper representations of the functions of M in the sense that they give results equal to
those of the function of M when acting on any eigenvector of M ; because the eigenvectors
form complete sets, this implies that they give identical results when acting on any vector.
This equivalence can be shown most easily by carrying out a power series expansion of the
function of M (e.g., of exp( M )) and allowing each term in the series to act on an
eigenvector.
VII. Projectors Revisited
In hindsight, the relationships developed above for expressing functions of the
original matrix in terms of its eigenvectors and eigenvalues are not unexpected because each
of the matrices is given in terms of the so-called projector matrices | | v(i)><v(i) . As we
saw, the matrix

i
λ
i
| | v(i)><v(i) behaves just like M , as demonstrated below:

i
λ
i| | v(i)><v(i) (v(j)> =

i
λ
i
|v(i)> δ
ij
= λ
j
|v(j)>
The matrix | | v(i)><v(i) is called a projector onto the space of eigenvector |v(i)> because its
action on any vector |f> within the class of admissible vectors (2- or 4-dimensional vectors
are in a different class than 3- or 1- or 7- or 96-dimensional vectors)
( ) | | v(i)><v(i) |f> = |v(i)> ( ) <v(i)|f>
gives |v(i)> multiplied by the coefficient of |f> along |v(i)>.
This construction in which a vector is used to form a matrix | | v(i)><v(i) is called an
"outer product". The projection matrix thus formed can be shown to be idempotent ,
which means that the result of applying it twice (or more times) is identical to the result of
applying it once P P = P. This property is straightforward to demonstrate. Let us consider
the projector P
i
≡ | | v(i)><v(i) , which, when applied twice yields
| | v(i)><v(i) | | v(i)><v(i) = | | v(i)>δ
ii
<v(i) = | | v(i)><v(i) .
Sets of projector matrices each formed from a member of an orthonormal vector set
are mutually orthogonal , (i.e., P
i
P
j
= 0 if i ≠ j),
which can be shown as follows:
| | v(i)><v(i) | | v(j)><v(j) = | | v(i)>δ
ij
<v(j)
= | | v(i)>0<v(j) = 0 (i ≠ j)
VIII. Hermitian Matrices and The Turnover Rule
The eigenvalue equation:
M |v(i)> = λ
i
|v(i)>,
which can be expressed in terms of its indices as:

j
M
kj
v
j
(i) = λ
i
v
k
(i),
is equivalent to (just take the complex conjugate):

j
v
j
*(i)M
kj
* = λ
i
v
k
*(i),
which, for a Hermitian matrix M , can be rewritten as:

j
v
j
*(i)M
jk
= λ
i
v
k
*(i),
or, equivalently, as:
<v(i)| M = λ
i
<v(i)|.
This means that the v(i), when viewed as column vectors, obey the eigenvalue identity
M |v(i)> = m
i
|v(i)>. These same vectors, when viewed as row vectors (and thus complex
conjugated), also obey the eigenvalue relation, but in the "turn over" form <v(i)| M = λ
i
<v(i)| . For example, in the case we have been studying, the first vector obeys

]
]
0
1
2
1
2

]
]
]
]
3 0 0
0
5
2
1
2
0
1
2
5
2
=

]
]
0
3
2
3
2

= 3

]
]
0
1
2
1
2

As a general rule, a hermitian matrix operating on a column vector to the right is
equivalent to the matrix operating on the complex conjugate of the same vector written as a
row vector to the left. For a non-Hermitian matrix, it is the adjoint matrix operating on the
complex conjugate row vector that is equivalent to the original matrix acting on the column
vector on the right. These two statements are consistent since, for a Hermitian matrix the
adjoint is identical to the original matrix (often, Hermitian matrices are called "self-adjoint"
matrices for this reason) M = M
÷
|
.
IX. Connection Between Orthonormal Vectors and Orthonormal Functions.
For vectors as we have been dealing with, the scalar or dot product is defined as we
have seen as follows:
< >
v(i)v(j) ≡

k
v
k
*(i)v
k
(j) .
For functions of one or more variable (we denote the variables collectively as x), the
generalization of the vector scalar product is
< >
f(i)f(j) ≡

⌠f
i
*(x)f
j
(x)dx .
The range of integration and the meaning of the variables x will be defined by the specific
problem of interest; for example, in polar coordinates for one particle, x→r,θ, φ , and for N
particles in a general coordinate system x→r
i
, r
2
, . . . , r
N
).
If the functions f
i
(x) are orthonormal;
< >
f(i)f(j) = δ
ij ,
and complete, then a resolution of the identity holds that is similar to that for vectors

i
| | f(i)><f(i) =

i
f
i
(x)f
i
*(x') = δ(x-x')
where δ(x-x') is called the Dirac delta function. The δ function is zero everywhere except
at x', but has unit area under it. In other words,


δ(x-x')g(x')dx' = g(x) =

i
| | f(i)><f(i) |g> =

i
f
i
(x)

⌠f
i
*(x')g(x')dx'
=

i
f
i
(x)a
i
,
where a
i
is the projection of g(x) along f
i
(x). The first equality can be taken as a definition
of δ(x-x').
X. Matrix Representations of Functions and Operators
As we saw above, if the set of functions {f
i
(x)} is complete, any function of x can
be expanded in terms of the {f
i
(x)}
g(x) =

i
| | f(i)><f(i) |g>
=

i
f
i
(x)

⌠f
i
*(x')g(x')dx'
The column vector of numbers a
i


⌠f
i
*(x')g(x')dx' is called the representation of g(x) in
the f
i
(x) basis . Note that this vector may have an infinite number of components because
there may be an infinite number of f
i
(x).
Suppose that the function g(x) obeys some operator equation (e.g., an eigenvalue
equation) such as
dg(x)
dx
= αg(x).
Then, if we express g(x) in terms of the {f
i
(x)}, the equation for g(x) can be replaced by a
corresponding matrix problem for the a
i
vector coefficients:

i
df
i
(x)
dx

⌠f
i
*(x')g(x')dx' = α

i
f
i
(x)

⌠f
i
*(x')g(x')dx'
If we now multiply by f
j
*(x) and integrate over x, we obtain

i


f
j
*(x)
d
dx
f
i
(x)dx a
i
= α

i

⌠f
j
*(x)f
i
(x)dx a
i
.
If the f
i
(x) functions are orthonormal, this result simplifies to

i
.
|
,
`
d
dx

ji
a
i
= α a
j
where we have defined the matrix representation of
d
dx
in the {f
i
} basis by
.
|
,
`
d
dx

ji



f
j
*(x)
d
dx
f
i
(x)dx .
Remember that a
i
is the representation of g(x) in the {f
i
} basis. So the operator eigenvalue
equation is equivalent to the matrix eigenvalue problem if the functions {f
i
} form a
complete set.
Let us consider an example, that of the derivative operator in the orthonormal basis
of Harmonic Oscillator functions. The fact that the solutions of the quantum Harmonic
Oscillator, ψ
n
(x), are orthonormal and complete means that:


-∞
+∞
ψ
n
*(x)ψ
m
(x)dx = δ
mn
.
The lowest members of this set are given as
ψ
0
(x) = π
-
1
4
e
-
x
2
2
,
ψ
1
(x) = π
-
1
4
2
1
2
x e
-
x
2
2
,
ψ
2
(x) = π
-
1
4
8
-
1
2
(4x
2
- 2) e
-
x
2
2
, ... ,
ψ
n
(x) = A
n
H
n
(x)e
-
x
2
2
.
The derivatives of these functions are
ψ
0
(x)' = π
-
1
4
(-x) e
-
x
2
2
,
ψ
1
(x)' = π
-
1
4
2
1
2

.

|
,

`
e
-
x
2
2
- x
2
e
-
x
2
2
,
,
= 2
-
1
2
( ) ψ
0
- ψ
2
, etc.
In general, one finds that

n
(x)
dx
= 2
-
1
2
.
|
,
`
n
1
2
ψ
n-1
- (n+1)
1
2
ψ
n+1
.
From this general result, it is clear that the matrix representation of the
d
dx
operator is given
by
D = 2
-
1
2

]
]
]
]
]
0 1 0 0 . . .
-1 0 2 0 . . .
0 - 2 0 3 . . .
0 0 - 3 0 . . .
. . . . . . . . . . . . . . .
.
The matrix D operates on the unit vectors e
0
= (1,0,0...), e
1
= (0,1,....) etc. just
like
d
dx
operates on ψ
n
(x), because these unit vectors are the representations of the ψ
n
(x) in
the basis of the ψ
n
(x), and D is the representation of
d
dx
in the basis of the ψ
n
(x). Since
any vector can be represented by a linear combination of the e
i
vectors, the matrix D
operates on any vector (a
0
,a
1
,a
2
...) just like
d
dx
operates on any f(x). Note that the matrix
is not Hermitian; it is actually antisymmetric. However, if we multiply D by -ih

we obtain
a Hermitian matrix that represents the operator -ih

d
dx
, the momentum operator.
It is easy to see that we can form the matrix representation of any linear operator
for any complete basis in any space. To do so, we act on each basis function with the
operator and express the resulting function as a linear combination of the original basis
functions. The coefficients that arise when we express the operator acting on the functions
in terms of the original functions form the the matrix representation of the operator.
It is natural to ask what the eigenvalues and eigenfunctions of the matrix you form
through this process mean. If your operator is the Hamiltonian operator, then the matrix
eigenvectors and eigenvalues are the representations of the solutions to the Schrodinger
equation in this basis. Forming the representation of an operator reduces the solution of the
operator eigenvalue equation to a matrix eigenvalue equation.
XI. Complex Numbers, Fourier Series, Fourier Transforms, Basis Sets
One of the techniques to which chemists are frequently exposed is Fourier
transforms. They are used in NMR and IR spectroscopy, quantum mechanics, and
classical mechanics.
Just as we expand a function in terms of a complete set of basis function or a vector in
terms of a complete set of vectors, the Fourier transform expresses a function f(ω) of a
continuous variable ω in terms of a set of orthonormal functions that are not discretely
labeled but which are labeled by a continuous "index" t. These functions are (2π)
-
1
2
e
-iωt
,
and the "coefficients" in the expansion
f(ω) · (2π)
-
1
2


-∞
+∞
e
-iωt
f(t)dt ,
are called the Fourier transform f(t) of f(ω).
The orthonormality of the (2π)
-
1
2
e
-iωt
functions will be demonstrated explicitly
later. Before doing so however, it is useful to review both complex numbers and basis
sets.
A. Complex numbers
A complex number has a real part and an imaginary part and is usually denoted:
z = x + iy.
For the complex number z, x is the real part, y is the imaginary part, and i = -1 . This is
expressed as x = Re(z), y = Im(z). For every complex number z, there is a related one
called its complex conjugate z* = x - iy.
Complex numbers can be thought of as points in a plane where x and y are the
abscissa and ordinate, respectively. This point of view prompts us to introduce polar
coordinates r and θ to describe complex numbers, with
x = r Cos θ r = (x
2
+ y
2
)
1
2

or,
y = r Sin θ θ = Tan
-1

.
|
,
`
y
x
+ [π (if x < 0)].
Another name for r is the norm of z which is denoted z. The angle θ is sometimes called
the argument of z, arg(z), or the phase of z.
Complex numbers can be added, subtracted, multiplied and divided like real
numbers. For example, the multiplication of z by z* gives:
zz* = (x + iy) (x - iy) = x
2
+ ixy - ixy + y
2
= x
2
+ y
2
= r
2
Thus zz* = z
2
is a real number.
An identity due to Euler is important to know and is quite useful. It states that
e

= Cosθ + iSinθ.
It can be proven by showing that both sides of the identity obey the same differential
equation. Here we will only demonstrate its plausibility by Taylor series expanding both
sides:
e
x
= 1 + x +
x
2
2
+
x
3
3!
+
x
4
4!
+ ...,
Sin x = x -
x
3
3!
+
x
5
5!
+ ...,
and
Cos x = 1 -
x
2
2
+
x
4
4!
+ ....
Therefore, the exponential exp(iθ) becomes
e

= 1 + iθ + i
2
θ
2
2


+ i
3
θ
3
3!


+ i
4
θ
4
4!


+ i
5
θ
5
5!


+ ...
= 1 -
θ
2
2
+
θ
4
4!
+ ... + i
.
|
,
`
θ -
θ
3
3!
+
θ
5
5!
+ . . . .
The odd powers of θ clearly combine to give the Sine function; the even powers give the
Cosine function, so
e

= Cosθ + iSinθ
is established. From this identity, it follows that Cosθ =
1
2
(e

+ e
-iθ
) and Sinθ =
1
2
i (e

-
e
-iθ
).
It is now possible to express the complex number z as
z = x + iy
= rCosθ + irSinθ
= r(Cosθ + iSinθ)
= re

.
This form for complex numbers is extremely useful. For instance, it can be used to easily
show that
zz* = re

(re
-iθ
) = r
2
e
i(θ-θ)
= r
2
.
B. Fourier Series
Now let us consider a function that is periodic in time with period T. Fourier's
theorem states that any periodic function can be expressed in a Fourier series as a linear
combination (infinite series) of Sines and Cosines whose frequencies are multiples of a
fundamental frequency Ω corresponding to the period:
f(t) =

n=0

a
n
Cos(nΩt) +

n=1

b
n
Sin(nΩt) ,
where Ω =

T
. The Fourier expansion coefficients are given by projecting f(t) along each
of the orthogonal Sine and Cosine functions:
a
o
=
1
T


0
T
f(t)dt ,
a
n
=
2
T


0
T
f(t)Cos(nΩt)dt ,
b
n
=
2
T


0
T
f(t)Sin(nΩt)dt .
The term in the Fourier expansion associated with a
o
is a constant giving the average value
(i.e., the zero-frequency or DC component) of the function. The terms with n = 1 contain
the fundamental frequency and higher terms contain the nth harmonics or overtones of the
fundamental frequency. The coefficients, a
n
and b
n
, give the amplitudes of each of these
frequencies. Note that if f(t) is an even function (i.e., if f(t) = f(-t)), b
n
= 0 for n = 1,2,3...
so the series only has Cosine terms. If f(t) is an odd function (i.e. f(t) = -f(-t)), a
n
= 0 for
n = 0,1,2,3... and the series only has Sine terms.
The Fourier series expresses a continuous function as an infinite series of numbers
... a
o
, a
1
, b
1
, a
2
, b
2
, ... . We say that the set of coefficients is a representation of the
function in the Fourier basis. The expansion in the Cos Ω
n
t and Sin Ω
n
t basis is useful
because the basis functions are orthogonal when integrated over the interval 0 to T. An
alternative set of functions is sometimes used to carry out the Fourier expansion; namely
the
.
|
,
`
1
T
1
2
exp(iΩnt) functions for n from -∞ to +∞. Their orthogonality can be proven as
follows:

1
T


0
T
ψ
n

m
dt =
1
T


0
T
exp(i(m-n)Ωt)dt = 1 if m=n, and
=
1
T
(i(m-n)Ω)
-1
(exp(i(m-n)ΩT) - 1) = 0 if m≠ n.
Let us consider the Fourier representation of f(t) in terms of the complex
exponentials introduced above. For an arbitrary periodic function f(t), we can write
f(t) =

-∞
+∞
c
n
e
inΩt
, where c
n
=
1
T


0
T
f(t)e
-inΩt
dt .
This form of the Fourier series is entirely equivalent to the first form and the a
n
and b
n
can
be obtained from the c
n
and vice versa. For example, the c
n
amplitudes are obtained by
projecting f(t) onto exp(inΩt) as:
c
n
=
1
T


0
T
f(t)( ) Cos(nΩt) - iSin(nΩt) dt
=
1
T


0
T
f(t)Cos(nΩt)dt -
1
T


0
T
f(t)Sin(nΩt)dt ,
but these two integrals are easily recognized as the expansion in the other basis, so
c
n
=
1
2
a
n
-
1
2
i b
n
=
1
2
(a
n
- ib
n
) .
By using complex exponential functions instead of trigonometric functions, we only have
one family of basis functions, e
inΩt
, but we have twice as many of them. In this form, if
f(t) is even, then the c
n
are real, but if f(t) is odd, then c
n
are imaginary.
It is useful to consider some examples to learn this material. First, let f(t) be the
odd function f(t) = Sin3t. Then, one period has elapsed when 3T = 2π, so T =

3
and Ω
= 3. For this function, the complex series Fourier expansion coefficients are given by
c
0
=
3



0

3
Sin3tdt = 0.
c
1
=
3



0

3
Sin3te
-i3t
dt =
3



1

( ) e
i3t
-e
-i3t
2i
e
-i3t
dt
=
1
2i

3



0

3
(1-e
6it
)dt =
3
4πi

.
|
,
` 2π
3
- 0 =
1
2i
= -
i
2

Because Sin3t is real, it is straightforward to see from the definition of c
n
that c
-n
= c
n
*, as
a result of which c
-1
= c
1
* = +
i
2
. The orthogonality of the exp(inΩt) functions can be used
to show that all of the higher c
n
coefficients vanish:
c
n
= 0 for n ≠ t 1.
Hence we see that this simple periodic function has just two terms in its Fourier series. In
terms of the Sine and Cosine expansion, one finds for this same f(t)=Sin3t that a
n
= 0, b
n
=
0 for n ≠ 1, and b
1
= 1.
As another example, let f(t) be
f(t) = t, -π < t < π,
and make f(t) periodic with period 2π (so Ω = 1). For this function,
c
0
=
1




π
tdt = 0
c
n
=
1




π
te
-int
dt =
1
2πn
2
[e
-int
(1 + int)]
|

π

=
1
2πn
2
(e
-inπ
(1 + inπ(1 - inπ)))
=
1
2πn
2
((-1)
n
(1 + inπ) - (-1)
n
(1 - inπ))
=
(-1)
n
2πn
2
(2inπ) =
i
n
(-1)
n
n ≠ 0.
Note that since t is an odd function, c
n
is imaginary, and as n → ∞, c
n
approaches zero.
C. Fourier Transforms
Let us now imagine that the period T becomes very long. This means that the
fundamental frequency Ω =

T
becomes very small and the harmonic frequencies nΩ are
very close together. In the limit of an infinite period one has a non-periodic function (i.e.,
an arbitrary function). The frequencies needed to represent such a function become
continuous, and hence the Fourier sum becomes an integral.
To make this generalization more concrete, we first take the earlier Fourier
expression for f(t) and multiply it by unity in the form
ΩT

to obtain:
f(t) =

-∞

c
n
e
inΩt
ΩT

.
Now we replace nΩ (which are frequencies infinitesimally close together for sufficiently
long T) by the continuous index ω:
f(t) =
1



n=-∞

(c
n
T)e
iωt
Ω .
In the limit of long T, this sum becomes an integral because Ω becomes infinitesimally
small (Ω→dω). As T grows, the frequency spacing between n and n+1, which is

T
, as
well as the frequency associated with a given n-value become smaller and smaller. As a
result, the product c
n
T evolves into a continuous function of ω which we denote c(ω).
Before, c
n
depended on the continuous index n, and represented the contribution to the
function f(t) of waves with frequency nΩ. The function c(ω) is the contribution per unit
frequency to f(t) from waves with frequency in the range ω to ω + dω. In summary, the
Fourier series expansion evolves into an integral that defines the Fourier transformation of
f(t):
f(t) =
1



-∞

c(ω)e
-iωt
dω .
It is convenient to define a new function f(ω) = (2π)
-
1
2
c(ω) where f(ω) is called the Fourier
transform of f(t). The Fourier transform f(ω) is a representation of f(t) in another basis,
that of the orthonormal set of oscillating functions e

(2π)
-
1
2
:
f(t) =
.
|
,
`
1

1
2


f(ω)e
-iωt
dω .
Earlier, for Fourier series, we had the orthogonality relation among the Fourier
functions:
1
T


0
T
ψ
n
*
ψ
m
dt = δ
nm
,
but for the continuous variable ω, we have a different kind of orthogonality



-∞
ψ*(ω
1
)ψ(ω
2
)dt = δ(ω
1

2
),
where ψ(ω
j
) = (2π)
-
1
2
e

j
t
.
The function δ(ω), called the Dirac delta function, is the continuous analog to δ
nm
.
It is zero unless ω = o. If ω = o, δ(ω) is infinite, but it is infinite in such a way that the
area under the curve is precisely unity. Its most useful definition is that δ(ω) is the function
which, for arbitrary f(ω), the following identity holds:



-∞
f(ω)δ(ω−ω')dω = f(ω' ).
That is, integrating δ(ω-ω') times any function evaluated at ω just gives back the value of
the function at ω' .
The Dirac delta function can be expressed in a number of convenient ways,
including the following two:
δ(ω-ω') =
1




-∞
e
i(ω-ω')t
dt
= lim
a→0

π
a
e
-
(ω-ω')
2
a
2

As an example of applying the Fourier transform method to a non-periodic
function, consider the localized pulse
f(t) =
¹
'
¹
0 t >
T
2
1
T
-
T
2
≤ t ≤
T
2

For this function, the Fourier transform is
f(ω) = (2π)
-
1
2


-∞

f(t)e
iωt
dt
= (2π)
-
1
2

1
T


-
T
2
T
2
e
-iωt
dt
= (2π)
-
1
2

1
T


-
T
2
T
2
(Cosωt - iSinωt)dt
= (2π)
-
1
2

1
T


-
T
2
T
2
Cosωtdt + 0
= (2π)
-
1
2

1
ωT
( ) Sinωt ¹
¹
-
T
2
T
2


=
.
|
,
`
2
π
-
1
2
Sin (ωT/2)/ωT.
Note that f(ω) has its maximum value of (2π)
-
1
2
for ω = 0 and that f(ω) falls slowly in
magnitude to zero as ω increases in magnitude, while oscillating to positive and negative
values. However, the primary maximum in f(ω) near zero-frequency has a width that is
inversely proportional to T. This inverse relationship between the width (T) in t-space and
the width
.
|
,
` 2π
T
in ω-space is an example of the uncertainty principle. In the present case it
means that if you try to localize a wave in time, you must use a wide range of frequency
components.
D. Fourier Series and Transforms in Space
The formalism we have developed for functions of time can also be used for
functions of space variables or any other variables for that matter. If f(x) is a periodic
function of the coordinate x, with period (repeat distance)

K
, then it can be represented in
terms of Fourier functions as follows:
f(x) =

n=-∞

f
n
e
inKx

where the coefficients are
f
n
=
K



0

K
f(x)e
-inKx
dx .
If f(x) is a non-periodic function of the coordinate x, we write it as
f(x) = (2π)
-
1
2


-∞

f(k)e
ikx
dk
and the Fourier transform is given as
f(k) = (2π)
-
1
2


-∞

f(x)e
-ikx
dx .
If f is a function of several spatial coordinates and/or time, one can Fourier transform (or
express as Fourier series) simultaneously in as many variables as one wishes. You can
even Fourier transform in some variables, expand in Fourier series in others, and not
transform in another set of variables. It all depends on whether the functions are periodic
or not, and whether you can solve the problem more easily after you have transformed it.
E. Comments
So far we have seen that a periodic function can be expanded in a discrete basis set
of frequencies and a non-periodic function can be expanded in a continuous basis set of
frequencies. The expansion process can be viewed as expressing a function in a different
basis. These basis sets are the collections of solutions to a differential equation called the
wave equation. These sets of solutions are useful because they are complete sets .
Completeness means that any arbitrary function can be expressed exactly as a linear
combination of these functions. Mathematically, completeness can be expressed as
1 = ⌡

ψ(ω)><ψ(ω)dω
in the Fourier transform case, and
1 =

ψ
n
><ψ
n

in the Fourier series case.
The only limitation on the function expressed is that it has to be a function that has
the same boundary properties and depends on the same variables as the basis. You would
not want to use Fourier series to express a function that is not periodic, nor would you
want to express a three-dimensional vector using a two-dimensional or four-dimensional
basis.
Besides the intrinsic usefulness of Fourier series and Fourier transforms for
chemists (e.g., in FTIR spectroscopy), we have developed these ideas to illustrate a point
that is important in quantum chemistry. Much of quantum chemistry is involved with basis
sets and expansions. This has nothing in particular to do with quantum mechanics. Any
time one is dealing with linear differential equations like those that govern light (e.g.
spectroscopy) or matter (e.g. molecules), the solution can be written as linear combinations
of complete sets of solutions.
XII. Spherical Coordinates
A. Definitions
The relationships among cartesian and spherical polar coordinates are given as
follows:
z = rCosθ r = x
2
+y
2
+z
2

x = rSinθ Cosφ θ = Cos
-1
.

|
,

`
z
x
2
+y
2
+z
2

y = rSinθ Sinφφ = Cos
-1
.

|
,

`
x
x
2
+y
2

The ranges of the polar variables are 0 < r < ∞, 0 ≤ θ ≤ π, 0 ≤ φ < 2π.
B. The Jacobian in Integrals
In performing integration over all space, it is necessary to convert the multiple
integral from cartesian to spherical coordinates:


dx⌡

dy⌡

dz f(x,y,z) → ⌡

dr⌡

dθ⌡

dφ f(r,θ, φ) ⋅ J,
where J is the so-called Jacobian of the transformation. J is computed by forming the
determinant of the three-by-three matrix consisting of the partial derivatives relating x,y,z to
r,θ, φ:
J =
¹
¹
¹
¹
¹
¹ ∂(x,y,z)
∂(r,θ, φ)

=
¹
¹
¹
¹
¹
¹
¹
¹
SinθCosφ SinθSinφ Cosθ
rCosθCosφ rCosθSinφ -rSinθ
-rSinθSinφ rSinθCosφ 0

The determinant, J, can be expanded, for example, by the method of diagonals, giving four
nonzero terms:
J = r
2
Sin
3
θSin
2
φ + r
2
Cos
2
θSinθCos
2
φ
+ r
2
Sin
3
θCos
2
φ + r
2
Cos
2
θSinθSin
2
φ
= r
2
Sinθ(Sin
2
θ(Sin
2
φ + Cos
2
φ) + Cos
2
θ(Cos
2
φ + Sin
2
φ))
= r
2
Sinθ.
Hence in converting integrals from x,y,z to r,θ, φ one writes as a short hand dxdydz =
r
2
Sinθdrdθdφ .
C. Transforming Operators
In many applications, derivative operators need to be expressed in spherical
coordinates. In converting from cartesian to spherical coordinate derivatives, the chain rule
is employed as follows:

∂x
=
∂r
∂x


∂r
+
∂θ
∂x


∂θ
+
∂φ
∂x


∂φ

= SinθCosφ

∂r
+
CosθCosφ
r


∂θ
-
Sinφ
rSinθ


∂φ
.
Likewise

∂y
= SinθSinφ

∂r
+
CosθSinφ
r


∂θ
+
Cosφ
rSinθ


∂φ
,
and

∂z
= Cosθ

∂r
-
Sinθ
r


∂θ
+ 0

∂φ

Now to obtain an expression for

2
∂x
2
and other second derivatives, one needs to take the
following derivatives:

2
∂x
2
= SinθCosφ

∂r

.

|
,

`
SinθCosφ

∂r
+
CosθCosφ
r


∂θ
-
Sinφ
rSinθ


∂φ

+
CosθCosφ
r


∂θ

.

|
,

`
SinθCosφ

∂r
+
CosθCosφ
r


∂θ
-
Sinφ
rSinθ


∂φ

-
Sinφ
rSinθ


∂φ

.

|
,

`
SinθCosφ

∂r
+
CosθCosφ
r


∂θ
-
Sinφ
rSinθ


∂φ

= Sin
2
θCos
2
φ

2
∂r
2
+
Cos
2
θCos
2
φ
r
2


2
∂θ
2
+
Sin
2
φ
r
2
Sin
2
θ


2
θφ
2

+
2SinθCosθCos
2
φ
r


2
∂r∂θ
-
2CosφSinφ
r


2
∂r∂φ
-
2CosθCosφSinφ
r
2
Sinθ


2
∂θ∂φ

+
.
|
,
` Cos
2
θCos
2
φ
r
+
Sin
2
φ
r


∂r
+
.

|
,

`
-
2SinθCosθCos
2
φ
r
2
+
Sin
2
φCosθ
r
2
Sinθ


∂θ

+
.

|
,

` CosφSinφ
r
2
+
Cos
2
θCosφSinφ
r
2
Sin
2
θ


∂φ

Analogous steps can be performed for

2
∂y
2
and

2
∂z
2
. Adding up the three contributions,
one obtains:

2
=
1
r
2


∂r

.

|
,

`
r
2


∂r
+
1
r
2
Sinθ


∂θ

.

|
,

`
Sinθ

∂θ
+
1
r
2
Sin
2
θ


2
∂φ
2
.
As can be seen by reading Appendix G, the terms involving angular derivatives in ∇
2
is
identical to -
L
2
r
2
, where L
2
is the square of the rotational angular momentum operator.
Although in this appendix, we choose to treat it as a collection of differential operators that
gives rise to differential equations in θ and φ to be solved, there are more general tools for
treating all such angular momentum operators. These tools are developed in detail in
Appendix G and are used substantially in the next two chapters.
XIII. Separation of Variables
In solving differential equations such as the Schrödinger equation involving two or
more variables (e.g., equations that depend on three spatial coordinates x, y, and z or r, θ,
and φ or that depend on time t and several spatial variables denoted r), it is sometimes
possible to reduce the solution of this one multi-variable equation to the solution of several
equations each depending on fewer variables. A commonly used device for achieving this
goal is the separation of variables technique .
This technique is not always successful, but can be used for the type of cases
illustrated now. Consider a two-dimensional differential equation that is of second order
and of the eigenvalue type:
A ∂
2
/∂x
2
ψ + B ∂
2
/∂y
2
ψ + C ∂ψ/∂x ∂ψ/∂y + D ψ = E ψ.
The solution ψ must be a function of x and y because the differential equation refers to ψ' s
derivatives with respect to these variables.
The separations of variables device assumes that ψ(x,y) can be written as a product
of a function of x and a function of y:
ψ(x,y) = α(x) β(y).
Inserting this ansatz into the above differential equation and then dividing by α(x) β(y)
produces:
A α
-1

2
/∂x
2
α + B β
-1

2
/∂y
2
β + C α
-1
β
-1
∂α/∂x ∂β/∂y + D = E .
The key observations to be made are:
A. If A if independent of y, A α
-1

2
/∂x
2
α must be independent of y.
B. If B is independent of x, B β
-1

2
/∂y
2
β must be independent of x.
C. If C vanishes and D does not depend on both x and y, then there are no "cross terms" in
the above equations (i.e., terms that contain both x and y). For the sake of argument, let us
take D to be dependent on x only for the remainder of this discussion; the case for which D
depends on y only is handled in like manner.
Under circumstances for which all three of the above conditions are true, the left-hand side
of the above second-order equation in two variables can be written as the sum of
A α
-1

2
/∂x
2
α + D,
which is independent of y, and
B β
-1

2
/∂y
2
β
which is independent of x.
The full equation states that the sum of these two pieces must equal the eigenvalue E, which
is independent of both x and y.
Because E and B β
-1

2
/∂y
2
β are both independent of x, the quantity A α
-1

2
/∂x
2
α + D (as a whole) can not depend on x, since
E - B β
-1

2
/∂y
2
β must equal A α
-1

2
/∂x
2
α + D for all values of x and y. That A α
-1

2
/∂x
2
α + D is independent of x and (by assumption) independent of y allows us to write:
A α
-1

2
/∂x
2
α + D = ε,
a constant.
Likewise, because E and A α
-1

2
/∂x
2
α + D are independent of y, the quantity B
β
-1

2
/∂y
2
β (as a whole) can not depend on y, since E - A α
-1

2
/∂x
2
α - D must equal B
β
-1

2
/∂y
2
β for all values of x and y. That B β
-1

2
/∂y
2
β is independent of y and (by
assumption) independent of x allows us to write:
B β
-1

2
/∂y
2
β = ε' ,
another constant.
The net result is that we now have two first-order differential equations of the eigenvalue
form:
A ∂
2
/∂x
2
α + D α= ε α,
and
B


2
/∂y
2
β = ε' β,
and the solution of the original equation has been successfully subjected to separation of
variables. The two eigenvalues ε and ε' of the separated x- and y- equations must obey:
ε + ε' = E,
which follows by using the two separate (x and y) eigenvalue equations in the full two-
dimensional equation that contains E as its eigenvalue.
In summary, when separations of variables can be used, it:
A. Reduces one multidimensional differential equation to two or more lower-dimensional
differential equations.
B. Expresses the eigenvalue of the original equation as a sum of eigenvlaues (whose values
are determined via boundary conditions as usual) of the lower-dimensional problems.
C. Expresses the solutions to the original equation as a product of solutions to the lower-
dimensional equations (i.e., ψ(x,y) = α(x) β(y), for the example considered above).

Sign up to vote on this title
UsefulNot useful