This action might not be possible to undo. Are you sure you want to continue?

(Version 2.0)

Andrew E. Blechman

September 10, 2007

A Note of Explanation

It has been my experience that many incoming graduate students have not yet been

exposed to some of the more advanced mathematics needed to access electrodynamics or

quantum mechanics. Mathematical tools such as tensors or diﬀerential forms, intended

to make the calculations much simpler, become hurdles that the students must overcome,

taking away the time and understanding from the physics. The professors, on the other

hand, do not feel it necessary to go over these topics. They cannot be blamed, since they are

trying to teach physics and do not want to waste time with the details of the mathematical

trickery that students should have picked up at some point during their undergraduate years.

Nonetheless, students ﬁnd themselves at a loss, and the professors often don’t even know

why the students are struggling until it is too late.

Hence this paper. I have attempted to summarize most of the more advanced mathemat-

ical trickery seen in electrodynamics and quantum mechanics in simple and friendly terms

with examples. I do not know how well I have succeeded, but hopefully students will be able

to get some relief from this summary.

Please realize that while I have done my best to make this as complete as possible, it is

not meant as a textbook. I have left out many details, and in some cases I have purposely

glossed over vital subtlties in the deﬁnitions and theorems. I do not provide many proofs,

and the proofs that are here are sketchy at best. This is because I intended this paper only

as a reference and a source. I hope that, in addition to giving students a place to look for a

quick reference, it might spark some interest in a few to take a math course or read a book

to get more details. I have provided a list of references that I found useful when writing this

review at the end.

Finally, I would like to thank Brock Tweedie for a careful reading of the paper, giving

very helpful advice on how to improve the language as well as catching a few mistakes.

Good luck, and have fun!

September 19, 2006

In this version (1.5), I have made several corrections and improvements. I would like to

write a chapter on analysis and conformal mapping, but unfortunately I haven’t got the time

right now, so it will have to wait. I want to thank Linda Carpenter and Nadir Jeevanjee for

their suggestions for improvement from the original version. I hope to encorporate more of

Nadir’s suggestions about tensors, but it will have to wait for Version 2...

September 10, 2007

In version 2.0, I have made several vital corrections, tried to incorporate some of the

suggestions on tensors alluded to above, and have included and improved the sections on

analysis that I always wanted to include. Thanks to Michael Luke for the plot in Chapter

1, that comes from his QFT lecture notes.

i

List of Symbols and Abbreviations

Here is a list of symbols used in this review, and often by professors on the blackboard.

Symbol Meaning

∀ For all

∃ There exists

∃! There exists unique

a ∈ A a is a member (or element) of the set A

A ⊂ B A is a subset of B

A = B A ⊂ B and B ⊂ A

A ∪ B The set of members of the sets A or B

A ∩ B The set of members of the sets A and B

∅ The empty set; the set with no elements

A HB The “disjoint union”; same as A ∪ B where A ∩ B = ∅

WLOG Without Loss of Generality

p ⇒q implies; If p is true, then q is true.

p ⇔q; iﬀ p is true if and only if q is true.

N Set of natural numbers (positive integers)

Z Set of integers

Q Set of rational numbers

R Set of real numbers

C Set of complex numbers

ii

Contents

1 Tensors and Such 1

1.1 Some Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Rank 0: Scalars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.2 Rank 1: Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.3 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Upstairs, Downstairs: Contravariant vs Covariant . . . . . . . . . . . . . . . 4

1.4 Mixed Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 Einstein Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.5.1 Another Shortcut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6 Some Special Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6.1 Kroneker Delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.6.2 Levi-Civita Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.6.3 Using the Levi-Civita Symbol . . . . . . . . . . . . . . . . . . . . . . 9

1.7 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.8 Constructing and Manipulating Tensors . . . . . . . . . . . . . . . . . . . . . 12

1.8.1 Constructing Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.8.2 Kroneker Products and Sums . . . . . . . . . . . . . . . . . . . . . . 13

2 Transformations and Symmetries 15

2.1 Symmetries and Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.1 SU(2) - Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.1.2 SO(3,1) - Lorentz (Poincar´e) Group . . . . . . . . . . . . . . . . . . . 19

2.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3 Lagrangian Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 Noether’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 Geometry I: Flat Space 26

3.1 THE Tensor: g

µν

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1.1 Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Diﬀerential Volumes and the Laplacian . . . . . . . . . . . . . . . . . . . . . 28

3.2.1 Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.2.2 Veilbeins - a Prelude to Curvature . . . . . . . . . . . . . . . . . . . . 29

3.2.3 Laplacians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.3 Euclidean vs Minkowskian Spaces . . . . . . . . . . . . . . . . . . . . . . . . 30

iii

4 Geometry II: Curved Space 32

4.1 Connections and the Covariant Derivative . . . . . . . . . . . . . . . . . . . 32

4.2 Parallel Transport and Geodesics . . . . . . . . . . . . . . . . . . . . . . . . 36

4.3 Curvature- The Riemann Tensor . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.3.1 Special Case: d = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4.3.2 Higher Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Diﬀerential Forms 42

5.1 The Mathematician’s Deﬁnition of Vector . . . . . . . . . . . . . . . . . . . . 42

5.2 Form Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2.1 Wedge Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.2.2 Tilde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.2.3 Hodge Star . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5.2.4 Evaluating k-Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.2.5 Generalized Cross Product . . . . . . . . . . . . . . . . . . . . . . . . 50

5.3 Exterior Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.3.1 Exterior Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.3.2 Formulas from Vector Calculus . . . . . . . . . . . . . . . . . . . . . 51

5.3.3 Orthonormal Coordinates . . . . . . . . . . . . . . . . . . . . . . . . 51

5.4 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.4.1 Evaluating k-form Fields . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4.2 Integrating k-form Fields . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.4.3 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.5 Forms and Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

6 Complex Analysis 58

6.1 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

6.2 Singularities and Residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6.3 Complex Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.4 Conformal Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

iv

Chapter 1

Tensors and Such

1.1 Some Deﬁnitions

Before diving into the details of tensors, let us review how coordinates work. In N-dimensions,

we can write N unit vectors, deﬁning a basis. If they have unit magnitude and point in

orthogonal directions, it is called an orthonormal basis, but this does not have to be the

case. It is common mathematical notation to call these vectors e

i

, where i is an index and

goes from 1 to N. It is important to remember that these objects are vectors even though

they also have an index. In three dimensions, for example, they may look like:

e

1

=

_

_

1

0

0

_

_

e

2

=

_

_

0

1

0

_

_

e

3

=

_

_

0

0

1

_

_

(1.1.1)

Now you can write a general N-vector as the sum of the unit vectors:

r = x

1

e

1

+ x

2

e

2

+ ... + x

N

e

N

=

N

i=1

x

i

e

i

(1.1.2)

where we have written x

i

as the coordinates of the vector r. Note that the index for the

coordinate is a superscript, whereas the index for the unit vector is a subscript. This small

fact will turn out to be very important! Be careful not to confuse the indices for exponents.

Before going any further, let us introduce one of the most important notions of math-

ematics: the inner product. You are probably very familiar with the concept of an inner

product from regular linear algebra. Geometrically, we can deﬁne the inner product in the

following way:

¸ r

1

, r

2

) = ¸

N

i=1

x

i

1

e

i

,

N

j=1

x

j

2

e

j

) =

N

i=1

N

j=1

x

i

1

x

j

2

¸e

i

, e

j

) =

N

i=1

N

j=1

x

i

1

x

j

2

g

ij

(1.1.3)

where we have deﬁned the new quantity:

g

ij

≡ ¸e

i

, e

j

) (1.1.4)

1

Mathematicians sometimes call this quantity the ﬁrst fundamental form. Physicists often

simply call it the metric.

Now look at the last equality of Equation 1.1.3. Notice that there is no explicit mention

of the basis vectors ¦e

i

¦. This implies that it is usually enough to express a vector in terms

of its components. This is exactly analogous to what you are used to doing when you say,

for example, that a vector is (1,2) as opposed to x+2y. Therefore, from now on, unless it is

important to keep track of the basis, we will omit the basis vectors and denote the vector

only by its components. So, for example, r becomes x

i

. Note that “x

i

” is not rigorously

a vector, but only the vector components. Also note that the choice of basis is important

when we want to construct the metric g

ij

. This will be an important point later.

1.2 Tensors

The proper question to ask when trying to do tensor analysis involves the concept of a

transformation. One must ask the question: How do the coordinates change (“transform”)

under a given type of transformation? Before going any further, we must understand the

general answer to this problem.

Let’s consider a coordinate system (x

i

), and perform a transformation on these coordi-

nates. Then we can write the new coordinates (x

i

**) as a function of the old ones:
**

x

i

= x

i

(x

1

, x

2

, ..., x

N

) (1.2.5)

Now let us consider only inﬁnitesimal transformations, i.e.: very small translations and

rotations. Then we can Taylor expand Equation 1.2.5 and write the change in the coordinates

as:

δx

i

=

N

j=1

∂x

i

∂x

j

δx

j

(1.2.6)

where we have dropped higher order terms. Notice that the derivative is actually a matrix

(it has two indices):

A

i

.j

≡

∂x

i

∂x

j

(1.2.7)

Now Equation 1.2.6 can be written as a matrix multiplication:

δx

i

=

N

j=1

A

i

.j

δx

j

(1.2.8)

Before going further, let us consider the new matrix A

i

.j

. First of all, take note that one

index is primed, while one is not. This is a standard but very important notation, where

the primed index refers to the new coordinate index, while the unprimed index refers to the

old coordinate index. Also notice that there are two indices, but one is a subscript, while

the other is a superscript. Again, this is not a coincidence, and will be important in what

follows. I have also included a period to the left of the lower index: this is to remind you

2

that the indices should be read as “ij” and not as “ji”. This can prove to be very important,

as in general, the matrices we consider are not symmetric, and it is important to know the

order of indices. Finally, even though we have only calculated this to lowest order, it turns

out that Equation 1.2.8 has a generalization to large transformations. We will discuss these

details in Chapter 2.

Let us consider a simple example. Consider a 2-vector, and let the transformation be

a regular rotation by an angle θ. We know how to do such transformations. The new

coordinates are:

x

1

= x

1

cos θ + x

2

sin θ ∼ x

1

+ θx

2

+O(θ

2

)

x

2

= x

2

cos θ −x

1

sin θ ∼ x

2

−θx

1

+O(θ

2

)

Now it is a simple matter of reading oﬀ the A

i

.j

:

A

1

.1

= 0 A

1

.2

= θ

A

2

.1

= −θ A

2

.2

= 0

Armed with this knowedge, we can now deﬁne a tensor:

A tensor is a collection of objects which, combined the right way, transform the same

way as the coordinates under inﬁnitesimal (proper) symmetry transformations.

What are “inﬁnitesimal proper symmetry transformations”? This is an issue that we

will tackle in Chapter 2. For now it will be enough to say that in N-dimensions, they are

translations and rotations in each direction; without the proper, they also include reﬂections

(x → −x). This deﬁnition basically states that a tensor is an object that for each index,

transforms according to Equation 1.2.8, with the coordinate replaced by the tensor.

Tensors carry indices. The number of indices is known as the tensor rank. Notice,

however, that not everything that caries an index is a tensor!!

Some people with more mathematical background might be upset with this deﬁnition.

However, you should be warned that the mathematician’s deﬁnition of a tensor and the

physicist’s deﬁnition of a tensor are not exactly the same. I will say more on this in Chapter

5. For the more interested readers, I encourage you to think about this; but for the majority

of people using this review, it’s enough to know and use the above deﬁnition.

Now that we have a suitable deﬁnition, let us consider the basic examples.

1.2.1 Rank 0: Scalars

The simplest example of a tensor is a rank-0 tensor, or scalar. Since each index is supposed

to transform like Equation 1.2.8, and there is no index, it must be that a scalar does not

transform at all under a coordinate transformation. Such quantities are said to be invari-

ants.

3

Notice that an element of a vector, such as x

1

, is not a scalar, as it does transform

nontrivially under Equation 1.2.8.

We will see that the most common example of a scalar is the length of a vector:

s

2

=

N

i=1

x

2

i

≡

N

i=1

x

i

x

i

(1.2.9)

Exercise: Prove s is an invariant quantity under translations and rotations. See Section

1.3 for details.

1.2.2 Rank 1: Vectors

The next type of tensor is rank-1, or vector. This is a tensor with only one index. The

most obvious example of a vector is the coordinates themselves, which by deﬁnition satisfy

Equation 1.2.8. There are other examples as well. Some common examples of vectors that

appear in physics are the momentum vector p, the electric ﬁeld

E, and force

F. There are

many others.

1.2.3 The General Case

It is a relatively straightforward project to extend the notion of a rank-1 tensor to any rank:

just keep going! For example, a rank-2 tensor is an object with two indices (T

ij

) which

transforms the following way:

T

i

j

=

N

k,l=1

A

i

.k

A

j

.l

T

kl

(1.2.10)

An example of a rank-2 tensor is the moment of intertia from classical rigid-body mechanics.

A tensor of rank-3 would have three A’s in the transformation law, etc. Thus we can deﬁne

tensors of any rank by insisting that they obey the proper transformation law in each index.

1.3 Upstairs, Downstairs: Contravariant vs Covariant

We have put a great deal of emphasis on whether indices belong as superscripts or as sub-

scripts. In Euclidean space, where we are all used to working, the diﬀerence is less important

(although see below), but in general, the diﬀerence is huge. Now we will make this quanti-

tative. For simplicity, I’ll only consider rank-1 tensors (1 index), but the generalization is

very straightforward.

Recall that we deﬁned a vector as an object which transformed according to Equation

1.2.8. There the index was a superscript. We will call vectors of that form contravariant

vectors. Now we want to deﬁne a vector with an index as a subscript, and we will do it

using Equation 1.2.9.

In any reasonable world, the length of a vector should have nothing to do with the

coordinate system you are using (the magnitude of a vector does not depend on the location

4

of the origin, for example). So we will deﬁne a vector with a lower index as an object which

leaves the length invariant:

N

i=1

x

i

x

i

= s

2

=

N

i

=1

(

N

k=1

A

i

.k

x

k

)(

N

l=1

x

l

B

l

.i

) =

N

k,l=1

_

N

i

=1

B

l

.i

A

i

.k

_

x

k

x

l

(1.3.11)

where B

l

.i

is the matrix that deﬁnes how this new type of vector is supposed to transform.

Notice that if the left and right side of this expression are supposed to be equal, we have a

constraint on the quantity in brackets:

N

i

=1

B

l

.i

A

i

.k

= δ

l

k

(1.3.12)

where δ

l

k

is the Kroneker Delta: +1 if k = l, 0 otherwise. In matrix notation, this reads:

BA = 1 ⇒B = A

−1

(1.3.13)

So we know what B must be. Reading oﬀ the inverse is quite easy, by comparing with

the deﬁnition of A in Equation 1.2.7:

B

j

.i

=

∂x

j

∂x

i

≡ A

j

.i

(1.3.14)

The last step just came from noting that B is identical to A with the indices switched. So

we have deﬁned a new quantity, which we will call a covariant vector. It is just like a

contravariant vector except that it transforms in exactly the oposite way:

x

i

=

N

j=1

x

j

A

j

.i

(1.3.15)

There is one more very important point to notice about the contravariant and covariant

vectors. Recall from Equation 1.1.3 that the length of a vector can be written as:

s

2

=

N

i,j=1

g

ij

x

i

x

j

But we also wrote another equation for s

2

in Equation 1.3.11 when we started deﬁning

the covariant vector. By comparing these two equations, we can conclude a very useful

relationship between covariant and contravariant vectors:

5

1

2

x

Figure 1.1: A graph showing the diﬀerence between covariant and contravariant coordinates.

x

i

=

N

j=1

g

ij

x

j

(1.3.16)

x

j

=

N

k=1

g

jk

x

k

(1.3.17)

N

j=1

g

ij

g

jk

= δ

k

i

(1.3.18)

So we have found that the metric takes contravariant vectors to covariant vectors and

vice versa, and in doing so we have found that the inverse of the metric with lower indices

is the metric with upper indices. This is a very important identity, and will be used many

times.

When working in an orthonormal ﬂat-space coordinate system (so g

ij

= δ

ij

), the diﬀerence

between covariant and contravariant is negligible. We will see in Chapters 3 and 4 some

examples of where the diﬀerence begins to appear. But even now I can show you a simple

nontrivial example. Consider a cartesian-like coordinate system, but now let the coordinate

axes cross at an angle of 60 deg (see Figure 1.1). Now the metric is no longer proportional

to the Kroneker delta (compare this to Equation (1.1.4), and so there will be a diﬀerence

between covariant and contravariant coordinates. You can see that explicitly in the ﬁgure

above, where the same point has x

1

< x

2

, but x

1

> x

2

. So we ﬁnd that covariant and

contravariant are the same only in orthonormal cartesian coordinates! In the more general

case, even in ﬂat space, the diﬀerence is important.

As another example, consider polar coordinates (r, θ). Even though polar coordinates

describe the ﬂat plane, they are not orthonormal coordinates. Speciﬁcally, if x

i

= (r, θ), then

it is true that x

i

= (r, r

2

θ), so here is another example of how covariant and contravariant

makes a diﬀerence.

6

1.4 Mixed Tensors

We have talked about covariant and contravariant tensors. However, by now it should be

clear that we need not stop there. We can construct a tensor that is covariant in one index

and contravariant in another. Such a tensor would be expected to transform as follows:

T

i

.j

=

N

k,l=1

A

i

.k

A

l

.j

T

k

.l

(1.4.19)

Such objects are known as mixed tensors. By now you should be able to construct a tensor

of any character with as many indices as you wish. Generally, a tensor with n upper indices

and l lower indices is called a (n, l) tensor.

1.5 Einstein Notation

The last thing to notice from all the equations so far is that every time we have an index

that is repeated as a superscript and a subscript, we are summing over that index. This is

not a coincidence. In fact, it is generally true that anytime two indices repeat in a monomial,

they are summed over. So let us introduce a new convention:

Anytime an index appears twice (once upstairs, once downstairs) it is to be summed over,

unless speciﬁcally stated otherwise.

This is known as Einstein notation, and will prove very useful in keeping the expressions

simple. It will also provide a quick check: if you get an index repeating, but both subscripts

or superscripts, chances are you made a mistake somewhere.

1.5.1 Another Shortcut

There is another very useful shortcut when writing down derivatives:

f(x)

,i

= ∂

i

f(x) ≡

∂f

∂x

i

(1.5.20)

Notice that the derivative with respect to contravariant-x is covariant; the proof of this

follows from the chain rule and Equation (1.2.6) This is very important, and is made much

more explicit in this notation. I will also often use the shortand:

∂

2

= ∂

i

∂

i

= ∇

2

(1.5.21)

This is just the Laplacian in arbitrary dimensions

1

. It is not hard to show that it is a scalar

operator. In other words, for any scalar φ(x), ∂

2

φ(x) is also a scalar.

1

In Minkowski space, where most of this is relevant for physics, there is an extra minus sign, so ∂

2

= =

∂

2

∂t

2

−∇

2

,which is just the D’Alembertian (wave equation) operator. I will cover this more in Chapter 3; in

this chapter, we will stick to the usual Euclidean space.

7

1.6 Some Special Tensors

1.6.1 Kroneker Delta

We have already seen the Kroneker Delta in action, but it might be useful to quickly sum

up its properties here. This object is a tensor which always carries two indices. It is equal

to +1 whenever the two indices are equal, and it is 0 whenever they are not. Notice that in

matrix form, the Kroneker Delta is nothing more than the unit matrix in N dimensions:

[δ

i

j

] =

_

_

_

1 0

0 1

.

.

.

.

.

.

.

.

.

_

_

_

Notice that the Kroneker delta is naturally a (1, 1) tensor.

1.6.2 Levi-Civita Symbol

The Levi-Civita symbol () is much more confusing than the Kroneker Delta, and is the

source of much confusion among students who have never seen it before. Before we dive into

its properties, let me just mention that no matter how complicated the Levi-Civita Symbol

is, life would be close to unbearable if it wasn’t there! In fact, it wasn’t until Levi-Civita

published his work on tensor analysis that Albert Einstein was able to complete his work on

General Relativity- it’s that useful!

Recall what it means for a tensor to have a symmetry (for now, let’s only consider a

rank-2 tensor, but the arguments generalize). A tensor T

ij

is said to be symmetric if

T

ij

= T

ji

and antisymmetric if T

ij

= −T

ji

. Notice that an antisymmetric tensor cannot

have diagonal elements, because we have the equation T

ii

= −T

ii

⇒ T

ii

= 0 (no sum). In

general, tensors will not have either of these properties; but in many cases, you might want

to construct a tensor that has a symmetry of one form or another. This is where the power

of the Levi-Civita symbol comes into play. Let us start with a deﬁnition:

The Levi-Civita symbol in N dimensions is a tensor with N indices such that it equals +1

if the indices are an even permutation and −1 if the indices are an odd permutation,

and zero otherwise.

We will talk about permutations in more detail in a little while. For now we will simply

say

12···N

= +1 if all the numbers are in ascending order. The Levi-Civita symbol is a

completely antisymmetric tensor, i.e.: whenever you switch two adjacent indices you gain a

minus sign. This means that whenever you have two indices repeating, it must be zero (see

the above talk on antisymmetric tensors).

That was a lot of words; let’s do some concrete examples. We will consider the three

most important cases (N=2,3,4). Notice that for N=1, the Levi-Civita symbol is rather silly!

8

N=2:

ij

For the case of two dimensions, the Levi-Civita symbol has two indices, and we can write

out very easily what it looks like:

12

= +1

21

= −1

N=3:

ijk

In three dimensions, it becomes a little harder to write out explicitly what the Levi-Civita

symbol looks like, but it is no more conceptually diﬃcult than before. The key equation to

remember is:

123

= +1 (1.6.22)

What if I write down

213

? Notice that I recover

123

by switching the ﬁrst two indices,

but owing to the antisymmetric nature, I must include a minus sign, so:

213

= −1. There

are six possible combinations of indices. Can you decide which ones are +1 (even) and which

are −1 (odd)? Here’s the answer:

123

=

231

=

312

= +1

321

=

213

=

132

= −1

N=4:

ijkl

I am not going to do N=4 explicitly, but hopefully by now you feel comfortable enough

to ﬁgure it out on your own. Just remember, you always start with an ascending number

sequence, so:

1234

= +1

and just take it from there! For example:

2143

= −

2134

= +

1234

= +1.

The Levi-Civita symbol comes back again and again whenever you have tensors, so it is

a very important thing to understand. Remember: if ever you get confused, just write down

the symbol in ascending order, set it equal to +1, and start ﬂipping (adjacent) indices until

you ﬁnd what you need. That’s all there is to it!

Finally, I’d like to point out a small inconsistency here. Above I mentioned that you

want to sum over indices when one is covariant and one is contravariant. But often we relax

that rule with the Levi-Civita symbol and always write it with lower indices no matter what.

There are some exceptions to this rule when dealing with curved space (General Relativity),

but I will not discuss that here. From now on, I will always put the Levi-Civita indices

downstairs - if an index is repeated, it is still summed over.

9

1.6.3 Using the Levi-Civita Symbol

Useful Relations

When performing manipulations using the Levi-Civita symbol, you often come across prod-

ucts and contractions of indices. There are two useful formulas that are good to remember:

ijk

ijl

= 2δ

kl

(1.6.23)

ijk

ilm

= δ

jk

lm

≡ δ

lj

δ

mk

−δ

lk

δ

mj

(1.6.24)

Exercise: Prove these formulas.

Cross Products

Later, in Chapter 5, we will discuss how to take a generalized cross-product in n-dimensions.

For now, recall that a cross product only makes sense in three dimensions. There is a

beautiful formula for the cross product using the Levi-Civita symbol:

(

A

B)

i

=

ijk

A

j

B

k

(1.6.25)

Using this equation, and Equations (1.6.23) and (1.6.24), you can derive all the vector

identities on the cover of Jackson, such as the double cross product. Try a few to get the

hang of it. That’s a sure-ﬁre way to be sure you understand index manipulations.

As an example, let’s prove a diﬀerential identity:

∇(∇A) =

ijk

∂

j

klm

∂

l

A

m

=

kij

klm

∂

j

∂

l

A

m

= [δ

il

δ

jm

−δ

im

δ

jl

]∂

j

∂

l

A

m

= ∂

i

∂

m

A

m

−∂

2

A

i

= ∇(∇ A) −∇

2

A

Exercise: Prove that you can interchange dot and cross products: A BC = AB C.

1.7 Permutations

A permutation is just a shuﬄing of objects. Rigorously, consider a set of objects (called

S) and consider an automorphism σ : S → S (one-to-one and onto function). Then σ is a

permutation. It takes an element of S and sends it uniquely to another element (possibly

the same element) of S.

One can see very easily that the set of all permutations of a (ﬁnite) set S is a group

with multiplication deﬁned as function composition. We will talk more about groups in the

next chapter. If the set S has n elements, this group of permutations is denoted S

n

, and is

often called the Symmetric group. It has n! elements. The proof of this is straightforward

application of group and number theory, and I will not discuss it here.

The theory of symmetric groups is truly deep and of fundamental importance in math-

ematics, but it also has much application in other branches of science as well. For that

reason, let me give a very brief overview of the theory. For more information, I recomend

your favorite abstract algebra textbook.

10

Speciﬁc permutations are usually written as a row of numbers. It is certainly true (and

is rigorously provable for the anal retentive!) that any ﬁnite set of order n (i.e.: with n

elements) can be represented by the set of natural numbers between 1 and n. Then WLOG

2

we can always represent our set S in this way. Then a permutation might look like this:

σ = (1234)

This permutation sends 1 → 2,2 → 3,3 → 4,4 → 1. This is in general how it always works.

If our set has four elements in it, then this permutation touches all of them. If it has more

than four elements, it leaves any others alone (for example, 5 →5).

Working in S

5

for the moment, we can write down the product of two permutations by

composition. Recall that composition of functions always goes rightmost function ﬁrst. This

is very important here, since the multiplication of permutations is almost never commutative!

Let’s see an example:

(1234)(1532) = (154)

Where did this come from? Start with the ﬁrst number in the rightmost permutation (1).

Then ask, what does 1 go to? Answer: 5. Now plug 5 into the secondmost right transposition,

where 5 goes to 5. Net result: 1 → 5. Now repeat the same proceedure on 5 to get 5 → 4,

and so on until you have completed the cycle (quick check: does 2 →2,3 →3?). Multiplying

three permutations is just as simple, but requires three steps instead of two.

A permutation of one number is trivial, so we don’t worry about it. A transposition

is a permutation of two numbers. For example: (12) sends 1 → 2,2 → 1. It is a truly

amazing theorem in group theory that any permutation (no matter what it is) can be written

as a product of transpositions. Furthermore, this decomposition is unique up to trivial

permutations. Can you write down the decomposition of the above product

3

?

Armed with that knowledge, we make a deﬁnition:

Deﬁnition: If a permutation can be written as a product of an even number of trans-

positions, it is called an even permutation. If it can be written as an odd number of

perumtations, it is called an odd permutation. We deﬁne the signature of a permutation

in the following way:

sgn(σ) =

_

+1 σ even

−1 σ odd

(1.7.26)

It is pretty obvious that we have the following relationships:

(even)(even) = (even)

(odd)(odd) = (even)

(odd)(even) = (even)(odd) = (odd)

2

Without Loss Of Generality

3

Answer: (15)(45). Notice that this is not the same as (45)(15).

11

What this tells us is that the set of even permutations is closed, and hense forms a

subgroup, called the Alternating Subgroup (denoted A

n

) of order

1

2

n!. Again, I omit

details of the proof that this is indeed a subgroup, although it’s pretty easy. Notice that the

odd elements do not have this property. The alternating subgroup is a very powerful tool in

mathematics.

For an application of permutations, a good place to look is index manipulation (hence

why I put it in this chapter!). Sometimes, you might want to sum over all combinations of

indices. For example, the general formula for the determinant of an n n matrix can be

written as:

det A =

σ∈Sn

sgn(σ)a

1σ(1)

a

2σ(2)

a

nσ(n)

(1.7.27)

1.8 Constructing and Manipulating Tensors

To ﬁnish this chapter, I will review a few more manipulations that provide ways to construct

tensors from other tensors.

1.8.1 Constructing Tensors

Consider two tensors of rank k (K) and l (L) respectively. We want to construct a new tensor

(T) from these two tensors. There are three immediate ways to do it:

1. Inner Product: We can contract one of the indices from K and one from L to form

a tensor T of rank k +l −2. For example, if K is rank 2 and L is rank 3, then we can

contract, say, the ﬁrst index of both of them to form T with rank 3:

T

µνρ

= K

α

.µ

L

ανρ

We could construct other tensors as well if we wished by contracting other indices.

We could also use the metric tensor or the Levi-Civita tensor if we wished to form

symmetric or antisymmetric combinations of tensors. For example, if A, B are vectors

in 4 dimensions:

T

µν

=

µνρσ

A

ρ

B

σ

⇒T

12

= A

3

B

4

−A

4

B

3

, etc.

2. Outer Product: We can always construct a tensor by combining two tensors without

contracting any indices. For example, if A and B are vectors, we can construct: T

ij

=

A

i

B

j

. Because we are using a Cartesian coordinate system, we call tensors of this form

Cartesian Tensors. When combining two vectors in this way, we say that we have

constructed a Dyadic.

3. Symmetrization: Consider the outer product of two vectors A

i

B

j

. Then we can force

the tensor to be symmetric (or antisymmetric) in its indices by construction:

12

A

(i

B

j)

≡

1

2

(A

i

B

j

+ A

j

B

i

) (1.8.28)

A

[i

B

j]

≡

1

2

(A

i

B

j

−A

j

B

i

) (1.8.29)

Circular brackets about indices mean symmetric combinations, and square brackets

about indices mean antisymmetric combinations. You can do this to as many indices

as you want in an outer product (for n indices, replace

1

2

→

1

n!

).

Dyadics provide a good example of how combining tensors can be more involved than it

ﬁrst seems. Consider the dyadic we have just constructed. Then we can write it down in a

fancy way by adding and subtracting terms:

T

ij

=

1

N

(T

k

k

)δ

ij

+

1

2

(T

ij

−T

ji

) + [

1

2

(T

ij

+ T

ji

) −

1

N

(T

k

k

)δ

ij

] (1.8.30)

N is the dimension of A and B, and the sum over k in the ﬁrst and last terms is implied

by Einstein notation. Notice that I have not changed the tensor at all, because any term I

added I also subtracted. However, I have decomposed the tensor into three diﬀerent pieces.

The ﬁrst piece is just a scalar (times the identity matrix), and therefore is invariant under

rotations. The next term is an antisymmtric tensor, and (in 3D) can be written as a (pseudo)

vector- it’s a cross product (see Equation (1.6.25)). Finally the third piece is a symmetric,

traceless tensor.

Let’s discuss the matter concretely by letting N = 3 and counting degrees of freedom. If

A and B are 3-vectors, then T

ij

has 3 3 = 9 components. Of the three pieces, the ﬁrst is

a scalar (1), the second is a 3-vector (3) and the third is a rank-2 symmetric tensor (6) that

is traceless (-1); so we have decomposed the tensor according to the following prescription:

3 ⊗3 = 1 ⊕3 ⊕5

We say that the dyatic form T

ij

is a reducible Cartesian tensor, and it reduces into 3

irreducible spherical tensors, each of rank 0, 1 and 2 respectively.

1.8.2 Kroneker Products and Sums

With that introduction, we can talk about Kroneker products and sums. A Kroneker

product is simply an outer product of two tensors of lesser rank. A dyatic is a perfect

example of a Kroneker Product. Writing it out in a matrix, we have (for A, B rank-1 tensors

in 3 dimensions):

T = A⊗B =

_

_

A

1

B

1

A

1

B

2

A

1

B

3

A

2

B

1

A

2

B

2

A

2

B

3

A

3

B

1

A

3

B

2

A

3

B

3

_

_

(1.8.31)

A Kroneker sum is the concatinated sum of matrices. You can write the Kroneker sum

in matrix form using a block-diagonal matrix:

13

T = C⊕D⊕E =

_

_

C 0 0

0 D 0

0 0 E

_

_

(1.8.32)

where C is an mm matrix, D is an nn matrix, E is an l l matrix,and “0” is just ﬁlling

in any left over entries with a zero. Notice that the ﬁnal matrix is an (m+n+l)(m+n+l)

matrix.

At this point, we can rewrite our equation for the dyadic:

A⊗B ∼ C⊕D⊕E (1.8.33)

where A⊗B is a 3 3 matrix, C is 1 1, D is 3 3 and E is 5 5. Notice that the matrix

dimensions are not the same on both sides of the equation - these are not the same matrix!

However, they each have the same number of independent quantities, and both represent

the same tensor T. For this reason, the study of ﬁnding diﬀerent dimensional matrices for

T goes under the heading of representation theory. It is a very active part of modern

algebraic research, and is very useful in many areas of science.

Why would one ever want to do this? The answer is that when working with dyadics in

practice, very often one is only interested in the part of the tensor that transforms the same

way. Notice that when you rotate T, you never mix terms between C, D and E. This kind of

relationship is by no means as obvious when written in its Cartesian form. For that reason,

it is often very useful to work with the separate spherical tensors rather than with the entire

Cartesian tensor. The best example is in Quantum Mechanics, when you are talking about

angular momentum. See Sakurai for more details.

14

Chapter 2

Transformations and Symmetries

2.1 Symmetries and Groups

Here we will discuss some properties of groups in general that you need to know. Then

we will progress to discuss a very special and important kind of group, called a Lie Group.

Finally we will look at some examples.

• Deﬁnition: A group G is a set with an operator called multiplication that obeys four

axioms: Closure (g

1

, g

2

∈ G ⇒g

1

g

2

∈ G), Associative((g

1

g

2

)g

3

= g

1

(g

2

g

3

)), an identity

(generically called e such that ge = eg = g) and inverses (g has inverse g

−1

such that

gg

−1

= g

−1

g = e). If the group is commutative (g

1

g

2

= g

2

g

1

), it is said to be an

Abelian group, otherwise it is Non-Abelian.

• Discrete groups represent operations that either do or do not take place. Continuous

groups represent operations that diﬀer inﬁnitesimally from each other.

• Elements in continuous groups can be labeled by a set of continuous parameters. If

the set of parameters is ﬁnite, the group is said to be a ﬁnite group. If the ranges of

the parameters are closed and bounded, the group is said to be a compact group.

• Deﬁnition: A subgroup is a subset of a larger group that is also a group. A subgroup

N ⊂ G is a normal subgroup (also called invariant subgroup) if ∀s

i

∈ N, g

−1

s

i

g ∈

N ∀g ∈ G. In other words, elements of N stay in N after being multiplied by elements

in G. We write N G. There are several deﬁnitions for invariant subgroups, but they

are all equivalent to this one.

• Consider two groups that commute with each other (G, H). Then you can construct

the direct product group in the following way: G H ≡ ¦(g

j

h

k

)[(g

j

h

k

)(g

l

h

m

) =

(g

j

g

l

h

k

h

m

)¦. Be careful not to confuse a direct product group with a Kroneker product

of tensors.

• Fact: G, H GH are invariant subgroups.

• Deﬁnition: If a group has no invariant subgroups, it is called a simple group.

15

• Deﬁnition: Suppose there is a square nn matrix M that corresponds to each element

of a group G, where g

1

g

2

= g

3

⇒ M(g

1

)M(g

2

) = M(g

3

) and M(g

−1

) = [M(g)]

−1

.

Then M is called a representation of the group G, and n is called the dimension

of the representation. If this identiﬁcation is one-to-one, the representation is called

“faithful”. An example of an unfaithful representation is where all elements of a group

G go to 1.

• Deﬁnition: A representation is reducible if each matrix can be expressed in block-

diagonal form. Using the Kroneker sum notation, we can rephrase this deﬁnition:

∃A, B such that M = A ⊕ B. A and B are also valid representations of the group.

Furthermore, they do not talk to each other (think angular momentum in quantum

mechanics). If a representation is not reducible, it is said to be irreducible.

• Consider an n-tuple ξ

a

, not necessarely a vector in the sense of Chapter 1. Then

the group element g inan n−dimensional representation M acts on the vector in the

following way:

ξ

a

= M(g)

.b

a

ξ

b

(recall Einstein notation). Since the matrix M must be nonsingular (otherwise there

would be no g

−1

and it wouldn’t be a group), ∃ξ

a

such that ξ

a

= ξ

b

[M(g)

−1

]

.a

b

. ξ is

called the adjoint of ξ.

• Deﬁnition: The smallest (lowest dimension) nontrivial irreducible faithful representa-

tion is called the fundamental representation. We will see some examples of this

shortly. Here, the “trivial” representation is the one-dimensional representation.

• Deﬁnition: A Lie group is a continuous group that is also a manifold whose mul-

tiplication and inverse operations are diﬀerentiable. It is named after Sophus Lie,

the mathematician who ﬁrst discovered their properties. This deﬁnition is quite a

mouthful- don’t worry too much about it if you’re not a theorist! Things will become

a lot simpler once we get to some examples. But before that, here are some important

results you should know:

• Theorem: All compact Lie groups have unitary representations. In other words:

M() = e

i

j

F

j

, j = 1, . . . , N, and F

j

matrices

1

. The F

j

are called generators. From

this we can see that M

†

= M

−1

, as long as F

†

= F, so we require that the generators

be Hermitian. Warning: the index on F is not a matrix index - F

j

= (F

j

)

ab

.

• Many (though not all) Lie groups have the additional property that det M() = +1

∀. From the above result, this implies Tr(F

i

) = 0; this can be proved with the help

of the very useful fact

detA = e

Tr log A

This is called the special condition.

1

Recall the deﬁnition for the exponential of a matrix: e

A

=

∞

n=0

(A)

n

n!

16

• When is small, we can expand the exponential to get:

M() = I + i

j

F

j

+O(

2

)

This is exactly the same as an inﬁnitesimal transformation, and so we say that F

j

generates inﬁnitesimal transformations.

• Theorem: The local structure of any Lie group is completely speciﬁed by a set of

commutation relations among the generators:

[F

i

, F

j

] = if

ijk

F

k

(2.1.1)

(the sum over k is understood). These commutators specify an algebra spanned by the

generators

2

, and the f

ijk

are called the structure constants of the Lie group. They

are the same for any representation of the group.

• Theorem: f

ijl

f

lkm

+f

kil

f

ljm

+f

jkl

f

lim

= 0. The proof follows from Jacobi’s identity for

commutators.

• In general, the structure constants are antisymmetric in their indices.

• Notice that we can rearrange the above version of the Jacobi identity and use the

antisymmetry to rewrite it as:

(f

i

)

jl

(f

k

)

lm

−(f

k

)

jl

(f

i

)

lm

= −f

ikl

(f

l

)

jm

(2.1.2)

where I have just separated the indices in a convienient way, but nothing more. Then

if I deﬁne a matrix with the elements (F

i

)

jk

≡ −if

ijk

, I have reproduced Equation

(2.1.1). Therefore, this is a valid representaion! It is so important, it is given a special

name - it is called the regular adjoint representation.

• Deﬁnition: The rank of a Lie group is the number of simultaneously diagonalizeable

generators; that is, the number of generators that commute among themselves. This

set of generators forms a subalgebra called the Cartan Subalgebra. The rank of the

group is the dimension of this subalgebra.

• Deﬁnition: C

(n)

≡ f

i

1

j

1

j

2

f

i

2

j

2

j

3

f

injnj

1

F

i

1

F

i

2

F

in

are called the Casimir opera-

tors notice that all indices are repeated and therefore summed via Einstein conven-

sion).

• Casimir operators commute with every generator; hence they must be multiples of the

identity. This means that they represent conserved quantities. It is also true that the

number of Casimir operators is equal to the rank of the group.

That’s enough general stuﬀ. Let’s look at some very important examples:

2

An algebra is a vector space with a vector product, where in this case, the basis of this vector space is

the set of generators (see Chapter 5). This algebra is called a Lie Algebra.

17

2.1.1 SU(2) - Spin

• Consider a complex spinor

ξ =

_

ξ

1

ξ

2

_

and the set of all transformations which leave the norm of the spinor invariant; i.e.:

s

2

= [ξ

1

[

2

+[ξ

2

[

2

s invariant under all transformations. We will also require det M = 1. From the above,

we know that the generators (call them σ

i

) must be linearly independent, traceless,

Hermitian 2 2 matrices. Let us also include a normalization condition:

Tr(σ

i

σ

j

) = 2δ

ij

Then the matrices we need are the Pauli matrices:

σ

1

=

_

0 1

1 0

_

σ

2

=

_

0 −i

i 0

_

σ

3

=

_

1 0

0 −1

_

(2.1.3)

Then we will chose the group generators to be F

i

=

1

2

σ

i

. The group generated by

these matrices is called the Special Unitary group of dimension 2, or SU(2). In the

fundamental representation, it is the group of 22 unitary matrices of unit determinant.

• Recall from quantum mechanics that the structure constants are simply given by the

Levi-Civita tensor:

f

ijk

=

ijk

• Only σ

3

is diagonal, and therefore the rank of SU(2) is 1. Then there is only one

Casimir operator, and it is given by:

C

(2)

= F

2

1

+ F

2

2

+ F

2

3

• If we remove the Special condition (unit determinant), then the condition that Trσ

i

= 0

no longer need hold. Then there is one more matrix that satisﬁes all other conditions

and is still linearly independent from the others- the identity matrix:

σ

0

=

_

1 0

0 1

_

Notice, however that this is invariant:

e

i

j

F

j

e

i

0

F

0

e

−i

j

F

j

= e

i

0

F

0

(2.1.4)

Therefore, the group generated by e

i

0

F

0

is an invariant subgroup of the whole group.

We call it U(1). That it is a normal subgroup should not suprise you, since it is true

that e

i

0

F

0

= (e

i

0

/2

)σ

0

, which is a number times the identity matrix.

18

• The whole group (without the special) is called U(2). We have found U(1) U(2),

SU(2) U(2) and we can write:

U(2) = SU(2) U(1) (2.1.5)

This turns out to be true even in higher dimensional cases:

U(N) = SU(N) U(1)

• Sumarizing: R = e

i

θ

2

(n·σ)

is the general form of an element of SU(2) in the fundamental

representation. In this representation, C

(2)

= 2, as you might expect if you followed

the analogy from spin in quantum mechanics.

• You know from quantum mechanics that there are other representations of spin (with

diﬀerent dimensions) that you can use, depending on what you want to describe - this

change in representation is nothing more than calculating Clebsch-Gordon coeﬃcients!

• Example: what if you want to combine two spin-

1

2

particles? Then you know that you

get (3) spin-1 states and (1) spin-0 state. We can rewrite this in terms of representation

theory: we’ll call the fundamental representation 2, since it is two-dimensional. The

spin-1 is three dimensional and the spin-0 is one dimensional, so we denote them by 3

and 1 respecitively. Then we can write the equation:

2 ⊗2 = 3 ⊕1 (2.1.6)

(Notice once again that the left hand side is a 2 2 matrix while the right hand side

is a 4 4 matrix, but both sides only have four independent components). Equation

(2.1.6) is just saying that combining two spin-

1

2

particles gives a spin-1 and a spin-0

state, only using the language of groups. In this language, however, it becomes clear

that the four-dimensional representation is reducible into a three-dimensional and a

one-dimensional representation. So all matrix elements can be written as a Kroneker

sum of these two representations. Believe it or not, you already learned this when you

wrote down the angular momentum matrices (recall they are block-diagonal). This also

shows that you do not need to consider the entire (inﬁnitely huge) angular momentum

matrix, but only the invariant subgroup that is relevant to the problem. In other

words, the spin-1 and spin-0 do not talk to each other!

2.1.2 SO(3,1) - Lorentz (Poincar´e) Group

Much of this information is in Jackson, Chapter 11, but good ol’ JD doesn’t always make

things very clear, so let me try to summarize.

• The Poincar´e Group is the set of all transformations that leave the metric of ﬂat

space-time invariant (see Chapter 4). The metric in Minkowski (ﬂat) space is given

by:

ds

2

= c

2

dt

2

−dr

2

19

• Intuitively we know that this metric is invariant under parity (space reversal), time

reversal, spacial rotations, space and time translations and Lorentz transformations.

Then we know that these operations should be represented in the Poincar´e group.

• Ignore translations for now. You can prove to yourself that the matrices that represent

the remaining operators must be orthogonal (O

T

= O

−1

); the group of nn orthogonal

matrices is usually denoted by O(n). In this case, n = 4; however, there is a minus sign

in the metric that changes things somewhat. Since three components have one sign,

and one component has another, we usually denote this group as O(3,1), and we call

it the Lorentz group.

• The transformations we are interested in are the proper orthochronous transformations-

i.e.: the ones that keep causality and parity preserved. This is accomplished by im-

posing the “special” condition (det O = +1). However, this is not enough: although

this forbids parity and time-reversal, it does not forbid their product (prove this!). So

we consider the Proper Orthochronous Lorentz Group, or SO(3, 1)

+

, where the

“orthochronous” condition (subscript “+” sign) is satisﬁed by insisting the the (00)

component of any transformation is always positive; that along with the special condi-

tion is enough to isolate the desired subgroup. As this is quite a mouthful, we usually

just refer to SO(3,1) as the Lorentz group, even though it’s really only a subgroup.

• We can now go back to Chapter 1 and rephrase the deﬁnition of a tensor in ﬂat space-

time as an object that transforms under the same way as the coordinates under an

element of the Lorentz group.

• The Lorentz group has six degrees of freedom (six parameters, six generators). For a

proof, see Jackson. We can interpret these six parameters as three spacial rotations and

three boosts, one about each direction. We will denote the rotation matrix generators

as S

i

, and the boost generators as K

i

, where i = 1, 2, 3. These matrices are:

S

1

=

_

_

_

_

0 0 0 0

0 0 0 0

0 0 0 −1

0 0 1 0

_

_

_

_

S

2

=

_

_

_

_

0 0 0 0

0 0 0 1

0 0 0 0

0 −1 0 0

_

_

_

_

S

3

=

_

_

_

_

0 0 0 0

0 0 −1 0

0 1 0 0

0 0 0 0

_

_

_

_

K

1

=

_

_

_

_

0 1 0 0

1 0 0 0

0 0 0 0

0 0 0 0

_

_

_

_

K

2

=

_

_

_

_

0 0 1 0

0 0 0 0

1 0 0 0

0 0 0 0

_

_

_

_

K

3

=

_

_

_

_

0 0 0 1

0 0 0 0

0 0 0 0

1 0 0 0

_

_

_

_

• In general, we can write any element of the Lorentz group as e

−[θ·S+η·K]

, where θ are the

three rotation angles, and η are the three boost parameters, called the rapidity. You

should check to see that this exponential does in fact give you a rotation or Lorentz

transformation when chosing the proper variables. See Jackson for the most general

transformation.

20

• The boost velocity is related to the rapidity by the equation:

β

i

≡ tanh η

i

(2.1.7)

• Exercise: what is the matrix corresponding to a boost in the ˆ x-direction? Answer:

_

_

_

_

cosh η sinh η 0 0

sinh η cosh η 0 0

0 0 1 0

0 0 0 1

_

_

_

_

=

_

_

_

_

γ γβ 0 0

γβ γ 0 0

0 0 1 0

0 0 0 1

_

_

_

_

where γ = (1 −β

2

)

−1/2

. This is what you’re used to seeing in Griﬃths!

• Notice that you can always boost along a single direction by combining rotations with

the boosts. This is certainly legal by the nature of groups!

• The angles are bounded: 0 ≤ θ ≤ 2π, but the rapidities are not: 0 ≤ η < ∞. The

Lorentz group is not a compact group! And therefore it does not have faithful unitary

representations. This is a big problem in Quantum Mechanics and comes into play

when trying to quantize gravity.

• The Lorentz algebra is speciﬁed by the commutation relations:

[S

i

, S

j

] =

ijk

S

k

[S

i

, K

j

] =

ijk

K

k

[K

i

, K

j

] = −

ijk

S

k

Some points to notice about these commutation relations:

1. The rotation generators form a closed subalgebra (and hense generate a closed

subgroup) – it is simply the set of all rotations, SO(3). It is not a normal subgroup,

however, becuase of the second commutator.

2. A rotation followed by a boost is a boost, as I mentioned above.

3. Two boosts in diferent directions correspond to a rotation about the third direc-

tion.

4. The minus sign in the last commutator is what makes this group the noncompact

SO(3,1), as opposed to the compact SO(4), which has a + sign.

• Notice that you could have constructed the generators for the Lorentz group in a very

similar way to how you construct generators of the regular rotation group (angular

momentum):

J

µν

= x ∧ p = i(x

µ

∂

ν

−x

ν

∂

µ

) (2.1.8)

21

where i, j = 1, 2, 3, µ, ν = 0, 1, 2, 3 and you can think of the wedge product as a

generalization of the cross product (see Chapter 5). In this notation, J

0i

≡ K

i

and

J

ij

≡

ijk

S

k

. These six matrices are equivalent to the six above ones. Notice that, as

before, the greek indices are not matrix indices.

2.2 Transformations

• We wish to understand how transformations aﬀect the equations of motion. The most

important kinds of transformations are translations and rotations. These are called

proper transformations. Other transformations might include parity, time reversal

and charge inversion. We also might be interested in internal symmetries such as gauge

transformations in E&M, for example.

• When discussing the theory of transformations, it is useful to consider how objects

transform under a very small change; such changes are called inﬁnitesimal transfor-

mations. Let’s consider these ﬁrst. In what follows, I will be supressing indices (to

make the notation simpler).

• We want to invoke a transformation on a coordinate: x → x + δx. Then we would

expect a change in any function of x: f(x) →f(x) +δf. What is the form of δf? If f

represents a tensor, then it should have the same form as δx by deﬁnition.

• Example: Consider an inﬁnitesimal rotation about a point in the i direction. The

transformation will look like:

x →x + iJ

i

x +O(

2

) = [1 + iJ

i

+O(

2

)]x

where J

i

is the generator of the rotation (angular momentum).

• Recall that performing two rotations in the same direction is just multiplying the two

rotations together. Then performing N inﬁnitesimal rotations of size = a/N, where

a is a ﬁnite number, and is inﬁnitesimal in the limit N →∞, we have:

lim

N→∞

_

1 +

iaJ

i

N

_

N

= e

iaJ

i

⇒x →e

iaJ

i

x (2.2.9)

Hense we have constructed a ﬁnite transformation from an inﬁnite number of in-

ﬁnitesimal ones. Notice that if you expand the exponential and keep only linear terms

(for a very small) you reproduce the inﬁnitesimal rotation.

2.3 Lagrangian Field Theory

Here is a brief overview of the important things you should know about Lagrangian ﬁeld

theory.

22

• Recall in Lagrangian mechanics, you deﬁne a Lagrangian L(q(t), ˙ q(t)) and an action

S =

_

Ldt. Now we wish to decribe a Lagrangian density L(φ(x), ∂

µ

φ(x)) where

L =

_

Ld

3

x, and φ is a ﬁeld which is a funtion of spacetime. The action can be written

as S =

_

Ld

4

x. The action completely speciﬁes the theory.

• The equations of motion can now be found, just as before by the condition δS = 0.

This yields the new Euler-Lagrange equations:

∂L

∂φ

−∂

µ

_

∂L

∂φ

,µ

_

= 0 (2.3.10)

Notice that there is the sum over the index µ, following Einstein notation.

• Example: L =

1

2

∂

µ

φ∂

µ

φ −

1

2

m

2

φ

2

is the Lagrangian for a scalar ﬁeld (Klein-Gordon

ﬁeld in relativistic quantum mechanics). Can you ﬁnd the equations of motion? Here’s

the answer:

(∂

2

+ m

2

)φ = 0

• Just as in classical mechanics, we can also go to the Hamiltonian formalism, by deﬁning

a momentum density:

π ≡

∂L

∂

˙

φ

where

˙

φ ≡ ∂

0

φ and perform the Legendre transformation:

H(φ(x), π(x)) = π(x)

˙

φ(x) −L ⇒H =

_

d

3

xH (2.3.11)

• Notice that L is covariant, but H is not, since time becomes a special parameter via

the momentum density. This is why we often prefer to use the Lagrangian formalism

whenever working with relativistic mechanics. Nevertheless, this formalism is still

generally useful since H is related to the energy in the ﬁeld.

2.4 Noether’s Theorem

A theory is said to be symmetric with respect to a transformation whenever the action doesn’t

change under that transformation. This implies that the Lagrangian can only change up to

a total divergence (via Gauss’s Theorem). To see this, notice that if the Lagrangian changes

by a total divergence, we have:

S

=

_

Ld

4

x +

_

∂

µ

¸

µ

d

4

x = S +

_

dΣ

µ

¸

µ

where the last surface integral is an integral over a surface at inﬁnity. In order for quantities

like energy to be ﬁnite, we often require that ﬁelds and their derivatives vanish on such a

23

surface. Thi implies that the above surface integral vanishes (or at least doesn’t contribute

to the Equations of Motion

3

). Thus we have

Noether’s Theorem: For every symmetry in the theory, there corresponds a conserved

current, and vice versa.

Proof: I will prove Noether’s theorem for a scalar ﬁeld, but it is true in general for any

type of ﬁeld. Let’s consider the general transformation that acts on the ﬁeld according to:

φ →φ + αδφ

where α is a constant inﬁnitesimal parameter

4

. Then if we insist this transformation leave

the action invariant, the previous argument says that the Lagrangian can only change by a

total divergence:

L →L + α∂

µ

¸

µ

Let’s calculate the change in L explicitly:

αδL =

∂L

∂φ

(αδφ) +

∂L

∂φ

,µ

∂

µ

(αδφ)

= α∂

µ

_

∂L

∂φ

,µ

δφ

_

+ α

_

∂L

∂φ

−

_

∂L

∂φ

,µ

_

,µ

_

δφ

= α∂

µ

¸

µ

To get line 1 from line 2, use the product rule. The term in brackets is zero by Equation

(2.3.10), leaving only the total divergence. Then we can combine terms to get:

j

µ

≡

_

∂L

∂φ

,µ

δφ

_

−¸

µ

⇒∂

µ

j

µ

= 0 (2.4.12)

hense j

µ

is a conserverd current, which is precicely what we wanted to show. QED.

Notice that we could also write Noether’s theorem by saying that for every symmetry of

the theory, there is a “charge” which is conserved:

Q ≡

_

j

0

d

3

x ⇒

dQ

dt

= 0

There are many applications of this theorem. You can prove that isotropy of space (rota-

tional invariance) implies angular momentum conservation, homogeneity of space (transla-

tional invariance) implies linear momentum conservation, time translation invariance implies

3

Sometimes this integral does not vanish, and this leads to nontrivial “topological” eﬀects. But this is

advanced stuﬀ, and well beyond the scope of this review. So for now let’s just assume that the surface

integral is irrelevant, as it often is in practice.

4

This theorem also holds for nonconstant α - I leave the proof to you.

24

energy conservation, etc. You can also consider more exotic internal symmetries; for example,

gauge invariance in electrodynamics implies electric charge conservation.

25

Chapter 3

Geometry I: Flat Space

The most important aspect of geometry is the notion of distance. What does it mean for two

points in a space to be “close” to each other? This notion is what makes geometry distinct

from topology and other areas of abstract mathematics. In this chapter, I want to deﬁne

some important quantities that appear again and again in geometric descriptions of physics,

and give some examples.

In this chapter we will always be working in ﬂat space (no curvature). In the next chapter,

we will introduce curvature and see how it aﬀects our results.

3.1 THE Tensor: g

µν

• Like transformations, we can deﬁne distance inﬁnitesimally and work our way up from

there. Consider two points in space that are separated by inﬁnitesimal distances dx,

dy and dz. Then the total displacement (or metric) is given by the triangle rule of

metric spaces (a.k.a. the Pythagorean theorem):

ds

2

= dx

2

+ dy

2

+ dz

2

≡ g

µν

dx

µ

dx

ν

Here, g

µν

= δ

µν

.

• Deﬁnition: g

µν

is called the metric tensor or the ﬁrst fundamental form. Knowing

the form of g

µν

for any set of coordinates let’s you know how to measure distances.

The metric tensor is given by Equation (1.1.4), but in this chapter we will often cheat,

since sometimes we already know what ds

2

is. But when faced with trying to calculate

the metric in general, usually you have to resort to Equation (1.1.4).

• WARNING: Unlike the usual inner product notation, we never write ds

2

= g

µν

dx

µ

dx

ν

.

The dx’s always have upper indices

1

.

• In Cartesian coordinates, the metric tensor is simply the identity matrix, but in general,

it could be oﬀ-diagonal, or even be a function of the coordinates.

1

This is because ¦dx

µ

¦ forms a basis for the covectors (1-forms); see Chapter 5. Compare this to the

basis vectors of Chapter 1 ¦e

i

¦ which always came with lower indices, so that

¸

dx

i

, e

j

_

= δ

i

j

. Notice that

the indices work out exactly as they should for the Kroneker delta - this is, once again, not a coincidence!

26

3.1.1 Curvilinear Coordinates

• In spherical coordinates, the directions are given by (dr, dθ, dφ), and the metric is given

by:

ds

2

= dr

2

+ r

2

dθ

2

+ r

2

sin

2

θdφ

2

so the metric tensor is therefore:

[g

µν

] =

_

_

1 0 0

0 r

2

0

0 0 r

2

sin

2

θ

_

_

• Cyllindrical coordinates (or polar coordinates) are given by the coordinates:

ρ =

_

x

2

+ y

2

(3.1.1)

φ = tan

−1

y

x

(3.1.2)

z = z (3.1.3)

The metric is given by:

ds

2

= dρ

2

+ ρ

2

dφ

2

+ dz

2

and therefore:

[g

µν

] =

_

_

1 0 0

0 ρ

2

0

0 0 1

_

_

• Parabolic coordinates are derived from the cyllindrical coordinates in the following

way:

z =

1

2

(ξ −η) (3.1.4)

ρ =

_

ξη (3.1.5)

φ = φ (3.1.6)

or using the spherical radial distance:

ξ = r + z η = r −z

Plugging these expressions into the metric for cyllindrical coordinates, we derive the

parabolic coordinate metric:

27

ds

2

=

1

4

__

1 +

η

ξ

_

dξ

2

+

_

1 +

ξ

η

_

dη

2

_

+ ξηdφ

2

therefore:

[g

µν

] =

_

_

_

_

1

4

_

1 +

η

ξ

_

0 0

0

1

4

_

1 +

ξ

η

_

0

0 0 ξη

_

_

_

_

Parabolic coordinates describe families of concentric paraboloids symmetric about the

z-axis. They are very useful in problems with the appropriate symmetry, such as

parabolic mirrors.

• As you might have guessed, there are also hyperbolic coordinates and elliptical coor-

dinates, but I won’t get into those here. There are even more complicated coordinate

systems that are useful (toroidal coordinates, etc).

3.2 Diﬀerential Volumes and the Laplacian

3.2.1 Jacobians

• If you transform your coordinates, you will modify your diﬀerential volume element by

a factor called the Jacobian:

y

α

= y

α

(x

β

) ⇒d

n

x = ¸d

n

y

where in general, ¸ = det

_

∂x

a

∂y

µ

_

• To ﬁnd the general Jacobian, let us consider a transformation from Cartesian coordi-

nates (x

a

) to general coordinates (y

µ

) and ask how the metric transforms. Remember-

ing that the metric is a (0, 2) tensor and g

ab

= δ

ab

we have:

g

µν

= δ

ab

∂x

a

∂y

µ

∂x

b

∂y

ν

(3.2.7)

Now take the determinant of both sides, and deﬁning g ≡ det(g

µν

) we ﬁnd that ¸ =

√

g, so in general:

d

n

x =

√

gd

n

y

.

28

3.2.2 Veilbeins - a Prelude to Curvature

I will talk about curvature in the next chapter, but now would be a good time to introduce

a very useful tool called the vielbein (e

µ

a

):

• Consider a point p with coordinates x

µ

on a manifold (generally a manifold with

curvature, which we’ll talk about next Chapter). At every point p ∈ M, there is a

“tangent space” – this is basically the ﬂat plane tangent to the manifold at the point

p. Give coordinates on the tangent plane ξ

a

such that the point p is mapped to the

origin. The key point to notice is that the latin index is for ﬂat space while the greek

index is for curved space.

• Now consider a point p on the manifold, and the tangent space at p. Then there

is a (linear) mapping between the points near the origin of the tangent plane (with

coordinates ξ

a

) to a neighborhood of p on the manifold (with coordinates x

µ

). This

linear mapping is given by a matrix called the vielbein:

e

µ

a

≡

∂x

µ

∂ξ

a

This allows us to write Equation (3.2.7) as:

g

µν

= δ

ab

e

a

µ

e

b

ν

So the vielbein is, in some sense, the square root of the metric! In particular, the above

analysis shows that the Jacobian is just e, the determinant of the vielbein. There are

many cases in physics where this formalism is very useful

2

. I won’t mention any of it

here, but at least you have seen it.

3.2.3 Laplacians

• We can ﬁnd the Laplacian of a scalar ﬁeld in a very clever way using the Variational

principle in electrostatics. For our functional, consider that we wish to minimize the

energy of the system, and in electrostatics, that is (up to constants) the integral of E

2

.

So using our scalar potential, let’s deﬁne the energy functional in cartesian coordinates:

W[Φ] =

_

d

n

x(∇Φ)

2

where d

n

x is the n−dimensional diﬀerential volume element.

• We can ﬁnd the extremum of this functional δW[Φ] = 0 and see what happens. At

ﬁrst, notice that an integration by parts lets you rewrite the integrand as −Φ∇

2

Φ,

and the variation of that is simply −δΦ∇

2

Φ. Now change coordinates using the above

prescription and repeat the variation. We set the two expressions equal to each other,

2

For example, when spin is involved.

29

since the integral must be the same no matter what coordinate system we use. Then

you get as a ﬁnal expression for the Laplacian:

∇

2

=

1

√

g

∂

α

[

√

gg

αβ

∂

β

] (3.2.8)

Exercise: Prove this formula. It’s not hard, but it’s good practice.

Notice that ∂

µ

is covariant, so we require the contravariant g

µν

- remember that this

is the inverse of the metric, so be careful!

• Notice that this expression is only valid for scalar ﬁelds! For vector ﬁelds, it is generally

more complicated. To see this, notice that at ﬁrst, you might assume that ∇

2

A =

(∇

2

A

i

) for each component A

i

. In Cartesian coordinates, this is ﬁne. But in curvilinear

coordinates, you must be extra careful. The true deﬁnition of the i

th

vector component

is A

i

= A e

i

, and in curvilinear coordinates, e

i

has spacial dependence, so you have

to take its derivative as well. We will take the derivative of a unit vector and discuss

its meaning in the next chapter.

• Exercise: Find the Laplacian in several diﬀerent coordinate systems using Equation

(3.2.8). The usual Laplacians are on the cover of Jackson. Can you also ﬁnd the

Laplacian in parabolic coordinates? Here’s the answer:

∇

2

=

4

ξ + η

_

ξ

∂

2

∂ξ

2

+

∂

∂ξ

+ η

∂

2

∂η

2

+

∂

∂η

_

+

1

ξη

∂

2

∂φ

2

3.3 Euclidean vs Minkowskian Spaces

What is the diﬀerence between Euclidean space and Minkowski space? In all the examples

we have considered so far, we have been in three dimensions, whereas Minkowski space has

four. But this is no problem, since nothing above depended on the number of dimensions at

all, so that generalizes very nicely (now g

µν

is a 4 4 matrix). There is one more diﬀerence,

however, that must be addressed.

Minkowski space is a “hyperbolic space”: the loci of points equadistant from the origin

form hyperbolas. This is obvious when you look at the metric in spacetime:

∆s

2

= c

2

∆t

2

−∆x

2

(where I am looking in only one spacial dimension for simplicity). For constant ∆s, this is

the equation of a hyperbola.

The key to this strange concept is the minus sign. We immediately notice that trouble

will arise if we naively pretend that there is nothing wrong, since that minus sign makes g

(the determinant of the metric tensor) negative, and the square root of a negative number

makes no sense (you can’t have immaginary distance!). To ﬁx this problem, we will forever

in Minkowski space make the substitution g → −g, and all our woes will vanish! Another

30

way to think of this (and perhaps the better and more general way) is to replace g →[g[ in

all of our equations. Then there will never be a problem.

Just for completeness, the metric tensor of Minkowski spacetime is:

η

µν

=

_

_

_

_

+1 0 0 0

0 −1 0 0

0 0 −1 0

0 0 0 −1

_

_

_

_

This is called the “mostly-minus” prescription. Sometimes people will use the “mostly

plus” prescription which is the negative of this. You must be very careful which convention is

being used - particle physicists often use the minus prescription I use here, while GR people

tend to use the mostly plus prescription. But sometimes they switch! Of course, physics

doesn’t change, but the formulas might pick up random minus signs, so be careful.

31

Chapter 4

Geometry II: Curved Space

Consider the planet Earth as a sphere. We know that in Euclidean geometry, if two lines are

perpendicular to another line, then they are necessarely parallel to each other (they never

intersect). Yet a counterexample of this is longitude lines: two lines of longitude are always

perpendicular to the equator, and so they are “parallel”; yet they meet at two points: the

north and south poles. Obviously, something is wrong with Euclidean geometry on a sphere.

In everything that we have been doing, we have been assuming that the space we live in

was ﬂat. Even when we were using curvilinear coordinates, we were still talking about ﬂat

space. The situation is fundamentally diﬀerent when space itself is curved. It is here where

diﬀerential geometry truly becomes interesting!

There is a very deep theorem in mathematics that states that any manifold of any dimen-

sion and any curvature can always be embedded in a ﬂat space of some larger dimension. For

example, a sphere (which is two dimensional) can be embedded into Euclidean 3-space. So

in principle, we can avoid the issue of curvature completely if we think in enough dimensions.

But this is not practical or necessary in general, and curvature plays a very important role

in mathematical physics.

Diﬀerential geometry and curved space have their most obvious role in general relativity

(GR), where gravity manifests as a warping of spacetime. However, it also has applications

in several other branches of mathematical physics, including (but not limited to) optics,

solid-state physics and quantum ﬁeld theory. So it might be useful to see some of the basics

in one place. Hence this chapter.

Diﬀerential geometry is truly a deep and profound subject, and I recommend anyone

interested to study it either in a class or on your own (I took three classes in it as an

undergraduate and I’m still trying to learn it!) But I do not plan to teach diﬀerential

geometry here; only give an overview and a source for common equations you might come

across in physics. For a deeper understanding, you have to go to a textbook. I give some

recomendations in the bibliography.

4.1 Connections and the Covariant Derivative

One of the biggest problems with geometry on curved spaces is the notion of the derivative

of a vector ﬁeld. When you take the derivative, you are actually taking the diﬀerence of two

32

vectors that are separated by an inﬁtiesimal distance:

dA = A(x + ) −A(x)

But this doesn’t even make sense - you can only add or subtract diﬀerent vectors when they

have the same base point. In ﬂat space, this is not a problem - you simply move one of

the vectors over so that their base points coincide, and then you can subtract them like

usual. However, as we will see in a later section, you cannot generally move a vector around

arbitrarely in curved space without changing it. Hence, taking the diﬀerence of two vectors

with diﬀerent base points is a tricky subject, and therefore the naive derivative will not work

here.

Instead, we must deﬁne a new tensor called a connection:

Deﬁnition: A connection on a manifold M is a map that takes two tensors to a tensor.

It is denoted ∇

X

Y and necessarely has the following properties:

1. ∇

f(x)X

1

+g(x)X

2

Y = f(x)∇

X

1

Y + g(x)∇

X

2

Y , f, g ∈ C

∞

(M).

2. ∇

X

(aY

1

+ bY

2

) = a∇

X

Y

1

+ b∇

X

Y

2

, a, b constants.

3. ∇

X

(f(x)Y ) = f(x)∇

X

Y + (Xf(x))Y

The ﬁrst two rules give the connection linear properties, just like we expect the derivative

to do. Notice that the ﬁrst item applies to general functions, not just constants. Resorting to

mathematical terminology, we say that the connection ∇

X

Y is tensorial in X and local in

Y . The adjective “tensorial” means that it only depends on X at a speciﬁc point, as opposed

to a local neighborhood around the point. This is why we can factor out the functional

dependence. Notice that the ordinary directional derivative in vector calculus is exactly

the same. Finally, the third rule is the analogue of Leibniz’s rule

1

in vector calculus. This

becomes easy to see if we let X = x

i

∂

i

, where we have chosen e

i

≡ ∂

i

– a standard trick in

diﬀerential geometry.

Armed with this knowedge, we can ask: what is the connection of our coordinates? Let’s

choose a coordinate basis for our manifold (e

i

). Then we have the following vital deﬁnition:

∇

e

i

e

j

≡ Γ

k

ij

e

k

(4.1.1)

(Note the Einstein notation on the index k). The n

3

numbers Γ

k

ij

introduced here

2

are called

Christoﬀel symbols, or sometimes connection coeﬃcients. They are not tensors!! But

they are tensorial in the sense described above. They have the inhomogeneous transformation

law:

Γ

σ

µ

ν

= A

µ

.µ

A

ν

.ν

A

σ

.σ

Γ

σ

µν

+ A

σ

.σ

∂

2

x

σ

∂x

µ

x

ν

(4.1.2)

1

Sometimes called the product rule.

2

n = dimM

33

where the matrices A are the coordinate transformation matrices deﬁned in Chapter 1.

It seems reasonable that this object might be like the generalization of the derivative.

However, it turns out that we’re not quite done - these three axioms deﬁne a connection,

but they do not deﬁne it uniquely. If we want to deﬁne something like a derivative, we need

to invoke more axioms. We usually chose the following two:

4. ∇

X

g ≡ 0 ∀X

5. Γ

i

jk

= Γ

i

kj

The ﬁrst of these is often put into words as: “The connection is compatible with the

metric.” It basically means that the derivative of a product makes sense. The second prop-

erty is that the connection is symmetric or torsionless. These two additional properties,

along with the three already included, are necessary and suﬃcient to deﬁne a unique con-

nection called the Riemann Connection, or the Levi-Civita Connection. Before we

proceed, notice that this is by no means the only choice I could have made. There are many

theories where, for example, we drop the symmetric condition to get various other types of

geometry. This is an especially popular trick in trying to quantize gravity. But this is way

beyond the scope of this paper, and I will simply stick to Riemannian geometry, where these

are the axioms.

Now we would like to translate this rather abstract discussion into something concrete.

Consider a vector ﬁeld A(x) on a manifold M. Let’s take its connection with respect to a

basis vector:

∇

e

i

A = ∇

e

i

(A

j

e

j

) = (∂

i

A

j

)e

j

+ A

j

∇

e

i

e

j

= (∂

i

A

j

)e

j

+ A

j

Γ

k

ij

e

k

=

_

∂

i

A

k

+ A

j

Γ

k

ij

¸

e

k

where I have used Leibniz’s rule and renamed a dummy index in the last step (j → k). If

I am only interested in the coordinate form of the connection (which I usually am), I can

drop the basis vectors and write

3

:

D

i

A

k

= ∂

i

A

k

+ A

j

Γ

k

ij

(4.1.3)

D

i

A

k

= ∂

i

A

k

−A

j

Γ

j

ik

(4.1.4)

Proving Equation (4.1.4) is very straightforward - simply consider D

i

(A

j

A

j

) = ∂

i

(A

j

A

j

) and

use Equation (4.1.3). Equations (4.1.3-4.1.4) are often called the covariant derivatives,

and replace the ordinary derivative of ﬂat space. Notice that in the limit of ﬂat space, there

is no diﬀerence between the ordinary and covariant derivative, but in more general cases,

there is a “correction” to the ordinary derivative that compensates for how you subtract

the vectors with diﬀerent basepoints. Physically speaking, the covariant derivative is the

derivative of a vector ﬁeld that you would see on the surface. Recall I said that you can

always embed a manifold in a larger ﬂat space, where the ordinary derivative works ﬁne.

But as long as we stay strictly on the surface, we must use the covariant derivative.

3

When in a speciﬁc coordinate basis, we often simply write ∇

∂i

⇔D

i

.

34

Remembering our easy notation of Chapter 1, it is sometimes easier to write the diﬀer-

ential coordinate index with a semi-colon:

A

k

;i

= A

k

,i

+ A

j

Γ

k

ij

(4.1.5)

What if I wanted to take the covariant derivative of a tensor that is higher than rank-1?

Let’s consider the covariant derivative of the diadic:

(A

i

B

j

)

;k

= A

i;k

B

j

+ A

i

B

j;k

= (A

i

B

j

)

,k

+ (Γ

m

ik

A

m

)B

j

+ (Γ

m

jk

B

m

)A

i

where I have skipped some straightforward algebra. This, along with Equation (4.1.4), should

give us enough intuition to express the covariant derivative on an arbitrary rank tensor:

(T

i

1

i

2

···in

)

;j

= (T

i

1

i

2

···in

)

,j

+

n

α=1

Γ

iα

βj

T

i

1

···i

α−1

βi

α+1

···in

(4.1.6)

In other words – for a rank-n tensor, we get a Γ term for each contravariant index. Similarly

there would be a minus Γ term for each covariant index. By the way, a quick corollary to

this would be:

Φ

;i

= Φ

,i

for a scalar quantity Φ. This should be so, since the Christoﬀel symbols only appeared when

we had basis vectors to worry about.

Now that we have the covariant derivative under our belts, we can use it to ﬁnd a useful

expression for the connection coeﬃcients. Take the covariant derivative of the metric tensor

three times (each with a diﬀerent index), and use the property of compatability to get:

Γ

ρ

νσ

=

1

2

g

µρ

[g

µν,σ

−g

νσ,µ

+ g

σµ,ν

] (4.1.7)

As a ﬁnal exercise, consider a parametrized curve on a manifold M, w(s) : I → M,

where I is the unit interval [0,1]. Then we can take the parametrized covariant derivative

by resorting to the chain rule:

Dw

j

ds

= D

i

w

j

dx

i

ds

=

∂w

j

∂x

i

dx

i

ds

+ (Γ

j

ik

w

k

)

dx

i

ds

Combining terms:

Dw

j

ds

=

dw

j

ds

+ Γ

j

ik

w

k

dx

i

ds

(4.1.8)

This expression is very useful.

Exercise: Prove Equation (4.1.2). Hint: use the covariant derivative.

Exercise: Prove Equation (4.1.7) using the method suggested.

35

4.2 Parallel Transport and Geodesics

Let’s return to our spherical planet example from the beginning of the chapter. Consider

two parallel vectors tangent to the Earth at the equator and the prime meridian (zero

longitude, zero lattitude). Now move one of the vectors along the equator until it comes to

the international date line (180 degrees around). Now proceed to move both vectors up the

longitude line until they meet at the north pole. Notice that they are no longer parallel, but

antiparallel. If you were to do the same experiment, but move the vector only 90 degrees

around the globe, the two vectors would be perpendicular at the pole

4

!

Again, we see that there is something ﬁshy about the notion of parallel vectors in curved

space. Geometers knew this, and they did their best to generalize the notion of parallel

vectors by inventing an algorithm called parallel transport. In this section, I will talk a

little about that, and then use our new-found intuition to deﬁne a geodesic.

We begin with a deﬁnition:

Deﬁnition: Consider a parametrized path α(s) : I → M and a vector ﬁeld along the

curve: w(s) = w(α(s)) all in a manifold M. Then we say that w(s) is a parallel vector

ﬁeld if its covariant derivative vanishes everywhere:

Dw

ds

= 0 ∀s ∈ I

The uniquness theorem of diﬀerential equations tells us that if a vector ﬁeld has a given

value at some initial point (w(0) = w

0

), then there is a unique vector w that is parallel to

w

0

that comes from parallel transporting w

0

some distance along the contour α(s).

Armed with this knowedge, we can ﬁnally move onto one of the most useful objects in

geometry:

Deﬁnition: A geodesic on a manifold M is a parametrized curve x(s) : I → M whose

tangent vectors form a parallel vector ﬁeld. In other words, the geodesic satisﬁes the diﬀer-

ential equation:

Dx

ds

≡ 0

Using Equation (4.1.8), with x

≡

dx

ds

as the vector ﬁeld we’re diﬀerentiating, we get a

famous nonlinear diﬀerential equation:

d

2

x

µ

ds

+ Γ

µ

νσ

dx

ν

ds

dx

σ

ds

= 0 (4.2.9)

Going back to physics, note that a particle’s trajectory x

µ

(s) should ultimately be given

by Newton’s laws. Notice that if we’re in ﬂat space, these equations do indeed reduce to

4

This is actually a sketch of the proof of the famous theorem from geometry/topology that states that

you can never have a continuous vector ﬁeld deﬁned everywhere on a sphere that doesn’t vanish somewhere.

But there is no need to get into the details of that here.

36

Newton’s laws of motion! General relativity suggests that in the absense of inertial forces,

a particle moves along a geodesic, so its equation of motion is given by Equation (4.2.9).

Gravity manifests itself as the curvature of space, i.e.: the connection coeﬃcients. So we

can think of the Γ terms in Equation (4.2.9) as source terms for the gravitational ﬁeld in

Newton’s equations of motion. If we wished to include other inertial forces (such as an

electric ﬁeld, for instance), we would modify the right hand side of Equation (4.2.9).

Additionally, a geodesic is the is the path of shortest distance: this can be proved using

variational calculus techniques. So this equation tells us that particles like to travel along

the path the minimizes the distance travelled. In ﬂat space (no forces), that’s a straight line;

but in curved space (when gravity is present, for example) then in can be more complicated.

4.3 Curvature- The Riemann Tensor

Now that we have worked out some details on parallel transport and geodesics, let’s see

what else the covariant derivative can give us. One question that analysts like to ask is “Are

mixed partials equal?” It turns out that understanding the answer to this question gives us

a rigorous way to quantify the curvature of the manifold.

Consider the commutator of two covariant derivatives acting on a vector

5

:

A

ν;ρ;σ

−A

ν;σ;ρ

= A

µ

R

µ

.νρσ

where

R

µ

.νρσ

≡ Γ

µ

νσ,ρ

−Γ

µ

νρ,σ

+ Γ

α

νσ

Γ

µ

αρ

−Γ

α

νρ

Γ

µ

ασ

(4.3.10)

R

µ

.νρσ

is a tensor, called the Riemann-Christoﬀel Curvature Tensor. We know it’s

a tensor from the quotient rule of tensors: if a tensor equals another tensor contracted with

another object, then that other object must be a tensor. There are a number of symmetries

6

:

1. R

βνρσ

= −R

νβρσ

2. R

βνρσ

= −R

βνσρ

3. R

βνρσ

= +R

ρσβν

4. R

µ

.νρσ

+ R

µ

.ρσν

+ R

µ

.σνρ

= 0

5. R

µ

.νρσ;τ

+ R

µ

.νστ;ρ

+ R

µ

.ντρ;σ

= 0

The ﬁrst four symmetries are manifest from Equation (4.3.10); the last item is known as

the Bianchi Identity. These symmetries have the eﬀect of putting some powerful constraits

on the number of independent quantities in the curvature tensor. Without any symmetries,

there would be d

4

numbers, where d is the dimension of our manifold. However, with all of

these symmetries, there are:

5

I’m not going to prove this: you do it!

6

R

µ

.νρσ

= g

µβ

R

βνρσ

37

d

2

(d

2

−1)

12

(4.3.11)

independent quantities. Notice that the rank of the Riemann-Christoﬀel tensor is always 4,

despite the dimension of the manifold.

The Bianchi Identity is a diﬀerent sort of relation in the sense that it involves derivatives.

There is a similar equation in Maxwell’s electrodynamics:

F

µν,ρ

+ F

νρ,µ

+ F

ρµ,ν

= 0

where F

µν

is the ﬁeld strength tensor (see Jackson or Griﬃths). These equations are very

closely related to the topic of “Gauge Invariance” – we will talk about that in the next

chapter.

We can try to construct lower rank tensors from this monster in Equation (4.3.10), but

thanks to the symmetries, there are only a few nonzero contractions, and up to sign, they

all yield the same thing. We deﬁne the rank-2 tensor:

R

µν

≡ R

τ

.µντ

(4.3.12)

It is called the Ricci tensor. It is a symmetric tensor. Any other nonzero contraction would

give the same object up to a sign. You should prove this for yourself. Finally we can take

the trace of Equation (4.3.12) to get the Ricci scalar:

R ≡ R

µ

µ

(4.3.13)

R is sometimes called the scalar curvature. It is a measure of the intrisic curvature of

the manifold.

Finally, I would like to mention an interesting curiosity. Take a good look at Equation

(4.3.10). Einstein used this tensor to describe gravity. Notice that it has derivatives of

the connection coeﬃcients, and is a tensor. It is, in a sense, a strange kind of curl of

the connection coeﬃcients. Now look at Equation (4.1.2) - this says that the connection

coeﬃcients transform under coordinate transformations with an inhomogeneous term. Does

this look like anything we know? What about the vector potential of electrodynamics? That

has an inhomogeneous term when you perform a gauge transformation (A →A+∇Λ), and

yet when you take its curl to get B, the resulting ﬁeld is gauge invariant. Here, the “gauge

transformation” is actually a coordinate transformation (technically, a “diﬀeomorphism”),

and the Riemann tensor transforms “covariantly” (which is a fancy word for “properly”)

under these transformations, even though the connection coeﬃcients do not. You can see

the parallel: you might think of the connection coeﬃcients as “gravity vector potentials”,

and the curvature tensor as the “gravitational ﬁeld”. Then we have something like a gauge

theory of gravity! Unfortunately, things are more complicated than that. However, this is

getting us too far aﬁeld, so I’ll stop now.

4.3.1 Special Case: d = 2

Before leaving diﬀerential geometry behind, I wanted to talk a little about a very special

case of mathematics - two dimensional manifolds. In one dimension, the Riemann tensor

38

is not well-deﬁned, and we do not need any of this formalism to discuss geometry; in fact,

one dimensional diﬀerential geometry is nothing more than vector calculus! In three or more

dimensions, the Riemann tensor is complicated, and it is much more diﬃcult to say anything

quantitative and general about your manifold. But in the special case of two dimensions,

the Riemann tensor has only one component (see Equation (4.3.11)), and that is the Ricci

scalar curvature. Here, we can talk about curvature in a very quantitative matter.

To understand how this works, we need to state a few deﬁnitions. A two dimensional

Riemannian manifold is often simply referred to as a “surface”.

Deﬁnition: Consider a curve parametrized by α(s) living on a surface M. The the

curvature of the curve is given by:

k(s) ≡ [α

(s)[

If n is the normal vector to α, and N is the normal to the surface, then we deﬁne the angle:

cos θ

α(s)

≡ ¸n(s), N(s))

Then

k

n

(s) ≡ k(s) cos θ

α(s)

is called the normal curvature along the path α(s).

Now consider all the curves in the surface, and all of the normal curvatures. Let us

call the maximum normal curvature k

1

, and the minimum normal curvature k

2

.

7

These two

quantities are properties of the surface, not of the curves. Finally, we can deﬁne two new

objects:

H =

1

2

(k

1

+ k

2

) (4.3.14)

K = k

1

k

2

(4.3.15)

H is called the mean curvature. It is very useful in many applications, especially in

engineering. However, it is not as useful for a theoretical tool because it is not isometrically

invariant, where an “isometrically invariant” quantity only depends on the surface itself, not

on how the surface is embedded in 3-space. K is called the Gaussian curvature, and is of

fundamental importance. Gauss proved that it is isometrically invariant, which means that

it describes the surface in a fundamental way. One can prove the relation:

K ∝ R

and hense make the connection between two dimensional geometry and higher dimensional

geometry.

7

There is no diﬀerence to which one is which in the literature.

39

Armed with this new tool, we can state one of the most important theorems of diﬀerential

geometry, and one of the key reasons why diﬀerential geometry in two dimensions is so well

understood:

Theorem: For any compact, oriented Riemannian 2-manifold M:

_

M

KdΩ = 2πχ(M) (4.3.16)

where the integral is a surface integral over the entire manifold and χ(M) is a topologically

invariant quantity called the Euler characteristic. This theorem is called the Gauss-

Bonnet Theorem, and has several versions for diﬀerent cases of M, integrating over only

part of M, etc; but this is the most general statement of the theorem.

The Euler characteristic χ(M) is of fundamental importance in topology and geometry.

To calculate it for a surface, you divide your surface into a bunch of triangles; this process is

called triangulation. You then count the total number of faces (F), edges (E) and vertices

(V ) and apply the following formula:

χ(M) = V −E + F (4.3.17)

χ(M) is well-deﬁned in the sense that it is independent of how you triangulate your surface.

This is not a trivial point. We call this quantity a “topological invariant” because if you

have two surfaces that have the same value of χ, then they are homeomorphic; in other

words, you can always deform one of the surfaces into another in a continuous manner, (i.e.:

without tearing or gluing. Thus the Euler characteristic provides a very powerful techinque

for classifying diﬀerent types of surfaces.

As a very quick application of Equation (4.3.16), consider Gauss’ Law from electrody-

namics. If you interpret K as the “ﬁeld strength” we see that the surface integral must

be some topological invariant, i.e. “charge”! This kind of connection between ﬁeld theory

(classical and quantum) and diﬀerential geometry is very common, and very powerful!

4.3.2 Higher Dimensions

In more than two dimensions, things become considerably more complicated. In order to

deﬁne the analog of Gaussian curvature, one considers all the independent two dimensional

“cross sections” of the manifold and calculates each of their Gaussian curvatures. These

curvatures are know as sectional curvatures. It can be proved that all of the sectional

curvatures are contained in the Riemann tensor, which is why the Riemann-Levi-Civita

formalism is so useful in higher dimensions, even if it is overkill in two dimensions.

Unfortunately, there is no such thing as a Gauss-Bonnet theorem for manifolds of more

than two dimensions. First of all, the Gaussian curvature is no longer well-deﬁned. Instead,

you might use some information from the Riemann tensor, but it is not obvious how to do

this. Secondly, and more importantly, there is no known universal topological invariant in

more than two dimensions! In other words, manifolds with the same Euler characteristic

40

need no longer be topologically equivalent in higher dimensions, and we know of no single

quantity that does classify manifolds in this way. This is way beyond the scope of this

review, so I will stop now. Let me just say that this is one of the most active areas in

modern mathematics research. If you are at all interested, I strongly encourage you to check

it out!

41

Chapter 5

Diﬀerential Forms

Diﬀerential forms are, without a doubt, one of the most beautiful inventions in all of mathe-

matics! Their elegance and simplicity are without bound. Nevertheless, like anything good,

it takes a long time before they become natural and second nature to the student. Whether

you realize it or not, you have been using diﬀerential forms for a long time now (gradient).

Before diving into forms, I should mention two quick things. First of all, it is likely

that many of you will never need to know any of this in your life. If that is the case, then

you have no reason to read this chapter (unless you’re curious). I have included it anyway,

since writing these notes has helped me to understand some of the more subtle details of

the mathematics, and I ﬁnd forms to be very useful. Diﬀerential forms are extremely useful

when, for example, you wish to discuss electromagnetism, or any gauge ﬁeld for that matter,

in curved space (i.e.: with gravity). This is why I thought they would do well in this review.

The second point I want to make is that no matter how much you like or dislike forms,

remember that they are nothing more than notation. If mathematicians had never discovered

diﬀerential forms, we would not be at a serious loss; however, many calculations would

become long and tedious. The idea behind this beautiful notation is that terms cancel early.

If we had never heard of diﬀerential forms, one would have to carry out terms that would

ultimately vanish, making the calculations nasty, at best! But let us not forget that there is

nothing truly “fundamental” about forms – they are a convienience, nothing more.

That being said, let’s jump in...

5.1 The Mathematician’s Deﬁnition of Vector

One thing you have to be very careful of in mathematics is deﬁnitions. We have to construct

a deﬁnition that both allows for all the cases that we know should exist, and also forbids

any pathological examples. This is a big challenge in general mathematical research. We

have been talking a lot about “vectors”, where we have been using the physicist’s deﬁnition

involving transformation properties. But this is unfortunately not the only mathematical

deﬁnition of a vector. Mathematicians have another deﬁnition that stems from the subject

of linear algebra. For the moment, we are going to completely forget about the physical

deﬁnition (no more transformations) and look on vectors from the mathematician’s point of

view. To this end we must make a new deﬁnition; the key deﬁnition of linear algebra:

42

Deﬁnition: A Vector Space (sometimes called a Linear Space), denoted by 1, is a

set of objects, called vectors, and a ﬁeld, called scalars, and two operations called vector

addition and scalar multiplication that has the following properties:

1. (1,+) is an abelian group.

2. 1 is closed, associative, and distributive under scalar multiplication, and there is a

scalar identity element (1v = v).

In our case, the ﬁeld of scalars will always be R, but in general it could be something

else, like C or some other ﬁeld from abstract algebra.

Notice that this deﬁnition of vectors has absolutely nothing to do with the transformation

properties that we are used to dealing with. Nevertheless, you can prove to yourself very

easily that the set of n-dimensional real vectors (R

n

) is a vector space. But so are much

more abstract quantities. For example, the set of all n-diﬀerentiable, real-valued functions

(C

n

(R)) is also a vector space, where the “vectors” are the functions themselves. Another

example of a vector space is the set of all eigenfunctions in quantum mechanics, where the

eigenfunctions are the vectors. This is why it’s called “abstract” algebra!

Sticking to abstract linear algebra for a minute, there is a very important fact about

vector spaces. Recall that a linear transformation is a function that operates on 1 and

preserves addition and scalar multiplication:

T(αx + βy) = αT(x) + βT(y)

Then let us consider the set of all linear transformations that go from 1 to R:

Λ

1

(1) = ¦φ[ φ : 1 →R; φ linear¦ (5.1.1)

I’ll explain the notation in a moment. This set is very intriguing - every element in this

set has the property that it acts on any vector in the original vector space and returns a

number. It is, in general, true that for any vector space, this special set exists. It is also

true that this space also forms a vector space itself. It is so important in linear algebra, it is

given a name. We call it the dual space of 1, and denote it as 1

∗

. The vectors (functions)

in the space are called covectors, or 1-forms (hense the 1 superscript).

Armed with this knowledge we can go back and redeﬁne what we mean by “tensor”:

consider the set of all multilinear (that is, linear in each of its arguments) transformations

that send k vectors and l covectors to a number:

T : 1 1

. ¸¸ .

k times

1

∗

1)

∗

. ¸¸ .

l times

→R

The set of all such transformations is called T

l

k

and the elements of this set are called tensors

of rank

_

l

k

_

. Notice immediately that T

0

1

= 1, T

1

0

= 1

∗

. Also notice that this deﬁnition

of tensor has nothing to do with transformations, as we deﬁned them in Chapter 1. For

43

example, acording to this deﬁnition, the Christoﬀel symbols of diﬀerential geometry are

(1, 2) tensors. So you must be sure you know which deﬁnition people are using.

That’s enough abstraction for one day. Let us consider a concrete example. Consider the

vector space R

n

and the set of multilinear transformations that take k vectors to a number:

φ : R

n

R

n

R

n

. ¸¸ .

k times

→R (5.1.2)

with the additional property that they are “alternating”, i.e.:

φ(. . . , x, y, . . .) = −φ(. . . , y, x, . . .) (5.1.3)

When φ takes k vectors in R

n

, then φ is called a k-form. The set of all k-forms is denoted

Λ

k

(R

n

). Note that this set is also a vector space. The dual space is an example of this, hence

the notation.

In linear algebra, we know that every vector space has a basis, and the dimension of the

vector space is equal to the number of elements in a basis. What is the standard basis for

Λ

k

(R

n

)? Let us start by considering the dual space to R

n

(k = 1). We know the standard

basis for R

n

; let’s denote it by ¦e

1

, ..., e

n

¦. Now let us deﬁne a 1-form dx

i

with the property:

dx

i

( e

j

) ≡

¸

dx

i

, e

j

_

= δ

i

j

(5.1.4)

Then the standard basis of Λ

k

(R

n

) is the set of all k-forms:

dx

i

∧ dx

j

∧ dx

l

; i ,= j ,= ,= l; i, j, , l ≤ n (5.1.5)

where each “wedge product” has k “dx”s

1

.

Before going any further, perhaps it would be wise to go over this and make sure you

understand it. Can you write down the standard basis for Λ

2

(R

3

)? How about Λ

2

(R

4

)? Here

are the answers:

Λ

2

(R

3

) : ¦dx

1

∧ dx

2

, dx

1

∧ dx

3

, dx

2

∧ dx

3

¦ ⇒D = 3

Λ

2

(R

4

) : ¦dx

1

∧ dx

2

, dx

1

∧ dx

3

, dx

1

∧ dx

4

, dx

2

∧ dx

3

, dx

2

∧ dx

4

, dx

3

∧ dx

4

¦ ⇒D = 6

By looking at these examples, you should be able to ﬁgure out what the dimension of

Λ

k

(R

n

) is, in general (k < n):

dim Λ

k

(R

n

) =

_

n

k

_

=

n!

k!(n −k)!

(5.1.6)

Notice that if k > n the space is trivial – it only has the zero element.

1

Wedge products are discussed below, but for now all you need to know of them is that they are anti-

symmetric products, so A∧ B = −B ∧ A for A, B 1−forms.

44

5.2 Form Operations

5.2.1 Wedge Product

In order to deﬁne the standard basis for Λ

k

(R

n

), we needed to introduce a new operation,

called the wedge product (∧). Let’s look more carefully at this new form of multiplication.

As a “pseudo-deﬁnition”, consider two 1-forms ω, φ ∈ Λ

1

(R

n

); then let us deﬁne an element

in Λ

2

(R

n

) by the following:

ω ∧ φ(v, w) = ω(v)φ( w) −ω( w)φ(v) (5.2.7)

Notice that there are two immediate corollaries:

ω ∧ φ = −φ ∧ ω (5.2.8)

ω ∧ ω = 0 ∀ω ∈ Λ

k

(R

n

) (5.2.9)

So the wedge product deﬁnes a noncommutative product of forms. Notice that this is diﬀerent

from the ordinary tensor product:

ω ⊗φ(v, w) ≡ ω(v)φ( w) (5.2.10)

Looking at this as an antisymmetric product on forms, you might be reminded of the

cross product. This is exactly right: the cross-product of two vectors is the same thing as a

wedge product of two 1-forms. We will prove this explicitly soon.

Now that we have talked about the speciﬁc case of 1-forms, we can generalize to the

wedge product of any forms:

Deﬁnition: Let ω ∈ Λ

k

(R

n

), φ ∈ Λ

l

(R

n

). Then we deﬁne ω ∧ φ ∈ Λ

k+l

(R

n

) as the sum of

all the antisymmetric combinations of wedge products. This product is

1. Distributive: (dx

1

+ dx

2

) ∧ dx

3

= dx

1

∧ dx

2

+ dx

1

∧ dx

3

2. Associative: (dx

1

∧ dx

2

) ∧ dx

3

= dx

1

∧ (dx

2

∧ dx

3

) = dx

1

∧ dx

2

∧ dx

3

3. Skew-commutative: ω ∧ φ = (−1)

kl

φ ∧ ω

Notice that for k + l > n the space is trivial, so φ ∧ ω = 0.

Before going any further, let’s do some examples. Consider ω, φ ∈ Λ

1

(R

3

), i.e.: 1-forms

in 3-space. Writing it out explicitly, we have:

ω = a

1

dx

1

+ a

2

dx

2

+ a

3

dx

3

φ = b

1

dx

1

+ b

2

dx

2

+ b

3

dx

3

ω ∧ φ = (a

1

dx

1

+ a

2

dx

2

+ a

3

dx

3

) ∧ (b

1

dx

1

+ b

2

dx

2

+ b

3

dx

3

)

= (a

1

b

2

−a

2

b

1

)dx

1

∧ dx

2

+ (a

1

b

3

−a

3

b

1

)dx

1

∧ dx

3

+ (a

2

b

3

−a

3

b

2

)dx

2

∧ dx

3

45

Notice how I’ve been careful to keep track of minus signs. Also notice that this looks

very similar to a cross product, as stated earlier. What about taking the wedge product of

two 1-forms in R

4

:

ξ = a

1

dx

1

+ a

2

dx

2

+ a

3

dx

3

+ a

4

dx

4

χ = b

1

dx

1

+ b

2

dx

2

+ b

3

dx

3

+ b

4

dx

4

ξ ∧ χ = (a

1

b

2

−a

2

b

1

)dx

1

∧ dx

2

+ (a

1

b

3

−a

3

b

1

)dx

1

∧ dx

3

+ (a

1

b

4

−a

4

b

1

)dx

1

∧ dx

4

+(a

2

b

3

−a

3

b

2

)dx

2

∧ dx

3

+ (a

2

b

4

−a

4

b

2

)dx

2

∧ dx

4

+ (a

3

b

4

−a

4

b

3

)dx

3

∧ dx

4

Note that this does not look like a “cross product”; it is a six-dimensional object in R

4

.

However, if you took another wedge product with another 1-form, you would get a 3-form in

R

4

(which is four-dimensional), and that would give you something like a four-dimensional

cross product. In general, taking the wedge product of (n − 1) 1-forms in R

n

will give you

something analogous to a cross-product. We’ll get back to this later.

As a ﬁnal example, what if you took the wedge product of a 1-form and a 2-form in R

3

:

α = c

1

dx

1

+ c

2

dx

2

+ c

3

dx

3

(5.2.11)

β = L

12

dx

1

∧ dx

2

+ L

13

dx

1

∧ dx

3

+ L

23

dx

2

∧ dx

3

(5.2.12)

α ∧ β = (c

1

L

23

−c

2

L

13

+ c

3

L

12

)dx

1

∧ dx

2

∧ dx

3

(5.2.13)

Notice that if we identify β = ω ∧ φ from above, then we have shown that the triple wedge

product of 1-forms in R

3

is just:

ω ∧ φ ∧ α = det

_

_

a

1

a

2

a

3

b

1

b

2

b

3

c

1

c

2

c

3

_

_

dx

1

∧ dx

2

∧ dx

3

(5.2.14)

In fact, the wedge product of n 1-forms in R

n

is always such a determinant. This is a

very important fact that will come up again and again.

Next, we can ask how a general wedge product of forms acts on vectors? The answer is

that the form farthest to the left acts on the vector, and then you must permute the form so

that all the 1-forms act on the vector. As an example, consider a 2-form acting on a vector:

dx ∧ dy(v) = dx(v)dy −dy(v)dx = v

x

dy −v

y

dx

The minus sign comes from ﬂipping the dx and the dy.

Before moving on, let us conclude this section with a deﬁnition and a theorem:

Deﬁnition: A simple k-form is a “monomial” k-form; that is, there is no addition.

Theorem: All k-forms can be written as a linear combination of simple k-forms.

Proof: The proof of this theorem is straightforward: a simple k-form is just a basis

element, possibly multiplied by a number. Then since we are dealing with a vector space,

any element of Λ

k

(R

n

) can be written as a linear combination of these basis elements. QED.

46

5.2.2 Tilde

The tilde operator is the operator that translates us from vectors to forms, and vice versa.

In general, it is an operation: (rank-k tensors)→(k-forms). Let’s see some examples:

v = v

1

e

1

+ v

2

e

2

+ v

3

e

3

∈ R

3

˜ v = v

1

dx

1

+ v

2

dx

2

+ v

3

dx

3

∈ Λ

1

(R

3

)

Notice that the tilde changed subscript indices to superscript. This is important when we

have to keep track of what is covariant and contravariant.

Let’s do another less trivial example:

L =

_

_

0 L

12

L

13

−L

12

0 L

23

−L

13

−L

23

0

_

_

∈ R

3

(Rank-2)

˜

L =

_

L

12

dx

1

∧ dx

2

+ L

13

dx

1

∧ dx

3

+ L

23

dx

2

∧ dx

3

_

∈ Λ

2

(R

3

)

=

1

2

L

ij

dx

i

∧ dx

j

where in the last line I’m using Einstein notation for the indices. Notice how much simpler

it is to write a 2-form as opposed to writing a rank-2 tensor. Also note that both L

ij

and

L

ji

are taken care of, including the minus sign, by using wedge products as opposed to using

tensor products - this is why we need the factor of

1

2

in the last line. We are beginning to

see why form notation can be so useful!

Incidentally, notice that I am only considering antisymmetric tensors when I talk about

forms. This is because the form notation has antisymmetry built into it. For symmetric

tensors, the concept of a form is not very useful.

Finally, let me state a pretty obvious result:

Theorem:

˜

˜ v = v.

Proof: Left to the reader.

5.2.3 Hodge Star

The next tool we introduce for diﬀerential forms is the Hodge Star (∗). The Hodge Star

converts forms to their so-called dual form:

∗ : Λ

k

(R

n

) →Λ

n−k

(R

n

) (5.2.15)

As for many other things in linear algebra, it is suﬃcient to consider how the Hodge star

acts on the basis elements. Let’s look at some examples.

In R

2

:

∗dx

1

= dx

2

∗dx

2

= −dx

1

47

In R

3

:

∗dx

1

= dx

2

∧ dx

3

∗(dx

1

∧ dx

2

) = dx

3

∗dx

2

= dx

3

∧ dx

1

∗(dx

2

∧ dx

3

) = dx

1

∗dx

3

= dx

1

∧ dx

2

∗(dx

3

∧ dx

1

) = dx

2

In R

4

:

∗dx

1

= +dx

2

∧ dx

3

∧ dx

4

∗(dx

1

∧ dx

2

) = +dx

3

∧ dx

4

∗(dx

1

∧ dx

2

∧ dx

3

) = +dx

4

∗dx

2

= −dx

3

∧ dx

4

∧ dx

1

∗(dx

1

∧ dx

3

) = −dx

2

∧ dx

4

∗(dx

1

∧ dx

2

∧ dx

4

) = −dx

3

∗dx

3

= +dx

4

∧ dx

1

∧ dx

2

∗(dx

1

∧ dx

4

) = +dx

2

∧ dx

3

∗(dx

1

∧ dx

3

∧ dx

4

) = +dx

2

∗dx

4

= −dx

1

∧ dx

2

∧ dx

3

∗(dx

2

∧ dx

3

) = +dx

1

∧ dx

4

∗(dx

2

∧ dx

3

∧ dx

4

) = −dx

1

∗(dx

2

∧ dx

4

) = −dx

1

∧ dx

3

∗(dx

3

∧ dx

4

) = +dx

1

∧ dx

2

We can see a pattern here if we remember how the Levi-Civita simbol works. For any

form in Λ

k

(R

n

):

Φ = φ

µ

1

...µ

k

dx

µ

1

∧ ∧ dx

µ

k

⇒

∗Φ =

1

(n −k)!

ν

1

...νn

φ

ν

1

...ν

k

dx

ν

k+1

∧ ∧ dx

νn

(5.2.16)

or

∗φ

µ

1

...µ

n−k

=

1

(n −k)!

ν

1

...ν

k

µ

1

...µ

n−k

φ

ν

1

...ν

k

(5.2.17)

From the above, we can now present a theorem:

Theorem: For any k-form ω ∈ Λ

k

(R

n

), ∗ ∗ ω = (−1)

n−1

ω.

The proof follows from using Equation (5.2.16) and the properties of the Levi-Civita tensor.

Notice that the double-Hodge star does not depend on k at all!

Note that in all cases, we get the general result:

dx

i

∧ ∗dx

i

=

n

j=1

dx

j

≡ d

n

x ∈ Λ

n

(R

n

) (5.2.18)

Λ

n

(R

n

) is one-dimensional; it’s single basis element is the wedge product of all the dx

j

.

This product is called the volume form of R

n

. With this in mind, we can reinterpret

Equation (5.2.14). Given three vectors in R

3

, take their corresponding forms with the tilde

and then take the wedge product of the result. The answer you get is the determinant of

the matrix formed by these three vectors, multiplied by the volume form of R

3

. What is this

determinant? It is simply the Jacobian of the transformation from the standard basis to the

48

basis of the three vectors you chose! In other words, diﬀerential forms automatically give us

the vector calculus result:

d

n

x

= ¸d

n

x (5.2.19)

5.2.4 Evaluating k-Forms

We have done a lot of deﬁning, but not too much calculating. You might be asking at this

point: OK, so we have all of this machinery- now what?

The appearance of determinants should give you a clue. Whenever we need to evaluate

a k-form, we construct a matrix according to the prescription the form gives us. Finally, we

take the determinant of the matrix.

To construct the matrix, we simply chose whatever elements of our vectors get picked

out by the simple form.

Examples: evalutate the following forms:

1.

dx ∧ dy

_

_

_

_

a

b

c

_

_

,

_

_

d

e

f

_

_

_

_

2.

dx ∧ dz

_

_

_

_

a

b

c

_

_

,

_

_

d

e

f

_

_

_

_

3.

dx ∧ dy ∧ dz

_

_

_

_

a

b

c

_

_

,

_

_

d

e

f

_

_

,

_

_

g

h

i

_

_

_

_

Answers:

1. Construct the matrix by picking out the “x” and “y” components of the vectors, and

take the determinant:

det

_

a d

b e

_

2. This is similar, but now pick out the “x” and “z” components:

det

_

a d

c f

_

3. For this case, it is simply what we had before:

det

_

_

a d g

b e h

c f i

_

_

Hopefully you get the hang of it by now.

49

5.2.5 Generalized Cross Product

Eariler, I mentioned that we can deﬁne a generalized cross product in R

n

by wedging together

(n −1) 1-forms. Let’s see this explicitly:

In R

n

, take n −1 vectors, and tilde them so they are now 1-forms: ¦˜ v

1

, . . . , ˜ v

n−1

¦. Now

wedge these forms together to give you an (n − 1)-form, and take the Hodge star of the

product to give us a 1-form. Finally tilde the whole thing to give us an n-vector, and we

have deﬁned a general product of vectors:

v

1

v

n−1

≡

[∗(˜ v

1

∧ ∧ ˜ v

n−1

)] ∈ R

n

(5.2.20)

5.3 Exterior Calculus

A k-form ﬁeld is a k-form whose coeﬃcients depend on the coordinates. This is exactly

analogous to a vector ﬁeld in vector calculus. So, if we have a vector ﬁeld in R

3

:

v(x) = v

1

(x)e

1

+ v

2

(x)e

2

+ v

3

(x)e

3

˜ v(x) = v

1

(x)dx

1

+ v

2

(x)dx

2

+ v

3

(x)dx

3

5.3.1 Exterior Derivative

Now that we have a notion of “function”, let’s see what we can do in the way of calculus.

We can deﬁne an operator

d : Λ

k

(R

n

) →Λ

k+1

(R

n

)

with the all-important property:

d

2

= dd ≡ 0 (5.3.21)

By construction we will let d act on forms in the following way:

d(a(x)dx) = da(x) ∧ dx (5.3.22)

Let’s look at some examples.

f(x) ∈ Λ

0

(R

3

) →df(x) =

_

∂f

∂x

1

_

dx

1

+

_

∂f

∂x

2

_

dx

2

+

_

∂f

∂x

3

_

dx

3

∈ Λ

1

(R

3

)

This is just the gradient of a function! How about the curl?

50

ω = a

1

dx

1

+ a

2

dx

2

+ a

3

dx

3

→

dω =

__

∂a

1

∂x

1

_

dx

1

+

_

∂a

1

∂x

2

_

dx

2

+

_

∂a

1

∂x

3

_

dx

3

_

∧ dx

1

+

__

∂a

2

∂x

1

_

dx

1

+

_

∂a

2

∂x

2

_

dx

2

+

_

∂a

2

∂x

3

_

dx

3

_

∧ dx

2

+

__

∂a

3

∂x

1

_

dx

1

+

_

∂a

3

∂x

2

_

dx

2

+

_

∂a

3

∂x

3

_

dx

3

_

∧ dx

3

=

_

∂a

2

∂x

1

−

∂a

1

∂x

2

_

dx

1

∧ dx

2

+

_

∂a

3

∂x

1

−

∂a

1

∂x

3

_

dx

1

∧ dx

3

+

_

∂a

3

∂x

2

−

∂a

2

∂x

3

_

dx

2

∧ dx

3

Notice that terms like

_

∂a

1

∂x

1

_

dx

1

∧ dx

1

vanish immediately because of the wedge product.

Again we see how the beauty of diﬀerential forms notation pays oﬀ!

5.3.2 Formulas from Vector Calculus

At this point we can immediately write down some expressions. Some of them I have shown.

Can you prove the rest of them? In all cases, f is any (diﬀerentiable) function, and v is a

vector ﬁeld.

¯

df = ∇f (5.3.23)

¯

∗d˜ v = ∇v (in R

3

) (5.3.24)

∗d ∗ ˜ v = ∇ v (5.3.25)

∗d ∗ df = ∇

2

f (5.3.26)

Notice that even though dd = 0, d ∗ d ,= 0 in general!

5.3.3 Orthonormal Coordinates

Diﬀerential forms allow us to do calculations without ever refering to a coordinate system.

However, sooner or later we will want to get our answers back into a coordinate frame. This

could be tricky.

The key point to remember is that each of these dx

i

is an orthonormal coordinate

2

. There-

fore we must make sure to translate them into orthonormal coordinates in our coordinate

frame; otherwise, the Hodge star will not work.

Let’s consider a simple example by looking at the two-dimensional Laplacian. From

above, we know that ∇

2

= ∗d ∗ d. Cartesian coorindates (x

1

= x, x

2

= y) are orthonormal,

so we can easily plug into our formula:

2

That means that < dx

i

, e

j

>= δ

i

j

and < e

i

, e

j

>= δ

ij

.

51

df =

∂f

∂x

dx +

∂f

∂y

dy

∗df =

∂f

∂x

dy −

∂f

∂y

dx

d ∗ df =

∂

2

f

∂x

2

dx ∧ dy −

∂

2

f

∂y

2

dy ∧ dx

=

_

∂

2

f

∂x

2

+

∂

2

f

∂y

2

_

dx ∧ dy

∗d ∗ df =

∂

2

f

∂x

2

+

∂

2

f

∂y

2

All is well with the world. But what if I wanted to do my work in polar coordinates?

Then my coordinate system is (x

1

= r, x

2

= θ), but the orthonormal polar coordinate basis

vectors are e

1

= ˆr, e

2

=

ˆ

θ

r

, and the dual basis is (dr, rdθ) (see previous footnote). So:

df =

∂f

∂r

dr +

∂f

∂θ

dθ

_

r

r

_

=

∂f

∂r

dr +

1

r

∂f

∂θ

(rdθ)

∗df =

∂f

∂r

(rdθ) −

1

r

∂f

∂θ

dr

d ∗ df =

_

∂

∂r

_

r

∂f

∂r

_

dr ∧ dθ −

∂

∂θ

_

1

r

∂f

∂θ

_

dθ ∧ dr

_

_

r

r

_

=

_

1

r

∂

∂r

_

r

∂f

∂r

_

+

1

r

2

∂

2

f

∂θ

2

_

dr ∧ (rdθ)

∗d ∗ df =

1

r

∂

∂r

_

r

∂f

∂r

_

+

1

r

2

∂

2

f

∂θ

2

= ∇

2

f Wow!

where I have introduced (

r

r

) in the ﬁrst and third line in order to keep the coordinates

orthonormal before using the Hodge star. So, as long as we are careful to use orthonormal

coordinates when using the Hodge star, our formulas work for any coordinate system! This

is a vital reason why diﬀerential forms are so useful when working with general relativity,

where you want to prove things about the physics independent of your reference frame or

coordinate basis.

5.4 Integration

We have discussed how to diﬀerentiate k-forms; how about integrating them? First I will

discuss how to formally evaluate forms, and then I will present a formula for integrating

them. With this deﬁnition in mind, we will be able to derive Stokes’ Theorem- possibly the

most important theorem diﬀerential forms will provide us with.

52

5.4.1 Evaluating k-form Fields

First, a deﬁnition:

Deﬁnition: An Oriented k-Parallelogram, denoted by ±P

0

x

¦v

i

¦, (i = 1, . . . , k), is a

k-parallelogram spanned by the k n-vectors ¦v

i

¦ with basepoint x. P

0

x

¦v

i

¦ is antisymmetric

under v

i

exchange.

Oriented k-Parallelograms allow you to think geometrically about evaluating k-forms,

but as far as paperwork goes, they are just a way to keep track of what vectors are what.

We use them explicitly to evaluate k-form ﬁelds at a point. Let’s see how that works:

Let’s evaluate the 2-form ﬁeld φ = cos(xz)dx ∧ dy ∈ Λ

2

(R

3

). We’ll evaluate it on the

oriented 2-parallelogram spanned by the 3-vectors v

1

= (1, 0, 1) and v

2

= (2, 2, 3). We’ll

evaluate it at two points: (1, 2, π) and (

1

2

, 2, π):

1.

φ

_

¸

¸

¸

¸

¸

¸

_

P

0

0

B

B

B

@

1

2

π

1

C

C

C

A

_

_

_

_

_

1

0

1

_

_

,

_

_

2

2

3

_

_

_

_

_

_

¸

¸

¸

¸

¸

¸

_

= cos(1 π) det

_

1 2

0 2

_

= −2

2.

φ

_

¸

¸

¸

¸

¸

¸

_

P

0

0

B

B

B

@

1

2

2

π

1

C

C

C

A

_

_

_

_

_

1

0

1

_

_

,

_

_

2

2

3

_

_

_

_

_

_

¸

¸

¸

¸

¸

¸

_

= cos(

1

2

π) det

_

1 2

0 2

_

= 0

5.4.2 Integrating k-form Fields

Now that we have a formal way to evaluate k-form ﬁelds, we can talk about integrating

them. We deﬁne what the integral of a k-form ﬁeld is:

Deﬁnition: Let φ ∈ Λ

k

(R

n

) be a k-form ﬁeld, A ⊂ R

k

be a pavable set, and γ : A → R

n

be a (vector-valued) diﬀerentiable map. Then we deﬁne the integral of φ over γ(A) as:

_

γ(A)

φ ≡

_

A

φ

_

P

0

γ(u)

_

∂γ

∂x

1

¸

¸

¸

¸

u

,

∂γ

∂x

2

¸

¸

¸

¸

u

, . . . ,

∂γ

∂x

k

¸

¸

¸

¸

u

__

d

k

u (5.4.27)

where d

k

u is the k-volume form.

53

Like in vector calculus, we can deﬁne the integral formally in terms of Riemann sums -

yuck! Let me just say that this is an equivalent deﬁnition; the more curious students can go

prove it.

Let’s do two examples to get the hang of it:

1. Consider φ ∈ Λ

1

(R

2

) and integrate over the map:

γ(u) =

_

Rcos u

Rsin u

_

over the region A = [0, α], (α > 0). Let φ = xdy −ydx:

_

γ(A)

φ =

_

[0,α]

(xdy −ydx)

_

¸

¸

_

P

0

0

@

Rcos u

Rsin u

1

A

_

−Rsin u

Rcos u

_

_

¸

¸

_

du

=

_

α

0

du[(Rcos u)(Rcos u) −(Rsin u)(−Rsin u)]

=

_

α

0

duR

2

[cos

2

u + sin

2

u]

= αR

2

2. Consider φ = dx ∧ dy + ydx ∧ dz ∈ Λ

2

(R

3

), and the map:

γ

_

s

t

_

=

_

_

s + t

s

2

t

2

_

_

over the region C = ¦

_

s

t

_

[0 ≤ s ≤ 1, 0 ≤ t ≤ 1¦:

_

γ(C)

φ =

_

1

0

_

1

0

(dx ∧ dy + ydx ∧ dz)

_

¸

¸

¸

¸

¸

¸

_

P

0

0

B

B

B

@

s + t

s

2

t

2

1

C

C

C

A

_

_

_

_

_

1

2s

0

_

_

,

_

_

1

0

2t

_

_

_

_

_

_

¸

¸

¸

¸

¸

¸

_

dsdt

=

_

1

0

_

1

0

_

det

_

1 1

2s 0

_

+ s

2

det

_

1 1

0 2t

__

dsdt

=

_

1

0

_

1

0

(−2s + 2s

2

t)dsdt

=

_

1

0

ds(−2s + s

2

) = −

2

3

54

5.4.3 Stokes’ Theorem

Now we can move on to present one of if not the most important theorems that diﬀerential

forms has to oﬀer- Stokes’ Theorem. I will present it correctly; do not be overly concerned

with all the hypotheses; it suﬃces that the area you are integrating over has to be appropri-

ately “nice”.

Theorem: Let X be a compact, piece-with-boundary of a (k+1)-dimensional oriented

manifold M⊂ R

n

. Give the boundary of X (denoted by ∂X) the proper orientation, and

consider a k-form ﬁeld φ ∈ Λ

k

(R

n

) deﬁned on a neighborhood of X. Then:

_

∂X

φ =

_

X

dφ (5.4.28)

What is this theorem saying? My old calculus professor used to call it the “Jumping-d

theorem”, since the “d” jumps from the manifold to the form. In words, this theorem says

that the integral of a form over the boundary of a suﬃciently nice manifold is the same thing

as the integral of the derivative of the form over the whole mainfold itself.

You have used this theorem many times before. Let’s rewrite it in more familiar notation,

for the case of R

3

:

k dφ X

_

X

dφ =

_

∂X

φ Theorem Name

0 ∇f dx Path from a to

b

_

b

a

∇f dx = f(

b) −f(a) FTOC

1 (∇

f) d

S Surface (Ω)

_

Ω

(∇

f) d

S =

_

∂Ω

f dx Stokes Theorem

2 (∇

f)d

3

x Volume (V)

_

V

(∇

f)d3x =

_

∂V

f d

S Gauss Theorem

Here, I am using vector notation (even though technically I am supposed to be working with

forms) and for the case of R

3

, I’ve taken advantage of the following notations:

dx = (dx, dy, dz)

d

S = (dy ∧ dz, dz ∧ dx, dx ∧ dy)

d

3

x = dx ∧ dy ∧ dz

As you can see, all of the theorems of vector calculus in three dimensions are reproduced

as speciﬁc cases of this generalized Stokes Theorem. However, in the forms notation, we are

not limited to three dimensions!

Before leaving diﬀerential forms behind, I should mention one important point that I have

gone out of my way to avoid: the issue of orientation. For the sake of this introduction, let me

just say that all the manifolds we consider must be “orientable with acceptable boundary”.

What does this mean? It means that the manifold must have a sense of “up” (no M¨ obius

strips). “Acceptable boundary” basically means that we can, in a consistent and smooth way,

“straighten” the boundary (no fractals). These are technical issues that I have purposely

left out, but they are important if you want to be consistent.

55

5.5 Forms and Electrodynamics

As a ﬁnale for diﬀerential forms, I thought it would be nice to summarize brieﬂy how one

uses forms in theories such as E&M. Recall that in covariant electrodynamics, we have an

antisymmetric, rank-2 4-tensor known as the “Field Strength” tensor:

F ≡ F

µν

dx

µ

⊗dx

ν

(5.5.29)

As we have seen, it is always possible to use 2-forms instead of (antisymmetric) tensors, and

we can rewrite the above tensor as a diﬀerential form in Λ

2

(M

4

)

3

:

F ≡

1

2

F

µν

dx

µ

∧ dx

ν

∈ Λ

2

(M

4

) (5.5.30)

Notice that the antisymmetry of F

µν

is immediate in form notation. Also, this is a 2-form in

four dimensions, which means that its dual is also a 2-form. Now using Equation (5.5.30), I

can write the Lorentz Force Law for a particle with charge e moving with a velocity vector

u as:

˙

˜ p = eF(u) ∈ Λ

1

(M

4

) (5.5.31)

Let’s consider an example. Suppose we have a magnetic ﬁeld in the ˆ x direction, so

F = B

x

dy ∧ dz. Then the force on the particle is:

˙

˜ p = eF(u) = eB

x

¸dy ∧ dz, u)

= eB

x

[dy ¸dz, u) −dz ¸dy, u)]

= eB

x

[u

z

dy −u

y

dz] ⇒

˙

p = eB

x

(u

z

ˆ y −u

y

ˆ z)

This is exactly what you would have gotten if you had used your vector-form of Lorentz’s

Law, with cross products.

Another vector that is important in E&M is the 4-current (J). We will again think of J

as a 1-form in M

4

; therefore, it’s dual is a 3-form.

Armed with all we need, we can write down Maxwell’s Equations:

dF = 0 (5.5.32)

d ∗ F = 4π ∗ J (5.5.33)

To interpret these equations, one can think of F (∗F) as an object that represents “tubes of

force” ﬂowing through space-time.

We can take advantage of Stokes Theorem (just like in regular vector calculus) to put

these equations in integral form:

3

M

4

is Minkowski space-time. In general, I can go to any space I want; for now, I’m sticking to ﬂat space

(i.e.: ignoring gravity). But notice that this analysis is totally general, and gravity can be easily included at

this stage.

56

_

Σ

dF =

_

∂Σ

F = 0 (5.5.34)

_

Σ

d ∗ F =

_

∂Σ

∗F = 4π(charge) (5.5.35)

The ﬁrst of these equations says that the total ﬂux of F through a closed region of space-

time is zero; the second equation says that the total ﬂux of ∗F through a closed region of

space-time is proportional to the amount of charge in that region. Notice that this description

never mentions coordinate systems: once again, diﬀerential forms has allowed us to describe

physics and geometry without ever referring to a coordinate system! Notice the similarity

of this equation with the Gauss-Bonnet Theorem – in gauge theories, you may think of the

ﬁeld strength as a curvature to some space!

The ﬁnal step in studying electrodynamics is to notice that Equation (5.5.32) suggests

something. Recall that if a diﬀerential form was the exterior derivative of some other form,

then it’s exterior derivative was necessarely zero, via Equation (5.3.21). Is the converse true?

Namely, if you have a form whose exterior derivative is zero, can it be written as the exterior

derivative of some other form?

Your ﬁrst impulse might be to say “yes”: surely if the form is an exterior derivative, your

condition is satisﬁed. But it turns out that the question is a little more subtle than that.

For I can give you a form which is not an exterior derivative of another form, and yet still

has vanishing exterior derivative! This is a famous case of “necessity” versus “suﬃciency”

in mathematics. That a form is an exterior derivative is suﬃcient for its exterior derivative

to vanish, but not necessary.

The key point is that this property depends not on the diﬀerential forms themselves, but

on the global properties of the space they live in! It turns out that M

4

does indeed have the

property of necessity; speciﬁcally, it is simply connected. Therefore it is safe to assume a la

Equation (5.5.32) that we can write:

F = dA (5.5.36)

for some 1-form A. So indeed, electrodynamics in ﬂat space can be described by a 4-vector

potential. But do be careful not to jump to any conclusions before you know the properties

of the universe you are describing!

Finally, we can consider what happens if we let A →A+dλ, where λ is any (reasonably

well-behaved) function. Then by just plugging into Equation (5.5.36), with the help of

Equation (5.3.21), we see that F remains the same. Hense, the 4-vector potential is only

deﬁned up to a 4-gradient; this is exactly what gauge invariance tells us should happen!

This gives a very powerful geometric intuition for our ﬁeld-strength, and therefore for

electricity and magnetism. You can do similar analyses for any gauge ﬁeld, and by adjusting

the manifold, you can alter your theory to include gravity, extra dimensions, strings, or

whatever you want! This is one of the biggest reasons why diﬀerential forms are so useful to

physicists.

57

Chapter 6

Complex Analysis

In this chapter, I would like to present some key results that are used in analytic work. There

is of course much more material that what I present here, but hopefully this small review

will be useful.

Complex analysis is concerned with the behavior of functions of complex variables. It

turns out that these functions have a much diﬀerent behavior than functions of real variables.

The techniques of complex analysis, pioneered in the eighteenth and nineteenth centuries,

have become very important in almost all branches of applied mathematics. I am assum-

ing that you are familiar with the basics of complex numbers (z = x + iy ∈ C) and will

immediately move on.

In my experience, applications of complex numbers fall into two general catagories:

Cauchy’s Integral Theorem and Conformal Mapping. This review will consider the cal-

culus of complex variables. I will not mention too much about conformal mapping, except

to deﬁne it and give an example or two.

Complex numbers are written as z. Then a function of a complex variable has the form:

f(z) = f(x + iy) = u(x, y) + iv(x, y)

where u, v are both real valued functions of the real variables x, y. You can also write a

complex number in polar coordinates z = re

iθ

, where r ≡ [z[ and θ is called the argment

of z, sometimes denoted as arg z.

6.1 Analytic Functions

• Deﬁnition: A complex function f(z) is analytic at a point z

0

if it has a Taylor

expansion that converges in a neighborhood of that point:

f(z) =

∞

i=0

a

i

(z −z

0

)

i

If f(z) is analytic at all points inside some domain D ⊂ C then it is said to be analytic

in D. If it is analytic for all ﬁnite complex numbers, it is said to be entire. If it is

analytic for all complex numbers, including inﬁnity, then f(z) =constant!

58

• Theorem: The real and imaginary parts of an analytic function f(z) = u(x, y)+iv(x, y)

satisfy the Cauchy-Riemann (CR) Equations:

u

x

= v

y

(6.1.1)

v

x

= −u

y

(6.1.2)

where u

x

means the partial derivative of u with respect to x, etc. From the CR

equations, it can be seen that both u and v satisfy Laplace’s equation:

u

xx

+ u

yy

= v

xx

+ v

yy

= 0

Functions that satisfy Laplace’s equation are called harmonic functions. If u(x, y) is

a harmonic function, then up to an additive constant, there is a unique function v(x, y)

that satisﬁes the CR equations with u(x, y). This function is called the harmonic

conjugate of u(x, y). Notice that this relation is not reﬂexive in general; in other

words, if v is the harmonic conjugate of u, then it is not generally true that u is the

harmonic conjugate of v. When it is true, it is possible to show that u and v are

constants.

• Let’s do an example:

f(z) = z

2

= (x

2

−y

2

) + i(2xy)

This function is analytic everywhere in the ﬁnite complex plane (it is entire): its Taylor

series about z = 0 is trivial: a

2

= 1, all others zero. It also satisﬁes the CR equations.

Note that the function v(x, y) = 2xy is the harmonic conjugate to u(x, y) = x

2

− y

2

,

but as we expect from the above note, the converse is not true (prove it!).

• Let’s see how one might construct the harmonic conjugate of a real-valued harmonic

function. Consider the function u(x, y) = y

3

−3x

2

y. This function is certainly harmonic

(prove it!), so up to an additive constant, there should be a unique function v(x, y)

that is the harmonic conjugate of u(x, y). To ﬁnd it, note ﬁrst of all that by the ﬁrst

CR equation, v

y

= u

x

= −6xy. Integrating this equation with respect to y gives

v(x, y) = −3xy

2

+φ(x) where φ(x) is some arbitrary function of x alone. Plugging into

the second CR equation gives φ

(x) = 3x

2

. Integrating this gives φ(x) = x

3

+C with an

arbitrary constant C, and therefore v(x, y) = x

3

−3xy

2

+C is the harmonic conjugate

to u(x, y). Therefore f(x, y) = u(x, y)+iv(x, y) must be an analytic function. Exercise:

show that f(x, y) can be rewritten as f(z) = i(z

3

+ C), which is certainly analytic.

• Consider two analytic functions f

1

(z), f

2

(z) deﬁned on domains D

1

, D

2

respectively,

where D

1

∩ D

2

,= ∅. Also assume that the two functions agree for all points in the

intersection. Then f

2

(z) is called the analytic continuation of f

1

(z). When an

analytic continuation exists, then it is unique. The proof of this is not hard, but the

key is in the analytic. This means that there is only one way to “extend” a function

analytically out of its original domain. Notice, however, that if f

3

(z) is another analytic

continuation of f

2

(z) into another domain D

3

, it need not be a continuation of f

1

(z)

into that domain!

59

• As an example, let’s look at our old example of f(z) = z

2

. This function agrees with

the function f(x) = x

2

in the domain of real numbers (y = 0). It is also analytic,

therefore it is the analytic continuation of the real-valued function f(x) = x

2

. Further,

it is unique: there is no other function that is analytic and matches with our original

function!

• That was an obvious one. What about a more complicated example? Consider the

factorial function deﬁned on N: f(n) = n! I will now deﬁne a function on the real

numbers:

Γ(x) =

_

∞

0

dss

x−1

e

−s

(6.1.3)

Integrating by parts shows that when x is a natural number, Γ(x) = (x −1)! Further-

more, Γ(x) is in fact an analytic function of x, so long as x is not a negative integer

or zero. Hense, Γ(x) is the (unique) analytic continuation of the factorial function to

the real numbers (without the negative integers or zero). Indeed, we can go further:

by replacing x → z (as long as z is not a negative integer or zero – why not?) this

extends to all of C.

6.2 Singularities and Residues

• Deﬁnition: A Laurent Expansion of a function f(z) about a point z

0

is an expansion

in both positive and negative powers:

f(z) =

∞

n=−∞

c

n

(z −z

0

)

n

=

∞

n=1

b

−n

(z −z

0

)

n

+

∞

n=0

a

n

(z −z

0

)

n

(6.2.4)

• As you can see, a Laurent expansion is a generalization of a Taylor expansion. Notice

that the function is not deﬁned at the point z

0

unless all of the b

n

= 0. If this is the

case, f(z) is analytic at z

0

. A Taylor expansion typically exists in an open disk about

the point z

0

. The generalization to the Laurent expansion converges inside an annulus

R

1

< [z −z

0

[ < R

2

.

• A singularity of a function is a point where the function is not deﬁned. If z

0

is a

singularity of f(z) and you can deﬁne a neighborhood (open set) about that point U(z

0

)

such that f(z) is analytic in the domain U(z

0

) − ¦z

0

¦ (so R

1

→ 0 above), then z

0

is

called an isolated singularity. There are three general types of isolated singularities:

1. Removable singularities are the most mild form of singularity, where all the

b

−n

= 0. An example of this is the function

sin x

x

. This function is not deﬁned

at the origin, but it does have a well-deﬁned limit there. You can see this by

performing a Taylor expansion of the sine function and dividing out by x, leaving

a ﬁnite value plus terms that are O(x) and hense vanish in the limit. Removable

singularities are trivial, and I will not talk anymore about them.

60

2. If a function has a Laurent expansion about z

0

but there exists an integer m > 0

such that all the b

−n

= 0 for every n > m (but with b

−m

,= 0), then the function is

said to have a Pole of order m at z

0

. In the special case of m = 1, the function

is said to have a simple pole. An example of a function with a simple pole is

cos x

x

at the origin, where it behaves as 1/x. Poles are very useful and important

in complex analysis, as we will see.

3. If a function has a Laurent expansion about z

0

but there is not an integer m > 0

such that all the b

−n

= 0 for every n > m (so the denominator terms do not stop),

then the function is said to have an Essential Singularity at z

0

. An example of

this is the function f(z) = e

1/z

at the origin. This singularity is a true nightmare,

and there is very little to be done for it!

• If a function f(z) has a pole of order m at z

0

(not necessarely a simple pole), then the

quantity b

−1

is called the Residue of the pole, also denoted by Res

z=z

0

_

f(z)

_

. This

quantity plays a vital role in the calculus of complex-valued functions. A function that

is analytic in a domain, up to maybe a countable number of poles (so it still has a

Laurent expansion), is said to be meromorphic in that domain.

• Residues are so important, I want to list various methods of ﬁnding them. I will state

several methods here without proof:

1. If p(z

0

) ,= 0 and q(z) has a zero of order m at z

0

1

then

p(z)

q(z)

has a pole of order m

at z

0

.

2. If f(z) has a pole of order m at z

0

, it can be written as f(z) =

p(z)

(z−z

0

)

m

where

p(z

0

) ,= 0. Then the residue of f(z) at z

0

is b

−1

=

p

(m−1)

(z

0

)

(m−1)!

3. If q(z) has a simple zero at z

0

and p(z

0

) ,= 0, then the residue of

p(z)

q(z)

at z

0

is given

by b

−1

=

p(z

0

)

q

(z

0

)

• Before leaving this section, there is one more type of singularity that comes up. This is

not an isolated singularity, and therefore is not deﬁned through its Laurant expansion.

Rather this singularity comes from an ambiguity in multiply deﬁned functions. For

example, consider the complex Logarithmic function:

log z = log r + iθ

where I have used the famous result z = re

iθ

. The imaginary part of this function

is a problem, since θ is only deﬁned up to an arbitrary factor of 2π. Therefore, this

expression has an inﬁnite number of solutions for a single value of z. In real analysis,

we would throw such functions away, but not in complex analysis. The solution is to

deﬁne a cut where we explicitly say what values of θ we allow. For example, we can

say that we will only let −π < θ < +π. This is called the principle branch of the

logarithmic function, and is sometimes denoted Log z.

1

i.e.: q(z) = (z −z

0

)

m

r(z), where r(z

0

) ,= 0.

61

• Branch cuts are lines in the complex plane that tell you where to cut oﬀ the deﬁnition

of your function. For example, the branch cut for the logarithmic function is a line that

connects the origin to inﬁnity. For the principle branch, that line extends along the

negative real axis. It could have gone in any direction, which would have corresponded

to a diﬀerent branch of the logarithm. In addition, it does not even have to be a

straight line. Notice that the function is discontinuous as you cross the branch cut. In

the case of the Logarithm: log(−x + i) − log(−x − i) = 2πi, where x is a positive

real number. Branch cuts are discontinuities in the function.

• The branch cut you make is not unique, but what is true is that all branch cuts connect

two points, and those two points are unique. In the case of the Logarithm, those points

were the origin and inﬁnity. Since those points are common to all branch cuts, you

cannot ever deﬁne the logarithmic function there. Therefore they are singularities.

They are called branch points.

• Aside from the logarithm, the most common function with branch point singularities

is the function f(z) = x

a

where 0 < Re(a) < 1. This function is multiply deﬁned since

f(z) = f(ze

i2πn/a

), for integer n. It too has branch points at z = 0, ∞.

6.3 Complex Calculus

When doing calculus on a function of complex variables, a whole new world of possibilities

opens up! At ﬁrst, you can think of the complex plane as a special representation of R

2

, and

you might think that the complex calculus is the same as real vector calculus. But this is

not true! There are a variety of theorems that make calculus (especially integral calculus)

on the complex plane a rich and fascinating subject, both theoretically as well as practically.

Complex analysis is a deep subject, but since this is supposed to be a practical review for

physics grad students, I will skip over all the proofs and just present the most commonly-

used results. I strongly encourage you to read up on the theory behind this stuﬀ; it’s really

an amazing subject in and of itself.

• A general function of a complex variable can be written as f(z, z

∗

). Such a function is

analytic if and only if

df

dz

∗

≡ 0 (6.3.5)

so that f = f(z). Such a function is sometimes called holomorphic, but this is just

a synonym for analytic.

• Just like in vector calculus, you can deﬁne contour integrals in the complex plane. if

C is a contour described by a path in the complex plane z(t) where t ∈ [a, b] is a real

parameter, we can deﬁne the contour integral of an analytic function in terms of a

one-dimensional Riemann integral:

62

_

C

f(z)dz ≡

_

b

a

f(z(t))z

(t)dt (6.3.6)

• Cauchy-Goursat Theorem: Let C be a closed contour in the complex plane and f(z)

an analytic function on the domain interior to and on C. Then:

_

C

f(z)dz ≡ 0 (6.3.7)

This is a generalization of the fundamental theorem of calculus. A powerful conse-

quence of this theorem is that a contour integral is invariant to deformations of the

contour, as long as the deformation does not pass through a singularity of f(z).

• Cauchy’s Integral Formula: Let f(z) be analytic everywhere within and on a simple,

closed, positively oriented contour

2

C, and let z

0

be any point on the interior to C.

Then:

f(z

0

) =

1

2πi

_

C

f(z)dz

z −z

0

(6.3.8)

• Using the Cauchy integral formula and some complex analysis, we can actually derive

a very powerful formula for the n-th derivative of an analytic function:

f

(n)

(z) =

n!

2πi

_

C

f(z)dz

(z −z

0

)

n+1

(6.3.9)

From this, we come to a rather amazing result: if a function is analytic (which is the

same as saying

df

dz

exists), then all of its derivatives exist and are given by the above

integral! This is very diﬀerent from real analysis, where a function that has a ﬁrst

derivative need not have higher derivatives. For example, x

3/2

has a ﬁrst derivative at

x = 0 but it does not have a second derivative there. This kind of thing can never

happen in complex analysis. Exercise: why doesn’t z

3/2

contradict this result?

• Notice immediately the marvelous corollary to the above formulas: for all n ∈ Z,

_

C

dz

(z −z

0

)

n

= 2πiδ

n1

(6.3.10)

as long as C encloses z

0

. This follows for n > 1 since the derivative of a constant is

zero, and it follows for n ≤ 0 since such terms are analytic inside and on the contour

and hence the integral vanishes by the Cauchy-Goursat theorem. Finally, the result

for n = 1 follows directly from the Cauchy integral formula, or by direct integration.

2

A positively oriented contour means you travel around the contour in a counter-clockwise direction; a

negatively oriented contour travels around the contour in a clockwise direction.

63

• Using this result, we come to a remarkable and vital theorem used throughout all areas

of applied and abstract mathematics. Consider a general, meromorphic function on the

complex plane. Such a function has a Laurent expansion. Now let’s do a closed contour

integral, where we make sure the contour does not pass through any singularities of

f(z). Some of the isolated singularities of f(z) are on the interior of C (call this set

Z = ¦z

0

, z

1

, . . . , z

n

¦) and some are on the exterior, where Z can be a ﬁnite or inﬁnite

set. Then we have the amazing and most important result:

_

C

f(z)dz = 2πi

z

k

∈Z

Res

z=z

k

_

f(z)

_

(6.3.11)

This result is usually called the Residue Theorem. This is why it’s so important to

know how to compute residues of meromorphic functions.

• To see how vitally useful the residue theorem is, consider the following real integral:

I =

_

+∞

−∞

cos xdx

(x

2

+ a

2

)

3

(6.3.12)

Try doing this integral directly! Before you shoot yourself, let’s see how to do it in

just a few steps. First of all, we analytically continue the integrand to the complex

plane by replacing x by z (check that this is, indeed, the analytic continuation of the

integrand). Now in order to make sense of the cosine of a complex variable, we use

the deﬁnition cos z ≡ (e

iz

+e

−iz

)/2. We split the integral into two terms I = I

+

+I

−

,

where

I

±

=

1

2

_

+R

−R

e

±iz

dz

(z

2

+ a

2

)

3

where we will take the limit R → ∞ in the end. Let’s just consider I

+

for now. We

can rewrite the denominator as (z +ia)

3

(z −ia)

3

, so by rule 2 in the residues section,

the integrand has a third-order pole at both of these points and the residue at these

points are given by

1

2!

d

2

dz

2

e

iz

(z ±ia)

3

¸

¸

¸

¸

z=±ia

= ∓i

e

∓a

16a

5

(3 + a(a ±3))

So far, we cannot apply the residue theorem since the contour is not closed. However,

notice that when we go into the complex plane, the exponential in the numerator picks

up a factor of e

−Im(z)

. So in particular, if we add to I

+

a term that is the contour

integral over the semicircle of radius R going through the upper-half of the complex

plane (Im(z) > 0), such a term will go like e

−R

and hence vanish as R → ∞. If we

include this term, we don’t change the value of I

+

but we do close the contour, and

hence can apply the residue theorem. Since we only surround the pole at z = +ia we

only keep that residue, and we get:

64

lim

R→∞

I

+

=

πe

−a

16a

5

(3 + a(a + 3))

To compute I

−

, the story is very similar, but there are a few diﬀerences to take into

account. First of all, with the exponential being negative, this will change our formula

for the residue slightly. Also, we will need to close the integral in the lower half

of the complex plane rather than the upper half, in order to make sure the added

contribution vanishes. Finally, since the new contour is oriented clockwise rather than

counterclockwise, there is an extra negative sign. Putting this all together gives us

that I

−

= I

+

so the ﬁnal integral is

_

+∞

−∞

cos xdx

(x

2

+ a

2

)

3

=

πe

−a

8a

5

(3 + a(a + 3))

Try doing that integral without the help of the residue theorem – I wouldn’t even know

where to begin!!

• The residue theorem gives us a way to integrate (or diﬀerentiate) complicated mero-

morphic functions which are analytic on the contour of integration. But what about

functions that have singularities on the contour? Such integrals by themselves are not

well-deﬁned, but we can still make sense of them by means of a regulator. The ﬁnal

answer we get will not be unique since it will depend on the regularization scheme used.

But once we specify this scheme, we can proceed to make well-deﬁned statements.

• It turns out that there is a particular regularization scheme that is a very natural

choice, called the Principle Value. As an example, consider the integral

_

∞

−∞

xdx

This integral is indeterminant until we chose precisely how to take the limit to inﬁnity.

The Principle Value of this integral is deﬁned as the symmetric limit:

T

_

∞

−∞

xdx = lim

R→∞

_

R

−R

xdx = 0

Notice that this choice of regulator is by no means unique. For example, we could have

deﬁned the integral as

lim

R→∞

_

R+

b

R

−R−

a

R

xdx = b −a

• There is nothing special about any of these choices. However, there is good reason to

use the principle value. Let f(x) be an analytic real-valued function. Then we have

the following vital result:

65

lim

→0

_

∞

−∞

f(x)

x −x

0

±i

= T

_

∞

−∞

f(x)

x −x

0

∓iπf(x

0

) (6.3.13)

where the principle value on the right hand side should be understood as

T

_

∞

−∞

f(x)

x −x

0

≡ lim

R→∞

lim

ρ→0

__

−ρ

−R

f(x)

x −x

0

+

_

R

ρ

f(x)

x −x

0

_

(6.3.14)

This theorem goes by many names, but I like to call it the Principle Value Theorem,

although you might ﬁnd it under something else in your favorite textbook.

• I will not prove the Principle Value theorem (it’s not very diﬃcult to prove – try it

yourself), but I will use it in a simple example. Let’s calculate the integral

_

∞

−∞

cos x

x + i

dx

As → 0, this integral is ill-deﬁnined, due to its pole on the real axis. Nonetheless,

the Principle Value Theorem gives us a way to solve the problem:

_

∞

−∞

cos x

x + i

= T

_

∞

−∞

cos x

x

−iπ cos(0)

If the universe is fair, the ﬁrst integral on the right-hand side should vanish due to

the symmetry of the integrand, but that will only happen if the divergences are under

control. Notice that near the origin, where the integrand has a simple pole, the principle

value tells us how to cancel the divergence there:

lim

R→∞

lim

ρ→0

__

−ρ

−R

dx

x

+

_

R

ρ

dx

x

_

= lim

R→∞

lim

ρ→0

_

−

_

R

ρ

du

u

+

_

R

ρ

dx

x

_

= 0

where I set u = −x in the ﬁrst integral. Thus the principle value has tamed the

divergence. So this integral does exist, and it equals −iπ.

• Notice in the above example, if we had changed the sign of (placing the pole above

the real axis rather than below it) we would have found a diﬀerent answer +iπ, even

though we take the limit → 0, and you might have thought that we should get the

same answer in the end. This is not a coincidence! It is due to the fact that the integral

(without the ) is not well deﬁned. So this begs the question: which sign should we

use? This is where mathematics ends and physics begins. You will ﬁnd that the choice

of sign typically comes from boundary conditions, and is sensitive to the particular

problem at hand.

66

• As a ﬁnal example of how complex calculus is so very useful, let’s consider how to

integrate a function with a branch-point singularity. This is a little diﬀerent since a

branch point is not an isolated singularity, so the residue theorem doesn’t help by itself.

But we can still do pretty well.

When there is a branch-point singularity, you have to specify a branch cut that connects

the singularities. So for example, let’s consider the following integral:

_

∞

0

x

−a

dx

x + 1

where 0 < a < 1. This integrand is multiply deﬁned, so we must choose a branch cut

to deﬁne it. This is a line connecting the two branch points, in this case the origin and

inﬁnity. This comes from the fact that we can write the numerator as e

−a log x

. Also,

notice the simple pole at x = −1, so we shouldn’t choose our branch cut to overlap

with that pole. We will make our cut along the positive real axis. This integral is the

same as

_

R

+

z

−a

dz

z + 1

where the integral is along the positive real axis in the complex plane.

We will extend this contour: ﬁrst, the contour goes from ρ + i (ρ < 1) to R + i

(R > 1) on the real axis (L

+

). Then the contour goes counterclockwise around a circle

of radius R until we get back to just below the real axis R −i (C

R

). Then we travel

back down the real axis until we get to ρ − i (L

−

). Finally, we go clockwise around

a circle of radius ρ until we get back to the starting point at ρ + i. The is there to

avoid the branch cut - we can let → 0 as long as we keep in mind how to deﬁne the

exponent (see below). With our choice of ρ and R this contour encircles the pole, so

the residue theorem applies. So we have:

_

_

L

+

+

_

C

R

+

_

L

−

+

_

Cρ

_

z

−a

dz

z + 1

= 2πie

−iaπ

The two circle contour integrals vanish. To see that, we need the fact that [z

1

+z

2

[ ≥

[[z

1

[ −[z

2

[[. Then for the small circle:

¸

¸

¸

¸

¸

_

Cρ

z

−a

dz

z + 1

¸

¸

¸

¸

¸

≤

ρ

−a

(2πρ)

[[z[ −1[

=

2π

1 −ρ

ρ

1−a

ρ→0

−→ 0

and a similar analysis can be done with the large circle. The interesting part comes

from the remaining two integrals. The integral across L

+

is above the branch cut, so

log z = log r + i0 as → 0, while the integral along L

−

is below the branch cut, so

log z = log r + i2π as →0. Plugging this in:

67

_

L

+

z

−a

dz

z + 1

=

_

R

ρ

r

−a

dr

r + 1

_

L

−

z

−a

dz

z + 1

=

_

ρ

R

r

−a

e

2iaπ

dr

r + 1

Putting all of this together and taking the limits ρ →0 and R →∞ gives

_

∞

0

r

−a

dr

r + 1

=

2πie

−iaπ

1 −e

2iaπ

=

π

sin aπ

6.4 Conformal Mapping

• Conformal mapping is one of those tools that, in my experience, takes a great deal of

time and practice to get comfortable with. It is one of those things in mathematics

which is almost as much art as science. Whole textbooks have been written on this

rich and amazing subject; perhaps it is overly ambitious for me to try and write this

section. But I’ve always loved a challenge.

I cannot begin to write a complete review of this subject – it would take way too long

and I am not an expert on the subject anyway. So in this section I only intend to

present some of the basic theorems and one or two examples. Maybe one day I’ll do

more than that, but not today.

• Conformal mapping is a technique that is useful in solving partial diﬀerential equations

in two dimensions with certain boundary conditions. The applications to such problems

range everywhere from ﬂuid dynamics to electrodynamics to ﬁnancial forecasting! The

basic idea is to transform complicated boundary conditions into simpler ones, and thus

make solving the problem much easier.

• Simply put, a conformal transformation is a transformation that preserves angles.

Technically, consider a contour in the complex plane parametrized by a function z(t),

and a map f(z) that is analytic and has a nonvanishing ﬁrst derivative; such maps

are called conformal maps. Let w(t) = f(z(t)) be the image of the contour under

the map f. We will call the complex w-plane C

w

and the complex z-plane C

z

, so

f : C

z

→C

w

.

• So what does this have to do with angles? The tangent vector in C

w

is given by

w

(t) = f

(z(t))z

(t)

From this it follows that

arg w

(t) = arg f

(z) + arg z

(t) (6.4.15)

Now let’s imagine two curves described by z

1

(t) and z

2

(t) that intersect at a point

z

1

(t

0

) = z

2

(t

0

) = z

0

∈ C

z

. Then from the above formula, we have the result:

68

arg w

1

(t) −arg w

2

(t) = arg z

1

(t) −arg z

2

(t) (6.4.16)

So indeed we ﬁnd that conformal transformations do preserve angles between contours.

• Before going further, let me point out a very important fact without proof: let f(z) =

u(x, y)+iv(x, y) be an analytic function, so that u(x, y) and v(x, y) are both harmonic,

and v is the harmonic conjugate to u. Let T

u

(x, y) = ∇u(x, y) be the tangent vector

of u at the point (x, y), and similarly for T

v

(x, y), and let (x

0

, y

0

) be a point such that

u(x

0

, y

0

) = v(x

0

, y

0

). Then T

u

(x

0

, y

0

) T

v

(x

0

, y

0

) = 0; that is, harmonic conjugates

have orthogonal tangent vectors at points of intersection. I leave the proof of this to

you (Hint: use the CR Equations). This will be very useful in what follows.

• Another important fact: the deﬁnition of a conformal transformation turns out to

provide a necessary and suﬃcient condition for an inverse function to exist, at least

locally; this is a result of the infamous “Inverse Function Theorem”. Thus, if we ﬁnd a

conformal transformation that simpliﬁes our boundary conditions in C

w

, we can always

invert the transformation to get the solution on C

z

, which is, of course, what we really

want.

• The diﬀerential equation we are interested in is Laplace’s equation. Remember that

the real and imaginary parts of an analytic function are harmonic functions; that is,

solutions to Laplace’s equation in two dimensions. We are interested in two types of

boundary conditions for a harmonic function h(x, y) on a boundary described by a

curve C in the complex plane z = x + iy:

1. Dirichlet Boundary Conditions: h[

C

= h

0

for any constant h

0

.

2. Neumann Boundary Conditions: n ∇h[

C

= 0, where n is a unit vector normal

to the boundary.

It turns out that for these two types of boundary conditions are preserved under con-

formal transformations. Any other type of boundary condition will generally not be

preserved. for example,

dh

dn

¸

¸

C

=constant,= 0 is not preserved by conformal transforma-

tions. Therefore, whenever faced with trying to solve Laplace’s equation with either

Dirichlet or Neumann boundary conditions in two dimensions, conformal mapping is

an available method of ﬁnding the solution. As mentioned above, this problem comes

up many times and in many places, so this is a very powerful technique.

• As an example, let’s consider the problem of trying to ﬁnd the electrostatic scalar

potential inside an electrical wire. The wire has a circular cross section, where half of

the cylinder is kept at V = 0 and the other half is kept at V = 1. The cylindrical

symmetry of the problem means the potential cannot depend on z, so this is a two-

dimensional problem. If we place the cross section of the wire on a unit circle in the

complex plane C

z

, we can look for a harmonic function that agrees with the boundary

conditions. This, by uniqueness of solutions, is the ﬁnal answer. This is a Dirichlet

boundary condition problem with circular boundary. Such boundaries are complicated,

but we can use conformal mapping to turn this problem into an easier one.

69

Consider the mapping:

w = i

1 −z

1 + z

(6.4.17)

This is an example of a M¨ obius Transformation, also known as a linear fractional

transformation . It has the eﬀect of mapping the interior of the unit circle in C

z

to

the entire upper-half plane of C

w

, where the upper half of the circle is mapped to the

positive real w axis while the lower half of the circle is mapped to the negative real w

axis. I leave it to you to prove that this map is, indeed, a conformal map.

Now the next step is to ﬁnd a map on C

w

that is harmonic and satisﬁes the boundary

conditions. Consider the function

1

π

log w =

1

π

(log [w[ + i arg(w)) (6.4.18)

which is analytic and spans the upper half plane of C

w

for 0 ≤ arg w ≤ π. Notice that

the imaginary part of this function satisﬁes the boundary conditions we need. Since

it is the imaginary component of an analytic function is harmonic, so we found our

solution.

By writing w = u + iv, we have arg(w) = tan

−1

_

v

u

_

. By expanding Equation (6.4.17)

to compute u(x, y) and v(x, y), we ﬁnd

V (r) =

1

π

tan

−1

_

1 −x

2

−y

2

2y

_

=

1

π

tan

−1

_

1 −r

2

2r sin θ

_

(6.4.19)

where we take the branch of the arctangent function 0 ≤ tan

−1

t ≤ π. It gets better:

the harmonic conjugate of the potential is another harmonic function whose tangent

vector is always orthogonal to the potential’s tangent vector. But the tangent vector

to the electrostatic potential is the electric ﬁeld. And the lines perpendicular to the

ﬁeld are just the ﬁeld ﬂux lines. From Equation (6.4.18) we know right away that these

ﬂux lines are then given by the level curves of the imaginary part of −(i/π) log w, or

upon using Equation (6.4.17):

c(r) =

1

2π

log

_

(1 −x)

2

+ y

2

(1 + x)

2

+ y

2

_

=

1

2π

log

_

1 + r

2

−2r cos θ

1 + r

2

+ 2r cos θ

_

• Perhaps this relatively straightforward example has been able to show just how much

art goes into solving problems this way. First, we had to ﬁnd the conformal transfor-

mation that simpliﬁes the problem. Believe it or not, this is not as bad as it sounds

– there are big, fat tables online and in textbooks cataloging hundreds of conformal

maps. So you can go through those and look for one that helps. The next thing we had

to do is ﬁnd the analytic function whose harmonic component gives us our solution.

In this case, that was relatively straightforward. There are a whole bunch of analytic

functions that you can try – polynomials and roots, trig functions, logs and exponents,

70

etc. When you’ve done problem after problem, you begin to develop an intuition for

ﬁnding the right function. But it is something of an art, and like all forms of art, it

takes practice to get good at it. But the rewards are worth it! Once you get good at

ﬁnding the conformal map and the analytic function, you have completely solved the

problem. You even get the ﬁeld ﬂux lines for free.

71

Suggested Reading

Here I include some books that I used when writing this review. It is by no means an

exhaustive list, but it might be enough to get you started if you wish to follow up on any

topic.

Brown, J. W. and Churchill, R. V., Complex Variables and Applications (6th Ed), Mc-

Graw Hill, Inc. (New York), 1996. This is a great book on complex analysis and conformal

mapping. I heartily recommend it.

Dirac, P. A. M., General Theory of Relativity, Princeton University Press (New Jersey),

1996. This pamphlet has many formulas, but no exercises or details. A great place to see

everything without any excess baggage, but I wouldn’t use it as a primary source.

Do Carmo, M. P., Diﬀerential Geometry of Curves and Surfaces, Prentice Hall, Inc (New

Jersey), 1976. This is a great book on diﬀerential geometry. It only deals in one and two

dimensions, and so it says nothing about Riemann formalism, and it does not use Einstein

notation, but it is a great beginning book that gets the ideas across without too much heavy

machinery.

Herstein, I. N. Abstract Algebra, Prentice Hall (New Jersey), 1990. A decent introduction

to abstract algebra. Not the best, but it’s what I used.

Hubbard, J. H. and Hubbard, B. B., Vector Calculus, Linear Algebra and Diﬀerential

Forms: A Uniﬁed Approach, Prentice Hall, Inc (New Jersey), 1999. This book has a great

chapter on diﬀerential forms.

Jackson, J. D., Classical Electrodynamics, Third Ed., John Wiley and Sons (New York),

1999. The classical textbook on electrodynamics. Has many details on tensors and the

Lorentz group.

Lee, J. M., Riemannian Manifolds: An Introduction to Curvature, Springer-Verlag (New

York), 1997. This is an excellent book on diﬀerential geometry in all its details, although

more advanced. It assumes the reader is familiar with the theory of diﬀerentiable manifolds.

Peskin, M. E. and Schroeder, D. V., An Introduction to Quantum Field Theory, Perseus

Books (Massachusets), 1995. The modern texbook for quantum ﬁeld theory. Has a detailed

72

discussion of Lagrangian ﬁeld theory, Noether’s theorem, as well as a section on Lie groups.

Sakurai, J. J., Modern Quantum Mechanics, Addison-Wesley (Massachusetts), 1994. A

textbook used for graduate quantum mechanics in many universities. It has a very good

discussion of the theory of angular momentum, where tensors and Lie groups have a secure

home in physics.

Ward, R. M., General Relativity, University of Chicago Press (Illinois), 1984. A great

book – one of my favorites. It has many details and is accessible to graduates.

I hope that this review is of use to you, and that these references can help you to ﬁll in

the gaps, or at least start you on your way. If you have any suggestions for the improvement

of this paper, please send me your comments (blechman@pha.jhu.edu).

73

A Note of Explanation

It has been my experience that many incoming graduate students have not yet been exposed to some of the more advanced mathematics needed to access electrodynamics or quantum mechanics. Mathematical tools such as tensors or diﬀerential forms, intended to make the calculations much simpler, become hurdles that the students must overcome, taking away the time and understanding from the physics. The professors, on the other hand, do not feel it necessary to go over these topics. They cannot be blamed, since they are trying to teach physics and do not want to waste time with the details of the mathematical trickery that students should have picked up at some point during their undergraduate years. Nonetheless, students ﬁnd themselves at a loss, and the professors often don’t even know why the students are struggling until it is too late. Hence this paper. I have attempted to summarize most of the more advanced mathematical trickery seen in electrodynamics and quantum mechanics in simple and friendly terms with examples. I do not know how well I have succeeded, but hopefully students will be able to get some relief from this summary. Please realize that while I have done my best to make this as complete as possible, it is not meant as a textbook. I have left out many details, and in some cases I have purposely glossed over vital subtlties in the deﬁnitions and theorems. I do not provide many proofs, and the proofs that are here are sketchy at best. This is because I intended this paper only as a reference and a source. I hope that, in addition to giving students a place to look for a quick reference, it might spark some interest in a few to take a math course or read a book to get more details. I have provided a list of references that I found useful when writing this review at the end. Finally, I would like to thank Brock Tweedie for a careful reading of the paper, giving very helpful advice on how to improve the language as well as catching a few mistakes. Good luck, and have fun!

September 19, 2006 In this version (1.5), I have made several corrections and improvements. I would like to write a chapter on analysis and conformal mapping, but unfortunately I haven’t got the time right now, so it will have to wait. I want to thank Linda Carpenter and Nadir Jeevanjee for their suggestions for improvement from the original version. I hope to encorporate more of Nadir’s suggestions about tensors, but it will have to wait for Version 2...

September 10, 2007 In version 2.0, I have made several vital corrections, tried to incorporate some of the suggestions on tensors alluded to above, and have included and improved the sections on analysis that I always wanted to include. Thanks to Michael Luke for the plot in Chapter 1, that comes from his QFT lecture notes. i

**List of Symbols and Abbreviations
**

Here is a list of symbols used in this review, and often by professors on the blackboard.

Symbol ∀ ∃ ∃! a∈A A⊂B A=B A∪B A∩B ∅ A B WLOG p⇒q p ⇔ q; iﬀ N Z Q R C

Meaning For all There exists There exists unique a is a member (or element) of the set A A is a subset of B A ⊂ B and B ⊂ A The set of members of the sets A or B The set of members of the sets A and B The empty set; the set with no elements The “disjoint union”; same as A ∪ B where A ∩ B = ∅ Without Loss of Generality implies; If p is true, then q is true. p is true if and only if q is true. Set of natural numbers (positive integers) Set of integers Set of rational numbers Set of real numbers Set of complex numbers

ii

. . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Upstairs. .1 Constructing Tensors . . . . . . . .1 SU(2) . 1. . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . .2 Kroneker Products and Sums . . . . . . . . . . . . . 1. . . . . . . . . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Tensors . . . . . . . . . . .4 Noether’s Theorem . . .2 Rank 1: Vectors . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . 1. . . .8. . 1. . . . . 1. . . . . . . .2.Spin . . . . . . . . . 3. . . . . . . . . . . . . . . 1 1 2 3 4 4 4 6 7 7 7 7 8 9 10 12 12 13 15 15 18 19 22 22 23 26 26 27 28 28 29 29 30 . . . . . . . .6. 1. . . . .5.5 Einstein Notation . . . . .3 The General Case . . . . . . . . .2 Diﬀerential Volumes and the Laplacian . . . . . . . .3 Using the Levi-Civita Symbol . . . . . . . . 3. . . 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Symmetries and Groups . . . . . .1. . . . . . . . . 1. . . . 1. . . .7 Permutations . . . . . . . . . . . . . . . . . . . . Group .2. . . . . . . . . . . . . . . . . . . . .8 Constructing and Manipulating Tensors .1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Rank 0: Scalars .Contents 1 Tensors and Such 1. . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . 1. . .1 Kroneker Delta . . . . . . . .2 SO(3. . 1. . . . . . . . . . . . . . . .2. . . . . . . . . . . . .1 THE Tensor: gµν . . . . . . . . . . . . 2. . . . . . . . . . . . . 1. . . . . . . . . . . . . .1. . . . . . . . . . . . .2. . . .3 Euclidean vs Minkowskian Spaces . . . . . . . .1 Some Deﬁnitions . . . . . . . . . . . . 3. . . . . . . . . . . . . . . .Lorentz (Poincar´) e 2. . .1 Jacobians . . . . . . .1 Another Shortcut . . . . . . . . . . . . .3 Lagrangian Field Theory . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6. . . . . .2 Levi-Civita Symbol . . . . . . . 3 Geometry I: Flat Space 3. . . . .6. . . . . . . . . . . 2. . . 1. .1. . . . . . .8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Veilbeins . . .a Prelude to Curvature . .1 Curvilinear Coordinates . . . . . . . . . . .6 Some Special Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Downstairs: Contravariant vs Covariant 1. . . . . . . . .3 Laplacians . . . . . . . . . . .4 Mixed Tensors . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Transformations and Symmetries 2. . . . .

. . . . . . . . . . . . . . 5 Diﬀerential Forms 5. . . . . .3 Exterior Calculus . . . . . . . . . .4 Integration . . .2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Analytic Functions . . . . . . . . .5 Forms and Electrodynamics . .2 Formulas from Vector Calculus . .4 Geometry II: Curved Space 4. . . . . . . .4 Conformal Mapping . . .1 Wedge Product . . . . . . . . .3. . . . . iv . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . . .2 Higher Dimensions . . . . . . . . . . . . . . . .3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . .1 Exterior Derivative . . .2. . . 5. 4. . . .1 Connections and the Covariant Derivative 4. . 5. .3 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . .2 Tilde . . . . . .4. . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Parallel Transport and Geodesics . . . . . . . . . .2 Form Operations . . . . . . . . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . 6 Complex Analysis 6. . . .1 Special Case: d = 2 . . . . . . . . . . .2. . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . 32 32 36 37 38 40 42 42 45 45 47 47 49 50 50 50 51 51 52 53 53 55 56 58 58 60 62 68 . . .4. . . .3 Curvature. . . . . .3. . . . 4. . . . . . . . . . . . . . . . . .1 Evaluating k-form Fields . . . .1 The Mathematician’s Deﬁnition of Vector . . . . . . . .3.3. . . . . . . .The Riemann Tensor . . . . . . . . . . . . . . . 5. . . . . 5. . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Evaluating k-Forms . .4. . . . . . . . . . . 5. . . . . 4. . . . . . . . . . . . . . . .3 Orthonormal Coordinates . . . . . . . . . . 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6. .2 Integrating k-form Fields .3 Hodge Star . . . . . . .2 Singularities and Residues 6. . . . . . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . 5. . . . . . . . . . . . . . . . .5 Generalized Cross Product . . . . . . . . . . . . . . . . . . . . .3 Complex Calculus . . . .2. . . 6. . . . . . . . . 5. . . . . . . . . . .

let us review how coordinates work. ej 1 (1. ej = i=1 j=1 xi xj gij 1 2 (1. It is common mathematical notation to call these vectors ei . Geometrically. Note that the index for the coordinate is a superscript. where i is an index and goes from 1 to N .1.. If they have unit magnitude and point in orthogonal directions. whereas the index for the unit vector is a subscript. r2 = i=1 xi ei .1. but this does not have to be the case.1) Now you can write a general N-vector as the sum of the unit vectors: N r = x e1 + x e2 + . deﬁning a basis. In N -dimensions. This small fact will turn out to be very important! Be careful not to confuse the indices for exponents. It is important to remember that these objects are vectors even though they also have an index.. they may look like: 1 e1 = 0 0 0 e2 = 1 0 0 e3 = 0 1 (1. You are probably very familiar with the concept of an inner product from regular linear algebra. for example.1 Some Deﬁnitions Before diving into the details of tensors. Before going any further.2) where we have written xi as the coordinates of the vector r.4) .1. we can write N unit vectors.Chapter 1 Tensors and Such 1. + x eN = i=1 1 2 N xi ei (1. we can deﬁne the inner product in the following way: N N N N N N r1 . 1 j=1 xj ej 2 = i=1 j=1 xi xj 1 2 ei .3) where we have deﬁned the new quantity: gij ≡ ei .1. it is called an orthonormal basis. let us introduce one of the most important notions of mathematics: the inner product. In three dimensions.

. from now on. i. Note that “xi ” is not rigorously a vector.2 Tensors The proper question to ask when trying to do tensor analysis involves the concept of a transformation. let us consider the new matrix Ai . xN ) (1. Therefore. unless it is important to keep track of the basis. First of all.2. Now look at the last equality of Equation 1.7) δxi = j=1 Ai δxj .. I have also included a period to the left of the lower index: this is to remind you 2 .3. Also notice that there are two indices.: very small translations and rotations. Again. this is not a coincidence. r becomes xi . Notice that the derivative is actually a matrix (it has two indices): ∂xi ∂xj Now Equation 1. This is exactly analogous to what you are used to doing when you say. for example. Physicists often simply call it the metric. Let’s consider a coordinate system (xi ). This implies that it is usually enough to express a vector in terms of its components.5) Now let us consider only inﬁnitesimal transformations. Then we can Taylor expand Equation 1.2. This will be an important point later. while the other is a superscript. Notice that there is no explicit mention of the basis vectors {ei }.2. This is a standard but very important notation. Then we can write the new coordinates (xi ) as a function of the old ones: xi = xi (x1 .2.j (1.. while one is not. Also note that the choice of basis is important when we want to construct the metric gij .2.e.2) as opposed to x+2y. take note that one . while the unprimed index refers to the old coordinate index.j index is primed. x2 .6) where we have dropped higher order terms. we must understand the general answer to this problem.j N (1. for example. 1. but one is a subscript.8) Before going further. that a vector is (1. we will omit the basis vectors and denote the vector only by its components. where the primed index refers to the new coordinate index..Mathematicians sometimes call this quantity the ﬁrst fundamental form.5 and write the change in the coordinates as: N δx = j=1 i ∂xi j δx ∂xj (1.6 can be written as a matrix multiplication: Ai ≡ .2. So. and perform a transformation on these coordinates. but only the vector components.1. One must ask the question: How do the coordinates change (“transform”) under a given type of transformation? Before going any further. and will be important in what follows.

j A1 = 0 A1 = θ . What are “inﬁnitesimal proper symmetry transformations”? This is an issue that we will tackle in Chapter 2. that not everything that caries an index is a tensor!! Some people with more mathematical background might be upset with this deﬁnition.8. Since each index is supposed to transform like Equation 1. without the proper. Notice.2 = 0 Armed with this knowedge. For the more interested readers. or scalar. let us consider the basic examples.2.1 2 2 A. the matrices we consider are not symmetric. I will say more on this in Chapter 5. Let us consider a simple example. However.2.8 has a generalization to large transformations. 3 . We know how to do such transformations. Finally.1 = −θ A. even though we have only calculated this to lowest order. Consider a 2-vector. we can now deﬁne a tensor: A tensor is a collection of objects which. it turns out that Equation 1.8. and it is important to know the order of indices. The number of indices is known as the tensor rank. Now that we have a suitable deﬁnition.2 . This can prove to be very important. 1. and let the transformation be a regular rotation by an angle θ. Tensors carry indices. they also include reﬂections (x → −x). combined the right way. you should be warned that the mathematician’s deﬁnition of a tensor and the physicist’s deﬁnition of a tensor are not exactly the same.1 Rank 0: Scalars The simplest example of a tensor is a rank-0 tensor. This deﬁnition basically states that a tensor is an object that for each index.2. For now it will be enough to say that in N-dimensions. however. transforms according to Equation 1. as in general.that the indices should be read as “ij” and not as “ji”.2. The new coordinates are: x1 x2 = x1 cos θ + x2 sin θ ∼ x1 + θx2 + O(θ2 ) = x2 cos θ − x1 sin θ ∼ x2 − θx1 + O(θ2 ) Now it is a simple matter of reading oﬀ the Ai : . We will discuss these details in Chapter 2. but for the majority of people using this review. with the coordinate replaced by the tensor. it must be that a scalar does not transform at all under a coordinate transformation. transform the same way as the coordinates under inﬁnitesimal (proper) symmetry transformations. they are translations and rotations in each direction. and there is no index. it’s enough to know and use the above deﬁnition. Such quantities are said to be invariants. I encourage you to think about this.

See Section 1. the diﬀerence is huge.l (1. Downstairs: Contravariant vs Covariant We have put a great deal of emphasis on whether indices belong as superscripts or as subscripts.2. and force F . a rank-2 tensor is an object with two indices (T ij ) which transforms the following way: N T ij = k. For simplicity. Thus we can deﬁne tensors of any rank by insisting that they obey the proper transformation law in each index. I’ll only consider rank-1 tensors (1 index).2. In Euclidean space.2. A tensor of rank-3 would have three A’s in the transformation law. Recall that we deﬁned a vector as an object which transformed according to Equation 1. which by deﬁnition satisfy Equation 1. Now we will make this quantitative. the diﬀerence is less important (although see below).3 Upstairs.8.8. 1.Notice that an element of a vector. Some common examples of vectors that appear in physics are the momentum vector p. In any reasonable world.9) Exercise: Prove s is an invariant quantity under translations and rotations. or vector. This is a tensor with only one index.8. but the generalization is very straightforward. There are other examples as well.10) An example of a rank-2 tensor is the moment of intertia from classical rigid-body mechanics.l=1 Ai Aj T kl . There the index was a superscript.2. the length of a vector should have nothing to do with the coordinate system you are using (the magnitude of a vector does not depend on the location 4 . but in general.2.2. as it does transform nontrivially under Equation 1. Now we want to deﬁne a vector with an index as a subscript. There are many others. is not a scalar. where we are all used to working. 1. such as x1 . We will see that the most common example of a scalar is the length of a vector: N N s2 = i=1 x2 ≡ i i=1 xi xi (1.3 for details.2 Rank 1: Vectors The next type of tensor is rank-1.2. etc. 1.2. the electric ﬁeld E. The most obvious example of a vector is the coordinates themselves. and we will do it using Equation 1.3 The General Case It is a relatively straightforward project to extend the notion of a rank-1 tensor to any rank: just keep going! For example.k .9. We will call vectors of that form contravariant vectors.

i ∂xi The last step just came from noting that B is identical to A with the indices switched. It is just like a contravariant vector except that it transforms in exactly the oposite way: j B. In matrix notation.3.2.i Ai = δk .i (1. Recall from Equation 1. for example).i = N xi = j=1 xj A j .14) .i l=1 N l B.k i =1 l where δk is the Kroneker Delta: +1 if k = l.of the origin.3.3.11 when we started deﬁning the covariant vector. By comparing these two equations. by comparing with the deﬁnition of A in Equation 1.1.j=1 2 gij xi xj But we also wrote another equation for s2 in Equation 1. Reading oﬀ the inverse is quite easy.11) l where B.l=1 i =1 xi x = s = i=1 i 2 ( i =1 k=1 Ai xk )( .15) There is one more very important point to notice about the contravariant and covariant vectors. So we have deﬁned a new quantity.3 that the length of a vector can be written as: N s = i.7: ∂xj ≡ Aj (1. we can conclude a very useful relationship between covariant and contravariant vectors: 5 . we have a constraint on the quantity in brackets: N l l B. Notice that if the left and right side of this expression are supposed to be equal.3.12) BA = 1 ⇒ B = A−1 (1. 0 otherwise.3. which we will call a covariant vector.3.k k.13) So we know what B must be. So we will deﬁne a vector with a lower index as an object which leaves the length invariant: N N N N N l xl B.i is the matrix that deﬁnes how this new type of vector is supposed to transform. this reads: (1.i Ai xk xl .k )= (1.

2 x 1 Figure 1. and so there will be a diﬀerence between covariant and contravariant coordinates. so here is another example of how covariant and contravariant makes a diﬀerence. and in doing so we have found that the inverse of the metric with lower indices is the metric with upper indices. but x1 > x2 . even in ﬂat space. and will be used many times.4). θ). N xi = j=1 N gij xj g jk xk k=1 (1.3.1: A graph showing the diﬀerence between covariant and contravariant coordinates. the diﬀerence is important. the diﬀerence between covariant and contravariant is negligible.1. You can see that explicitly in the ﬁgure above. θ). Speciﬁcally.3.18) So we have found that the metric takes contravariant vectors to covariant vectors and vice versa.1). So we ﬁnd that covariant and contravariant are the same only in orthonormal cartesian coordinates! In the more general case.16) xj = N k gij g jk = δi j=1 (1. 6 . This is a very important identity. But even now I can show you a simple nontrivial example. Even though polar coordinates describe the ﬂat plane. if xi = (r. Now the metric is no longer proportional to the Kroneker delta (compare this to Equation (1. then it is true that xi = (r. Consider a cartesian-like coordinate system. consider polar coordinates (r. but now let the coordinate axes cross at an angle of 60 deg (see Figure 1. As another example. We will see in Chapters 3 and 4 some examples of where the diﬀerence begins to appear. When working in an orthonormal ﬂat-space coordinate system (so gij = δij ). they are not orthonormal coordinates.17) (1. where the same point has x1 < x2 .3. r2 θ).

k .6) This is very important. By now you should be able to construct a tensor of any character with as many indices as you wish. the proof of this follows from the chain rule and Equation (1.5.2.5. in this chapter. In fact. by now it should be clear that we need not stop there.j = k. ∂2 ∂t2 1 7 . there is an extra minus sign. So let us introduce a new convention: Anytime an index appears twice (once upstairs. where most of this is relevant for physics. we will stick to the usual Euclidean space.i = ∂i f (x) ≡ ∂ 2 = ∂i ∂ i = 2 (1. In Minkowski space.1 Another Shortcut There is another very useful shortcut when writing down derivatives: ∂f (1. so ∂ 2 = = − 2 . once downstairs) it is to be summed over. Such a tensor would be expected to transform as follows: N i T. It is not hard to show that it is a scalar operator.20) ∂xi Notice that the derivative with respect to contravariant-x is covariant. it is generally true that anytime two indices repeat in a monomial.l . l) tensor. 1. We can construct a tensor that is covariant in one index and contravariant in another. for any scalar φ(x). 1. It will also provide a quick check: if you get an index repeating. but both subscripts or superscripts. I will also often use the shortand: f (x). However. This is known as Einstein notation.1. chances are you made a mistake somewhere.l=1 k Ai Al T.4. unless speciﬁcally stated otherwise. they are summed over. This is not a coincidence. In other words. ∂ 2 φ(x) is also a scalar.21) This is just the Laplacian in arbitrary dimensions1 .j (1.5.19) Such objects are known as mixed tensors. and will prove very useful in keeping the expressions simple. a tensor with n upper indices and l lower indices is called a (n.which is just the D’Alembertian (wave equation) operator.5 Einstein Notation The last thing to notice from all the equations so far is that every time we have an index that is repeated as a superscript and a subscript. we are summing over that index. I will cover this more in Chapter 3.4 Mixed Tensors We have talked about covariant and contravariant tensors. Generally. and is made much more explicit in this notation.

The Levi-Civita symbol is a completely antisymmetric tensor. Let us start with a deﬁnition: The Levi-Civita symbol in N dimensions is a tensor with N indices such that it equals +1 if the indices are an even permutation and −1 if the indices are an odd permutation. let me just mention that no matter how complicated the Levi-Civita Symbol is. That was a lot of words. . it wasn’t until Levi-Civita published his work on tensor analysis that Albert Einstein was able to complete his work on General Relativity. 0 ··· 1 ··· .1 Some Special Tensors Kroneker Delta We have already seen the Kroneker Delta in action. 1) tensor. In general.4).2 Levi-Civita Symbol The Levi-Civita symbol ( ) is much more confusing than the Kroneker Delta. .6 1. ..it’s that useful! Recall what it means for a tensor to have a symmetry (for now. Notice that an antisymmetric tensor cannot have diagonal elements. the Kroneker Delta is nothing more than the unit matrix in N dimensions: 1 0 i [δj ] = . the Levi-Civita symbol is rather silly! 8 .6.1.e. A tensor Tij is said to be symmetric if Tij = Tji and antisymmetric if Tij = −Tji . . but the arguments generalize). and it is 0 whenever they are not. It is equal to +1 whenever the two indices are equal. Notice that the Kroneker delta is naturally a (1. Notice that for N=1. and is the source of much confusion among students who have never seen it before. Notice that in matrix form. For now we will simply say 12···N = +1 if all the numbers are in ascending order. but it might be useful to quickly sum up its properties here. let’s only consider a rank-2 tensor. you might want to construct a tensor that has a symmetry of one form or another.6. . but in many cases.3.: whenever you switch two adjacent indices you gain a minus sign. 1. This object is a tensor which always carries two indices. We will talk about permutations in more detail in a little while. it must be zero (see the above talk on antisymmetric tensors). This is where the power of the Levi-Civita symbol comes into play. . life would be close to unbearable if it wasn’t there! In fact. i. let’s do some concrete examples. because we have the equation Tii = −Tii ⇒ Tii = 0 (no sum). tensors will not have either of these properties. and zero otherwise. This means that whenever you have two indices repeating. Before we dive into its properties. We will consider the three most important cases (N=2.

so: 1234 = +1 and just take it from there! For example: 2143 = − 2134 = + 1234 = +1. so: 213 = −1.6. Just remember. it is still summed over. The Levi-Civita symbol comes back again and again whenever you have tensors. I will always put the Levi-Civita indices downstairs . and we can write out very easily what it looks like: 12 21 = +1 = −1 N=3: ijk In three dimensions. There are six possible combinations of indices. Can you decide which ones are +1 (even) and which are −1 (odd)? Here’s the answer: 123 321 = = 231 213 = = 312 132 = +1 = −1 N=4: ijkl I am not going to do N=4 explicitly. the Levi-Civita symbol has two indices. There are some exceptions to this rule when dealing with curved space (General Relativity). Remember: if ever you get confused. it becomes a little harder to write out explicitly what the Levi-Civita symbol looks like.if an index is repeated. That’s all there is to it! Finally. you always start with an ascending number sequence. but I will not discuss that here. but owing to the antisymmetric nature. But often we relax that rule with the Levi-Civita symbol and always write it with lower indices no matter what. From now on.22) What if I write down 213 ? Notice that I recover 123 by switching the ﬁrst two indices. so it is a very important thing to understand. I’d like to point out a small inconsistency here. The key equation to remember is: 123 = +1 (1. but it is no more conceptually diﬃcult than before. just write down the symbol in ascending order. 9 . set it equal to +1.N=2: ij For the case of two dimensions. and start ﬂipping (adjacent) indices until you ﬁnd what you need. I must include a minus sign. but hopefully by now you feel comfortable enough to ﬁgure it out on your own. Above I mentioned that you want to sum over indices when one is covariant and one is contravariant.

6. There are two useful formulas that are good to remember: ijk ijl = 2δkl (1. If the set S has n elements. but it also has much application in other branches of science as well.23) (1.24) ijk ilm jk = δlm ≡ δlj δmk − δlk δmj Exercise: Prove these formulas. in Chapter 5. There is a beautiful formula for the cross product using the Levi-Civita symbol: (A × B)i = j k ijk A B (1. recall that a cross product only makes sense in three dimensions. this group of permutations is denoted Sn . For more information. The theory of symmetric groups is truly deep and of fundamental importance in mathematics. let me give a very brief overview of the theory.24). you often come across products and contractions of indices. 1. It has n! elements. For now. For that reason. consider a set of objects (called S) and consider an automorphism σ : S → S (one-to-one and onto function).3 Using the Levi-Civita Symbol Useful Relations When performing manipulations using the Levi-Civita symbol. you can derive all the vector identities on the cover of Jackson. such as the double cross product. One can see very easily that the set of all permutations of a (ﬁnite) set S is a group with multiplication deﬁned as function composition.6. That’s a sure-ﬁre way to be sure you understand index manipulations. It takes an element of S and sends it uniquely to another element (possibly the same element) of S. Rigorously. and I will not discuss it here. and is often called the Symmetric group. I recomend your favorite abstract algebra textbook.6. let’s prove a diﬀerential identity: ×( × A) = ijk ∂j klm ∂l Am = kij klm ∂j ∂l Am = [δil δjm − δim δjl ]∂j ∂l Am = ∂i ∂m Am − ∂ 2 Ai = ( · A) − 2 A Exercise: Prove that you can interchange dot and cross products: A · B × C = A × B · C. Then σ is a permutation. The proof of this is straightforward application of group and number theory.25) Using this equation.6. Cross Products Later. Try a few to get the hang of it.6.6. 10 .7 Permutations A permutation is just a shuﬄing of objects.1. As an example. we will discuss how to take a generalized cross-product in n-dimensions. We will talk more about groups in the next chapter.23) and (1. and Equations (1.

then this permutation touches all of them. A permutation of one number is trivial. Notice that this is not the same as (45)(15).2 → 3. Working in S5 for the moment. Then ask. Now repeat the same proceedure on 5 to get 5 → 4. it leaves any others alone (for example. It is a truly amazing theorem in group theory that any permutation (no matter what it is) can be written as a product of transpositions.3 → 4. it is called an even permutation. A transposition is a permutation of two numbers.: with n elements) can be represented by the set of natural numbers between 1 and n. what does 1 go to? Answer: 5. If it has more than four elements. If our set has four elements in it. where 5 goes to 5. Then a permutation might look like this: σ = (1234) This permutation sends 1 → 2.3 → 3?). and so on until you have completed the cycle (quick check: does 2 → 2. It is certainly true (and is rigorously provable for the anal retentive!) that any ﬁnite set of order n (i. If it can be written as an odd number of perumtations.26) It is pretty obvious that we have the following relationships: (even)(even) = (even) (odd)(odd) = (even) (odd)(even) = (even)(odd) = (odd) 2 3 Without Loss Of Generality Answer: (15)(45). This is in general how it always works. 11 . Recall that composition of functions always goes rightmost function ﬁrst. it is called an odd permutation. Furthermore. this decomposition is unique up to trivial permutations.e.7. For example: (12) sends 1 → 2.4 → 1. Multiplying three permutations is just as simple. Then WLOG2 we can always represent our set S in this way. 5 → 5).2 → 1. since the multiplication of permutations is almost never commutative! Let’s see an example: (1234)(1532) = (154) Where did this come from? Start with the ﬁrst number in the rightmost permutation (1). but requires three steps instead of two. We deﬁne the signature of a permutation in the following way: sgn(σ) = +1 σ −1 σ even odd (1. we make a deﬁnition: Deﬁnition: If a permutation can be written as a product of an even number of transpositions. Net result: 1 → 5. so we don’t worry about it. This is very important here. we can write down the product of two permutations by composition.Speciﬁc permutations are usually written as a row of numbers. Can you write down the decomposition of the above product3 ? Armed with that knowledge. Now plug 5 into the secondmost right transposition.

called the Alternating Subgroup (denoted An ) of order 1 n!. Because we are using a Cartesian coordinate system.8 Constructing and Manipulating Tensors To ﬁnish this chapter. I omit 2 details of the proof that this is indeed a subgroup.7. For example. say. if A and B are vectors. then we can contract. if K is rank 2 and L is rank 3.µ Lανρ We could construct other tensors as well if we wished by contracting other indices. B are vectors in 4 dimensions: Tµν = ρ σ µνρσ A B ⇒ T12 = A3 B 4 − A4 B 3 . 1. Outer Product: We can always construct a tensor by combining two tensors without contracting any indices. we call tensors of this form Cartesian Tensors.27) 1. Again.What this tells us is that the set of even permutations is closed. Inner Product: We can contract one of the indices from K and one from L to form a tensor T of rank k + l − 2. Notice that the odd elements do not have this property. and hense forms a subgroup. For example. you might want to sum over all combinations of indices. We could also use the metric tensor or the Levi-Civita tensor if we wished to form symmetric or antisymmetric combinations of tensors. The alternating subgroup is a very powerful tool in mathematics. I will review a few more manipulations that provide ways to construct tensors from other tensors. 3. 2. we can construct: Tij = Ai Bj . etc. For example.8. Symmetrization: Consider the outer product of two vectors Ai Bj . For an application of permutations. There are three immediate ways to do it: 1. although it’s pretty easy. the general formula for the determinant of an n × n matrix can be written as: det A = σ∈Sn sgn(σ)a1σ(1) a2σ(2) · · · anσ(n) (1. Then we can force the tensor to be symmetric (or antisymmetric) in its indices by construction: 12 . For example. if A. we say that we have constructed a Dyadic. We want to construct a new tensor (T) from these two tensors. When combining two vectors in this way. a good place to look is index manipulation (hence why I put it in this chapter!). Sometimes.1 Constructing Tensors Consider two tensors of rank k (K) and l (L) respectively. the ﬁrst index of both of them to form T with rank 3: α Tµνρ = K.

30) N 2 2 N N is the dimension of A and B. You can do this to as many indices 1 as you want in an outer product (for n indices. A dyatic is a perfect example of a Kroneker Product. Of the three pieces.28) (1. traceless tensor. because any term I added I also subtracted.29) Circular brackets about indices mean symmetric combinations. we have (for A. However. Writing it out in a matrix. each of rank 0. 1. A Kroneker product is simply an outer product of two tensors of lesser rank. so we have decomposed the tensor according to the following prescription: Tij = 3⊗3=1⊕3⊕5 We say that the dyatic form Tij is a reducible Cartesian tensor.25)). Then we can write it down in a fancy way by adding and subtracting terms: 1 1 1 k 1 k (Tk )δij + (Tij − Tji ) + [ (Tij + Tji ) − (Tk )δij ] (1.8. and square brackets about indices mean antisymmetric combinations.6. Finally the third piece is a symmetric. 2 Dyadics provide a good example of how combining tensors can be more involved than it ﬁrst seems. The next term is an antisymmtric tensor. The ﬁrst piece is just a scalar (times the identity matrix).8. and therefore is invariant under rotations. the ﬁrst is a scalar (1). You can write the Kroneker sum in matrix form using a block-diagonal matrix: 13 .2 Kroneker Products and Sums With that introduction. we can talk about Kroneker products and sums. Notice that I have not changed the tensor at all. and (in 3D) can be written as a (pseudo) vector. B rank-1 tensors in 3 dimensions): A1 B1 A1 B2 A1 B3 T = A ⊗ B = A2 B1 A2 B2 A2 B3 A3 B1 A3 B2 A3 B3 (1. the second is a 3-vector (3) and the third is a rank-2 symmetric tensor (6) that is traceless (-1). and it reduces into 3 irreducible spherical tensors.it’s a cross product (see Equation (1. and the sum over k in the ﬁrst and last terms is implied by Einstein notation.8. then Tij has 3 × 3 = 9 components. I have decomposed the tensor into three diﬀerent pieces. Consider the dyadic we have just constructed. If A and B are 3-vectors.A(i Bj) ≡ A[i Bj] 1 (Ai Bj + Aj Bi ) 2 1 ≡ (Ai Bj − Aj Bi ) 2 (1. Let’s discuss the matter concretely by letting N = 3 and counting degrees of freedom. 1 and 2 respectively.8.31) A Kroneker sum is the concatinated sum of matrices. replace 1 → n! ).8.

D is 3 × 3 and E is 5 × 5. E is an l × l matrix. Notice that when you rotate T. D and E. it is often very useful to work with the separate spherical tensors rather than with the entire Cartesian tensor. C is 1 × 1. and is very useful in many areas of science.8. C 0 0 T=C⊕D⊕E = 0 D 0 0 0 E (1. they each have the same number of independent quantities. Why would one ever want to do this? The answer is that when working with dyadics in practice. the study of ﬁnding diﬀerent dimensional matrices for T goes under the heading of representation theory.8. For this reason. and both represent the same tensor T.33) where A ⊗ B is a 3 × 3 matrix. Notice that the ﬁnal matrix is an (m + n + l) × (m + n + l) matrix. See Sakurai for more details.these are not the same matrix! However. we can rewrite our equation for the dyadic: A⊗B∼C⊕D⊕E (1. It is a very active part of modern algebraic research. The best example is in Quantum Mechanics. when you are talking about angular momentum.32) where C is an m × m matrix. At this point. For that reason. Notice that the matrix dimensions are not the same on both sides of the equation . 14 . This kind of relationship is by no means as obvious when written in its Cartesian form. D is an n × n matrix. you never mix terms between C.and “0” is just ﬁlling in any left over entries with a zero. very often one is only interested in the part of the tensor that transforms the same way.

Associative((g1 g2 )g3 = g1 (g2 g3 )). elements of N stay in N after being multiplied by elements in G. In other words. the group is said to be a ﬁnite group. otherwise it is Non-Abelian. We write N G. • Deﬁnition: If a group has no invariant subgroups. Be careful not to confuse a direct product group with a Kroneker product of tensors. There are several deﬁnitions for invariant subgroups. but they are all equivalent to this one.1 Symmetries and Groups Here we will discuss some properties of groups in general that you need to know. it is called a simple group. • Consider two groups that commute with each other (G. Then we will progress to discuss a very special and important kind of group. Finally we will look at some examples. • Fact: G. H G × H are invariant subgroups. Then you can construct the direct product group in the following way: G × H ≡ {(gj hk )|(gj hk )(gl hm ) = (gj gl hk hm )}. 15 . g2 ∈ G ⇒ g1 g2 ∈ G). If the group is commutative (g1 g2 = g2 g1 ).Chapter 2 Transformations and Symmetries 2. H). • Discrete groups represent operations that either do or do not take place. the group is said to be a compact group. Continuous groups represent operations that diﬀer inﬁnitesimally from each other. called a Lie Group. an identity (generically called e such that ge = eg = g) and inverses (g has inverse g −1 such that gg −1 = g −1 g = e). If the ranges of the parameters are closed and bounded. • Elements in continuous groups can be labeled by a set of continuous parameters. A subgroup N ⊂ G is a normal subgroup (also called invariant subgroup) if ∀si ∈ N. g −1 si g ∈ N ∀g ∈ G. If the set of parameters is ﬁnite. • Deﬁnition: A group G is a set with an operator called multiplication that obeys four axioms: Closure (g1 . it is said to be an Abelian group. • Deﬁnition: A subgroup is a subset of a larger group that is also a group.

ξ is b called the adjoint of ξ. If a representation is not reducible. they do not talk to each other (think angular momentum in quantum mechanics). Then M is called a representation of the group G. not necessarely a vector in the sense of Chapter 1.Fj = (Fj )ab . Then the group element g inan n−dimensional representation M acts on the vector in the following way: ξa = M (g). . In other words: j M ( ) = ei Fj . Furthermore. as long as F † = F . ∃ξ such that ξ = ξ [M (g)−1 ]. and Fj matrices1 . . We will see some examples of this shortly. A and B are also valid representations of the group. the representation is called “faithful”. Here. the “trivial” representation is the one-dimensional representation.don’t worry too much about it if you’re not a theorist! Things will become a lot simpler once we get to some examples. we can rephrase this deﬁnition: ∃A. so we require that the generators be Hermitian.a . . N . here are some important results you should know: • Theorem: All compact Lie groups have unitary representations. . and n is called the dimension of the representation. where g1 g2 = g3 ⇒ M (g1 )M (g2 ) = M (g3 ) and M (g −1 ) = [M (g)]−1 . • Deﬁnition: A Lie group is a continuous group that is also a manifold whose multiplication and inverse operations are diﬀerentiable. If this identiﬁcation is one-to-one.b ξb a (recall Einstein notation). the mathematician who ﬁrst discovered their properties. j = 1. Since the matrix M must be nonsingular (otherwise there a a b would be no g −1 and it wouldn’t be a group). From the above result. This deﬁnition is quite a mouthful. 1 Recall the deﬁnition for the exponential of a matrix: eA = ∞ (A)n n=0 n! 16 . An example of an unfaithful representation is where all elements of a group G go to 1. • Deﬁnition: The smallest (lowest dimension) nontrivial irreducible faithful representation is called the fundamental representation. It is named after Sophus Lie.• Deﬁnition: Suppose there is a square n × n matrix M that corresponds to each element of a group G. B such that M = A ⊕ B. this implies T r(Fi ) = 0. this can be proved with the help of the very useful fact detA = eTr log A This is called the special condition. The Fj are called generators. From this we can see that M † = M −1 . • Many (though not all) Lie groups have the additional property that det M ( ) = +1 ∀ . it is said to be irreducible. Using the Kroneker sum notation. • Consider an n-tuple ξa . Warning: the index on F is not a matrix index . • Deﬁnition: A representation is reducible if each matrix can be expressed in blockdiagonal form. But before that.

and the fijk are called the structure constants of the Lie group. we can expand the exponential to get: M ( ) = I + i j Fj + O( 2 ) This is exactly the same as an inﬁnitesimal transformation. This set of generators forms a subalgebra called the Cartan Subalgebra.1. This means that they represent conserved quantities. that is. They are the same for any representation of the group. 2 17 . • In general.1). this is a valid representaion! It is so important. but nothing more.1) (the sum over k is understood). Fj ] = ifijk Fk (2.2) where I have just separated the indices in a convienient way. the basis of this vector space is the set of generators (see Chapter 5). where in this case. the structure constants are antisymmetric in their indices. I have reproduced Equation (2. hence they must be multiples of the identity. Let’s look at some very important examples: An algebra is a vector space with a vector product. This algebra is called a Lie Algebra. These commutators specify an algebra spanned by the generators2 .it is called the regular adjoint representation. The rank of the group is the dimension of this subalgebra.1. • Casimir operators commute with every generator. • Deﬁnition: C (n) ≡ fi1 j1 j2 fi2 j2 j3 · · · fin jn j1 Fi1 Fi2 · · · Fin are called the Casimir operators notice that all indices are repeated and therefore summed via Einstein convension). Therefore. The proof follows from Jacobi’s identity for commutators. Then if I deﬁne a matrix with the elements (Fi )jk ≡ −ifijk . it is given a special name . • Theorem: The local structure of any Lie group is completely speciﬁed by a set of commutation relations among the generators: [Fi . the number of generators that commute among themselves.• When is small. and so we say that Fj generates inﬁnitesimal transformations. That’s enough general stuﬀ. • Theorem: fijl flkm + fkil fljm + fjkl flim = 0. • Notice that we can rearrange the above version of the Jacobi identity and use the antisymmetry to rewrite it as: (fi )jl (fk )lm − (fk )jl (fi )lm = −fikl (fl )jm (2. • Deﬁnition: The rank of a Lie group is the number of simultaneously diagonalizeable generators. It is also true that the number of Casimir operators is equal to the rank of the group.1.

however that this is invariant: ei jF j 1 0 0 1 ei 0F 0 e−i jF j = ei 0F 0 (2. From the above.the identity matrix: σ0 = Notice. traceless.1. it is the group of 2×2 unitary matrices of unit determinant.1. Hermitian 2 × 2 matrices. and it is given by: 2 2 2 C (2) = F1 + F2 + F3 • If we remove the Special condition (unit determinant). we know that the generators (call them σi ) must be linearly independent. 2 i σ1 = σ2 = σ3 = 1 0 0 −1 (2. In the fundamental representation. Then there is one more matrix that satisﬁes all other conditions and is still linearly independent from the others. the group generated by ei F0 is an invariant subgroup of the whole group. i.1 SU(2) . since it is true 0 0 that ei F0 = (ei /2 )σ0 . or SU(2). 18 0 .e. Then there is only one Casimir operator.2.: s2 = |ξ 1 |2 + |ξ 2 |2 s invariant under all transformations. which is a number times the identity matrix.3) Then we will chose the group generators to be Fi = The group generated by these matrices is called the Special Unitary group of dimension 2.1. • Recall from quantum mechanics that the structure constants are simply given by the Levi-Civita tensor: fijk = ijk • Only σ3 is diagonal.4) Therefore. and therefore the rank of SU(2) is 1. Let us also include a normalization condition: T r(σi σj ) = 2δij Then the matrices we need are the Pauli matrices: 0 1 1 0 0 −i i 0 1 σ. That it is a normal subgroup should not suprise you. We will also require det M = 1.Spin ξ= ξ1 ξ2 • Consider a complex spinor and the set of all transformations which leave the norm of the spinor invariant. then the condition that T rσi = 0 no longer need hold. We call it U(1).

C (2) = 2. as you might expect if you followed the analogy from spin in quantum mechanics.1) . We can rewrite this in terms of representation theory: we’ll call the fundamental representation 2. since it is two-dimensional.5) • Sumarizing: R = ei 2 (n·σ) is the general form of an element of SU(2) in the fundamental representation. • You know from quantum mechanics that there are other representations of spin (with diﬀerent dimensions) that you can use.6) is just saying that combining two spin. We have found U (1) SU (2) U (2) and we can write: U (2) = SU (2) × U (1) This turns out to be true even in higher dimensional cases: U (N ) = SU (N ) × U (1) U (2). The spin-1 is three dimensional and the spin-0 is one dimensional.this change in representation is nothing more than calculating Clebsch-Gordon coeﬃcients! • Example: what if you want to combine two spin. In other words.6) θ (Notice once again that the left hand side is a 2 × 2 matrix while the right hand side is a 4 × 4 matrix. In this language. it becomes clear that the four-dimensional representation is reducible into a three-dimensional and a one-dimensional representation. however. but both sides only have four independent components). This also shows that you do not need to consider the entire (inﬁnitely huge) angular momentum matrix. • The Poincar´ Group is the set of all transformations that leave the metric of ﬂat e space-time invariant (see Chapter 4).Lorentz (Poincar´) Group e Much of this information is in Jackson.1.1 particles? Then you know that you 2 get (3) spin-1 states and (1) spin-0 state.1. In this representation. (2. The metric in Minkowski (ﬂat) space is given by: ds2 = c2 dt2 − dr2 19 . so let me try to summarize. the spin-1 and spin-0 do not talk to each other! 2. depending on what you want to describe .1. Believe it or not. you already learned this when you wrote down the angular momentum matrices (recall they are block-diagonal). So all matrix elements can be written as a Kroneker sum of these two representations. so we denote them by 3 and 1 respecitively. but only the invariant subgroup that is relevant to the problem. only using the language of groups. Equation 1 (2. Then we can write the equation: 2⊗2=3⊕1 (2. Chapter 11. but good ol’ JD doesn’t always make things very clear.1.• The whole group (without the special) is called U(2).2 particles gives a spin-1 and a spin-0 state.2 SO(3.

e. • The Lorentz group has six degrees of freedom (six parameters. • We can now go back to Chapter 1 and rephrase the deﬁnition of a tensor in ﬂat spacetime as an object that transforms under the same way as the coordinates under an element of the Lorentz group. You should check to see that this exponential does in fact give you a rotation or Lorentz transformation when chosing the proper variables. e • Ignore translations for now. however. See Jackson for the most general transformation. or SO(3. six generators). However. • The transformations we are interested in are the proper orthochronous transformationsi. and one component has another. This is accomplished by imposing the “special” condition (det O = +1). that along with the special condition is enough to isolate the desired subgroup. where i = 1. and η are the three boost parameters. and the boost generators as Ki . 1)+ . We will denote the rotation matrix generators as Si . it does not forbid their product (prove this!). As this is quite a mouthful. We can interpret these six parameters as three spacial rotations and three boosts. there is a minus sign in the metric that changes things somewhat.• Intuitively we know that this metric is invariant under parity (space reversal). For a proof. n = 4. In this case. time reversal.: the ones that keep causality and parity preserved. we usually just refer to SO(3. Since three components have one sign. where the “orthochronous” condition (subscript “+” sign) is satisﬁed by insisting the the (00) component of any transformation is always positive. So we consider the Proper Orthochronous Lorentz Group. called the rapidity. 2. You can prove to yourself that the matrices that represent the remaining operators must be orthogonal (OT = O−1 ). spacial rotations. even though it’s really only a subgroup. Then we know that these operations should be represented in the Poincar´ group.1). the group of n×n orthogonal matrices is usually denoted by O(n). we usually denote this group as O(3. one about each direction. where θ are the three rotation angles. These matrices are: 0 0 S1 = 0 0 0 0 0 0 0 0 0 0 0 −1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 S2 = 0 0 0 −1 0 0 K2 = 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 S3 = 0 0 0 0 K3 = 0 1 0 0 0 0 −1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 K1 = 0 0 • In general. and we call it the Lorentz group. this is not enough: although this forbids parity and time-reversal. see Jackson. we can write any element of the Lorentz group as e−[θ·S+η·K] . space and time translations and Lorentz transformations. 3.1) as the Lorentz group. 20 .

The minus sign in the last commutator is what makes this group the noncompact SO(3.1. as opposed to the compact SO(4). It is not a normal subgroup.7) in the x-direction? Answer: ˆ γβ 0 0 γ 0 0 0 1 0 0 0 1 where γ = (1 − β 2 )−1/2 . A rotation followed by a boost is a boost. as I mentioned above. but the rapidities are not: 0 ≤ η < ∞. Sj ] = [Si . Two boosts in diferent directions correspond to a rotation about the third direction. The Lorentz group is not a compact group! And therefore it does not have faithful unitary representations.• The boost velocity is related to the rapidity by the equation: βi ≡ tanh ηi • Exercise: what is the matrix corresponding to a boost cosh η sinh η 0 0 γ sinh η cosh η 0 0 γβ = 0 0 1 0 0 0 0 0 1 0 (2.8) 21 . • Notice that you could have constructed the generators for the Lorentz group in a very similar way to how you construct generators of the regular rotation group (angular momentum): J µν = x ∧ p = i (xµ ∂ ν − xν ∂ µ ) (2. The rotation generators form a closed subalgebra (and hense generate a closed subgroup) – it is simply the set of all rotations.1. however. This is a big problem in Quantum Mechanics and comes into play when trying to quantize gravity. This is what you’re used to seeing in Griﬃths! • Notice that you can always boost along a single direction by combining rotations with the boosts. This is certainly legal by the nature of groups! • The angles are bounded: 0 ≤ θ ≤ 2π. Kj ] = − Some points to notice about these commutation relations: 1. 3. Kj ] = ijk Sk ijk Kk ijk Sk [Ki . becuase of the second commutator.1). 4. 2. SO(3). which has a + sign. • The Lorentz algebra is speciﬁed by the commutation relations: [Si .

such changes are called inﬁnitesimal transformations.9) Hense we have constructed a ﬁnite transformation from an inﬁnite number of inﬁnitesimal ones. as before. In what follows.2 Transformations • We wish to understand how transformations aﬀect the equations of motion. 2. 2. What is the form of δf ? If f represents a tensor. I will be supressing indices (to make the notation simpler). Let’s consider these ﬁrst. and is inﬁnitesimal in the limit N → ∞. The most important kinds of transformations are translations and rotations.3 Lagrangian Field Theory Here is a brief overview of the important things you should know about Lagrangian ﬁeld theory. where a is a ﬁnite number.2. • We want to invoke a transformation on a coordinate: x → x + δx. time reversal and charge inversion. for example. j = 1. 3. Notice that. These are called proper transformations. Then performing N inﬁnitesimal rotations of size = a/N . we have: lim 1+ iaJi N N N →∞ = eiaJi ⇒ x → eiaJi x (2. These six matrices are equivalent to the six above ones. • Example: Consider an inﬁnitesimal rotation about a point in the i direction. then it should have the same form as δx by deﬁnition.where i. The transformation will look like: x → x + i Ji x + O( 2 ) = [1 + i Ji + O( 2 )]x where Ji is the generator of the rotation (angular momentum). 22 . • When discussing the theory of transformations. 2. it is useful to consider how objects transform under a very small change. In this notation. 1. We also might be interested in internal symmetries such as gauge transformations in E&M. ν = 0. 3 and you can think of the wedge product as a generalization of the cross product (see Chapter 5). Other transformations might include parity. • Recall that performing two rotations in the same direction is just multiplying the two rotations together. Then we would expect a change in any function of x: f (x) → f (x) + δf . Notice that if you expand the exponential and keep only linear terms (for a very small) you reproduce the inﬁnitesimal rotation. 2. µ. J 0i ≡ Ki and J ij ≡ ijk Sk . the greek indices are not matrix indices.

since time becomes a special parameter via the momentum density.10) Notice that there is the sum over the index µ. The action can be written as S = Ld4 x. The action completely speciﬁes the theory.11) • Notice that L is covariant.• Recall in Lagrangian mechanics. Can you ﬁnd the equations of motion? Here’s the answer: (∂ 2 + m2 )φ = 0 • Just as in classical mechanics. 2. by deﬁning a momentum density: ∂L π≡ ˙ ∂φ ˙ where φ ≡ ∂0 φ and perform the Legendre transformation: ˙ H(φ(x). • The equations of motion can now be found. you deﬁne a Lagrangian L(q(t). just as before by the condition δS = 0. and φ is a ﬁeld which is a funtion of spacetime.3. we often require that ﬁelds and their derivatives vanish on such a 23 . This yields the new Euler-Lagrange equations: ∂L − ∂µ ∂φ ∂L ∂φ. notice that if the Lagrangian changes by a total divergence. q(t)) and an action ˙ S = Ldt. this formalism is still generally useful since H is related to the energy in the ﬁeld.4 Noether’s Theorem A theory is said to be symmetric with respect to a transformation whenever the action doesn’t change under that transformation. This is why we often prefer to use the Lagrangian formalism whenever working with relativistic mechanics. we have: S = Ld4 x + ∂µ J µ d4 x = S + dΣµ J µ where the last surface integral is an integral over a surface at inﬁnity.3. but H is not. 1 1 • Example: L = 2 ∂µ φ∂ µ φ − 2 m2 φ2 is the Lagrangian for a scalar ﬁeld (Klein-Gordon ﬁeld in relativistic quantum mechanics). following Einstein notation. Nevertheless. Now we wish to decribe a Lagrangian density L(φ(x). ∂µ φ(x)) where L = Ld3 x. To see this.µ =0 (2. This implies that the Lagrangian can only change up to a total divergence (via Gauss’s Theorem). π(x)) = π(x)φ(x) − L ⇒ H = d3 xH (2. In order for quantities like energy to be ﬁnite. we can also go to the Hamiltonian formalism.

Then we can combine terms to get: jµ ≡ ∂L δφ − J µ ⇒ ∂µ j µ = 0 ∂φ. homogeneity of space (translational invariance) implies linear momentum conservation. use the product rule.surface. and well beyond the scope of this review.µ (2.I leave the proof to you. Proof: I will prove Noether’s theorem for a scalar ﬁeld. Notice that we could also write Noether’s theorem by saying that for every symmetry of the theory. 3 24 .4. leaving only the total divergence. 4 This theorem also holds for nonconstant α .µ αδL = = α∂µ = α∂µ J µ To get line 1 from line 2.10). QED. So for now let’s just assume that the surface integral is irrelevant. the previous argument says that the Lagrangian can only change by a total divergence: L → L + α∂µ J µ Let’s calculate the change in L explicitly: ∂L ∂L (αδφ) + ∂µ (αδφ) ∂φ ∂φ. there is a “charge” which is conserved: dQ =0 dt There are many applications of this theorem.µ δφ . The term in brackets is zero by Equation (2.3. there corresponds a conserved current. which is precicely what we wanted to show. but it is true in general for any type of ﬁeld. You can prove that isotropy of space (rotational invariance) implies angular momentum conservation. time translation invariance implies Q≡ j 0 d3 x ⇒ Sometimes this integral does not vanish. Then if we insist this transformation leave the action invariant.12) hense j µ is a conserverd current.µ ∂φ ∂L ∂φ. and this leads to nontrivial “topological” eﬀects. Let’s consider the general transformation that acts on the ﬁeld according to: φ → φ + αδφ where α is a constant inﬁnitesimal parameter4 .µ ∂L ∂L δφ + α − ∂φ. and vice versa. Thi implies that the above surface integral vanishes (or at least doesn’t contribute to the Equations of Motion3 ). But this is advanced stuﬀ. Thus we have Noether’s Theorem: For every symmetry in the theory. as it often is in practice.

etc. gauge invariance in electrodynamics implies electric charge conservation. for example. You can also consider more exotic internal symmetries. 25 .energy conservation.

1 THE Tensor: gµν • Like transformations. I want to deﬁne some important quantities that appear again and again in geometric descriptions of physics. • Deﬁnition: gµν is called the metric tensor or the ﬁrst fundamental form. once again. ej = δj . or even be a function of the coordinates. What does it mean for two points in a space to be “close” to each other? This notion is what makes geometry distinct from topology and other areas of abstract mathematics. Consider two points in space that are separated by inﬁnitesimal distances dx. In the next chapter. but in general. it could be oﬀ-diagonal.Chapter 3 Geometry I: Flat Space The most important aspect of geometry is the notion of distance. But when faced with trying to calculate the metric in general. the Pythagorean theorem): ds2 = dx2 + dy 2 + dz 2 ≡ gµν dxµ dxν Here. The metric tensor is given by Equation (1. we can deﬁne distance inﬁnitesimally and work our way up from there.4). • WARNING: Unlike the usual inner product notation. we never write ds2 = g µν dxµ dxν . not a coincidence! 1 26 . dy and dz. gµν = δµν .a. we will introduce curvature and see how it aﬀects our results.k. 3. • In Cartesian coordinates. usually you have to resort to Equation (1. This is because {dxµ } forms a basis for the covectors (1-forms).4). The dx’s always have upper indices1 . but in this chapter we will often cheat. Then the total displacement (or metric) is given by the triangle rule of metric spaces (a. Notice that the indices work out exactly as they should for the Kroneker delta . In this chapter we will always be working in ﬂat space (no curvature).1. since sometimes we already know what ds2 is. and give some examples. so that dxi .this is. Compare this to the i basis vectors of Chapter 1 {ei } which always came with lower indices. the metric tensor is simply the identity matrix. see Chapter 5. In this chapter.1. Knowing the form of gµν for any set of coordinates let’s you know how to measure distances.

we derive the parabolic coordinate metric: 27 . dφ).5) (3. dθ.2) (3.1.4) (3.3.1.1.3) • Parabolic coordinates are derived from the cyllindrical coordinates in the following way: 1 (ξ − η) 2 ρ = ξη φ = φ z = or using the spherical radial distance: ξ =r+z η =r−z (3.1.1.1.6) Plugging these expressions into the metric for cyllindrical coordinates. the directions are given by (dr.1) (3. and the metric is given by: ds2 = dr2 + r2 dθ2 + r2 sin2 θdφ2 so the metric tensor is therefore: 1 0 0 0 [gµν ] = 0 r2 2 2 0 0 r sin θ • Cyllindrical coordinates (or polar coordinates) are given by the coordinates: x2 + y 2 y φ = tan−1 x z = z ρ = The metric is given by: ds2 = dρ2 + ρ2 dφ2 + dz 2 and therefore: 1 0 0 [gµν ] = 0 ρ2 0 0 0 1 (3.1 Curvilinear Coordinates • In spherical coordinates.1.

28 . They are very useful in problems with the appropriate symmetry. 3.ds2 = therefore: 1 4 1+ η ξ dξ 2 + 1 + ξ η dη 2 + ξηdφ2 [gµν ] = 1 4 1+ 0 0 η ξ 1 4 0 1+ 0 ξ η 0 0 ξη Parabolic coordinates describe families of concentric paraboloids symmetric about the z-axis. there are also hyperbolic coordinates and elliptical coordinates.2. etc). but I won’t get into those here. such as parabolic mirrors.2.7) Now take the determinant of both sides. Remembering that the metric is a (0. There are even more complicated coordinate systems that are useful (toroidal coordinates. and deﬁning g ≡ det(gµν ) we ﬁnd that J = √ g. J = det ∂xa ∂y µ • To ﬁnd the general Jacobian.1 Diﬀerential Volumes and the Laplacian Jacobians • If you transform your coordinates. let us consider a transformation from Cartesian coordinates (xa ) to general coordinates (y µ ) and ask how the metric transforms. you will modify your diﬀerential volume element by a factor called the Jacobian: y α = y α (xβ ) ⇒ dn x = J dn y where in general.2 3. 2) tensor and gab = δab we have: gµν = δab ∂xa ∂xb ∂y µ ∂y ν (3. so in general: √ dn x = gdn y . • As you might have guessed.

2. • We can ﬁnd the extremum of this functional δW [Φ] = 0 and see what happens. 29 .7) as: gµν = δab ea eb µ ν So the vielbein is. So using our scalar potential. The key point to notice is that the latin index is for ﬂat space while the greek index is for curved space. the square root of the metric! In particular. For our functional. 2 For example. • Now consider a point p on the manifold.a Prelude to Curvature I will talk about curvature in the next chapter. I won’t mention any of it here.2. and the tangent space at p. There are many cases in physics where this formalism is very useful2 . At ﬁrst. and in electrostatics. consider that we wish to minimize the energy of the system. which we’ll talk about next Chapter). At every point p ∈ M . the determinant of the vielbein. Now change coordinates using the above prescription and repeat the variation. let’s deﬁne the energy functional in cartesian coordinates: W [Φ] = dn x( Φ)2 where dn x is the n−dimensional diﬀerential volume element.2 Veilbeins . when spin is involved. that is (up to constants) the integral of E2 .3 Laplacians • We can ﬁnd the Laplacian of a scalar ﬁeld in a very clever way using the Variational principle in electrostatics. and the variation of that is simply −δΦ 2 Φ. the above analysis shows that the Jacobian is just e. notice that an integration by parts lets you rewrite the integrand as −Φ 2 Φ. Give coordinates on the tangent plane ξ a such that the point p is mapped to the origin. in some sense. there is a “tangent space” – this is basically the ﬂat plane tangent to the manifold at the point p.2. Then there is a (linear) mapping between the points near the origin of the tangent plane (with coordinates ξ a ) to a neighborhood of p on the manifold (with coordinates xµ ). We set the two expressions equal to each other. but at least you have seen it.3. but now would be a good time to introduce a very useful tool called the vielbein (eµ ): a • Consider a point p with coordinates xµ on a manifold (generally a manifold with curvature. This linear mapping is given by a matrix called the vielbein: eµ ≡ a This allows us to write Equation (3. ∂xµ ∂ξ a 3.

To see this. But in curvilinear coordinates. • Exercise: Find the Laplacian in several diﬀerent coordinate systems using Equation (3. The true deﬁnition of the ith vector component is Ai = A · ei . But this is no problem.8). but it’s good practice. since nothing above depended on the number of dimensions at all. that must be addressed. Then you get as a ﬁnal expression for the Laplacian: 2 1 √ = √ ∂α [ gg αβ ∂β ] g (3.3 Euclidean vs Minkowskian Spaces What is the diﬀerence between Euclidean space and Minkowski space? In all the examples we have considered so far.2. since that minus sign makes g (the determinant of the metric tensor) negative. We will take the derivative of a unit vector and discuss its meaning in the next chapter. notice that at ﬁrst. however. We immediately notice that trouble will arise if we naively pretend that there is nothing wrong. There is one more diﬀerence.8) Exercise: Prove this formula. so you have to take its derivative as well. and in curvilinear coordinates.2. you must be extra careful. we will forever in Minkowski space make the substitution g → −g. The usual Laplacians are on the cover of Jackson. it is generally more complicated.since the integral must be the same no matter what coordinate system we use. this is ﬁne. so that generalizes very nicely (now gµν is a 4 × 4 matrix). whereas Minkowski space has four. you might assume that 2 A = ( 2 Ai ) for each component Ai . The key to this strange concept is the minus sign. To ﬁx this problem. Notice that ∂µ is covariant. so be careful! • Notice that this expression is only valid for scalar ﬁelds! For vector ﬁelds. we have been in three dimensions. It’s not hard. and all our woes will vanish! Another 30 . This is obvious when you look at the metric in spacetime: ∆s2 = c2 ∆t2 − ∆x2 (where I am looking in only one spacial dimension for simplicity). In Cartesian coordinates.remember that this is the inverse of the metric. For constant ∆s. so we require the contravariant g µν . and the square root of a negative number makes no sense (you can’t have immaginary distance!). this is the equation of a hyperbola. ei has spacial dependence. Can you also ﬁnd the Laplacian in parabolic coordinates? Here’s the answer: 2 ∂2 1 ∂2 ∂ ∂2 ∂ 4 + +η 2 + = ξ + ξ + η ∂ξ 2 ∂ξ ∂η ∂η ξη ∂φ2 3. Minkowski space is a “hyperbolic space”: the loci of points equadistant from the origin form hyperbolas.

way to think of this (and perhaps the better and more general way) is to replace g → |g| in all of our equations. but the formulas might pick up random minus signs. while GR people tend to use the mostly plus prescription. 31 . You must be very careful which convention is being used . Just for completeness.particle physicists often use the minus prescription I use here. the metric tensor of Minkowski spacetime is: +1 0 0 0 0 −1 0 0 ηµν = 0 0 −1 0 0 0 0 −1 This is called the “mostly-minus” prescription. But sometimes they switch! Of course. so be careful. physics doesn’t change. Sometimes people will use the “mostly plus” prescription which is the negative of this. Then there will never be a problem.

Diﬀerential geometry and curved space have their most obvious role in general relativity (GR). including (but not limited to) optics. and so they are “parallel”. So in principle. We know that in Euclidean geometry. yet they meet at two points: the north and south poles.1 Connections and the Covariant Derivative One of the biggest problems with geometry on curved spaces is the notion of the derivative of a vector ﬁeld. I give some recomendations in the bibliography. we can avoid the issue of curvature completely if we think in enough dimensions. For a deeper understanding. we were still talking about ﬂat space. and I recommend anyone interested to study it either in a class or on your own (I took three classes in it as an undergraduate and I’m still trying to learn it!) But I do not plan to teach diﬀerential geometry here. then they are necessarely parallel to each other (they never intersect). if two lines are perpendicular to another line. you are actually taking the diﬀerence of two 32 . where gravity manifests as a warping of spacetime. However. The situation is fundamentally diﬀerent when space itself is curved. only give an overview and a source for common equations you might come across in physics. Yet a counterexample of this is longitude lines: two lines of longitude are always perpendicular to the equator. a sphere (which is two dimensional) can be embedded into Euclidean 3-space. you have to go to a textbook. it also has applications in several other branches of mathematical physics. Diﬀerential geometry is truly a deep and profound subject. When you take the derivative. Even when we were using curvilinear coordinates.Chapter 4 Geometry II: Curved Space Consider the planet Earth as a sphere. Obviously. Hence this chapter. and curvature plays a very important role in mathematical physics. So it might be useful to see some of the basics in one place. But this is not practical or necessary in general. something is wrong with Euclidean geometry on a sphere. solid-state physics and quantum ﬁeld theory. we have been assuming that the space we live in was ﬂat. 4. For example. In everything that we have been doing. It is here where diﬀerential geometry truly becomes interesting! There is a very deep theorem in mathematics that states that any manifold of any dimension and any curvature can always be embedded in a ﬂat space of some larger dimension.

Instead. This becomes easy to see if we let X = xi ∂i . Resorting to mathematical terminology. They have the inhomogeneous transformation law: Γσ ν = Aµ Aν Aσ Γσ + Aσ . not just constants.2) Sometimes called the product rule. The n3 numbers Γk introduced here2 are called ij Christoﬀel symbols. just like we expect the derivative to do. the third rule is the analogue of Leibniz’s rule1 in vector calculus. a. as we will see in a later section.vectors that are separated by an inﬁtiesimal distance: dA = A(x + ) − A(x) But this doesn’t even make sense . However. The adjective “tensorial” means that it only depends on X at a speciﬁc point.you can only add or subtract diﬀerent vectors when they have the same base point. Notice that the ordinary directional derivative in vector calculus is exactly the same.1. + bY2 ) = a ) = f (x) +b X (f (x)Y + (Xf (x))Y The ﬁrst two rules give the connection linear properties. where we have chosen ei ≡ ∂i – a standard trick in diﬀerential geometry. Then we have the following vital deﬁnition: ei ej ≡ Γk ek ij (4. g ∈ C ∞ (M ). we must deﬁne a new tensor called a connection: Deﬁnition: A connection on a manifold M is a map that takes two tensors to a tensor. Finally. b constants. X2 Y . taking the diﬀerence of two vectors with diﬀerent base points is a tricky subject. we can ask: what is the connection of our coordinates? Let’s choose a coordinate basis for our manifold (ei ). 3. Notice that the ﬁrst item applies to general functions.you simply move one of the vectors over so that their base points coincide. f (x)X1 +g(x)X2 Y X (aY1 = f (x) X Y1 XY X1 Y + g(x) X Y2 . f. and then you can subtract them like usual.σ µν . In ﬂat space. Hence. Armed with this knowedge. n = dimM 33 . as opposed to a local neighborhood around the point. This is why we can factor out the functional dependence. this is not a problem .σ µ . or sometimes connection coeﬃcients. 2. They are not tensors!! But they are tensorial in the sense described above. It is denoted X Y and necessarely has the following properties: 1.1. you cannot generally move a vector around arbitrarely in curved space without changing it. and therefore the naive derivative will not work here.ν . we say that the connection X Y is tensorial in X and local in Y .µ 1 2 ∂ 2 xσ ∂xµ xν (4.1) (Note the Einstein notation on the index k).

1.4) Proving Equation (4. Consider a vector ﬁeld A(x) on a manifold M . the covariant derivative is the derivative of a vector ﬁeld that you would see on the surface. but they do not deﬁne it uniquely. However.3) (4.1.1. It seems reasonable that this object might be like the generalization of the derivative. Let’s take its connection with respect to a basis vector: j ei A = ei (A ej ) = (∂i Aj )ej + Aj ei ej = (∂i Aj )ej + Aj Γk ek = ∂i Ak + Aj Γk ek ij ij where I have used Leibniz’s rule and renamed a dummy index in the last step (j → k). it turns out that we’re not quite done . Equations (4. There are many theories where.3). I can drop the basis vectors and write3 : Di Ak = ∂i Ak + Aj Γk ij Di Ak = ∂i Ak − Aj Γj ik (4. Recall I said that you can always embed a manifold in a larger ﬂat space. 3 When in a speciﬁc coordinate basis. Notice that in the limit of ﬂat space. The second property is that the connection is symmetric or torsionless.” It basically means that the derivative of a product makes sense. Before we proceed. If we want to deﬁne something like a derivative. This is an especially popular trick in trying to quantize gravity. we need to invoke more axioms.where the matrices A are the coordinate transformation matrices deﬁned in Chapter 1.1. notice that this is by no means the only choice I could have made. there is a “correction” to the ordinary derivative that compensates for how you subtract the vectors with diﬀerent basepoints. or the Levi-Civita Connection. Now we would like to translate this rather abstract discussion into something concrete.4) is very straightforward .1. and replace the ordinary derivative of ﬂat space. where these are the axioms.3-4. and I will simply stick to Riemannian geometry. we drop the symmetric condition to get various other types of geometry. These two additional properties. we must use the covariant derivative.4) are often called the covariant derivatives.1. along with the three already included. we often simply write ∂i ⇔ Di . Γi = Γi jk kj The ﬁrst of these is often put into words as: “The connection is compatible with the metric. We usually chose the following two: 4. but in more general cases. If I am only interested in the coordinate form of the connection (which I usually am). are necessary and suﬃcient to deﬁne a unique connection called the Riemann Connection. there is no diﬀerence between the ordinary and covariant derivative.these three axioms deﬁne a connection. But this is way beyond the scope of this paper. Xg ≡0 ∀X 5. Physically speaking. 34 . for example. But as long as we stay strictly on the surface.simply consider Di (Aj Aj ) = ∂i (Aj Aj ) and use Equation (4. where the ordinary derivative works ﬁne.

we can use it to ﬁnd a useful expression for the connection coeﬃcients.4).k = (Ai Bj ).1]. should give us enough intuition to express the covariant derivative on an arbitrary rank tensor: n (T i1 i2 ···in ). it is sometimes easier to write the diﬀerential coordinate index with a semi-colon: Ak = Ak + Aj Γk ij .8) 35 . Exercise: Prove Equation (4. Similarly there would be a minus Γ term for each covariant index. a quick corollary to this would be: Φ. Then we can take the parametrized covariant derivative by resorting to the chain rule: Dwj dxi ∂wj dxi dxi = Di w j = + (Γj wk ) ik ds ds ∂xi ds ds Combining terms: Dwj dxi dwj = + Γj w k ik ds ds ds This expression is very useful. Hint: use the covariant derivative.1. w(s) : I → M .Remembering our easy notation of Chapter 1.i (4.k = Ai.µ + gσµ. where I is the unit interval [0.7) As a ﬁnal exercise.1.6) In other words – for a rank-n tensor.1.1. (4.1. consider a parametrized curve on a manifold M . This.ν ] νσ 2 (4. This should be so. since the Christoﬀel symbols only appeared when we had basis vectors to worry about.2). we get a Γ term for each contravariant index. Exercise: Prove Equation (4.1.i for a scalar quantity Φ.j + α=1 Γiα T i1 ···iα−1 βiα+1 ···in βj (4.j = (T i1 i2 ···in ).k Bj + Ai Bj.7) using the method suggested. By the way. along with Equation (4. Now that we have the covariant derivative under our belts.1.k + (Γm Am )Bj + (Γm Bm )Ai ik jk where I have skipped some straightforward algebra. and use the property of compatability to get: 1 Γρ = g µρ [gµν. Take the covariant derivative of the metric tensor three times (each with a diﬀerent index).σ − gνσ.i .5) What if I wanted to take the covariant derivative of a tensor that is higher than rank-1? Let’s consider the covariant derivative of the diadic: (Ai Bj ).i = Φ.

8). these equations do indeed reduce to This is actually a sketch of the proof of the famous theorem from geometry/topology that states that you can never have a continuous vector ﬁeld deﬁned everywhere on a sphere that doesn’t vanish somewhere. Notice that if we’re in ﬂat space. and then use our new-found intuition to deﬁne a geodesic. We begin with a deﬁnition: Deﬁnition: Consider a parametrized path α(s) : I → M and a vector ﬁeld along the curve: w(s) = w(α(s)) all in a manifold M . Now move one of the vectors along the equator until it comes to the international date line (180 degrees around). note that a particle’s trajectory xµ (s) should ultimately be given by Newton’s laws.9) Going back to physics. In this section. and they did their best to generalize the notion of parallel vectors by inventing an algorithm called parallel transport. 4 36 . But there is no need to get into the details of that here. If you were to do the same experiment. with x ≡ dx as the vector ﬁeld we’re diﬀerentiating.2 Parallel Transport and Geodesics Let’s return to our spherical planet example from the beginning of the chapter. Armed with this knowedge. we see that there is something ﬁshy about the notion of parallel vectors in curved space. zero lattitude).2. Then we say that w(s) is a parallel vector ﬁeld if its covariant derivative vanishes everywhere: Dw =0 ds ∀s ∈ I The uniquness theorem of diﬀerential equations tells us that if a vector ﬁeld has a given value at some initial point (w(0) = w0 ). but move the vector only 90 degrees around the globe.1. Now proceed to move both vectors up the longitude line until they meet at the north pole. the two vectors would be perpendicular at the pole4 ! Again. the geodesic satisﬁes the diﬀerential equation: Dx ≡0 ds Using Equation (4. we can ﬁnally move onto one of the most useful objects in geometry: Deﬁnition: A geodesic on a manifold M is a parametrized curve x(s) : I → M whose tangent vectors form a parallel vector ﬁeld. Consider two parallel vectors tangent to the Earth at the equator and the prime meridian (zero longitude. In other words. I will talk a little about that. but antiparallel. Geometers knew this. then there is a unique vector w that is parallel to w0 that comes from parallel transporting w0 some distance along the contour α(s). Notice that they are no longer parallel.4. we get a ds famous nonlinear diﬀerential equation: d2 xµ dxν dxσ + Γµ =0 νσ ds ds ds (4.

4. There are a number of symmetries6 : 1. Rβνρσ = +Rρσβν µ µ µ 4.2. there would be d4 numbers. a particle moves along a geodesic.σ νσ αρ νρ ασ (4. there are: 5 6 I’m not going to prove this: you do it! µ R.3. If we wished to include other inertial forces (such as an electric ﬁeld. let’s see what else the covariant derivative can give us. Gravity manifests itself as the curvature of space.9). for example) then in can be more complicated.νρσ.ρ νρ.Newton’s laws of motion! General relativity suggests that in the absense of inertial forces.3.ρ + R.νστ .ρ.9).νρσ is a tensor.σνρ = 0 µ µ µ 5.10). So we can think of the Γ terms in Equation (4. but in curved space (when gravity is present. with all of these symmetries.2. i. In ﬂat space (no forces). R.σ − Aν. One question that analysts like to ask is “Are mixed partials equal?” It turns out that understanding the answer to this question gives us a rigorous way to quantify the curvature of the manifold.νρσ ≡ Γµ − Γµ + Γα Γµ − Γα Γµ νσ. so its equation of motion is given by Equation (4. These symmetries have the eﬀect of putting some powerful constraits on the number of independent quantities in the curvature tensor. R. Without any symmetries. where d is the dimension of our manifold. Rβνρσ = −Rνβρσ 2. the last item is known as the Bianchi Identity. called the Riemann-Christoﬀel Curvature Tensor. that’s a straight line. Rβνρσ = −Rβνσρ 3.νρσ where µ R.e.2.ντ ρ.10) µ R.τ + R. we would modify the right hand side of Equation (4.νρσ + R. a geodesic is the is the path of shortest distance: this can be proved using variational calculus techniques. then that other object must be a tensor. We know it’s a tensor from the quotient rule of tensors: if a tensor equals another tensor contracted with another object. Consider the commutator of two covariant derivatives acting on a vector5 : µ Aν.ρσν + R.The Riemann Tensor Now that we have worked out some details on parallel transport and geodesics.3 Curvature. However.νρσ = g µβ Rβνρσ 37 . So this equation tells us that particles like to travel along the path the minimizes the distance travelled.: the connection coeﬃcients.σ = 0 The ﬁrst four symmetries are manifest from Equation (4.9) as source terms for the gravitational ﬁeld in Newton’s equations of motion.σ.ρ = Aµ R. for instance). Additionally.

It is. the resulting ﬁeld is gauge invariant. there are only a few nonzero contractions. Does this look like anything we know? What about the vector potential of electrodynamics? That has an inhomogeneous term when you perform a gauge transformation (A → A + Λ).3.two dimensional manifolds. These equations are very closely related to the topic of “Gauge Invariance” – we will talk about that in the next chapter.13) R is sometimes called the scalar curvature. Finally. Any other nonzero contraction would give the same object up to a sign. There is a similar equation in Maxwell’s electrodynamics: Fµν. but thanks to the symmetries.12) It is called the Ricci tensor.1. and yet when you take its curl to get B. and up to sign. and is a tensor.12) to get the Ricci scalar: µ R ≡ Rµ (4.d2 (d2 − 1) (4. the “gauge transformation” is actually a coordinate transformation (technically. It is a measure of the intrisic curvature of the manifold. We can try to construct lower rank tensors from this monster in Equation (4.11) 12 independent quantities.ν = 0 where Fµν is the ﬁeld strength tensor (see Jackson or Griﬃths). so I’ll stop now. in a sense.10). The Bianchi Identity is a diﬀerent sort of relation in the sense that it involves derivatives. they all yield the same thing. a “diﬀeomorphism”).3. Notice that it has derivatives of the connection coeﬃcients. You should prove this for yourself. Einstein used this tensor to describe gravity. I wanted to talk a little about a very special case of mathematics .this says that the connection coeﬃcients transform under coordinate transformations with an inhomogeneous term.2) . You can see the parallel: you might think of the connection coeﬃcients as “gravity vector potentials”.10). this is getting us too far aﬁeld.3. Finally we can take the trace of Equation (4. I would like to mention an interesting curiosity. However. a strange kind of curl of the connection coeﬃcients. 4. and the curvature tensor as the “gravitational ﬁeld”. Now look at Equation (4. We deﬁne the rank-2 tensor: τ Rµν ≡ R. In one dimension. Here.3. Notice that the rank of the Riemann-Christoﬀel tensor is always 4.3.3. Then we have something like a gauge theory of gravity! Unfortunately. and the Riemann tensor transforms “covariantly” (which is a fancy word for “properly”) under these transformations.ρ + Fνρ. It is a symmetric tensor. things are more complicated than that. the Riemann tensor 38 . despite the dimension of the manifold.µντ (4.1 Special Case: d = 2 Before leaving diﬀerential geometry behind. Take a good look at Equation (4.µ + Fρµ.3. even though the connection coeﬃcients do not.

Gauss proved that it is isometrically invariant.11)). The the curvature of the curve is given by: k(s) ≡ |α (s)| If n is the normal vector to α. 7 There is no diﬀerence to which one is which in the literature. To understand how this works.14) (4. It is very useful in many applications. one dimensional diﬀerential geometry is nothing more than vector calculus! In three or more dimensions. Now consider all the curves in the surface.3. One can prove the relation: K∝R and hense make the connection between two dimensional geometry and higher dimensional geometry. 39 .7 These two quantities are properties of the surface. and N is the normal to the surface. Finally. Here. we can talk about curvature in a very quantitative matter.15) K = k1 k2 H is called the mean curvature. and the minimum normal curvature k2 .is not well-deﬁned. the Riemann tensor is complicated. and we do not need any of this formalism to discuss geometry. However. Deﬁnition: Consider a curve parametrized by α(s) living on a surface M . A two dimensional Riemannian manifold is often simply referred to as a “surface”. K is called the Gaussian curvature. not of the curves. and that is the Ricci scalar curvature. we need to state a few deﬁnitions.3.3. then we deﬁne the angle: cos θα(s) ≡ n(s). we can deﬁne two new objects: H= 1 (k1 + k2 ) 2 (4. But in the special case of two dimensions. where an “isometrically invariant” quantity only depends on the surface itself. the Riemann tensor has only one component (see Equation (4. in fact. and all of the normal curvatures. which means that it describes the surface in a fundamental way. not on how the surface is embedded in 3-space. and it is much more diﬃcult to say anything quantitative and general about your manifold. and is of fundamental importance. especially in engineering. it is not as useful for a theoretical tool because it is not isometrically invariant. Let us call the maximum normal curvature k1 . N(s) Then kn (s) ≡ k(s) cos θα(s) is called the normal curvature along the path α(s).

which is why the Riemann-Levi-Civita formalism is so useful in higher dimensions. and one of the key reasons why diﬀerential geometry in two dimensions is so well understood: Theorem: For any compact. but this is the most general statement of the theorem.2 Higher Dimensions In more than two dimensions. we can state one of the most important theorems of diﬀerential geometry. You then count the total number of faces (F ). In order to deﬁne the analog of Gaussian curvature.3. then they are homeomorphic. This theorem is called the GaussBonnet Theorem. there is no known universal topological invariant in more than two dimensions! In other words. Thus the Euler characteristic provides a very powerful techinque for classifying diﬀerent types of surfaces. this process is called triangulation.17) χ(M ) is well-deﬁned in the sense that it is independent of how you triangulate your surface. and very powerful! 4. “charge”! This kind of connection between ﬁeld theory (classical and quantum) and diﬀerential geometry is very common. manifolds with the same Euler characteristic 40 . you divide your surface into a bunch of triangles. We call this quantity a “topological invariant” because if you have two surfaces that have the same value of χ.16). First of all. edges (E) and vertices (V ) and apply the following formula: χ(M ) = V − E + F (4.3.Armed with this new tool. and has several versions for diﬀerent cases of M .3.: without tearing or gluing. the Gaussian curvature is no longer well-deﬁned.e. (i. This is not a trivial point. It can be proved that all of the sectional curvatures are contained in the Riemann tensor. you can always deform one of the surfaces into another in a continuous manner. As a very quick application of Equation (4. oriented Riemannian 2-manifold M: KdΩ = 2πχ(M ) M (4. things become considerably more complicated.3. Secondly. These curvatures are know as sectional curvatures. i. one considers all the independent two dimensional “cross sections” of the manifold and calculates each of their Gaussian curvatures. in other words. and more importantly. but it is not obvious how to do this. you might use some information from the Riemann tensor. The Euler characteristic χ(M ) is of fundamental importance in topology and geometry. consider Gauss’ Law from electrodynamics. even if it is overkill in two dimensions. there is no such thing as a Gauss-Bonnet theorem for manifolds of more than two dimensions.16) where the integral is a surface integral over the entire manifold and χ(M ) is a topologically invariant quantity called the Euler characteristic. To calculate it for a surface. If you interpret K as the “ﬁeld strength” we see that the surface integral must be some topological invariant.e. etc. integrating over only part of M . Instead. Unfortunately.

Let me just say that this is one of the most active areas in modern mathematics research. so I will stop now. This is way beyond the scope of this review. If you are at all interested.need no longer be topologically equivalent in higher dimensions. and we know of no single quantity that does classify manifolds in this way. I strongly encourage you to check it out! 41 .

If mathematicians had never discovered diﬀerential forms.Chapter 5 Diﬀerential Forms Diﬀerential forms are. or any gauge ﬁeld for that matter. in curved space (i. without a doubt. This is a big challenge in general mathematical research. If that is the case. The second point I want to make is that no matter how much you like or dislike forms. let’s jump in. The idea behind this beautiful notation is that terms cancel early. I have included it anyway.. We have to construct a deﬁnition that both allows for all the cases that we know should exist. remember that they are nothing more than notation. however. Mathematicians have another deﬁnition that stems from the subject of linear algebra. If we had never heard of diﬀerential forms. I should mention two quick things. Diﬀerential forms are extremely useful when. This is why I thought they would do well in this review. and also forbids any pathological examples. it takes a long time before they become natural and second nature to the student.1 The Mathematician’s Deﬁnition of Vector One thing you have to be very careful of in mathematics is deﬁnitions. Whether you realize it or not. it is likely that many of you will never need to know any of this in your life. we are going to completely forget about the physical deﬁnition (no more transformations) and look on vectors from the mathematician’s point of view. like anything good. Nevertheless. making the calculations nasty. many calculations would become long and tedious. To this end we must make a new deﬁnition. one would have to carry out terms that would ultimately vanish.e. 5. at best! But let us not forget that there is nothing truly “fundamental” about forms – they are a convienience.: with gravity). we would not be at a serious loss. For the moment. nothing more. you have been using diﬀerential forms for a long time now (gradient). We have been talking a lot about “vectors”. you wish to discuss electromagnetism.. First of all. But this is unfortunately not the only mathematical deﬁnition of a vector. Before diving into forms. and I ﬁnd forms to be very useful. since writing these notes has helped me to understand some of the more subtle details of the mathematics. where we have been using the physicist’s deﬁnition involving transformation properties. one of the most beautiful inventions in all of mathematics! Their elegance and simplicity are without bound. then you have no reason to read this chapter (unless you’re curious). the key deﬁnition of linear algebra: 42 . for example. That being said.

Another example of a vector space is the set of all eigenfunctions in quantum mechanics.1. linear in each of its arguments) transformations that send k vectors and l covectors to a number: T : V × · · · × V × V ∗ × · · · × V)∗ → R k times l times l The set of all such transformations is called Tk and the elements of this set are called tensors l 0 1 of rank . It is so important in linear algebra. and denote it as V ∗ . you can prove to yourself very easily that the set of n-dimensional real vectors (Rn ) is a vector space. real-valued functions (C n (R)) is also a vector space. The vectors (functions) in the space are called covectors. or 1-forms (hense the 1 superscript). in general. Notice that this deﬁnition of vectors has absolutely nothing to do with the transformation properties that we are used to dealing with. and there is a scalar identity element (1v = v). as we deﬁned them in Chapter 1. φ linear} (5. where the “vectors” are the functions themselves. It is also true that this space also forms a vector space itself. the ﬁeld of scalars will always be R. associative. Recall that a linear transformation is a function that operates on V and preserves addition and scalar multiplication: T (αx + βy) = αT (x) + βT (y) Then let us consider the set of all linear transformations that go from V to R: Λ1 (V) = {φ| φ : V → R. called vectors. 2. For example. Armed with this knowledge we can go back and redeﬁne what we mean by “tensor”: consider the set of all multilinear (that is. We call it the dual space of V. is a set of objects. true that for any vector space. the set of all n-diﬀerentiable. This set is very intriguing .every element in this set has the property that it acts on any vector in the original vector space and returns a number. and two operations called vector addition and scalar multiplication that has the following properties: 1. where the eigenfunctions are the vectors. there is a very important fact about vector spaces. this special set exists. But so are much more abstract quantities. T0 = V ∗ . V is closed.+) is an abelian group.1) I’ll explain the notation in a moment. This is why it’s called “abstract” algebra! Sticking to abstract linear algebra for a minute. but in general it could be something else. called scalars. Also notice that this deﬁnition k of tensor has nothing to do with transformations. For 43 . it is given a name. Notice immediately that T1 = V. denoted by V. and distributive under scalar multiplication. like C or some other ﬁeld from abstract algebra. In our case. It is.Deﬁnition: A Vector Space (sometimes called a Linear Space). and a ﬁeld. (V. Nevertheless.

1. Consider the vector space Rn and the set of multilinear transformations that take k vectors to a number: φ : Rn × Rn × · · · × Rn → R k times with the additional property that they are “alternating”. .3) (5. Now let us deﬁne a 1-form dxi with the property: i dxi (ej ) ≡ dxi . The set of all k-forms is denoted Λk (Rn ). Before going any further.. j. x. x.6) Notice that if k > n the space is trivial – it only has the zero element.1. Wedge products are discussed below. What is the standard basis for Λk (Rn )? Let us start by considering the dual space to Rn (k = 1). Note that this set is also a vector space. ej = δj (5. perhaps it would be wise to go over this and make sure you understand it. . en }. dx2 ∧ dx3 .e.4) Then the standard basis of Λk (Rn ) is the set of all k-forms: dxi ∧ dxj ∧ · · · dxl . but for now all you need to know of them is that they are antisymmetric products. acording to this deﬁnition. y. ... That’s enough abstraction for one day. We know the standard basis for Rn . the Christoﬀel symbols of diﬀerential geometry are (1.example. . dx1 ∧ dx3 . · · · . .1. hence the notation.) = −φ(. dx1 ∧ dx3 .1. Can you write down the standard basis for Λ2 (R3 )? How about Λ2 (R4 )? Here are the answers: Λ2 (R3 ) : {dx1 ∧ dx2 . you should be able to ﬁgure out what the dimension of Λ (Rn ) is. i = j = · · · = l. dx3 ∧ dx4 } ⇒ D = 6 By looking at these examples. In linear algebra. y. dx2 ∧ dx3 } ⇒ D = 3 Λ2 (R4 ) : {dx1 ∧ dx2 . dx2 ∧ dx4 . dx1 ∧ dx4 . 2) tensors. l ≤ n (5. in general (k < n): k dim Λk (Rn ) = n k = n! k!(n − k)! (5. i. B 1−forms. . . . Let us consider a concrete example. so A ∧ B = −B ∧ A for A.: φ(. . . then φ is called a k-form.2) When φ takes k vectors in Rn . . i. we know that every vector space has a basis. . . and the dimension of the vector space is equal to the number of elements in a basis.5) where each “wedge product” has k “dx”s1 .1. 1 44 . The dual space is an example of this. let’s denote it by {e1 .) (5. So you must be sure you know which deﬁnition people are using.

let’s do some examples.2. Notice that this is diﬀerent from the ordinary tensor product: ω ⊗ φ(v. called the wedge product (∧). As a “pseudo-deﬁnition”.1 Form Operations Wedge Product In order to deﬁne the standard basis for Λk (Rn ). Associative: (dx1 ∧ dx2 ) ∧ dx3 = dx1 ∧ (dx2 ∧ dx3 ) = dx1 ∧ dx2 ∧ dx3 3. φ ∈ Λ1 (R3 ). We will prove this explicitly soon. φ ∈ Λ1 (Rn ).: 1-forms in 3-space.10) Looking at this as an antisymmetric product on forms.2 5.2. then let us deﬁne an element in Λ2 (Rn ) by the following: ω ∧ φ(v. Distributive: (dx1 + dx2 ) ∧ dx3 = dx1 ∧ dx2 + dx1 ∧ dx3 2.8) (5. we can generalize to the wedge product of any forms: Deﬁnition: Let ω ∈ Λk (Rn ).2. Skew-commutative: ω ∧ φ = (−1)kl φ ∧ ω Notice that for k + l > n the space is trivial. i. so φ ∧ ω = 0. This product is 1. Writing it out explicitly. Consider ω.2. w) = ω(v)φ(w) − ω(w)φ(v) Notice that there are two immediate corollaries: ω ∧ φ = −φ ∧ ω ω∧ω = 0 ∀ω ∈ Λk (Rn ) (5.9) So the wedge product deﬁnes a noncommutative product of forms. we needed to introduce a new operation.7) (5.e.5. This is exactly right: the cross-product of two vectors is the same thing as a wedge product of two 1-forms. you might be reminded of the cross product. Let’s look more carefully at this new form of multiplication. Before going any further. w) ≡ ω(v)φ(w) (5. consider two 1-forms ω. Then we deﬁne ω ∧ φ ∈ Λk+l (Rn ) as the sum of all the antisymmetric combinations of wedge products.2. we have: ω = a1 dx1 + a2 dx2 + a3 dx3 φ = b1 dx1 + b2 dx2 + b3 dx3 ω ∧ φ = (a1 dx1 + a2 dx2 + a3 dx3 ) ∧ (b1 dx1 + b2 dx2 + b3 dx3 ) = (a1 b2 − a2 b1 )dx1 ∧ dx2 + (a1 b3 − a3 b1 )dx1 ∧ dx3 + (a2 b3 − a3 b2 )dx2 ∧ dx3 45 . Now that we have talked about the speciﬁc case of 1-forms. φ ∈ Λl (Rn ).

2. then we have shown that the triple wedge product of 1-forms in R3 is just: a1 a2 a3 ω ∧ φ ∧ α = det b1 b2 b3 dx1 ∧ dx2 ∧ dx3 c1 c2 c3 (5. consider a 2-form acting on a vector: dx ∧ dy(v) = dx(v)dy − dy(v)dx = vx dy − vy dx The minus sign comes from ﬂipping the dx and the dy. 46 . In general. Proof: The proof of this theorem is straightforward: a simple k-form is just a basis element.13) Notice that if we identify β = ω ∧ φ from above. any element of Λk (Rn ) can be written as a linear combination of these basis elements. let us conclude this section with a deﬁnition and a theorem: Deﬁnition: A simple k-form is a “monomial” k-form. As a ﬁnal example. the wedge product of n 1-forms in Rn is always such a determinant. What about taking the wedge product of two 1-forms in R4 : ξ = a1 dx1 + a2 dx2 + a3 dx3 + a4 dx4 χ = b1 dx1 + b2 dx2 + b3 dx3 + b4 dx4 ξ ∧ χ = (a1 b2 − a2 b1 )dx1 ∧ dx2 + (a1 b3 − a3 b1 )dx1 ∧ dx3 + (a1 b4 − a4 b1 )dx1 ∧ dx4 +(a2 b3 − a3 b2 )dx2 ∧ dx3 + (a2 b4 − a4 b2 )dx2 ∧ dx4 + (a3 b4 − a4 b3 )dx3 ∧ dx4 Note that this does not look like a “cross product”. if you took another wedge product with another 1-form.2. Then since we are dealing with a vector space. Also notice that this looks very similar to a cross product. possibly multiplied by a number. there is no addition.2. Before moving on. taking the wedge product of (n − 1) 1-forms in Rn will give you something analogous to a cross-product. We’ll get back to this later. QED. and then you must permute the form so that all the 1-forms act on the vector.Notice how I’ve been careful to keep track of minus signs.14) In fact. and that would give you something like a four-dimensional cross product. we can ask how a general wedge product of forms acts on vectors? The answer is that the form farthest to the left acts on the vector. that is.12) (5. what if you took the wedge product of a 1-form and a 2-form in R3 : α = c1 dx1 + c2 dx2 + c3 dx3 β = L12 dx1 ∧ dx2 + L13 dx1 ∧ dx3 + L23 dx2 ∧ dx3 α ∧ β = (c1 L23 − c2 L13 + c3 L12 )dx1 ∧ dx2 ∧ dx3 (5. Theorem: All k-forms can be written as a linear combination of simple k-forms. As an example. it is a six-dimensional object in R4 . as stated earlier. This is a very important fact that will come up again and again. you would get a 3-form in R4 (which is four-dimensional).11) (5. However.2. Next.

In R2 : ∗dx1 = dx2 ∗dx2 = −dx1 47 . let me state a pretty obvious result: ˜ Theorem: v = v. Let’s see some examples: v = v 1 e1 + v 2 e2 + v 3 e3 ∈ R3 v = v1 dx1 + v2 dx2 + v3 dx3 ∈ Λ1 (R3 ) ˜ Notice that the tilde changed subscript indices to superscript. and vice versa. 5.3 Hodge Star The next tool we introduce for diﬀerential forms is the Hodge Star (∗). by using wedge products as opposed to using 1 tensor products . Finally. Let’s do another less trivial example: 0 L12 L13 0 L23 ∈ R3 (Rank-2) L = −L12 −L13 −L23 0 ˜ L = L12 dx1 ∧ dx2 + L13 dx1 ∧ dx3 + L23 dx2 ∧ dx3 ∈ Λ2 (R3 ) 1 Lij dxi ∧ dxj = 2 where in the last line I’m using Einstein notation for the indices.2 Tilde The tilde operator is the operator that translates us from vectors to forms.15) As for many other things in linear algebra. This is because the form notation has antisymmetry built into it. the concept of a form is not very useful. it is suﬃcient to consider how the Hodge star acts on the basis elements. We are beginning to see why form notation can be so useful! Incidentally. Let’s look at some examples. For symmetric tensors. Notice how much simpler it is to write a 2-form as opposed to writing a rank-2 tensor. notice that I am only considering antisymmetric tensors when I talk about forms. This is important when we have to keep track of what is covariant and contravariant.2. ˜ Proof: Left to the reader. In general.this is why we need the factor of 2 in the last line.2. including the minus sign. The Hodge Star converts forms to their so-called dual form: ∗ : Λk (Rn ) → Λn−k (Rn ) (5.2. Also note that both Lij and Lji are taken care of.5. it is an operation: (rank-k tensors)→(k-forms).

Given three vectors in R3 . we can reinterpret Equation (5...17) From the above.µk dxµ1 ∧ · · · ∧ dxµk ⇒ ∗Φ = or ∗φµ1 ...14)..In R3 : ∗dx1 = dx2 ∧ dx3 ∗dx2 = dx3 ∧ dx1 ∗dx3 = dx1 ∧ dx2 In R4 : ∗dx1 ∗dx2 ∗dx3 ∗dx4 = = = = +dx2 ∧ dx3 ∧ dx4 −dx3 ∧ dx4 ∧ dx1 +dx4 ∧ dx1 ∧ dx2 −dx1 ∧ dx2 ∧ dx3 ∗(dx1 ∧ dx2 ) ∗(dx1 ∧ dx3 ) ∗(dx1 ∧ dx4 ) ∗(dx2 ∧ dx3 ) ∗(dx2 ∧ dx4 ) ∗(dx3 ∧ dx4 ) = = = = = = +dx3 ∧ dx4 −dx2 ∧ dx4 +dx2 ∧ dx3 +dx1 ∧ dx4 −dx1 ∧ dx3 +dx1 ∧ dx2 ∗(dx1 ∧ dx2 ∧ dx3 ) ∗(dx1 ∧ dx2 ∧ dx4 ) ∗(dx1 ∧ dx3 ∧ dx4 ) ∗(dx2 ∧ dx3 ∧ dx4 ) = = = = +dx4 −dx3 +dx2 −dx1 ∗(dx1 ∧ dx2 ) = dx3 ∗(dx2 ∧ dx3 ) = dx1 ∗(dx3 ∧ dx1 ) = dx2 We can see a pattern here if we remember how the Levi-Civita simbol works.. we can now present a theorem: Theorem: For any k-form ω ∈ Λk (Rn ). we get the general result: n dx ∧ ∗dx = j=1 n n i i dxj ≡ dn x ∈ Λn (Rn ) (5.. take their corresponding forms with the tilde and then take the wedge product of the result. it’s single basis element is the wedge product of all the dxj .18) Λ (R ) is one-dimensional.2.νk dxνk+1 ν1 .µn−k φ 1 (n − k)! ν1 ... ∗ ∗ ω = (−1)n−1 ω..µn−k = 1 (n − k)! ν1 . multiplied by the volume form of R3 .. For any form in Λk (Rn ): Φ = φµ1 .2.2..16) and the properties of the Levi-Civita tensor.νk µ1 .2..16) (5. The answer you get is the determinant of the matrix formed by these three vectors. What is this determinant? It is simply the Jacobian of the transformation from the standard basis to the 48 .2. Notice that the double-Hodge star does not depend on k at all! Note that in all cases.. With this in mind.νn φ ∧ · · · ∧ dxνn (5. The proof follows from using Equation (5.νk ν1 . This product is called the volume form of Rn .

3. it is simply what we had before: a d g det b e h c f i Hopefully you get the hang of it by now. we simply chose whatever elements of our vectors get picked out by the simple form. e . For this case. so we have all of this machinery. Construct the matrix by picking out the “x” and “y” components of the vectors. and take the determinant: a d det b e 2. Answers: 1. diﬀerential forms automatically give us the vector calculus result: dn x = J dn x (5. Examples: evalutate the following forms: 1.2. but now pick out the “x” and “z” components: det a d c f 3. To construct the matrix. Whenever we need to evaluate a k-form.2. we take the determinant of the matrix. e dx ∧ dy c f a d dx ∧ dz b .4 Evaluating k-Forms We have done a lot of deﬁning. This is similar. Finally. e c f a d g b . but not too much calculating. we construct a matrix according to the prescription the form gives us.basis of the three vectors you chose! In other words. a d b .19) 5. h dx ∧ dy ∧ dz c f i 2. You might be asking at this point: OK. 49 .now what? The appearance of determinants should give you a clue.

and tilde them so they are now 1-forms: {˜1 .1 Exterior Derivative Now that we have a notion of “function”. . let’s see what we can do in the way of calculus. So.2. if we have a vector ﬁeld in R3 : v(x) = v 1 (x)e1 + v 2 (x)e2 + v 3 (x)e3 v (x) = v1 (x)dx1 + v2 (x)dx2 + v3 (x)dx3 ˜ 5. Let’s see this explicitly: In Rn .22) (5. . . take n − 1 vectors.3 Exterior Calculus A k-form ﬁeld is a k-form whose coeﬃcients depend on the coordinates.5 Generalized Cross Product Eariler. Finally tilde the whole thing to give us an n-vector. I mentioned that we can deﬁne a generalized cross product in Rn by wedging together (n − 1) 1-forms.2.3. .5.21) This is just the gradient of a function! How about the curl? 50 . and take the Hodge star of the product to give us a 1-form. and we have deﬁned a general product of vectors: v1 × · · · × vn−1 ≡ [∗(˜1 ∧ · · · ∧ vn−1 )] ∈ Rn v ˜ (5. vn−1 }. Now v ˜ wedge these forms together to give you an (n − 1)-form. f (x) ∈ Λ0 (R3 ) → df (x) = ∂f ∂x1 dx1 + ∂f ∂x2 dx2 + ∂f ∂x3 dx3 ∈ Λ1 (R3 ) (5.20) 5.3.3. We can deﬁne an operator d : Λk (Rn ) → Λk+1 (Rn ) with the all-important property: d2 = dd ≡ 0 By construction we will let d act on forms in the following way: d(a(x)dx) = da(x) ∧ dx Let’s look at some examples. This is exactly analogous to a vector ﬁeld in vector calculus.

ej >= δj and < ei . 51 . Cartesian coorindates (x1 = x. Some of them I have shown.ω = a1 dx1 + a2 dx2 + a3 dx3 → ∂a1 ∂a1 ∂a1 dx1 + dx2 + dx3 ∧ dx1 dω = ∂x1 ∂x2 ∂x3 ∂a2 ∂a2 ∂a2 + dx1 + dx2 + dx3 ∧ dx2 1 2 3 ∂x ∂x ∂x ∂a3 ∂a3 ∂a3 dx1 + dx2 + dx3 ∧ dx3 + 1 2 3 ∂x ∂x ∂x ∂a2 ∂a3 ∂a1 ∂a1 = − 2 dx1 ∧ dx2 + − 3 dx1 ∧ dx3 + 1 1 ∂x ∂x ∂x ∂x ∂a3 ∂a2 − 3 2 ∂x ∂x dx2 ∧ dx3 ∂a Notice that terms like ∂x1 dx1 ∧ dx1 vanish immediately because of the wedge product.25) (5. The key point to remember is that each of these dxi is an orthonormal coordinate2 . sooner or later we will want to get our answers back into a coordinate frame. so we can easily plug into our formula: 2 i That means that < dxi . Let’s consider a simple example by looking at the two-dimensional Laplacian. ej >= δij . f is any (diﬀerentiable) function. 1 Again we see how the beauty of diﬀerential forms notation pays oﬀ! 5.24) (5. we know that 2 = ∗d ∗ d. and v is a vector ﬁeld. Can you prove the rest of them? In all cases.3. From above.3.26) Notice that even though dd = 0. df = ∗d˜ = v ∗d ∗ v = ˜ ∗d ∗ df = f ×v ·v 2 f (in R ) 3 (5.3. However.2 Formulas from Vector Calculus At this point we can immediately write down some expressions. Therefore we must make sure to translate them into orthonormal coordinates in our coordinate frame. x2 = y) are orthonormal.3. otherwise. d ∗ d = 0 in general! 5.3. the Hodge star will not work. This could be tricky.3.3 Orthonormal Coordinates Diﬀerential forms allow us to do calculations without ever refering to a coordinate system.23) (5.

our formulas work for any coordinate system! This is a vital reason why diﬀerential forms are so useful when working with general relativity.possibly the most important theorem diﬀerential forms will provide us with.4 Integration We have discussed how to diﬀerentiate k-forms. So. and then I will present a formula for integrating them. we will be able to derive Stokes’ Theorem. but the orthonormal polar coordinate basis ˆ vectors are e1 = ˆ. where you want to prove things about the physics independent of your reference frame or coordinate basis. 5. how about integrating them? First I will discuss how to formally evaluate forms. e2 = θ . and the dual basis is (dr. 52 . With this deﬁnition in mind. But what if I wanted to do my work in polar coordinates? Then my coordinate system is (x1 = r. x2 = θ). rdθ) (see previous footnote). So: r r df = ∗df = d ∗ df = = ∗d ∗ df = ∂f ∂f ∂f 1 ∂f r dr + dθ = dr + (rdθ) ∂r ∂θ r ∂r r ∂θ ∂f 1 ∂f (rdθ) − dr ∂r r ∂θ ∂ ∂f ∂ 1 ∂f r dr ∧ dθ − dθ ∧ dr ∂r ∂r ∂θ r ∂θ 1 ∂ ∂f 1 ∂ 2f r + 2 2 dr ∧ (rdθ) r ∂r ∂r r ∂θ ∂f 1 ∂ 2f 1 ∂ r + 2 2 = 2f Wow! r ∂r ∂r r ∂θ r r r where I have introduced ( r ) in the ﬁrst and third line in order to keep the coordinates orthonormal before using the Hodge star. as long as we are careful to use orthonormal coordinates when using the Hodge star.df = ∗df = d ∗ df = = ∗d ∗ df = ∂f ∂f dx + dy ∂x ∂y ∂f ∂f dy − dx ∂x ∂y ∂ 2f ∂ 2f dx ∧ dy − 2 dy ∧ dx ∂x2 ∂y 2 2 ∂ f ∂ f + 2 dx ∧ dy 2 ∂x ∂y 2 ∂ f ∂ 2f + 2 ∂x2 ∂y All is well with the world.

they are just a way to keep track of what vectors are what. . we can talk about integrating them. Let’s see how that works: Let’s evaluate the 2-form ﬁeld φ = cos(xz)dx ∧ dy ∈ Λ2 (R3 ).2 Integrating k-form Fields Now that we have a formal way to evaluate k-form ﬁelds. 2 = cos(1 · π) det C 1 3 C C A 1 2 0 2 = −2 2 1 1 0 . k).. We’ll evaluate it on the oriented 2-parallelogram spanned by the 3-vectors v1 = (1. Then we deﬁne the integral of φ over γ(A) as: φ≡ γ(A) A 0 φ Pγ(u) ∂γ ∂x1 . π) and ( 1 . 3).4. .27) where dk u is the k-volume form.4.. 2.5. . A ⊂ Rk be a pavable set.4.1 Evaluating k-form Fields First. denoted by ±Px {vi }. 2. π): 2 1. 53 . and γ : A → Rn be a (vector-valued) diﬀerentiable map. is a 0 k-parallelogram spanned by the k n-vectors {vi } with basepoint x. Px {vi } is antisymmetric under vi exchange. (i = 1. a deﬁnition: 0 Deﬁnition: An Oriented k-Parallelogram. 0 φ P0 1 B 2 B 2 B @ π 2 1 1 0 . 0 φ P0 B 1 B 2 B @ π 2. 2. We’ll evaluate it at two points: (1. u ∂γ ∂xk dk u u (5. . Oriented k-Parallelograms allow you to think geometrically about evaluating k-forms. 2 = cos( 1 · π) det 2 C 1 3 C C A 1 2 0 2 =0 5. We use them explicitly to evaluate k-form ﬁelds at a point.. 0. We deﬁne what the integral of a k-form ﬁeld is: Deﬁnition: Let φ ∈ Λk (Rn ) be a k-form ﬁeld. 1) and v2 = (2.. u ∂γ ∂x2 . but as far as paperwork goes.

Consider φ ∈ Λ1 (R2 ) and integrate over the map: γ(u) = R cos u R sin u over the region A = [0. the more curious students can go prove it. (α > 0).Like in vector calculus.α] α −R sin u R cos u du 0 (xdy − ydx) P0 @ R cos u R sin u 1 A = 0 α du[(R cos u)(R cos u) − (R sin u)(−R sin u)] duR2 [cos2 u + sin2 u] 0 = = αR2 2. Consider φ = dx ∧ dy + ydx ∧ dz ∈ Λ2 (R3 ). 0 ≤ t ≤ 1}: 1 1 0 1 1 1 2s . 0 dsdt C 0 2t C C A φ = γ(C) 0 0 (dx ∧ dy + ydx ∧ dz) P0 B s+t B s2 B @ t2 det 1 1 2s 0 + s2 det 1 1 0 = 0 1 1 1 1 0 2t dsdt = 0 1 0 (−2s + 2s2 t)dsdt ds(−2s + s2 ) = − 0 = 2 3 54 . Let φ = xdy − ydx: φ = γ(A) [0. Let’s do two examples to get the hang of it: 1. we can deﬁne the integral formally in terms of Riemann sums yuck! Let me just say that this is an equivalent deﬁnition. α]. and the map: s t s+t = s2 t2 γ over the region C = { s t |0 ≤ s ≤ 1.

These are technical issues that I have purposely left out. However. I am using vector notation (even though technically I am supposed to be working with forms) and for the case of R3 .28) What is this theorem saying? My old calculus professor used to call it the “Jumping-d theorem”. You have used this theorem many times before. it suﬃces that the area you are integrating over has to be appropriately “nice”. “Acceptable boundary” basically means that we can. “straighten” the boundary (no fractals). do not be overly concerned with all the hypotheses. dy. all of the theorems of vector calculus in three dimensions are reproduced as speciﬁc cases of this generalized Stokes Theorem. for the case of R3 : k 0 1 2 dφ f · dx ( × f ) · dS ( · f )d3 x X Path from a to b Surface (Ω) Volume (V) b a X dφ = ∂X φ Theorem Name FTOC Stokes Theorem Gauss Theorem f · dx = f (b) − f (a) ( × f ) · dS = ∂Ω f · dx Ω ( · f )d3x = ∂V f · dS V Here. Give the boundary of X (denoted by ∂X) the proper orientation. 55 .5. piece-with-boundary of a (k+1)-dimensional oriented manifold M⊂ Rn . but they are important if you want to be consistent.Stokes’ Theorem. in a consistent and smooth way. and consider a k-form ﬁeld φ ∈ Λk (Rn ) deﬁned on a neighborhood of X. dz) dS = (dy ∧ dz. For the sake of this introduction. I should mention one important point that I have gone out of my way to avoid: the issue of orientation. In words. Let’s rewrite it in more familiar notation. dz ∧ dx.4. I’ve taken advantage of the following notations: dx = (dx. Then: φ= ∂X X dφ (5. let me just say that all the manifolds we consider must be “orientable with acceptable boundary”. since the “d” jumps from the manifold to the form. Theorem: Let X be a compact.4. dx ∧ dy) d3 x = dx ∧ dy ∧ dz As you can see. we are not limited to three dimensions! Before leaving diﬀerential forms behind. this theorem says that the integral of a form over the boundary of a suﬃciently nice manifold is the same thing as the integral of the derivative of the form over the whole mainfold itself.3 Stokes’ Theorem Now we can move on to present one of if not the most important theorems that diﬀerential forms has to oﬀer. What does this mean? It means that the manifold must have a sense of “up” (no M¨bius o strips). I will present it correctly. in the forms notation.

Suppose we have a magnetic ﬁeld in the x direction. u ] ˙ ˆ ˆ = eBx [uz dy − uy dz] ⇒ p = eBx (uz y − uy z) This is exactly what you would have gotten if you had used your vector-form of Lorentz’s Law.30) F ≡ Fµν dxµ ∧ dxν ∈ Λ2 (M4 ) 2 Notice that the antisymmetry of Fµν is immediate in form notation.5 Forms and Electrodynamics As a ﬁnale for diﬀerential forms. with cross products.e.32) (5. I can go to any space I want. I can write the Lorentz Force Law for a particle with charge e moving with a velocity vector u as: ˙ p = eF (u) ∈ Λ1 (M4 ) ˜ (5. for now.5. this is a 2-form in four dimensions. I thought it would be nice to summarize brieﬂy how one uses forms in theories such as E&M. 3 56 . and gravity can be easily included at this stage. Now using Equation (5. which means that its dual is also a 2-form. In general.5. we have an antisymmetric. Armed with all we need.5. Recall that in covariant electrodynamics. Another vector that is important in E&M is the 4-current (J).29) As we have seen. I’m sticking to ﬂat space (i.: ignoring gravity). and we can rewrite the above tensor as a diﬀerential form in Λ2 (M4 )3 : 1 (5.5. so F = Bx dy ∧ dz. Then the force on the particle is: ˙ p = eF (u) = eBx dy ∧ dz. it is always possible to use 2-forms instead of (antisymmetric) tensors. u ˜ = eBx [dy dz.30). we can write down Maxwell’s Equations: dF = 0 d ∗ F = 4π ∗ J (5.31) ˆ Let’s consider an example. We can take advantage of Stokes Theorem (just like in regular vector calculus) to put these equations in integral form: M4 is Minkowski space-time. Also.5.5. therefore. one can think of F (∗F ) as an object that represents “tubes of force” ﬂowing through space-time.33) To interpret these equations. We will again think of J as a 1-form in M4 .5. rank-2 4-tensor known as the “Field Strength” tensor: F ≡ Fµν dxµ ⊗ dxν (5. u − dz dy. it’s dual is a 3-form. But notice that this analysis is totally general.

then it’s exterior derivative was necessarely zero.3. the 4-vector potential is only deﬁned up to a 4-gradient.3. you may think of the ﬁeld strength as a curvature to some space! The ﬁnal step in studying electrodynamics is to notice that Equation (5. can it be written as the exterior derivative of some other form? Your ﬁrst impulse might be to say “yes”: surely if the form is an exterior derivative. But do be careful not to jump to any conclusions before you know the properties of the universe you are describing! Finally. That a form is an exterior derivative is suﬃcient for its exterior derivative to vanish. we can consider what happens if we let A → A + dλ. You can do similar analyses for any gauge ﬁeld.dF = Σ ∂Σ F =0 ∗F = 4π(charge) ∂Σ (5. via Equation (5.34) (5. The key point is that this property depends not on the diﬀerential forms themselves. and therefore for electricity and magnetism. it is simply connected. we see that F remains the same.5. where λ is any (reasonably well-behaved) function. your condition is satisﬁed. but not necessary.5.5.36). this is exactly what gauge invariance tells us should happen! This gives a very powerful geometric intuition for our ﬁeld-strength. you can alter your theory to include gravity. Therefore it is safe to assume a la Equation (5. and by adjusting the manifold.35) d∗F = Σ The ﬁrst of these equations says that the total ﬂux of F through a closed region of spacetime is zero.21).5. the second equation says that the total ﬂux of ∗F through a closed region of space-time is proportional to the amount of charge in that region.32) suggests something. Then by just plugging into Equation (5. So indeed.21).5.5. Notice that this description never mentions coordinate systems: once again. with the help of Equation (5. Hense. Recall that if a diﬀerential form was the exterior derivative of some other form.32) that we can write: F = dA (5. speciﬁcally. 57 . But it turns out that the question is a little more subtle than that. strings. For I can give you a form which is not an exterior derivative of another form. and yet still has vanishing exterior derivative! This is a famous case of “necessity” versus “suﬃciency” in mathematics. if you have a form whose exterior derivative is zero. or whatever you want! This is one of the biggest reasons why diﬀerential forms are so useful to physicists. but on the global properties of the space they live in! It turns out that M4 does indeed have the property of necessity. Is the converse true? Namely. electrodynamics in ﬂat space can be described by a 4-vector potential. extra dimensions. diﬀerential forms has allowed us to describe physics and geometry without ever referring to a coordinate system! Notice the similarity of this equation with the Gauss-Bonnet Theorem – in gauge theories.36) for some 1-form A.

There is of course much more material that what I present here. y) + iv(x. where r ≡ |z| and θ is called the argment of z. except to deﬁne it and give an example or two. I would like to present some key results that are used in analytic work. Complex numbers are written as z. Then a function of a complex variable has the form: f (z) = f (x + iy) = u(x. including inﬁnity. Complex analysis is concerned with the behavior of functions of complex variables. In my experience. 6. I will not mention too much about conformal mapping. If it is analytic for all ﬁnite complex numbers.1 Analytic Functions • Deﬁnition: A complex function f (z) is analytic at a point z0 if it has a Taylor expansion that converges in a neighborhood of that point: ∞ f (z) = i=0 ai (z − z0 )i If f (z) is analytic at all points inside some domain D ⊂ C then it is said to be analytic in D. v are both real valued functions of the real variables x. I am assuming that you are familiar with the basics of complex numbers (z = x + iy ∈ C) and will immediately move on. have become very important in almost all branches of applied mathematics. it is said to be entire. The techniques of complex analysis. You can also write a complex number in polar coordinates z = reiθ . If it is analytic for all complex numbers. but hopefully this small review will be useful.Chapter 6 Complex Analysis In this chapter. applications of complex numbers fall into two general catagories: Cauchy’s Integral Theorem and Conformal Mapping. This review will consider the calculus of complex variables. y. pioneered in the eighteenth and nineteenth centuries. then f (z) =constant! 58 . It turns out that these functions have a much diﬀerent behavior than functions of real variables. sometimes denoted as arg z. y) where u.

it is possible to show that u and v are constants. y) that satisﬁes the CR equations with u(x. • Let’s see how one might construct the harmonic conjugate of a real-valued harmonic function. y) = u(x. This function is certainly harmonic (prove it!). y) is a harmonic function. y). When it is true. To ﬁnd it. y). Consider the function u(x. y) = −3xy 2 + φ(x) where φ(x) is some arbitrary function of x alone. Integrating this equation with respect to y gives v(x.1) (6.1. y)+iv(x. y) that is the harmonic conjugate of u(x. D2 respectively. y). there should be a unique function v(x.1. This function is called the harmonic conjugate of u(x. it need not be a continuation of f1 (z) into that domain! 59 . y) = x3 − 3xy 2 + C is the harmonic conjugate to u(x. Integrating this gives φ(x) = x3 +C with an arbitrary constant C. but as we expect from the above note. From the CR equations. if v is the harmonic conjugate of u. Then f2 (z) is called the analytic continuation of f1 (z). it can be seen that both u and v satisfy Laplace’s equation: uxx + uyy = vxx + vyy = 0 Functions that satisfy Laplace’s equation are called harmonic functions. but the key is in the analytic. f2 (z) deﬁned on domains D1 . Exercise: show that f (x. This means that there is only one way to “extend” a function analytically out of its original domain. and therefore v(x. y) satisfy the Cauchy-Riemann (CR) Equations: ux = vy vx = −uy (6. so up to an additive constant. Note that the function v(x. y) = y 3 −3x2 y. which is certainly analytic. y) can be rewritten as f (z) = i(z 3 + C). there is a unique function v(x.2) where ux means the partial derivative of u with respect to x. y) = x2 − y 2 . y) = 2xy is the harmonic conjugate to u(x. note ﬁrst of all that by the ﬁrst CR equation. then it is not generally true that u is the harmonic conjugate of v. Notice. that if f3 (z) is another analytic continuation of f2 (z) into another domain D3 . The proof of this is not hard. in other words. all others zero. When an analytic continuation exists.• Theorem: The real and imaginary parts of an analytic function f (z) = u(x. Plugging into the second CR equation gives φ (x) = 3x2 . Notice that this relation is not reﬂexive in general. y) must be an analytic function. where D1 ∩ D2 = ∅. then it is unique. Also assume that the two functions agree for all points in the intersection. y). • Let’s do an example: f (z) = z 2 = (x2 − y 2 ) + i(2xy) This function is analytic everywhere in the ﬁnite complex plane (it is entire): its Taylor series about z = 0 is trivial: a2 = 1. y)+iv(x. • Consider two analytic functions f1 (z). vy = ux = −6xy. however. Therefore f (x. the converse is not true (prove it!). then up to an additive constant. If u(x. etc. It also satisﬁes the CR equations.

Hense. it is unique: there is no other function that is analytic and matches with our original function! • That was an obvious one. Further. Γ(x) is in fact an analytic function of x.3) Integrating by parts shows that when x is a natural number. 6. • A singularity of a function is a point where the function is not deﬁned.2 Singularities and Residues • Deﬁnition: A Laurent Expansion of a function f (z) about a point z0 is an expansion in both positive and negative powers: ∞ ∞ f (z) = n=−∞ cn (z − z0 ) = n=1 n b−n an (z − z0 )n + (z − z0 )n n=0 ∞ (6. What about a more complicated example? Consider the factorial function deﬁned on N: f (n) = n! I will now deﬁne a function on the real numbers: ∞ Γ(x) = 0 dssx−1 e−s (6. An example of this is the function sin x . therefore it is the analytic continuation of the real-valued function f (x) = x2 . The generalization to the Laurent expansion converges inside an annulus R1 < |z − z0 | < R2 .• As an example.4) • As you can see.1. It is also analytic. f (z) is analytic at z0 . a Laurent expansion is a generalization of a Taylor expansion. but it does have a well-deﬁned limit there. Indeed. If z0 is a singularity of f (z) and you can deﬁne a neighborhood (open set) about that point U (z0 ) such that f (z) is analytic in the domain U (z0 ) − {z0 } (so R1 → 0 above). Γ(x) is the (unique) analytic continuation of the factorial function to the real numbers (without the negative integers or zero). Removable singularities are trivial. Γ(x) = (x − 1)! Furthermore. we can go further: by replacing x → z (as long as z is not a negative integer or zero – why not?) this extends to all of C. and I will not talk anymore about them. This function agrees with the function f (x) = x2 in the domain of real numbers (y = 0). 60 . where all the b−n = 0. so long as x is not a negative integer or zero. There are three general types of isolated singularities: 1. This function is not deﬁned x at the origin. leaving a ﬁnite value plus terms that are O(x) and hense vanish in the limit. let’s look at our old example of f (z) = z 2 . A Taylor expansion typically exists in an open disk about the point z0 . You can see this by performing a Taylor expansion of the sine function and dividing out by x. Notice that the function is not deﬁned at the point z0 unless all of the bn = 0.2. Removable singularities are the most mild form of singularity. then z0 is called an isolated singularity. If this is the case.

For example. This quantity plays a vital role in the calculus of complex-valued functions. • Residues are so important. 61 . If p(z0 ) = 0 and q(z) has a zero of order m at z0 1 then at z0 . This singularity is a true nightmare. then the quantity b−1 is called the Residue of the pole. An example of this is the function f (z) = e1/z at the origin. For example. If f (z) has a pole of order m at z0 . consider the complex Logarithmic function: log z = log r + iθ where I have used the famous result z = reiθ . Then the residue of f (z) at z0 is b−1 = p(z0 ) q (z0 ) p(z) q(z) has a pole of order m p(z) (z−z0 )m 2. and is sometimes denoted Log z. This is called the principle branch of the logarithmic function. 1 i. this expression has an inﬁnite number of solutions for a single value of z. An example of a function with a simple pole is cos x at the origin.: q(z) = (z − z0 )m r(z). but not in complex analysis. then the function is said to have a Pole of order m at z0 . we would throw such functions away. where it behaves as 1/x. there is one more type of singularity that comes up. In the special case of m = 1. If a function has a Laurent expansion about z0 but there exists an integer m > 0 such that all the b−n = 0 for every n > m (but with b−m = 0).e. the function is said to have a simple pole. where r(z0 ) = 0. and there is very little to be done for it! • If a function f (z) has a pole of order m at z0 (not necessarely a simple pole). we can say that we will only let −π < θ < +π. The solution is to deﬁne a cut where we explicitly say what values of θ we allow. as we will see. If q(z) has a simple zero at z0 and p(z0 ) = 0. then the function is said to have an Essential Singularity at z0 . Therefore. I will state several methods here without proof: 1. Rather this singularity comes from an ambiguity in multiply deﬁned functions. A function that is analytic in a domain. up to maybe a countable number of poles (so it still has a Laurent expansion). In real analysis. 3. This is not an isolated singularity.2. then the residue of by b−1 = at z0 is given • Before leaving this section. it can be written as f (z) = p(m−1) (z 0) where (m−1)! p(z) q(z) 3. I want to list various methods of ﬁnding them. p(z0 ) = 0. If a function has a Laurent expansion about z0 but there is not an integer m > 0 such that all the b−n = 0 for every n > m (so the denominator terms do not stop). since θ is only deﬁned up to an arbitrary factor of 2π. The imaginary part of this function is a problem. also denoted by Resz=z0 f (z) . and therefore is not deﬁned through its Laurant expansion. Poles are very useful and important x in complex analysis. is said to be meromorphic in that domain.

Therefore they are singularities. those points were the origin and inﬁnity. It too has branch points at z = 0. • Aside from the logarithm. • The branch cut you make is not unique.5) so that f = f (z). you can deﬁne contour integrals in the complex plane. For the principle branch. In the case of the Logarithm. Complex analysis is a deep subject. but what is true is that all branch cuts connect two points. Such a function is sometimes called holomorphic. In addition. and you might think that the complex calculus is the same as real vector calculus. for integer n. Branch cuts are discontinuities in the function. Notice that the function is discontinuous as you cross the branch cut.3. Since those points are common to all branch cuts. • Just like in vector calculus. that line extends along the negative real axis. it does not even have to be a straight line. ∞. They are called branch points. you can think of the complex plane as a special representation of R2 . It could have gone in any direction. z ∗ ). 6. b] is a real parameter. For example. you cannot ever deﬁne the logarithmic function there. if C is a contour described by a path in the complex plane z(t) where t ∈ [a. where x is a positive real number. the most common function with branch point singularities is the function f (z) = xa where 0 < Re(a) < 1. but this is just a synonym for analytic. But this is not true! There are a variety of theorems that make calculus (especially integral calculus) on the complex plane a rich and fascinating subject. I will skip over all the proofs and just present the most commonlyused results. In the case of the Logarithm: log(−x + i ) − log(−x − i ) = 2πi. Such a function is analytic if and only if df ≡0 dz ∗ (6. both theoretically as well as practically.3 Complex Calculus When doing calculus on a function of complex variables. the branch cut for the logarithmic function is a line that connects the origin to inﬁnity.• Branch cuts are lines in the complex plane that tell you where to cut oﬀ the deﬁnition of your function. a whole new world of possibilities opens up! At ﬁrst. but since this is supposed to be a practical review for physics grad students. I strongly encourage you to read up on the theory behind this stuﬀ. This function is multiply deﬁned since f (z) = f (zei2πn/a ). which would have corresponded to a diﬀerent branch of the logarithm. we can deﬁne the contour integral of an analytic function in terms of a one-dimensional Riemann integral: 62 . and those two points are unique. • A general function of a complex variable can be written as f (z. it’s really an amazing subject in and of itself.

3.3. This follows for n > 1 since the derivative of a constant is zero. dz = 2πiδn1 (z − z0 )n (6. or by direct integration. Exercise: why doesn’t z 3/2 contradict this result? • Notice immediately the marvelous corollary to the above formulas: for all n ∈ Z. we come to a rather amazing result: if a function is analytic (which is the df same as saying dz exists).3. For example. where a function that has a ﬁrst derivative need not have higher derivatives. and it follows for n ≤ 0 since such terms are analytic inside and on the contour and hence the integral vanishes by the Cauchy-Goursat theorem.3. This kind of thing can never happen in complex analysis.7) This is a generalization of the fundamental theorem of calculus. positively oriented contour2 C. a negatively oriented contour travels around the contour in a clockwise direction. • Cauchy’s Integral Formula: Let f (z) be analytic everywhere within and on a simple. closed. A powerful consequence of this theorem is that a contour integral is invariant to deformations of the contour. x3/2 has a ﬁrst derivative at x = 0 but it does not have a second derivative there. we can actually derive a very powerful formula for the n-th derivative of an analytic function: f (n) (z) = n! 2πi f (z)dz (z − z0 )n+1 (6.b f (z)dz ≡ C a f (z(t))z (t)dt (6. as long as the deformation does not pass through a singularity of f (z). Then: f (z0 ) = 1 2πi f (z)dz z − z0 (6. the result for n = 1 follows directly from the Cauchy integral formula.9) C From this.3. Finally. Then: f (z)dz ≡ 0 C (6. 2 63 . A positively oriented contour means you travel around the contour in a counter-clockwise direction.10) C as long as C encloses z0 . then all of its derivatives exist and are given by the above integral! This is very diﬀerent from real analysis.6) • Cauchy-Goursat Theorem: Let C be a closed contour in the complex plane and f (z) an analytic function on the domain interior to and on C. and let z0 be any point on the interior to C.8) C • Using the Cauchy integral formula and some complex analysis.

Since we only surround the pole at z = +ia we only keep that residue. we cannot apply the residue theorem since the contour is not closed. the exponential in the numerator picks up a factor of e−Im(z) . if we add to I+ a term that is the contour integral over the semicircle of radius R going through the upper-half of the complex plane (Im(z) > 0). meromorphic function on the complex plane. We split the integral into two terms I = I+ + I− . So in particular. we don’t change the value of I+ but we do close the contour. and we get: 64 . where Z can be a ﬁnite or inﬁnite set. so by rule 2 in the residues section. Let’s just consider I+ for now. we use the deﬁnition cos z ≡ (eiz + e−iz )/2. . where I± = 1 2 +R −R e±iz dz (z 2 + a2 )3 where we will take the limit R → ∞ in the end.3. We can rewrite the denominator as (z + ia)3 (z − ia)3 . indeed. the analytic continuation of the integrand). such a term will go like e−R and hence vanish as R → ∞. Such a function has a Laurent expansion. and hence can apply the residue theorem. This is why it’s so important to know how to compute residues of meromorphic functions.12) Try doing this integral directly! Before you shoot yourself. Consider a general. . • To see how vitally useful the residue theorem is.11) This result is usually called the Residue Theorem. First of all. Some of the isolated singularities of f (z) are on the interior of C (call this set Z = {z0 . we analytically continue the integrand to the complex plane by replacing x by z (check that this is. . z1 . the integrand has a third-order pole at both of these points and the residue at these points are given by 1 d2 eiz 2! dz 2 (z ± ia)3 = z=±ia i e a (3 + a(a ± 3)) 16a5 So far.• Using this result. where we make sure the contour does not pass through any singularities of f (z). Now in order to make sense of the cosine of a complex variable. However. zn }) and some are on the exterior.3. we come to a remarkable and vital theorem used throughout all areas of applied and abstract mathematics. If we include this term. Now let’s do a closed contour integral. let’s see how to do it in just a few steps. Then we have the amazing and most important result: f (z)dz = 2πi C zk ∈Z Resz=zk f (z) (6. consider the following real integral: +∞ I= −∞ cos xdx (x2 + a2 )3 (6. . notice that when we go into the complex plane.

with the exponential being negative. First of all. But what about functions that have singularities on the contour? Such integrals by themselves are not well-deﬁned. But once we specify this scheme. Finally. in order to make sure the added contribution vanishes. since the new contour is oriented clockwise rather than counterclockwise. there is good reason to use the principle value. this will change our formula for the residue slightly. Also. Let f (x) be an analytic real-valued function. For example. The ﬁnal answer we get will not be unique since it will depend on the regularization scheme used. called the Principle Value. but we can still make sense of them by means of a regulator. we could have deﬁned the integral as b R+ R R→∞ lim xdx = b − a a −R− R • There is nothing special about any of these choices. Putting this all together gives us that I− = I+ so the ﬁnal integral is +∞ −∞ πe−a cos xdx = (3 + a(a + 3)) (x2 + a2 )3 8a5 Try doing that integral without the help of the residue theorem – I wouldn’t even know where to begin!! • The residue theorem gives us a way to integrate (or diﬀerentiate) complicated meromorphic functions which are analytic on the contour of integration. there is an extra negative sign. we will need to close the integral in the lower half of the complex plane rather than the upper half. The Principle Value of this integral is deﬁned as the symmetric limit: ∞ R P −∞ xdx = lim R→∞ xdx = 0 −R Notice that this choice of regulator is by no means unique.R→∞ lim I+ = πe−a (3 + a(a + 3)) 16a5 To compute I− . consider the integral ∞ xdx −∞ This integral is indeterminant until we chose precisely how to take the limit to inﬁnity. but there are a few diﬀerences to take into account. we can proceed to make well-deﬁned statements. However. • It turns out that there is a particular regularization scheme that is a very natural choice. As an example. the story is very similar. Then we have the following vital result: 65 .

So this integral does exist. but that will only happen if the divergences are under control. this integral is ill-deﬁnined. So this begs the question: which sign should we use? This is where mathematics ends and physics begins. the ﬁrst integral on the right-hand side should vanish due to the symmetry of the integrand.13) where the principle value on the right hand side should be understood as ∞ P −∞ f (x) ≡ lim lim x − x0 R→∞ ρ→0 −ρ −R f (x) + x − x0 R ρ f (x) x − x0 (6. the principle value tells us how to cancel the divergence there: −ρ R→∞ ρ→0 lim lim −R dx + x R ρ dx x R = lim lim − R→∞ ρ→0 ρ du + u R ρ dx x =0 where I set u = −x in the ﬁrst integral. the Principle Value Theorem gives us a way to solve the problem: ∞ −∞ cos x =P x+i ∞ −∞ cos x − iπ cos(0) x If the universe is fair. even though we take the limit → 0. but I like to call it the Principle Value Theorem.14) This theorem goes by many names. where the integrand has a simple pole.∞ lim →0 −∞ f (x) =P x − x0 ± i ∞ −∞ f (x) x − x0 iπf (x0 ) (6. 66 . Notice that near the origin. This is not a coincidence! It is due to the fact that the integral (without the ) is not well deﬁned. You will ﬁnd that the choice of sign typically comes from boundary conditions. although you might ﬁnd it under something else in your favorite textbook. and is sensitive to the particular problem at hand. • I will not prove the Principle Value theorem (it’s not very diﬃcult to prove – try it yourself). Thus the principle value has tamed the divergence. due to its pole on the real axis. • Notice in the above example.3. but I will use it in a simple example. and you might have thought that we should get the same answer in the end. and it equals −iπ. if we had changed the sign of (placing the pole above the real axis rather than below it) we would have found a diﬀerent answer +iπ.3. Nonetheless. Let’s calculate the integral ∞ −∞ cos x dx x+i As → 0.

When there is a branch-point singularity. notice the simple pole at x = −1. To see that. we go clockwise around a circle of radius ρ until we get back to the starting point at ρ + i . But we can still do pretty well.we can let → 0 as long as we keep in mind how to deﬁne the exponent (see below). So for example. in this case the origin and inﬁnity. while the integral along L− is below the branch cut. Finally. We will extend this contour: ﬁrst. so we must choose a branch cut to deﬁne it.• As a ﬁnal example of how complex calculus is so very useful. Plugging this in: 67 . So we have: z −a dz = 2πie−iaπ z+1 + L+ CR + L− + Cρ The two circle contour integrals vanish. We will make our cut along the positive real axis. This integrand is multiply deﬁned. Then we travel back down the real axis until we get to ρ − i (L− ). Then the contour goes counterclockwise around a circle of radius R until we get back to just below the real axis R − i (CR ). This comes from the fact that we can write the numerator as e−a log x . we need the fact that |z1 + z2 | ≥ ||z1 | − |z2 ||. so log z = log r + i0 as → 0. let’s consider how to integrate a function with a branch-point singularity. This is a little diﬀerent since a branch point is not an isolated singularity. With our choice of ρ and R this contour encircles the pole. so log z = log r + i2π as → 0. so we shouldn’t choose our branch cut to overlap with that pole. so the residue theorem applies. so the residue theorem doesn’t help by itself. The is there to avoid the branch cut . This integral is the same as z −a dz z+1 R+ where the integral is along the positive real axis in the complex plane. Also. the contour goes from ρ + i (ρ < 1) to R + i (R > 1) on the real axis (L+ ). let’s consider the following integral: ∞ 0 x−a dx x+1 where 0 < a < 1. you have to specify a branch cut that connects the singularities. The integral across L+ is above the branch cut. The interesting part comes from the remaining two integrals. This is a line connecting the two branch points. Then for the small circle: z −a dz ρ−a (2πρ) 2π 1−a ρ→0 ≤ = ρ −→ 0 z+1 ||z| − 1| 1−ρ Cρ and a similar analysis can be done with the large circle.

L+

z −a dz = z+1

R ρ

r−a dr r+1

L−

z −a dz = z+1

ρ R

r−a e2iaπ dr r+1

**Putting all of this together and taking the limits ρ → 0 and R → ∞ gives
**

∞ 0

r−a dr 2πie−iaπ π = = 2iaπ r+1 1−e sin aπ

6.4

Conformal Mapping

• Conformal mapping is one of those tools that, in my experience, takes a great deal of time and practice to get comfortable with. It is one of those things in mathematics which is almost as much art as science. Whole textbooks have been written on this rich and amazing subject; perhaps it is overly ambitious for me to try and write this section. But I’ve always loved a challenge. I cannot begin to write a complete review of this subject – it would take way too long and I am not an expert on the subject anyway. So in this section I only intend to present some of the basic theorems and one or two examples. Maybe one day I’ll do more than that, but not today. • Conformal mapping is a technique that is useful in solving partial diﬀerential equations in two dimensions with certain boundary conditions. The applications to such problems range everywhere from ﬂuid dynamics to electrodynamics to ﬁnancial forecasting! The basic idea is to transform complicated boundary conditions into simpler ones, and thus make solving the problem much easier. • Simply put, a conformal transformation is a transformation that preserves angles. Technically, consider a contour in the complex plane parametrized by a function z(t), and a map f (z) that is analytic and has a nonvanishing ﬁrst derivative; such maps are called conformal maps. Let w(t) = f (z(t)) be the image of the contour under the map f . We will call the complex w-plane Cw and the complex z-plane Cz , so f : Cz → Cw . • So what does this have to do with angles? The tangent vector in Cw is given by w (t) = f (z(t))z (t) From this it follows that arg w (t) = arg f (z) + arg z (t) (6.4.15)

Now let’s imagine two curves described by z1 (t) and z2 (t) that intersect at a point z1 (t0 ) = z2 (t0 ) = z0 ∈ Cz . Then from the above formula, we have the result:

68

arg w1 (t) − arg w2 (t) = arg z1 (t) − arg z2 (t)

(6.4.16)

So indeed we ﬁnd that conformal transformations do preserve angles between contours. • Before going further, let me point out a very important fact without proof: let f (z) = u(x, y)+iv(x, y) be an analytic function, so that u(x, y) and v(x, y) are both harmonic, and v is the harmonic conjugate to u. Let Tu (x, y) = u(x, y) be the tangent vector of u at the point (x, y), and similarly for Tv (x, y), and let (x0 , y0 ) be a point such that u(x0 , y0 ) = v(x0 , y0 ). Then Tu (x0 , y0 ) · Tv (x0 , y0 ) = 0; that is, harmonic conjugates have orthogonal tangent vectors at points of intersection. I leave the proof of this to you (Hint: use the CR Equations). This will be very useful in what follows. • Another important fact: the deﬁnition of a conformal transformation turns out to provide a necessary and suﬃcient condition for an inverse function to exist, at least locally; this is a result of the infamous “Inverse Function Theorem”. Thus, if we ﬁnd a conformal transformation that simpliﬁes our boundary conditions in Cw , we can always invert the transformation to get the solution on Cz , which is, of course, what we really want. • The diﬀerential equation we are interested in is Laplace’s equation. Remember that the real and imaginary parts of an analytic function are harmonic functions; that is, solutions to Laplace’s equation in two dimensions. We are interested in two types of boundary conditions for a harmonic function h(x, y) on a boundary described by a curve C in the complex plane z = x + iy: 1. Dirichlet Boundary Conditions: h|C = h0 for any constant h0 . 2. Neumann Boundary Conditions: n · to the boundary. h|C = 0, where n is a unit vector normal

It turns out that for these two types of boundary conditions are preserved under conformal transformations. Any other type of boundary condition will generally not be dh preserved. for example, dn C =constant= 0 is not preserved by conformal transformations. Therefore, whenever faced with trying to solve Laplace’s equation with either Dirichlet or Neumann boundary conditions in two dimensions, conformal mapping is an available method of ﬁnding the solution. As mentioned above, this problem comes up many times and in many places, so this is a very powerful technique. • As an example, let’s consider the problem of trying to ﬁnd the electrostatic scalar potential inside an electrical wire. The wire has a circular cross section, where half of the cylinder is kept at V = 0 and the other half is kept at V = 1. The cylindrical symmetry of the problem means the potential cannot depend on z, so this is a twodimensional problem. If we place the cross section of the wire on a unit circle in the complex plane Cz , we can look for a harmonic function that agrees with the boundary conditions. This, by uniqueness of solutions, is the ﬁnal answer. This is a Dirichlet boundary condition problem with circular boundary. Such boundaries are complicated, but we can use conformal mapping to turn this problem into an easier one. 69

Consider the mapping: w=i 1−z 1+z (6.4.17)

This is an example of a M¨bius Transformation, also known as a linear fractional o transformation . It has the eﬀect of mapping the interior of the unit circle in Cz to the entire upper-half plane of Cw , where the upper half of the circle is mapped to the positive real w axis while the lower half of the circle is mapped to the negative real w axis. I leave it to you to prove that this map is, indeed, a conformal map. Now the next step is to ﬁnd a map on Cw that is harmonic and satisﬁes the boundary conditions. Consider the function 1 1 log w = (log |w| + i arg(w)) π π (6.4.18)

which is analytic and spans the upper half plane of Cw for 0 ≤ arg w ≤ π. Notice that the imaginary part of this function satisﬁes the boundary conditions we need. Since it is the imaginary component of an analytic function is harmonic, so we found our solution. By writing w = u + iv, we have arg(w) = tan−1 to compute u(x, y) and v(x, y), we ﬁnd V (r) = 1 tan−1 π 1 − x2 − y 2 2y

v u

. By expanding Equation (6.4.17)

=

1 tan−1 π

1 − r2 2r sin θ

(6.4.19)

where we take the branch of the arctangent function 0 ≤ tan−1 t ≤ π. It gets better: the harmonic conjugate of the potential is another harmonic function whose tangent vector is always orthogonal to the potential’s tangent vector. But the tangent vector to the electrostatic potential is the electric ﬁeld. And the lines perpendicular to the ﬁeld are just the ﬁeld ﬂux lines. From Equation (6.4.18) we know right away that these ﬂux lines are then given by the level curves of the imaginary part of −(i/π) log w, or upon using Equation (6.4.17): E(r) = 1 log 2π (1 − x)2 + y 2 (1 + x)2 + y 2 = 1 log 2π 1 + r2 − 2r cos θ 1 + r2 + 2r cos θ

• Perhaps this relatively straightforward example has been able to show just how much art goes into solving problems this way. First, we had to ﬁnd the conformal transformation that simpliﬁes the problem. Believe it or not, this is not as bad as it sounds – there are big, fat tables online and in textbooks cataloging hundreds of conformal maps. So you can go through those and look for one that helps. The next thing we had to do is ﬁnd the analytic function whose harmonic component gives us our solution. In this case, that was relatively straightforward. There are a whole bunch of analytic functions that you can try – polynomials and roots, trig functions, logs and exponents, 70

71 . and like all forms of art. you begin to develop an intuition for ﬁnding the right function. But it is something of an art.etc. you have completely solved the problem. it takes practice to get good at it. You even get the ﬁeld ﬂux lines for free. But the rewards are worth it! Once you get good at ﬁnding the conformal map and the analytic function. When you’ve done problem after problem.

. but I wouldn’t use it as a primary source.. Do Carmo. H. R. M. Princeton University Press (New Jersey). and it does not use Einstein notation. Herstein. Diﬀerential Geometry of Curves and Surfaces. Dirac. M. 1999. J.. I. Perseus Books (Massachusets). A great place to see everything without any excess baggage. Riemannian Manifolds: An Introduction to Curvature. and so it says nothing about Riemann formalism. Inc (New Jersey). It assumes the reader is familiar with the theory of diﬀerentiable manifolds. Springer-Verlag (New York). 1997. (New York). Third Ed. V... J. P. General Theory of Relativity. Inc. E.. Has many details on tensors and the Lorentz group. Not the best. and Churchill. Abstract Algebra. It only deals in one and two dimensions. A.. B. but it’s what I used. Hubbard. Complex Variables and Applications (6th Ed). J. N. This pamphlet has many formulas. and Hubbard. 1976. This is a great book on complex analysis and conformal mapping. J. John Wiley and Sons (New York). W. but it is a great beginning book that gets the ideas across without too much heavy machinery.Suggested Reading Here I include some books that I used when writing this review. P. Prentice Hall. Vector Calculus. The classical textbook on electrodynamics. Brown. D. D. 1996. Prentice Hall. M. but no exercises or details. Peskin. It is by no means an exhaustive list. This is an excellent book on diﬀerential geometry in all its details. This book has a great chapter on diﬀerential forms. Lee. Linear Algebra and Diﬀerential Forms: A Uniﬁed Approach. 1999. 1996. McGraw Hill. although more advanced. and Schroeder.. Has a detailed 72 . A decent introduction to abstract algebra. 1990. 1995. Inc (New Jersey). B. Classical Electrodynamics. An Introduction to Quantum Field Theory. I heartily recommend it. This is a great book on diﬀerential geometry. but it might be enough to get you started if you wish to follow up on any topic. V. Prentice Hall (New Jersey). Jackson. M. The modern texbook for quantum ﬁeld theory.

General Relativity. Sakurai. and that these references can help you to ﬁll in the gaps.. 1984. as well as a section on Lie groups. or at least start you on your way. It has a very good discussion of the theory of angular momentum. Ward. 73 . R. A textbook used for graduate quantum mechanics in many universities. Addison-Wesley (Massachusetts). M.jhu. J. University of Chicago Press (Illinois).discussion of Lagrangian ﬁeld theory. where tensors and Lie groups have a secure home in physics. J.edu). Noether’s theorem. It has many details and is accessible to graduates. 1994. A great book – one of my favorites.. please send me your comments (blechman@pha. I hope that this review is of use to you. Modern Quantum Mechanics. If you have any suggestions for the improvement of this paper.

- Winning the Reputation Game (2016) by Grahame R. Dowling.pdf
- Wonderpedia - May 2016.pdf
- Yoga Journal Special - Issue 2 2016.pdf
- (Routledge_UACES Contemporary European Studies) José M. Magone, Brigid Laffan, Christian Schweiger-Core-periphery Relations in the European Union_ Power and Conflict in a Dualist Political Economy-Rou.pdf
- Aujourd'Hui en France du Samedi 14 Novembre 2015.pdf
- Aujourd'Hui en France du Dimanche 15 Novembre 2015.pdf
- Long Year w Bible by PM.pdf
- Collected Articles on Type 1 Diabetes.pdf
- BK_ACX0_008725
- De La Lectura La Bani – Burke Hedges
- Daniel Goleman Emotiile Distructive
- Jacques Salome Daca MAs Asculta MAs Intelege
- Cele 5 Elemente Esentiale Ale Vietii de Jim Rohn
- Anonim-Codul_Bunelor_Maniere_10__
- World Soccer Special Edition - Brazil World Cup 2014
- Complete Guide to World Cup 2014
- Chimie Generala Si Anorganica - Semestrul i Farmai
- The Essential Qigong Training Guide
- chipowerbook_2007
- The Mega Drive Book (2013)
- Homeopathy Overview Report 140408
- Tao Te Ching-Deppe
- Reiki Tips - 2014 Edition
- Hitachi 24hxj15u User Manual
- Dependent Origination as a Natural Governing Law

- ed-2-5-newby anandh_cdm
- Tensorkjjkljby Firoiu Irinel
- m469by Epic Win
- DIADASAn nth-rank tensor in m-dimensional space is a mathematical object that has n indices and m^n components and obeys certain transformation rules. Each index of a tensor ranges over the number of dimensions of space. However, the dimension of the space is largely irrelevant in most tensor equations (with the notable exception of the contracted Kronecker delta). Tensors are generalizations of scalars (that have no indices), vectors (that have exactly one index), and matrices (that have exactly two indices) to an arbitrary number of indices. Tensors provide a natural and concise mathematical framework for formulating and solving problems in areas of physics such as elasticity, fluid mechanics, and general relativity. The notation for a tensor is similar to that of a matrix (i.e., A=(a_(ij))), except that a tensor a_(ijk...), a^(ijk...), a_i^(jk)..., etc., may have an arbitrary number of indices. In addition, a tensor with rank r+s may be of mixed type (r,s), consisting of r so-calby JonatanVignatti

- Tensor Notation (Basics)
- rg_notes2_10
- Conventions, Definitions, Identities, and Other Useful Formulae
- Introduction to Tensors
- EN0175-05.pdf
- Appendix a Index
- General Relativity Course Handout
- gr
- ed-2-5-new
- Tensorkjjklj
- m469
- DIADASAn nth-rank tensor in m-dimensional space is a mathematical object that has n indices and m^n components and obeys certain transformation rules. Each index of a tensor ranges over the number of dimensions of space. However, the dimension of the space is largely irrelevant in most tensor equations (with the notable exception of the contracted Kronecker delta). Tensors are generalizations of scalars (that have no indices), vectors (that have exactly one index), and matrices (that have exactly two indices) to an arbitrary number of indices. Tensors provide a natural and concise mathematical framework for formulating and solving problems in areas of physics such as elasticity, fluid mechanics, and general relativity. The notation for a tensor is similar to that of a matrix (i.e., A=(a_(ij))), except that a tensor a_(ijk...), a^(ijk...), a_i^(jk)..., etc., may have an arbitrary number of indices. In addition, a tensor with rank r+s may be of mixed type (r,s), consisting of r so-cal
- Nst Mmii Chapter3
- Lecture 5
- Cálculo_Tensorial_com_o Maple
- TensorCalculus Maple
- Tensor Algebra
- Chapter Tensorer
- Levi Civita
- Vector and Tensor Analysis
- Index Notation
- Tensor Notation[1]
- The Kaluza-Klein Metric
- tensor calculus
- NikodemPoplawski_ClassicalPhysics
- QYbookCHAPT07
- Errata - A Most Incomprehensible Thing
- Ten Sores
- Tensors in Curvilinear Coordinates - Wikipedia, The Free Encyclopedia
- Differential Eqs 2
- mpri

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd