Ce PDF

X I N TA O
C L A S S I C A L E L E C T R O DY N A M I C S
Contents
1 Introduction to Vector and Tensor Analysis 7
2 Foundations of Theory of Relativity 19
3 Relativistic Dynamics 33
4 Charges in a Given Electromagnetic Field 40
5 The Electromagnetic Field Equations 55
6 General Electrostatics 68
7 Boundary Value Problems in Electrostatics 78
8 General Magnetostatics 89
9 Time-varying Electromagnetic Fields 97
10 Fields of Moving Charges 121
11 Radiation of Electromagnetic Waves 132

讲义说明
2018 版说明 : 一个主要变化是重新加入了从分析力学的角度推导
Maxwell 方程的章节，因为这一章节非常有助于从场论的角度理解电
动力学，同时也强调了场作为另一种描述物理现象的方式与粒子的同
等重要地位。与 2013 年的讲义不同的是，这次的场的分析力学部分
为选修章节，不会计入最终成绩，也不会参加正常的考试部分。
2015 版说明：这本讲义主要是我自己讲课时参考所用，其内
容主要了参考了 Landau 和 Lifshitz 的《The Classic Theory of
Fields》， David J. Griffiths 的《Introduction to Electrodynamics
(4th Edition)》。其余部分还参考了 Jackson 的《Classical Electrody-
namics》。在这三本书中，Landau 和 Jackson 的书并不太适合一般
本科生阅读。这个讲义里对 Landau 的书作了大幅简化，而 Jackson
的书则只是参考了一小部分数学证明和一些个别有意思的章节。正常
情况下，我推荐的阅读书是 Griffiths 的电动力学导论。
这个讲义第一版的使用是在 2014 年春季学期，当时采用的单位制
是 Gauss 单位制，因为我觉得 Gauss 单位制更适合理论电动力学。
但在实行过程中，不少同学向我建议换回国际单位制，以求更好地和
之前的课程衔接。因此从 2015 年春季学期开始，我对这个讲义作了
比较大幅度的改写，一是增加更多的基础数学知识，二是去除场的分
析力学部分，三是将 Gauss 单位制换为 SI 单位制。在这里要特别感
谢金泽宇，王贤瞳，和吴昊楠三位同学的协助，他们在改单位制的过
程中付出了大量劳动，并且修正了 2014 年讲义中的很多错误。2014
年春季学期班上还有好几位其他同学在 14 年教学过程中也陆续地指
出讲义中的错误，但遗憾当时并没有记录名单，在此一并感谢。
1
Introduction to Vector and Tensor Analysis
It is critically important for you to learn basic vector and tensor

analysis to understand the electromagnetic field theory. In this
chapter, we will explain some basic concepts and practically useful
techniques about vector and tensor analysis.
1.1 Vector Analysis
1.1.1 The definition

For most of you, a vector is an object with both magnitude and
direction; e.g., displacement, velocity, etc. This definition is useful
and pretty much all you need in 3D space. To understand the
vector in higher dimensional or non-Euclidean space; however, this
definition is not that intuitative. We introduce here a more general
and formal definition about the vector in 3D Euclidean space.
Any set of three components that transforms in the same

manner as the displacement vector when one changes coor-
dinates is called a “vector”.
You can see that, the displacement vector is the “model vector”
for all vectors. This definition of the vector is very important to
understanding 4-vectors in special relativity where laws of phsics
is defined in a four dimensional pseudo-Euclidean space called
Minkowski space. You can see that some sets of numbers look
like a vector, but they do not transform like the model vector (the
displacement vector); therefore, they are not vectors.
1.1.2 Einstein Summation Convention

Before we discuss more about vector analysis, we first introduce the
Einstein summation convention, a powerful and simple notation.
Einstein noticed that in some expressions involving summation
over components of vectors, e.g.,
3
A·B = ∑ Ai Bi , (1.1)
i =1
8 classical electrodynamics
you can take away the summation sign (∑) without changing the
meaning of the expression. Therefore, you can write
A · B = Ai Bi . (1.2)
Here the repeated index i is called the “dummy index” (哑指标),

and a dummy index is implicitly summed over. To generalize this
convection to more general cases, we need to follow essentially the
following three rules:
1. Dummy indices are implicitly summed over.
2. Each index can appear at most twice in any term.
3. Each term must contain identical non-repeated indices,

also called free indices (自由指标).
To understand the following rules, we give some examples here.

You have already seen rule #1. So for example, if you see Ai ei , it
means vector A, because
A i e i = A1 e1 + A2 e2 + A3 e3 . (1.3)
Rule #2 prohibits expressions like Ai Bi Ci , because no vector op-

erations can result in this kind of expression after removing the
summation sign, and it can easily produce wrong results.
As an example, we can also use Einstein summation rules to
obtain A · B directly using definitions of A and B. Suppose that we
have two vectors A and B in Cartesian space,
A · B = Ai ei · Bj e j = Ai Bj ei · e j = Ai Bj δij = Ai Bi . (1.4)
Here δij = 1 if i = j and 0 otherwise. Note that we used “i” as the

dummy index for A while j as the dummy index for B. You cannot
write it as Ai ei · Bi ei , because of the rule #2: the dummy index can
appear only twice in a given term. Writing it this way gives you the
correct results Ai Bi , but that is purely coincidental: in more general
cases, ei · e j 6= 01 . For example, let us assume for now that we use 1
This kind of coordinate system is
a coordinate system where ei · e j = gij , then the correct results for non-orthonognal (非正交)
A · B = Ai Bj gij , 2 not Ai Bi as you would have by violating rule #2. 2

This notation is also problamatic if
Rule #3 is useful when you think about a particular component you are familiar with vector analysis
in curvilinear coordinate system; it
of the a vector expression. For example, the ith component of A + B is used only for illustration of the
is Ai + Bi . Here i is the free index in both terms (Ai and Bi ). The summation convection. For those of
you who are interested, one correct
expression Ai + Bj does not mean anything. way to write it is Ai B j gij
Rules #2 and #3 can be used to identify wrong expressions. For
example, Ai Bi Ci cannot be correct because of rule 2: the index i
appears three times in this term. Expression Ai Bj Cj + E p is also
wrong, because of rule 3: The non-repeated index in Ai Bj Cj is “i”,
while in E p is “p”, and they are not identical. On the other hand,
Ai Bj Cj + Ei is correct. This expression means the i-th component of
the vector A( B · C ) + E. You have to make yourself familar with the
introduction to vector and tensor analysis 9
Einstein summation convection, as it will be used throughout this

class.
The two basic vector operations are the dot and the cross prod-
ucts. For the dot product of two vectors, we already see that it is
A · B = Ai Bi . For the cross product of two vectors, A × B,
A × B = ( A2 B3 − A3 B2 )e1 + ( A3 B1 − A1 B3 )e2 + ( A1 B2 − A2 B1 )e3 .
How do you write this using Einstein notation? Let’s introduce the
Levi-Civita symbol eijk given by

 +1 if i,j,k form an even permutation of 1, 2, 3.


eijk = −1 if i,j,k form an odd permutation of 1, 2, 3. (1.5)


0 otherwise

Hence, eijk = e jki , eijk = −e jik , and eiik = 0. A very useful identity of
eijk is
eijk eimn = δjm δkn − δjn δkm , (1.6)
where δjm = 0 if j 6= m and 1 if j = m. With the Levi-Civita symbol,

you can easily see that
A × B = eijk A j Bk ei . (1.7)
1.1.3 The Del Operator

One complicating factor in vector analysis is the del operator, ∇.
The gradient of a scalar is ∇ T. In Cartesian coordinates,
∂T ∂T ∂T
∇ T = e1 + e2 + e3 , (1.8)
∂x1 ∂x2 ∂x3
where x1 ≡ x, x2 ≡ y, x3 ≡ z. With Einstein summation convention,
∂T
∇ T = ei . (1.9)
∂xi
Therefore, in Cartesian coordinates,
∂
∇ = ei . (1.10)
∂xi
We see that ∇ is a vector operator; it has properties of a vector and

an operator at the same time.
Using Equation (1.10), a few expressions related to ∇ can be
quickly given. For example, the divergence of a vector,
∂ ∂Ai
∇· A = A = . (1.11)
∂xi i ∂xi
The curl of a vector A is
∂ ∂Ak
∇× A = eijk ei A = eijk e. (1.12)
∂x j k ∂x j i
1.2 Vector Algebra
In this section, I will teach you how to memorize/derive commonly

used vector algebra without referring to a handbook. We will use a
lot of vector analysis and identities in this class.
A·B×C = B·C×A = C·A×B

= A×B·C
A × ( B × C ) = (C × B) × A = ( A · C ) B − ( A · B)C
∇( f g) = f ∇ g + g∇ f
∇· ( f A) = f ∇· A + A · ∇ f
∇×( f A) = f ∇× A + ∇ f × A
∇· ( A × B) = B · ∇× A − A · ∇× B
∇( A · B) = A × (∇× B) + B × (∇× A)
+ ( A · ∇) B + ( B · ∇) A
∇×(∇× A) = ∇(∇· A) − ∇2 A
∇· ∇× A = 0
∇×∇ f = 0
B
Ak = A · ≡ A · B̂
B
A⊥ = − B̂ × ( B̂ × A)
In the last two identities, the direction is w.r.t. B̂. And also some
useful identities involving r, the radial vector:
∇· r = 3
∇× r = 0
∇r = I
r
∇r = = er
r
∇×[ f (r )r ] = 0.
We now start from the most basic vector operations and show you
how to memorize them.
First, for A · ( B × C ), you can interchange the dot and cross.
A · ( B × C ) = A × B · C.
Or you can cyclically permute the order of vectors, like
A · ( B × C ) = B · C × A.
Of course, since B × C = −C × B,
A · ( B × C ) = − A · C × B.
Second, for A × ( B × C ), you can use the middle-outer rule 3 . 3

Fundamentals of Plasma Physics by Paul
M. Bellan
A × ( |{z}
B × |{z}
C ) = |{z}
B ( · C}
|A{z ) − |{z}
C ( · A}
|B{z ).
middle outer middle other two dotted outer other two dotted
B × |{z}
(|{z} C ) × A = |{z}
C ( · A}
|B{z ) − |{z}
B ( · C}
|A{z ).
outer middle middle other two dotted outer other two dotted
I’ve found this middle-outer rule quite convenient.

Third, for identities involving ∇, you should treat it as being
both a vector and a differential operator. Applying the calculus rule
( ab)0 = a0 b + ab0 , we know
∇( f g) = g∇ f + f ∇ g
I find it not difficult to see
∇· ( f A) = f ∇· A + A · ∇ f
∇×( f A) = f ∇× A + ∇ f × A
The real difficulty most students find in doing vector anlysis is
when an expression has complicated identities involving ∇. Here,
we introduce two mathematical tricks.
1.2.1 Method 1
This one involves 3 steps; it uses tricks to seperate the operator
property and the vector property of the ∇ operator. I’ll use ∇×( A ×
B) as an example.
Step 1: Treat ∇ as an operator. Here it applies to both A and B.
So we add a subscript to ∇ so indicate which vector it operators on.
∇×( A × B) = ∇ A × ( A × B) + ∇ B × ( A × B).
Step 2: Treat ∇ A and ∇ B as different vectors, applying vector
rules, and making necessary adjustments to put ∇ A right before
A and ∇ B right before B. Adjustments should also satisfy vector
rules.
∇ A × ( A × B) = (∇ A · B) A − B(∇ A · A) middle-outer rule

= ( B · ∇ A ) A − B(∇ A · A) adjustment
Similarly, we have
∇ B × ( A × B) = (∇ B · B) A − (∇ B · A) B
= (∇ B · B) A − ( A · ∇ B ) B
Step 3: Dropping subscripts of ∇, we have
∇×( A × B) = ( B · ∇) A − B(∇ · A) + (∇ · B) A − ( A · ∇) B
Now let’s use this trick to derive ∇· ( A × B).
1. ∇· ( A × B) = ∇ A · ( A × B) + ∇ B · ( A × B).
2. ∇ A · ( A × B) = ∇ A × A · B = B · ∇ A × A
∇ B · ( A × B) = −∇ B × B · A = − A · ∇ B × B
3. ∇· ( A × B) = B · ∇ × A − A · ∇ × B
Note: you might ignore adding subscripts after you get familiar with
the method.
1.2.2 Method 2
The second trick for vector/tensor analysis uses the property that a
vector/tensor identity is valid in all coordinate systems. Using this
property, we can perform vector-tensor analysis in the following
way:
1. write out the components in Cartesian coordinates
2. do necessary analysis
3. from components, restore the original variables
Usually the last step is the most difficult one, it requires some
experience and practice. Also this method is especially powerful
when used with the Einstein summation convention.
Example 1: Prove ∇· r = 3 and ∇× r = 0.
∂ ∂x j ∂x
∇· r = ei · xj ej = δij = i = 3,
∂xi ∂xi ∂xi
∂
∇× r = ei eijk x = ei eijk δjk = 0.
∂x j k
Example 2: Prove
∇· ( A × B) = B · (∇× A) − A · ∇× B.
Proof:
∂ ∂
∇· ( A × B) = ei · (e jkl Ak Bl )e j = (e A B )
∂xi ∂xi ikl k l

∂Ak ∂B ∂Ak ∂B
= eikl B + Ak l = eikl B + eikl Ak l
∂xi l ∂xi ∂xi l ∂xi
∂A ∂B
= elik k Bl − Ak ekil l
∂xi ∂xi
= (∇× A) · B − A · (∇× B)
= B · (∇× A) − A · (∇× B)
Example 3: Prove
∇×( A × B) = ( B · ∇) A − B(∇ · A) + (∇ · B) A − ( A · ∇) B.
Proof:
∂ ∂
∇×( A × B) = eijk ( A × B)k ei = eijk e A Bm ei
∂x j ∂x j klm l
!
∂Al ∂Bm
= eijk eklm Bm + A ei
∂x j ∂x j l
!
∂Al ∂Bm
= ekij eklm Bm + A ei
∂x j ∂x j l
!
∂Al ∂Bm
= (δil δjm − δim δjl ) Bm + A ei
∂x j ∂x j l
Take a break and let’s continue...

!
∂Al ∂Bm
∇×( A × B) = (δil δjm − δim δjl ) Bm + A ei
∂x j ∂x j l
!
∂Al ∂Bm
= δil δjm Bm + A ei
∂x j ∂x j l
!
∂Al ∂Bm
− δim δjl Bm + A ei
∂x j ∂x j l
!
∂A ∂Bj ∂A j ∂B
= B j i + Ai − Bi − Aj i ei
∂x j ∂x j ∂x j ∂x j
!
∂A ∂Bj ∂A j ∂B
= Bj i + Ai − B − Aj i ei
∂x j ∂x j ∂x j i ∂x j
= ( B · ∇) A + (∇· B) A − (∇· A) B − ( A · ∇) B
Practice and master both methods. They are extremely important
in the study of classical electrodynamics.
1.3 Curvilinear coordinates
I do not talk about general vector analysis in curvilinear coordi-

nates in this class; it’s quite complicated. If you are interested, see
books in my reference list. I will only very briefly review spherical
and cylindrical coordinates, since you should have learned this
before.
In spherical coordinates (er , eθ , eφ ),
Gradient of a scalar
∂f 1 ∂f 1 ∂f
∇f = er + e + eφ ,
∂r r ∂θ θ r sin θ ∂φ
Divergence
1 ∂ 2 1 ∂ 1 ∂Aφ
∇· A = r Ar + (sin θ Aθ ) + ,
r2 ∂r r sin θ ∂θ r sin θ ∂φ
Curl of a vector
1 ∂ 1 ∂Aθ
(∇× A)r = sin θ Aφ −
r sin θ ∂θ r sin θ ∂φ
1 ∂Ar 1 ∂
(∇× A)θ = − rAφ ,
r sin θ ∂φ r ∂r
1 ∂ 1 ∂Ar
(∇× A)φ = (rAθ ) − .
r ∂r r ∂θ
Laplacian
∂2 f

2 1 ∂ 2∂f 1 ∂ ∂f 1
∇ f = 2 r + 2 sin θ + .
r ∂r ∂r r sin θ ∂θ ∂θ r2 sin2 θ ∂φ2
In cylindrical coordinates (er , eφ , ez ),

Gradient of a scalar
∂f 1 ∂f ∂f
∇f = er + eφ + ez ,
∂r r ∂φ ∂z
Divergence
1 ∂ 1 ∂Aφ ∂Az
∇· A = (rAr ) + + ,
r ∂r r ∂φ ∂z
Curl of a vector
1 ∂Az ∂Aφ
(∇× A)r = −
r ∂φ ∂z
∂Ar ∂Az
(∇× A)φ = − ,
∂z ∂r
1 ∂ 1 ∂Ar
(∇× A)z = rAφ − .
r ∂r r ∂φ
Laplacian
1 ∂2 f ∂2 f

1 ∂ ∂f
∇2 f = r + 2 2
+ 2.
r ∂r ∂r r ∂φ ∂z
1.4 A very brief introduction to tensor
A tensor may be regarded as the product of two vectors; it has two

sets of directions. 4 The simplest tensor is the dyad (并矢), which is 4
More strictly speaking, we are
just two vectors put together, without a symbol in between. For ex- dealing with second-order tensors in
this section.
ample, AB is a dyad. It has two sets of directions: one from vector
A, and one from B. In terms of Einstein summation convection,
AB = Ai ei Bj e j = Ai Bj ei e j , (1.13)
the component in ei e j direction is Ai Bj . Lots of students like to use

matrix to represent a tensor. That’s OK for the second order tensor,
but for more general N-order tensor, it is something not easy to
do. So I would like to ask you not to equal a tensor to a matrix, but
instead you should use notation similar to Equation (1.13) in tensor
operation.
The basic tensor operation can be introduced via the dot-product
between a dyad and a vector 5 . The dot product between a vector 5
W.D.D’haeseleer, W.N.G.Hltchon,
and a dyad AB is given by J.D.Callen, and J.L.Shohet, Flux coordi-
nates and magnetic field structure.
X · AB ≡ ( X · A) B,
or
AB · Y ≡ A( B · Y ).
Using these rules, we see that,
X · AB = Xi ei · A j Bk e j ek = Xi A j Bk (ei · e j )ek = Xi Ai Bk ek .
and
AB · Y = Ai Bj ei e j · Yk ek = Ai Bj Yk ei (e j · ek ) = Ai Bk Yk ei .
You see that in general AB 6= BA, since AB · C is a vector in the

direction of A, while BA · C is a vector in the direction of B. In this
class, the most common form of tensors is dyad, so you’d better get
familiar with this one.
A general second-order tensor is
F = Fij ei e j .
A general second-order tensor cannot be written as a dyad, but it

can be written as a summation of dyads; i.e.,
F = ab + cd + e f + · · ·
In 3-D space, you need at least three dyads to represent a general

second-order tensor. A vector can be considered to be a first-order
tensor, and a scalar is a zeroth-order tensor. It’s easy to see that in
the component of a tensor, there are two free indices; e.g., Fij . In
a vector, there is one; e.g., Ai . In a scalar, there is zero; e.g., C. So
the number of free indices in the component of an object equals the
rank of the tensor.
If you define a second-order tensor using transformation

properties, then a tensor transforms under coordinate trans-
formations like the products of components of two vectors.
As you can see from the definition of a dyad.
The dot-product between a second-order tensor and a vector is a

vector. For example,
F · A = Fij ei e j · Ak ek = Fij Ak ei δjk = Fij A j ei .
So a dot product reduces the order or rank of a tensor by one. The

dot-product between a tensor and two vectors gives a scalar, like
A · F · B = Ai ei · Fjk e j ek · Bl el = Ai Fjk Bl δij δkl = Ai Fik Bk .
This dot product can also be written as
A · F · B ≡ BA : F ≡ F : BA.
A special second-order tensor is the unit tensor I, which has the

property that
I · A = A · I = A,
I : AB = A · B,
∇ · ( ϕI) = ∇ ϕ,
I : ∇ B = ∇· B.
Here are two examples involving tensors.

Example 1: Prove that
∇· ( f g × r ) = [∇· ( f g )] × r + g × f .
Solution:
∂ ∂
∇· ( f g × r ) = ei · f ( g × r )l ek el = [ f ( g × r )l ] el
∂xi k ∂xi i
∂
= elkm ( f g rm )el
∂xi i k
∂ ∂
= elkm ( f i gk )rm el + elkm f i gk (r m ) e l
∂xi ∂xi
= elkm [∇· ( f g )]k rm el + elkm gk ( f · ∇r )m el
= [∇· ( f g )] × r + g × ( f · I)
= [∇· ( f g )] × r + g × f
Example 2: Prove that
∇· ( f gh) = (∇ · f ) gh + ( f · ∇ g )h + g ( f · ∇h).
Solution:
1. ∇· ( f gh) = ∇ f · ( f gh) + ∇ g · ( f gh) + ∇h · ( f gh)
2. ∇ f · ( f gh) = (∇ f · f ) gh
∇ g · ( f gh) = (∇ g · f ) gh = ( f · ∇ g ) gh = ( f · ∇ g g )h
∇h · ( f gh) = (∇h · f ) gh = ( f · ∇h ) gh = g ( f · ∇h h)
3. ∇· ( f gh) = (∇ · f ) gh + ( f · ∇ g )h + g ( f · ∇h)
1.5 The Dirac Delta Function
1.5.1 The one-dimensional Dirac Delta Function

The 1D Dirac delta function, δ( x ), is defined as

0, if x 6= 0,
δ( x ) =
∞, if x = 0,
and
Z ∞
δ( x )dx = 1.
−∞
Strictly speaking, δ( x ) is not a function, since its value is not finite

at x = 0. It is a distribution. Note that we also have
Z b
δ( x )dx = 1,
a
as long as 0 ∈ [ a, b].
Suppose f ( x ) is a ordinary continuous function, then it immedi-
ately follows from the definition of δ( x ) that
f ( x ) δ ( x ) = f (0) δ ( x ).
Correspondingly,
Z ∞ Z ∞
f ( x )δ( x )dx = f (0) δ( x )dx = f (0).
−∞ −∞
A slightly more general form of 1D Dirac delta function is of

course δ( x − a),

0, if x 6= a,
δ( x − a) =
∞, if x = a,
and
Z ∞
δ( x − a)dx = 1.
−∞
And it is straightforward to show that

Z ∞
f ( x )δ( x − a)dx = f ( a).
−∞
We almost always think that expressions involving δ( x ) should

be used under an integral sign. In particular, two expressions
involving delta functions, D1 ( x ) and D2 ( x ), are considered equal if
Z ∞ Z ∞
f ( x ) D1 ( x )dx = f ( x ) D2 ( x )dx,
−∞ −∞
for all ordinary functions f ( x ). For example,
1
δ(kx ) = δ ( x ),
|k|
from which you can see that δ( x ) is even; i.e., δ(− x ) = δ( x ). An-
other frequently used expression is
δ ( x − xi )
δ( g( x )) = ∑ | g0 ( xi )|
,
i
where g( x ) is a continuously differentiable function and xi are all

roots of g( x ) = 0 and of course g0 ( x ) 6= 0.
1.5.2 The three-dimensional Dirac delta function

The generation of δ( x ) to 3D is very easy.
δ (r ) = δ ( x ) δ ( y ) δ ( z ),
where r = xe x + yey + zez is the displacement vector. You see that


0, if r 6= 0,
δ (r ) =
∞, if r = 0,
and
Z
δ(r )dV = 1.
all space
Of course, you can have

Z ∞
f (r )δ(r − a)dr = f ( a).
−∞
Questions: Now, try to express the charge density ρ( x) of an

ideal point charge q located at x0 using the Dirac delta function.
The charge density satisfies,
Z
ρdV = q.
1.6 Some useful references about vector analysis
Note: See the course homepage for links to these documents.
1. Curvilinear coordinates and tensors, Chapters 1,2,and 3, Flux and

Coordinates and Magnetic Field Structure, by D’haeseleer, HItchon,
Callen, and Shohet.
2. 向量张量运算符号法胡友秋，电动力学讲义 (课程主页上可以下

载).
3. 张量分析，黄克智，薛明德，陆明万
2
Foundations of Theory of Relativity
2.1 The principle of relativity
In Newtonian mechanics, the Galileo’s principle of relativity (PR)

is that the laws of mechanics are identical in all interval systems
of reference under Galileo transformations. Suppose there are two
inertial reference frames K and K 0 ; K 0 moves with V relative to K.
Then
r = r 0 + V t0 , (2.1)
0
t=t. (2.2)
Note that time is absolute in classical mechanics.

It can be shown, however, that Maxwell equations do not satisfy
Galileo’s principle of relativity under Galileo transformation 1 . 1
Try to prove this for Maxwell Equa-
There are three possible ways to solve this problem. tions in vacuum.
1. The Maxwell equations were wrong. The proper theory of EM

was invariant under Galileo transformations.
2. Galileo PR only applied to classical mechanics. EM is not mechan-

ics.
3. There exists a general principle of relativity for both classical

mechanics and electromagnetism.
Einstein did some really hard thinking and chose 3. He then
propose the following two postulates based on experiments done by
other people and lots of his own thinking.
The principle of relativity. All the laws of nature are identical

in all inertial systems of reference.
The constancy of the speed of light. The speed of light (c) is
independent of the motion of its source; its numerical value
is c = 2.998 × 1010 cm/s.
It can be immediately shown that time being absolute is not

consistent with Einstein’s PR. From Galileo’s PR, the velocity trans-
forms like
v = v0 + V . (2.3)
This equation directly follows from that ∆t being invariant. How-

ever, this leads to that v can be larger than c, not consistent with
Einstein PR.
2.2 Intervals in spacetime
2.2.1 Definition of interval

For convenience of presentation, we’ll first introduce a few con-
cepts.
Event: An event is described by the place it occurred and the

time when it occurred.
We also introduce a fictitious four-dimensional space; marked

by 3 space coordinates and 1 time coordinates. For an idealized
particle, an event is defined by three coordinates and the time when
the event occurs.
We consider two inertial reference frames K and K 0 , with parallel
axes. Suppose K 0 moves in V relative to K. Now define two events
in K system.
• Event 1: sending out a light signal from ( x1 , y1 , z1 ) at t1 .
• Event 2: the arrival of the signal at ( x2 , y2 , z2 ) at t2 .
The signal traveled c∆t, or (∆x2 + ∆y2 + ∆z2 )1/2 , with definition
∆ f = f 2 − f 1 . So we have
c2 ∆t2 − (∆x2 + ∆y2 + ∆z2 ) = 0 (2.4)
In K 0 , noting that time is not absolute, the coordinates of the

same events are
• Event 1, ( x10 , y10 , z10 ) and t10
• Event 2, ( x20 , y20 , z20 ) and t20
Because of the constancy of light speed c, we have in K 0 system
c2 ∆t02 − (∆x 02 + ∆y02 + ∆z02 ) = 0. (2.5)
If x1 y1 z1 t1 and x2 y2 z2 t2 are the coordinates of any two

events, we define ∆s by
∆s = [c2 ∆t2 − ∆x2 − ∆y2 − ∆z2 ]1/2 , (2.6)
and call it the interval between these two events.
2.2.2 The invariance of interval

From the previous analysis, we reach an important conclusion that
if ∆s = 0 in K, then ∆s0 = 0 in any K 0 .
foundations of theory of relativity 21
To find the relationship between ∆s and ∆s0 for ∆s 6= 0, we

consider two events infinitely close to each other. In this case, the
interval ds is
ds = [c2 dt2 − dx2 − dy2 − dz2 ]1/2 . (2.7)
The form of ds allows us to regard it as the distance between two

world points in the fictitious four-dimensional space. This space is
called Minkowski space (axes: x, y, z, and ct), it’s pseudo-Euclidean.
If Euclidean, the distance would be
∆l = [c2 ∆t2 + ∆x2 + ∆y2 + ∆z2 ]1/2 . (2.8)
Now the question: What’s the relationship between ds in K and

ds0 in K 0 in general (valid for ds 6= 0)? We have two constraints:
1. If ds = 0, then ds0 = 0.
2. ds and ds0 are infinitesimal of the same order.
From these two conditions, we have
ds = ads0 (2.9)
We now derive the factor a in ds = ads0 . For an inertial reference

frame, space and time are homogeneous.
• If a = a(t), Equation (2.9) violates that time being homogeneous.
• If a = a( X ), Eq (2.9) depends on the location of the origin of K 0

in K; it is a violation of space being homogeneous.
• If a = a(V ), that means Eq (2.9) depends on the direction of K 0

moving in K; a violation of space being homogeneous.
Therefore we conclude a = a(|V |) or a = a(V ), and ds = a(V )ds0 .

To find the value of a, now suppose we have three inertial ref-
erence frames K, K1 , K2 , V 1 and V 2 are the velocities of K1 and K2
relative to K, respectively. Then we have
ds = a(V1 )ds1 , (2.10)

ds = a(V2 )ds2 , (2.11)
ds1 = a(V12 )ds2 . (2.12)
From these three equations, we have
a(V2 )
= a(V12 ). (2.13)
a(V1 )
Note that V12 depends on the relative angle between V1 and V2 , but
the left hand side of the above equation does not depends on this
angle. Therefore, we can conclude that a(V12 ) must be a constant.
And it is easy to see that this constant is 1. Therefore,
ds = ds0 OR ds2 = ds02 . (2.14)

The interval between two events is invariant under transforma-

tion from one inertial frame to another. This is the mathematical
formulation of the invariance of c.
From the invariance of ds, we can immediately reach the follow-
ing conclusion: If a particle moves with |v| < c in K, then |v0 | < c
in all other K 0 , because ds2 = c2 dt2 − dx2 = (c2 − |v|2 )dt2 is an
invariant.
2.2.3 Space-like and time-like intervals

With the invariance of ds, time is no longer absolute. Statements
like “two events occur simultaneously” do not necessarily hold if
we transform to another reference frame. Let’s now discuss this
type of problem.
Our first question is, if two events occur at two different times
in K (∆t 6= 0), can we find a reference frame K 0 in which ∆t0 = 0?
Suppose we can find a K 0 so that ∆t0 = 0. From the invariance of
∆s2 , we have
∆s2 = c2 ∆t2 − ∆x2 = −∆x02 < 0. (2.15)
Hence if ∆s2 < 0 or if ∆s is imaginary, it is possible to find a

reference frame where ∆t0 = 0. Imaginary intervals are said to be
space-like. For space-like intervals, the concepts of “simultaneous”,
“earlier”, and “later” are relative. Note that c2 ∆t2 < ∆x2 , meaning
that the two events are so separated that no signal can propagate
from one point to the other point within ∆t.
Following the previous question, it is natural to ask another one:
if two events occur at two different times in K (∆t 6= 0), what’s the
condition for ∆t0 6= 0 in all K 0 ? From the invariance of ∆s2 , we have
∆s2 = c2 ∆t2 − ∆x2 = c2 ∆t02 − ∆x02 . (2.16)
The minimum value for c2 ∆t02 to take is when ∆x02 = 0, and
∆s2 = c2 ∆t2 − ∆x2 = c2 ∆t02 > 0. (2.17)
Hence if ∆s2 > 0 or if ∆s is real, it is not possible to find a K 0 so that

∆t0 = 0. Note that c2 ∆t2 > ∆x2 . Real intervals are said to be time-like.
Note that for time-like intervals, it is possible to connect two events
using a signal with propagation speed less than c, since ∆x/∆t < c.
For time-like intervals, the concepts of “future” and “past” are
absolute. To see this, let’s assume the interval between event 1 and
event 2 is time-like, then
∆s2 = c2 ∆t2 − ∆x2 = c2 ∆t02 − ∆x02 > 0. (2.18)
Solving for ∆t0 from this equation, we have
∆t0 = ±|∆t|(1 − (∆x2 − ∆x02 )/c2 ∆t2 )1/2 . (2.19)
Which sign should we take for ∆t0 ? To see this, we know that
∆t0 → ∆t as V → 0, where V is the velocity of K 0 relative to K.
Hence
∆t0 = ∆t(1 − (∆x2 − ∆x02 )/c2 ∆t2 )1/2 . (2.20)
Therefore for time-like intervals, we must have

1. If ∆t > 0, then ∆t0 > 0 in all K 0 .
2. If ∆t < 0, then ∆t0 < 0 in all K 0 .

That is, the concepts of “future” or “past” are absolute for time-like
intervals.
The concept of time-like or space-like intervals are important
if two events are causally related. For event 1 to be the reason
of event 2, event 1 must occur “before” event 2 in all reference
frames; i.e., event 1 is in the “absolute past” of event 2. Therefore
the interval must be time-like.
On the other hand, if event 1 is the reason of event 2, a signal
has to propagate from event 1 to event 2. The propagation speed
of signal is then ∆x2 /∆t2 , which is less than c2 , since the interval is
time-like. This is the same statement as that c is the maximum speed
of propagation of interaction.
Finally we point out that because ds is an absolute concept, the
time-like or space-like property of an interval is also absolute.
2.3 Proper time
Since time is not absolute, it is convenient to introduce the concept

of “proper time”. Proper time is the time read by a clock moving with
the object; it is normally denoted by τ.
Let’s now derive the relationship between dt observed in labora-
tory frame K and the proper time dτ of the object. In the laboratory
frame K, the object moves with a constant velocity v. Let the iner-
tial reference frame moving with the object be called K 0 . From the
invariance of ds2 and dx0 = 0 (because the object is still in K 0 ), we
have
ds2 = c2 dτ 2 = c2 dt2 − dx2 . (2.21)
From this equation,
dτ = ds/c = dt (1 − dx2 /c2 dt2 )1/2 . (2.22)
But dx/dt = v, therefore we have
dτ = dt (1 − v2 /c2 )1/2 ≡ dt/γ, (2.23)
where γ = 1/(1 − β2 )1/2 with β = v/c; β ≤ 1 and γ ≥ 1. Note that

dτ = dt/γ; the fact that γ ≥ 1 leads to
dτ ≤ dt. (2.24)
Or by integrating this equation, we have

Z t2
τ2 − τ1 = dt/γ < t2 − t1 (2.25)
t1
Conclusion: Moving clocks go more slowly than those at

rest.
There are lots of experiments proving this conclusion. Here I list

a few:
1. The lifetime of mu-mesons (muons)2 . 2

Ref: Feynman’s Lecture on Physics,
Average lifetime of muons is 2.2 × 106 or about 2 µs. If they travel Vol 1, 15-4.
at v ≈ c, the Galilean distance ≈ 600 m. Muons can travel from

the top of atmosphere to the surface of Earth.
2. A similar example: The lifetime of pions 3 . 3

Ref: Jackson, Classical Electrodynam-
ics, 2nd Edition, pp 520.
On the other hand, the concept is new and confusing. See the
following two paradoxes:
1. If K 0 moves relative to K in V , then dt0 = dt/γ. From K 0 , how-

ever, K moves relative to K 0 in −V . then dt = dt0 /γ. How to
understand this? 4 . 4
Ref: 郭硕鸿，电动力学，第三版,
第204页
2. The twin paradox 5 : Identical twins; one of them makes a jour- 5
Ref: Feynman’s Lectures on Physics,
ney into space in a high-speed spaceship and returns home. Vol 1, 16-2
Which twin should have aged more?
2.4 The Lorentz transformation
With Galileo PR, r = r 0 + V t, t = t0 , and we know this is not correct

in general. Now time is not absolute, how do we make coordinate
transformations? Again we consider two IRF’s K and K 0 ; K 0 moves
relative to K in V . For an event with coordinate (t, r ), what’s the
corresponding coordinate (t0 , r 0 ) in K 0 ?
First, for simplicity, we assume that the two coordinate system
have the same origin; i.e., t = 0, r = 0 in K corresponds to the point
t0 = 0, r 0 = 0 in K 0 . Second, the basic requirement we have is that
the transformation should not change any ∆s, the distance between
any two points in Minkowski space. The ∆s in K is
∆s2 = c2 ∆t2 − ∆r2 . (2.26)
To make sure that changing coordinate system does not change ∆s2 ,
we make use of the fact that
cosh2 θ − sinh2 θ = 1, (2.27)
for any θ. Therefore, we can assume
c∆t = ∆s cosh θ,
∆r = ∆s sinh θ.
in K and in K 0 ,
c∆t0 = ∆s cosh θ 0 ,
∆r 0 = ∆s sinh θ 0 .
By parametering c∆t and ∆r using cosh and sinh, we can always

make sure that ∆s is an invariant in any coordinate system.
Now we try to find out what is the relation of coordiantes be-
tween K and K 0 . Let’s assume θ = θ 0 + φ. Noting c∆t0 = ∆s cosh θ 0
and ∆r 0 = ∆s sinh θ 0 , we have
c∆t = ∆s cosh(θ 0 + φ) = ∆s cosh θ 0 cosh φ + ∆s sinh θ 0 sinh φ

= c∆t0 cosh φ + ∆x 0 sinh φ,
∆r = ∆s sinh(θ 0 + φ) = ∆s sinh θ 0 cosh φ + ∆s cosh θ 0 sinh φ

= ∆r 0 cosh φ + c∆t0 sinh φ.
We rewrite the equations as
∆r = ∆r 0 cosh φ + c∆t0 sinh φ, (2.28)

c∆t = ∆r 0 sinh φ + c∆t0 cosh φ. (2.29)
These are the needed equations that relate (ct0 , r 0 ) and (ct, r ).
To obtain coordinate transformation, let’s consider the interval
between any event and the origin of the coordinate system. From
Equations (2.28) and (2.29), we have
r = r 0 cosh φ + ct0 sinh φ, (2.30)

0 0
ct = r sinh φ + ct cosh φ. (2.31)
Starting from the origin of the K 0 system, we have r 0 = 0. The

rotation equations are then
r = ct0 sinh φ, (2.32)

0
ct = ct cosh φ. (2.33)
We thus have V/c = tanh φ with V = r/t, the speed of the origin
of the K 0 system observed in K. For simplicity of discussion, let
assume that V is in the direction of e x ; i.e., K 0 moves relative to K
with V = Ve x , then r = x, y = y0 = 0, z = z0 = 0, Vy = Vz = 0. Now
from V/c = tanh φ = (eφ − e−φ )/(eφ + e−φ ), we have
r
φ V+c
e = . (2.34)
V−c
Therefore
eφ − e−φ V/c
sinh φ = = √ ≡ βγ (2.35)
2 1 − V 2 /c2
eφ + e−φ 1
cosh φ = = √ ≡γ (2.36)
2 1 − V 2 /c2
Substituting the expressions of sinh φ and cosh φ into rotation
equations, we have,
x 0 + Vt0
x= √ ,
1 − V 2 /c2
t0 + (V/c2 ) x 0
t= √ .
1 − V 2 /c2
y = y0 ,
z = z0 .
These equations are the needed transformations; they are called

Lorentz transformations. 6 6
From the above derivation, you can
see that the Lorentz transformation
The inverse transformation can be obtained by noting that K
corresponds to the rotation in the 4D
moves relative to K 0 in −V . Therefore we have Minkowski space involving the t-axis.
x − Vt For example, the above transformation
x0 = √ , (2.37) for V = Ve x corresponds to the
1 − V 2 /c2 rotation in the 4D space within the xt
t − (V/c2 ) x plane.
t0 = √ . (2.38)
1 − V 2 /c2
This inverse Lorentz transformation can also be directly obtained by
solving for (t0 , r 0 ) from previous equations.
If V/c 1 or equivalently c → ∞, (1 − V 2 /c2 )1/2 → 1. The
Lorentz transformation is reduced to the Galileo transformation,
x = x 0 + Vt0 , (2.39)
0
t=t. (2.40)
This means that classical mechanics work well because c V,

where V means the characteristic speed of our daily life.
From the Lorentz transformation, we derive the proper length,
which is the length of an object in an inertial reference frame in
which it is at rest. We denote proper length by l0 . Suppose we have
a rod parallel to the x 0 -axis; it’s at rest in K 0 system. In K 0 system,
the length of the rod is l0 = | x10 − x20 |. In K system, to measure the
length of the rod, we have x1 at t1 and x2 at t2 , and t1 = t2 . From
the inverse Lorentz transformation, we have,
x − Vt1
x10 = √ 1 , (2.41)
1 − V 2 /c2
x2 − Vt2
x20 = √ . (2.42)
1 − V 2 /c2
Noting t1 = t2 , we have
x1 − x2
x10 − x20 = √ . (2.43)
1 − V 2 /c2
√ √
Or l0 = l/ 1 − V 2 /c2 or l = l0 1 − V 2 /c2 . Since l < l0 , this is
called Lorentz contraction.
Similarly, we can introduce proper volume, which is the volume
of an object in an IRF in which it is at rest. Since transverse dimen-
sions do not change, we can immediately have
p
V = V0 1 − V 2 /c2 , (2.44)
where V0 is the proper volume of the object.

Using the Lorentz transformation, we can re-derive the equation
for proper time. Suppose a clock to be at rest in K 0 , two events
occurred at r 0 at t10 and t20 . Then ∆τ = t20 − t10 and in K,
t10 + (V/c2 ) x 0
t1 = √ , (2.45)
1 − V 2 /c2
t20 + (V/c2 ) x 0
t2 = √ . (2.46)
1 − V 2 /c2
√
Therefore, ∆t = ∆τ/ 1 − V 2 /c2 , the same as we obtained before.
2.5 Transformation of velocities
To completely determine the state of a particle, we need both its

space coordinates and velocity. From the Lorentz transformation,
we can easily derive the needed formulas for the transformation of
velocities.
Suppose a particle has a velocity v in IRF K and v0 in IRF K 0 ;
0
K moves with V = Ve x relative to K. The differentiation of the
Lorentz transformation gives
p
dx = (dx 0 + Vdt0 )/ 1 − V 2 /c2 , (2.47)
dy = dy0 , (2.48)
0
dz = dz , (2.49)
p
dt = (dt0 + V/c2 dx 0 )/ 1 − V 2 /c2 . (2.50)
The velocity transformation can be obtained easily as v = dr/dt,

v0 = dr 0 /dt0 , etc.
The resulting velocity transformations are
dx 0 + Vdt0 v0x + V
vx = = (2.51)
dt0 + (V/c2 )dx 0 1 + (V/c2 )v0x
0
r
vy V2
vy = 2 0
1− 2 (2.52)
1 + (V/c )v x c
r
0
vz V2
vz = 1 − (2.53)
1 + (V/c2 )v0x c2
In case of c → ∞, the above equations recovers the familiar Galileo

velocity transformation.
2.6 Vectors and Tensors in Spacetime
Minkowski once said: “Space of itself, and time of itself will sink into
mere shadows, and only a kind of union between them shall survive.”
Indeed, from previous introductions, in the framework of the
special theory of relativity, time and space are put on equal footing:
both time and space are not absolute and are transformed together
from one reference system to another reference system.
2.6.1 Four-vectors
It’s convenient to discuss time and space together in terms of
vectors in a pseudo-Euclidean Minkowski space. For example,
the coordinates of an event (ct, x, y, z) can be considered as the
components of a radius four-vector,
x0 = ct, x1 = x, x2 = y, x3 = z. (2.54)
The radius four-vector satisfies the Lorentz transformation when

changing reference system from K 0 to K. Note that with this repre-
sentation, the Lorentz transformation can be written as (easier to
remember)
0 0
x0 = γ( x0 + βx1 ), (2.55)
10 00
x1 = γ( x + βx ), (2.56)
√
where γ = 1/ 1 − V 2 /c2 , and β = V/c.
Like the case in 3D, we model four-vectors after the radius four-
vector.
Definition: any set of four quantities A0 , A1 , A2 , A3 , which

transform like the radius four-vector x α under the change
from K 0 to K is called a four-vector.
This definition essentially means that when changing from K 0 to

K, all four-vectors are transformed using Lorentz transformation.
Therefore for K 0 with axes parallel to K and moves relative to K in
V = Ve x ,
A00 + (V/c) A01

A0 = p , (2.57)
1 − (V/c)2
A01 + (V/c) A00
A1 = p , (2.58)
1 − (V/c)2
and A2 = A02 and A3 = A03 . The component A0 is called the time

component, and A1 , A2 , A3 the space components. We sometimes write
the four-vector as
A α = ( A0 , A ), (2.59)
where A is a three-dimensional vector. In this course, we use Greek

letters (α, β, · · · ) to represent 0, 1, 2, 3, and use Latin letters (i, j, · · · )
to represent 1, 2, 3. Also we use prefix four- to specifically denote
variables in the Minkowski space. Ordinary three-dimensional
variables will just be called scalars, vectors, or tensors.
The square magnitude of the radius four-vector x α is defined as
( x 0 )2 − ( x 1 )2 − ( x 2 )2 − ( x 3 )2 = s2 , (2.60)
which is the square of s, the spacetime interval (the magnitude

of the radius four-vector in Minkowski space). Similarly, the square
magnitude of an arbitrary four-vector Aα is
( A0 )2 − ( A1 )2 − ( A2 )2 − ( A3 )2 . (2.61)
To more conveniently represent the dot-product between four-

vectors, we introduce two kinds of components of four-vectors:
1. The contravariant (逆变) components, denoted by Aα .
2. The covariant (协变) components, denoted by Aα .
so that Aα and Aα are related by
A0 = A0 , A1 = − A1 , A2 = − A2 , A3 = − A3 . (2.62)
By using contravariant and covariant components, the square

magnitude of a four-vector Aα can now be compactly written as
3
∑ A α A α = A0 A0 + A1 A1 + A2 A2 + A3 A3 ≡ A α A α . (2.63)
α =0
The dot-product between two four-vectors Aα and Bα is Aα Bα or

Aα Bα . In these expressions, we have used the Einstein summation
rule. Another example, the square magnitude of the radius four-
vector, the interval, can be written as s2 = xα x α . Note the locations
of indices in these expressions.
There are simple ways to memorize the location of sub/super
scripts:
1. 中文：上逆下协
2. English: There is a “r” in both “contra-” (contra-variant) and

“super-” (super-script).
To convert between the contravariant and the covariant com-

ponents, one needs to use the metric coefficients gαβ . From the
definitions of contra- and co-variant components above, you should
be able to see that for the four-vectors in Minkowski space,
 
1 0 0 0
0 −1 0 0 
( gαβ ) = ( gαβ ) =  . (2.64)
 
0 0 −1 0 
0 0 0 −1
Therefore we can convert between Aα and Aα by
Aα = gαβ A β or Aα = gαβ A β . (2.65)
The metric tensor g plays an very important role in the general

theory of relativity (not covered by this course).
In your textbook, the authors represented the Minkowski space
in a different way. Let x0 = ct, x1 = ix, x2 = iy, and x3 = iz, then
s2 = ( x0 )2 + ( x1 )2 + ( x2 )2 + ( x3 )2 . (2.66)
There are two reasons why I wouldn’t use this in my class:
1. Hard to understand why one would need the contra- and co-
variant components of a vector.
2. Hard to generalize this to the General Theory of Relativity (the

metric g, curved or flat spacetime).
The concept of contra- and co-variant components of a vector is

important in using curvilinear coordinates. Curvilinear coordinates
are very convenient in certain cases (like fusion and space plasmas,
general relativity, etc). For more information about curvilinear
coordinates, see my reference list near the end of the previous
Chapter.
Here are a few rules regarding contra- and co-variant compo-
nents of four-vectors.
1. The dot product between two vectors is Aα Bα or Aα Bα .
2. The squared length of Aα is Aα Aα or Aα Aα .
3. You can convert between Aα and Aα by Aα = gαβ A β and Aα =

gαβ A β .
Regarding Einstein notation of four-vectors and four-tensors,

here are two simple rules that you should keep in mind.
1. For a dummy index, one index must be superscript and one

must be subscript; e.g., Aα Bα . Terms like Aα Bα do not make
sense, as you can see from the definitions of contra and co-
variant components of a four-vector given above.
2. Equations do not change the position of a free index, like Aα =

gαβ A β . Note the location of the free index α.
2.6.2 Four-scalars
Definition: Like 3D case, we define four-scalars as invariants

under the transformation of coordinates, here the Lorentz
transformation.
Because they are invariant under Lorentz transformations, the

four-scalars play important roles in the study of special relativity
and electrodynamics. You should learn how to construct four-
scalars from four-vectors and four-tensors. For example, like the
3D case, the dot product between two four-vectors is a four-scalar.
In the Minkowski space, the dot product between two four-vectors
Aα and Bα is Aα Bα or Aα Bα , and it is a four-scalar. Note that there
is no free index in this term. Example: the interval s2 = xα x α is a
four-scalar. We can “construct” more invariants after we learn more
four-vectors.
2.7 Four-tensors
Definition: A second-order four-tensor is a set of 16 quantities

F αβ , which under coordinate transformations, transform like
the products of components of two four-vectors.
A four-tensor can be written in different forms: F αβ , Fαβ , F·αβ ,

·β
and Fα . Different positions of indices represent different kind
of components of a four-tensor. In case of a four-vector, there
are two different kinds of components (co- and contra-variant
components); in the case of a four-tensor, there are four different
kinds of components. Again one use the metric tensor g to convert
between different kinds of representations of a four-tensor. One
just needs to remember the rules of Einstein notation, and you can
immediately figure out that,
Fµν = gµα F αβ g βν (2.67)

·β
Fµ = gµα F αβ (2.68)
······ (2.69)
Noting that in the special relativity,


1

 if α = β = 0
gαβ = −1 if α = β 6= 0 (2.70)


0 otherwise

From Fµν = gµα F αβ g βν , we have the following simple rules:

1. raising or lowering index (1, 2, 3) changes the sign,
2. raising or lowering index (0) does not change the sign.
Example: F00 = F00 , F01 = − F01 , F11 = F11 , F0·0 = F00 , F0·1 = F01 ,
F0·1 = − F01 , F1·0 = − F10 , · · · · · ·
From four-vectors Aα and Bα , we know Aα Aα and Aα Bα are four-
scalars; i.e., they are invariants under Lorentz transformations. The
trick to construct 4-scalars from 4-vectors/tensors is that all indices
are dummy indices; there is no free index in the final expression.
Similarly, we can also construct four-scalars from 4-tensors. For
example, for F αβ , we know that
Fαβ F βα = inv. (2.71)
Also it’s possible to construct four-scalars from four-vectors and

four-tensors, like Aα Bβ F βα is a four-scalar, since there is no free
index in the final expression. This operation is called contraction.
2.7.1 Basic four-vector/tensor differential calculus.

Let’s now introduce four-vector calculus. The four-gradient of a
scalar φ is the four-vector

∂φ 1 ∂φ
= , ∇φ (2.72)
∂x α c ∂t
From the location of the index, we know they are the covariant
components of a four-vector. To make this more obvious, some
people write the four-gradient of φ as,
∂φ
≡ ∂α φ.
∂x α
Similarly, you might have the contra-variant components of a four-
vector,
∂φ
≡ ∂α φ.
∂xα
The 4-divergence of a 4-vector Aα = ( A0 , A) is a four-scalar,
∂Aα 1 ∂A0
= ∂α Aα = ∂α Aα = + ∇· A. (2.73)
∂x α c ∂t
It’s a scalar under coordinate transformation.
2.8 Four-velocity and four-acceleration
Using the knowledge of four vectors, we can construct the four-

velocity by uα = dx α /dτ, where dτ = dt/γ,
cdt
u0 = γ = γc, (2.74)
dt
dx
u1 = γ = γv x , (2.75)
dt
u2 = γvy , (2.76)
3
u = γvz . (2.77)
Therefore, the four-velocity is
uα = (γc, γv x , γvy , γvz ) = (γc, γv). (2.78)
The contraction of the four-velocity is, noting that dτ = ds/c,
dx α dxα ds2
uα uα = = 2 2 = c2 , (2.79)
dτdτ ds /c
or it can be calculated directly by
u α u α = γ2 ( c2 − v2 ) = c2 , (2.80)
which is a scalar and an invariant.

One can construct the four-acceleration similarly by wα =
duα /dτ, Without calculating the components of the four-acceleration,
one can show that
uα wα = 0. (2.81)
This shows that the four-acceleration is always “perpendicular” to

the four-velocity.
3
Relativistic Dynamics
3.1 A brief review of Lagrangian mechanics
We’ll use some Lagrangian mechanics in this class, so let me first

briefly give it a review, using materials from Landau’s book “Me-
chanics”.
In classical mechanics, the action is defined by the integral
Z t2
S= L(q, q̇, t)dt. (3.1)
t1
Suppose q(t1 ) = q(1) and q(t2 ) = q(2) , there could be infinite

number of paths connecting t1 and t2 . It is required that the actual
path minimizes S; i.e.,
δS = 0. (3.2)
This is called the principle of least action, with L the Lagrangian.

To obtain equations of motion from the principle of least action,
note that if q = q(t) is the actual path, then for a given variation of
q,
q(t) + δq(t), (3.3)
the change in S is
Z t2 Z t2
δS = L(q + δq, q̇ + δq̇, t)dt − L(q, q̇, t)dt = 0. (3.4)
t1 t1
The constraints we have are
δq(t1 ) = 0, δq(t2 ) = 0, (3.5)
since q(t1 ) = q(1) and q(t2 ) = q(2) are given.

Keeping only the leading (first-order) terms in δS, we have
Z t2
∂L ∂L
δq + δq̇ dt = 0. (3.6)
t1 ∂q ∂q̇
Noting that δq̇ is, because of the variation in q,
d d d
δq̇ = (q + δq) − (q) = δq. (3.7)
dt dt dt
Therefore,
t2 Z t
∂L 2 ∂L d ∂L
δS = δq + − δqdt = 0. (3.8)
∂q̇ t1 t1 ∂q dt ∂q̇
Because δq(t1 ) = 0 and δq(t2 ) = 0, we have

Z t2
∂L d ∂L
δS = − δqdt = 0, (3.9)
t1 ∂q dt ∂q̇
Since δS must vanish for all δq, we have
∂L d ∂L
− = 0, (3.10)
∂q dt ∂q̇
these equations are called Lagrange’s equations. If we know L of a
system, then we can write down its equations of motion. Note that
the Lagrangian is not uniquely determined for a given system. For
if we have two Lagrangians L0 (q, q̇, t) and L(q, q̇, t), and
d
L0 = L + f (q, t), (3.11)
dt
then
Z t2 Z t2 Z t2
df
S0 = L0 dt = Ldt + dt = S + f (q(2) , t) − f (q(1) , t).
t1 t1 t1 dt
(3.12)
Therefore δS0 and δS would lead to the same equations of motion.

Now we apply the concept to solving a free body problem in
classical mechanics. To consider mechanical problems, it’s neces-
sary to choose a reference frame. In principle, we can choose any
reference system we like. However if we choose one where time
or space is inhomogeneous, then the description of nature laws
would become unnecessarily difficult. The simplest one is thus the
so-called inertial reference frame so that space is homogeneous and
isotropic,and time is homogeneous. In such a frame, a free body
would move with a constant velocity or is at rest forever.
Let’s find the the Lagrangian for a free particle in an inertial
reference system. Note that time is homogeneous so L does not
depend on t. Space is homogeneous so L does not depend on
r. Therefore, L = L(v). Space is isotropic, therefore L must not
depend on the direction of v. Conclusion: L = L(|v|) or L = L(v2 ).
Applying the Lagrange’s equation (3.10), noting that ∂L/∂v must be
a function of v only, the equation of motion is
v̇ = 0. (3.13)
This equation states that a free body in an inertial reference frame

moves with a constant velocity. Of course, this velocity can equal 0,
which means the free body is at rest all the time. This is the laws of
inertia.
To find the form of L(v2 ), we need to use Galileo’s principle of
relativity. Consider two IRF’s K and K 0 , K moves in e relative to K 0 .
v0 = v + e. (3.14)
relativistic dynamics 35
Because of Galileo’s PR,
L0 = L(v02 ) = L(v2 + 2v · e + e2 ), (3.15)
the leading terms of the L0 are

∂L
L ( v 02 ) = L ( v2 ) + 2v · e (3.16)
∂v2
We know from Galileo’s PR that L0 and L should lead to the same
equations of motion, therefore,
d
L0 = L + f (r, t), (3.17)
dt
where f (q, t) is some function of r and t, Hence
∂L d ∂f dr
2
2v · e = f (r, t) = + · ∇ f (r, t). (3.18)
∂v dt ∂t dt
Because v = dr/dt, we must have
∂L
2e = ∇ f (r, t). (3.19)
∂v2
The LHS is a function of v only, while the RHS is a function of t
and r only; therefore
∂L
= α, (3.20)
∂v2
where α is some constant. From
∂L
= α, (3.21)
∂v2
we have for a free particle
L = αv2 . (3.22)
Normally we write α = m/2, where m is called mass, and

1 2
L= mv . (3.23)
2
From previous derivations, we have learned that either dS

is an invariant under Galileo transformation, or dS and dS0
differ by a d f (r, t). We will use this property of action in rel-
ativistic dynamics. Of course, there the action is an invariant
under the Lorentz transformation (a four-scalar)
The Lagrangian for a particle moving in a given external field

can be obtained by adding to L = mv2 /2 a quantity describing the
interaction between the field and the particle,
1 2
L= mv − U (r, t), (3.24)
2
so that the equations of motion become
dv
m = −∇U. (3.25)
dt
Note that now the total action S becomes

Z t2
1 2
S = Sp + S f = mv − U (r, t) dt. (3.26)
t1 2
Using Lagrangian mechanics, it is very convenient to discuss

conservation properties. Here we discuss the conserved quantities
from the isotropy and homogeneity of space and time.
If time is homogeneous, then for a closed system L = L(qi , q˙i ).
The total time derivative of L is

dL ∂L dq˙i ∂L d ∂L dq˙ ∂L
= q̇i + = q̇i + i (3.27)
dt ∂qi dt ∂q̇i dt ∂q̇i dt ∂q̇i

d ∂L
= q̇i (3.28)
dt ∂q̇i

d ∂L
⇒ q̇i −L =0 (3.29)
dt ∂q̇i
We define total energy E of the system by
∂L
E ≡ q̇i − L. (3.30)
∂q̇i
Therefore, if time is homogeneous, E is a constant.
Equation (3.30) defines the energy of the system. If, however,

we consider a free particle, then it is the definition of energy
for the particle. We will use this to define particle energy in
relativistic dynamics.
If space is homogeneous; i.e., if we replace r i by r i + e, then
∂L ∂L
δL = ∑ ∂ri · δri = e · ∑ ∂ri = 0. (3.31)
i i
Therefore
∂L
∑ ∂ri = 0. (3.32)
i
From Lagrange’s equation,
d ∂L d ∂L
∑ dt ∂vi =
dt ∑ ∂vi = 0. (3.33)
i i
If space is homogeneous, for a closed system,
∂L
P≡ ∑ ∂vi (3.34)
i
remains constant. This is called the momentum of the system
Similarly, we will use this to define the momentum of a free

particle in relativistic dynamics.
3.2 Relativistic action for a free particle
I’ll now apply what I’ve introduced to obtain the relativistic action
for a free particle,
Z b
S= dS. (3.35)
a
The relativistic action must satisfy
1. dS is invariant under Lorentz transformation, or differed by a

d f ( x α ) when transformed from K 0 to K.
2. It should recover non-relativistic dynamics if c → ∞ (v/c → 0).
From the previous chapter, we know that for a free particle, the
event interval ds is an invariant. Try
Z b
S= αds, (3.36)
a
where α is a 4-scalar. Let’s see whether S can recover non-relativistic

dynamics if c → ∞. If it can, then it suits our need and can be used
as the relativistic action for a free particle.
To recover the non-relativistic dynamics, we first write S in
R
the normal 3D form; i.e., S = Ldt form. Because ds = cdτ =
√
c 1 − v2 /c2 dt, we write
Z b r
v2
Z b
S= αds = αc 1 − 2 dt. (3.37)
a a c
From this form of S, we have the Lagrangian
r
v2
L = αc 1 − 2 . (3.38)
c
In case of c → ∞, we have
αv2
L = αc − . (3.39)
2c
Therefore, if α = −mc, then S satisfies the two constraints.
The relativistic action for a free particle is then
Z b
S = −mc ds. (3.40)
a
And the corresponding Lagrangian is

r
2 v2
L = −mc 1− , (3.41)
c2
√
or L = −mc2 /γ, with γ = 1/ 1 − v2 /c2 .
3.3 Energy and momentum
The momentum of a particle is given by

∂L
p= . (3.42)
∂v
√
Using L = −mc2 1 − v2 /c2 , we have
− 12
v2

1 2 1
p = −mc 1− 2 − 2v,
2 c c2
or
mv
p= √ = γmv.
1 − v2 /c2
If c → ∞, p = mv, the non-relativistic (NR) momentum.

The energy of a particle is defined from L by
∂L
E = v· − L = p · v − L. (3.43)
∂v
Using p = γmv, we have
mc2
E= √ = γmc2 (3.44)
1 − v2 /c2
If v → 0, E → mc2 , the rest energy of the particle. In case of c → ∞,

E = mc2 + 12 mv2 .
This definition of energy leads to that mass is not a conserved
quantity anymore in relativistic mechanics. If we have a body, with
mass m, consisting of n particles. In classical mechanics, we have
m = ∑i mi . In relativistic mechanics, however, the energy of a body
at least contains
1. the rest energies of its constituent particles ∑i mi c2
2. the kinetic energy of particles
3. the interaction energy
4. · · · · · ·
therefore, mc2 6= ∑i mi c2 and m 6= ∑i mi . Only the law of conservation

of energy holds.
The energy and momentum are closely related. Note that the
4-velocity we introduced is uα = (γc, γv). Multiplying uα by m, a
4-scalar, we have the four-momentum vector
E

α α
p = mu = (γmc, γmv) = ,p . (3.45)
c
Since pα is a four-vector, we can apply the knowledge we know
about four-vector. First, we can immediately obtain their transfor-
mation equation, since any four-vector transforms like the radius
four-vector x α ; i.e.,
E 0 + V p0x
E= √ , (3.46)
1 − V 2 /c2
p0x + (V/c2 )E 0
px = √ , (3.47)
1 − V 2 /c2
py = p0y , (3.48)
pz = p0z . (3.49)
Here ( p x , py , pz ) = p. Also the dot product of a four-vector with

itself is a Lorentz invariant. Since four-momentum pα = muα , we
immediately have
p α p α = m2 c2 . (3.50)
Substituting pα = (E /c, − p) and pα = (E /c, p), we have
E2
− p2 = m2 c2 , (3.51)
c2
or
E
E 2 = p2 c2 + m2 c4 , and p = v. (3.52)
c2
this is the energy-momentum relation. If v → c, both p and E
become infinite unless m → 0. Actually, for a photon (m = 0), we
have
E
p= . (3.53)
c
This equation is also useful for the case E E0 , then p ≈ E /c.
3.4 The Covariant Equation of Motion
We can derive the equations of motion in covariant form by viewing

S as a function in the Minkowski space 1 , 1
Question: What’s the benefit of doing
Z b Z bp this?
S = −mc ds = −mc dxα dx α . (3.54)

a a
Then if we vary the actual path x α by δx α ,

Z b Z bp
δS = −mc δ ds = −mc δ dxα dx α = 0, (3.55)
a a
from the principle of least action.

√
Let’s first calculate δ dxα dx α .
p 1 1
δ dxα dx α = δ(dxα dx α ) = (δdxα dx α + dxα δdx α ) (3.56)
2ds 2ds
Note that we can switch the upper- and lower- indices,
δdxα dx α = δdx α dxα . (3.57)

√
Therefore, δ dxα dx α = dxα δdx α /ds. Noting that ds = cdτ,
dxα /ds = uα /c, and
p
δ dxα dx α = uα δdx α /c = uα dδx α /c. (3.58)
√
With δ dxα dx α = uα dδx α /c, we have
Z b Z b
duα
δS = −m uα dδx α = −muα δx α |ba + m δx α ds. (3.59)
a a ds
If we want to find the actual path, then we fix x α ( a) and x α (b),
duα
δS = 0 ⇒ =0 (3.60)
ds
or the four-velocity uα is constant.
4
Charges in a Given Electromagnetic Field
4.1 Four-potentials of a field
As in classical mechanics, the total action for a particle in a given

field is
S = Sp + Sp f (4.1)
R
Here S p = −mc ds is the action for the particle, and S p f is the
action characterizing the interaction. Experiments suggest that
the interaction is determined by the charge q, and the four-vector
potential Aα characterizing the field.
To make the integral S p f a Lorentz invariant, we can construct
Z b
S p f = −q Aα dx α , (4.2)
a
where Aα = (φ/c, A) or Aα = (φ/c, − A). Therefore the total action

is
Z b
S= (−mcds − qAα dx α ) . (4.3)
a
In 3D coordinates,
Z b
Sp f = (q A · dr − qφdt) (4.4)
a
Z b
= (q A · v − qφ) dt, (4.5)
a
and
mc2
Z t2
S= − + q A · v − qφ dt. (4.6)
t1 γ
The total Lagrangian is 1 1
Find the Hamiltonian and the
canonoical momentum ¶ using this
mc2 Lagrangian.
L= − + q( A · v − φ) . (4.7)
γ | {z }
| {z } interaction with the field
free particle
4.2 Equations of motion of a charge in a field
We derive here the equations of motion of a charge. For simplicity,

we assume that the charge is small, so its effects on the field can be
charges in a given electromagnetic field 41
neglected. The field potential do not depend on the position and

velocity of the charge. The Lagrange’s equation is

d ∂L ∂L
− =0 (4.8)
dt ∂v ∂r
From the total Lagrangian,
∂L
= ∇ L = q∇( A · v) − q∇φ. (4.9)
∂r
Noting that
∇( A · v) = ( A · ∇)v +(v · ∇) A + v × (∇× A) + A × (∇× v) .

| {z } | {z }
=0 =0
(4.10)
On the other hand,
∂L
= p + qA. (4.11)
∂v
Substituting Equations (4.9)-(4.11) into the Lagrange’s equation (4.8)

gives
d
( p + qA) = q(v · ∇) A + qv × (∇× A) − q∇φ (4.12)
dt
However,

d ∂A
q A=q + (v · ∇) A . (4.13)
dt ∂t
Correspondingly,
dp ∂A
= −q − q∇φ +qv × (∇× A). (4.14)
dt | ∂t {z }
no dependence on v
The right hand side of the equation can be divided into two parts;
one depends on velocity v and the other not. If we define the
electric field intensity to be
∂A
E=− − ∇φ. (4.15)
∂t
Similarly, we define the magnetic field intensity to be
B = ∇× A. (4.16)
Then the equations of motion of a charge can be simply written as
dp
= qE + qv × B. (4.17)
dt
This is the Lorentz force felt by a particle in a given field.

4.3 Gauge invariance
Note that we can describe the field in two ways. We can use Aα as
in the action S p f . Or as in the equations of motion, we can use only
E and B,
dp
= qE + qv × B. (4.18)
dt
However, a given E and B do not allow the unique determination of
A and φ. For example, it’s easy to show that the transformation
A0 = A + ∇ f (4.19)
∂f
φ0 = φ − (4.20)
∂t
do not change E and B. Therefore all equations must be invariant
under the transformation of the potentials; this invariance is called
gauge invariance (规范不变性). This is the first (and maybe the last)
gauge invariance you have ever seen and it is a very important
property of fields in more advanced field theories describing other
types of interactions.
The gauge invariance can also be clearly seen from the action,
Z b
S p f = −q Aα dx α , (4.21)
a
If we replace
∂f
Aα → Aα − , (4.22)
∂x α
where f is a function of t and r, then
Z b Z b
∂f
S p f = −q Aα dx α + q dx α (4.23)
a a ∂x α
Z b Z b
= −q Aα dx α + q df. (4.24)
a a
The total differential d f has no effect on the equations of motion.

The gauge invariance allows one to choose one2 auxiliary con- 2
This can be seen from that a scalar
dition for A and φ. Two popular gauges are Coulomb gauge and function is involved in Equation (4.22).
Lorenz gauge. One is
∇· A = 0, the Coulomb gauge (4.25)
This gauge is popular in some problems. The Lorenz gauge is very

popular in theoretical analysis. In the Lorenz gauge, we require
1 ∂φ
∇· A + = 0. the Lorenz gauge (4.26)
c2 ∂t
or in 4D form
∂Aα
= 0. (4.27)
∂x α
Note that if A and φ satisfy the Lorenz gauge in one inertial refer-
ence frame, then they satisfy the Lorenz gauge in all other inertial
reference frames.
4.4 Static electromagnetic fields
If potentials do not have time dependence, then
B = ∇× A, (4.28)
E = −∇φ. (4.29)
Therefore, φ determines E, while A determines B for static electro-

magnetic fields. In case of static fields, we can add to φ an arbitrary
constant. Usually, we choose φ(∞) = 0. Also it is obvious that E
and B fields are completely decoupled in case of static fields.
For a closed system in a static EM field, we have
∂L
= 0. (4.30)
∂t
From Mechanics, we know that the energy is conserved,
∂L
E = v· − L = γmc2 + qφ = const. (4.31)
∂v | {z } |{z}
kinetic potential
If the field is constant and uniform, then simple forms of A and

φ can be obtained. From E = −∇φ,
φ = − E · r. (4.32)
From B = ∇× A, we can choose A to be

1
A= B × r. (4.33)
2
The choices of φ and A can be easily proved by
−∇φ = ∇( E · r ) = E · ∇r = E · I = E, (4.34)
1 1
∇× A = ∇×( B × r ) = [ B∇· r − ( B · ∇)r ] = B. (4.35)
2 2
Note that we have used the fact that E and B are uniform in the
proof.
4.5 Motion in uniform static EM fields
First, let’s consider the motion in a uniform static E field. The

equations of motion are
dp
= qE. (4.36)
dt
Assuming E = Ee x , p x (t = 0) = 0, we have
dp x
= qE ⇒ p x = qEt (4.37)
dt
dpy
= 0 ⇒ py = const (4.38)
dt
dpz
= 0 ⇒ pz = const (4.39)
dt
Assuming py = p0 , and pz = 0, then p x = qEt, py = p0 .

The kinetic energy 3 of the particle is 3
Here “kinetic energy” means γmc2 ,
as opposed to the potential energy qφ.
See Equation (4.31)
q q
Ek = m2 c4 + p2 c2 = E02 + (qEct)2 , (4.40)
where
E0 = m2 c4 + p2y c2 + p2z c2 = m2 c4 + p20 c2 , (4.41)
is the energy at t = 0.
To find the trajectory of the particle, we need to solve
dr p pc2
=v= = . (4.42)
dt γm Ek
The components of the equation are
dx qEc2 t
= q , (4.43)
dt E02 + (qEct)2
dy p0 c2
= q . (4.44)
dt E02 + (qEct)2
Note that as t → ∞, v x → c and vy → 0. Integrate dx/dt and dy/dt,

assuming r (t = 0) = 0,
1
q
x= E02 + (qEct)2 , (4.45)
qE

p0 c −1 qEct
y= sinh , (4.46)
qE E0
From y = y(t), we obtain t = t(y). From x = x (t), we have
E0 qEy
x= cosh . (4.47)
qE p0 c
Now let’s consider the motion of a particle in a uniform static B

field, assuming B = Bez . The equations of motion are
dp
= qv × B. (4.48)
dt
Using p = γmv that γ is constant in a B field,
dv B
= qv × = v × Ω, (4.49)
dt γm
where Ω = qB/γm = Ωez . The components of dv/dt are
v̇ x = Ωvy , v̇y = −Ωv x , v̇z = 0, (4.50)
from which vz = vz0 .

To solve for v, let vb = v x + ivy , then from
v̇ x = Ωvy , (4.51)
iv̇y = −Ωiv x , (4.52)
we have vḃ = −iΩb v, from which vb = vb0 e−iΩt . Here vb0 = v⊥0 e−iα is
a complex constant with α a constant. From vb = v⊥0 e−i(Ωt+α) and
vb = v x + ivy , we have
v x = v⊥0 cos(Ωt + α), (4.53)

vy = −v⊥0 sin(Ωt + α), (4.54)
q
It’s clear that v⊥ = v2x + v2y = const = v⊥0 . We see from v x and vy
that Ω is the angular frequency.
The trajectory can be obtain by integrating r = dv/dt, which
gives
x = x0 + ρ sin(Ωt + α), (4.55)

y = y0 + ρ cos(Ωt + α), (4.56)
z = z0 + vz0 t. (4.57)
Here ρ = v⊥ /Ω is the gyro-radius (or |ρ|). In case of v/c → 0,

γ → 1, Ω → qB/m.
To make life more complicated, let’s now consider the motion of
a charge in a static uniform E and B field. We consider only non-
relativistic case here (v c). Assume B = Bez , E = Ey ey + Ez ez . For
a non-relativistic particle, p = mv.
mv̇ = qE + qv × B. (4.58)
The components of the equation are
mv̇ x = qvy B, (4.59)

mv̇y = qEy − qv x B, (4.60)
mv̇z = qEz , (4.61)
from which we immediately have vz = qEz t/m + vz0 .

To solve for v, we again let vb = v x + ivy . Then
v
db qEy
+ iΩb
v=i . (4.62)
dt m
The general solution of the equation is
qEy
vb = vb0 e−iΩt + . (4.63)
mΩ
Or letting vb0 = v⊥0 e−iα ,
qEy Ey
v x = v⊥0 cos(Ωt + α) + = v⊥0 cos(Ωt + α) + , (4.64)
mΩ B
vy = −v⊥0 sin(Ωt + α). (4.65)
The x-velocity consists of two parts,
v x = v⊥0 cos(Ωt + α) + Ey /B = vex + v x (4.66)

| {z } | {z }
oscillatory non-oscillatory
For cases like this, it’s frequently useful to average over the fast
oscillation to obtain the average motion by
Z t+ T Z t+ T
1 1
vx = v x (t0 )dt0 = v⊥0 cos(Ωt0 + α)dt0 + Ey /B,
T t T t
(4.67)
or v x = Ey /B. Note that vy = 0. The motion in x-y plane is the very

famous E × B drift. In vector form, the drift velocity is E × B/B2 .
1.2
0.9
0.6 vx
v
vy
0.3
0.0
0 1 2 3 4 5
time
Fig: v x and vy from Equations (4.64) and (4.65).
To find the trajectory, we integrate v x , vy and vz and
x = ρ sin(Ωt + α) + v E t + x0 (4.68)
y = ρ[cos(Ωt + α) − cos α] + y0 (4.69)
qEz 2
z= t + vz0 t + z0 (4.70)
2m
Here v E = Ey /B is the E × B drift velocity.
1.0
v ⟂ =0.5,vE =0.1,y0 =0.5,x0 =0.0,T =1
0.8
0.6
y
0.4
0.2
0.0
0.00 0.15 0.30 0.45 0.60
x
Fig: Trajectories in the x-y plane for different v E and v⊥ = ρΩ.
1.0
v ⟂ =0.1,vE =0.5,y0 =0.5,x0 =0.0,T =1
0.8
0.6
y
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5
x
1.0
v ⟂ =0.5,vE =0.5,y0 =0.5,x0 =0.0,T =1
0.8
0.6
y
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5
x
4.6 The electromagnetic field tensor
To obtain the Lorentz equation of motion in covariant form, we

consider the action in four-dimensional form. Recall the action of a
particle in a given field is
Z b
S= (−mcds − qAα dx α ) (4.71)
a
The principle of least action leads to

Z b
δS = δ (−mcds − qAα dx α ) = 0. (4.72)
a
Recall that δds = dxα dδx α /ds = uα dδx α /c,

Z b
δS = − (muα dδx α + qAα dδx α + qδAα dx α ) = 0. (4.73)
a
Integrating by parts, δS becomes

Z b
δS = (mδx α duα + qδx α dAα − qδAα dx α )
a
− [muα + qAα ] δx α |ba . (4.74)
Because δx α ( a) = δx α (b) = 0,
Z b
δS = (mδx α duα + qδx α dAα − qδAα dx α ). (4.75)
a
Noting dAα = (∂Aα /∂x β )dx β and δAα = (∂Aα /∂x β )δx β ,
Z b
∂Aα β α ∂Aα β α
δS = mduα δx + q β dx δx − q β δx dx
α
(4.76)
a ∂x ∂x
Z b
∂A α ∂A β
= mduα δx + q β dx δx − q α δx dx
α β α α β
(4.77)
a ∂x ∂x
Z b
duα ∂A β ∂Aα
= m −q − β u β dτδx α . (4.78)
a dτ ∂x α ∂x
If we define a four-tensor F αβ by
∂A β ∂Aα
Fαβ = − β, (4.79)
∂x α ∂x
then
Z b
duα
δS = m − qFαβ u β dτδx α . (4.80)
a dτ
The tensor Fαβ is called the electromagnetic field tensor; the tensor is
anti-symmetric,
Fαβ = − Fβα . (4.81)
To obtain the equations of motion, we apply the principle of least

action. From δS = 0,
duα
m = qFαβ u β . (4.82)
dτ
We can also move indices up or down,
duα
m = qF αβ u β . (4.83)
dτ
Equations (4.82) or (4.83) are the covariant form of the equations of
motion of a charge in a given field.
Now let’s find out the components of the electromagnetic field
tensor Fαβ ; they can be obtained by noting that Aα = (φ/c, − A),
∂A0 ∂A 1 ∂φ 1 ∂A x
e.g., F10 = − 01 = + = − Ex /c. (4.84)
∂x1 ∂x c ∂x c ∂t
In SI units,
 
0 Ex /c Ey /c Ez /c
− E /c 0 − Bz By 
Fαβ = x . (4.85)
 
 − Ey /c Bz 0 − Bx 
− Ez /c − By Bx 0
From Fαβ , we can immediately obtain

 
0 − Ex /c − Ey /c − Ez /c
 E /c
 x 0 − Bz By 
F αβ = , (4.86)

 Ey /c Bz 0 − Bx 
Ez /c − By Bx 0
using the rule
1. Moving the time index (0) ⇒ No change in sign.
2. Moving space indices (1, 2, 3) ⇒ change sign.
Using F αβ , we obtain components of the covariant equations of

motion, Equation (4.83), rewritten here
q αβ
duα /dτ = F uβ . (4.87)
m
For example, the time component (α = 0) is
du0

E
m = qF0β u β = q − · (−γv). (4.88)
dτ c
Noting that u0 = γc and dτ = dt/γ, we have
dγc q
γm = γ E · v, (4.89)
dt c
dE k
or = qE · v, with E = γmc2 (4.90)
dt
The space components of duα /dt can be obtained similarly; they
are just the Lorentz equations of motion (Equation (4.17)).
4.7 Lorentz transformation of the EM fields
4.7.1 Rules of transformation

The Lorentz transform of potentials φ and A can be obtained easily,
because Aα = (φ/c, A) is a four-vector. Any 4-vector transforms
like x α ; therefore
φ0 + VA0x
φ= √ , (4.91)
1 − V 2 /c2
A0x + (V/c2 )φ0
Ax = √ , (4.92)
1 − V 2 /c2
Ay = A0y , (4.93)
Az = A0z . (4.94)
Of course, this is for K 0 move with V = Ve x in K.

The transformation of E and B can be obtained from the transfor-
mation of components of F αβ . For the E field,
Ey0 + VBz0 Ez0 − VBy0

Ex = Ex0 , Ey = √ Ez = √ . (4.95)
1 − V 2 /c2 1 − V 2 /c2
For the B field,
By0 − (V/c2 ) Ez0 Bz0 + (V/c2 ) Ey0

Bx = Bx0 , By = √ Bz = √ . (4.96)
1 − V 2 /c2 1 − V 2 /c2
The non-relativistic case V/c 1 is very important,
Ex = Ex0 , Ey = Ey0 + VBz0 , Ez = Ez0 − VBy0 . (4.97)
Bx = Bx0 , By = By0 − (V/c2 ) Ez0 , Bz = Bz0 + (V/c2 ) Ey0 . (4.98)
Or in vector form,
1 0
E = E0 + B0 × V , B = B0 − E × V. (4.99)
c2
or E = E0 + cB0 × β, B = B0 − ( E0 /c) × β, (4.100)
with β = V /c.
Note that if B0 = 0 in K 0 system, then in K system,
B = β × ( E/c). (4.101)
If E0 = 0 in K 0 system, then in the K system,
E = cB × β. (4.102)
In both cases, E ⊥ B in K. This point can be obtained more clearly

by considering the invariants of E and B.
4.7.2 Lorentz invariants of electromagnetic fields

From the field tensor F αβ , we can obtain two important and useful
invariants by constructing four-scalars. First, we can easily see that
Fαβ F βα = inv. Because Fαβ is anti-symmetric, normally we write it
as
Fαβ F αβ = inv, (4.103)
which leads to
c2 B2 − E2 = inv. (4.104)
To construct the second invariant, we first introduce eαβµν the

completely anti-symmetric unit tensor of fourth-order. The tensor eαβµν
has the following properties,
1. eαβµν changes sign if interchanging any pair of indices
2. eαβµν = 0 if any two indices are the same
3. normally we choose e0123 = 1.
You can see that eαβµν is the 4D version of Levi-Civita symbol eijk .
An extra note: eαβµν is actually a pseudo-tensor. Read Pg 17-18 of
Landau’s book about this. Basically a pseudo-tensor transforms
similarly as real tensors in rotations of coordinates. They differ
only in those transformation that cannot be reduced to rotation
of coordinates. Since Lorentz transformation is a rotation of four-

dimensional space, we can use pseudo-tensors to construct Lorentz
invariants.
Using eαβµν , we can construct the following invariant,
Fαβ e βαµν Fνµ = inv. (4.105)
Because eαβµν and Fαβ are anti-symmetric, we write
eαβµν Fαβ Fµν = inv, (4.106)
which leads to
E · B = inv. (4.107)
In short summary, the two important invariants of the field are
c2 B2 − E2 = inv
E · B = inv
Using two invariants, we draw the following conclusions:
1. If E > cB in K, then E > cB in all other K 0 , and vice versa.
2. If E ⊥ B in K, then we can find a K 0 so that
• E = 0 if c2 B2 − E2 > 0.
• B = 0 if c2 B2 − E2 < 0.
3. If E ⊥ B in K, then E0 ⊥ B0 in all other K 0 if E0 6= 0 and B0 6= 0.
4.7.3 Fields of a uniformly moving charge

As an example of the above transformation rules, we use them to
obtain the electromagnetic fields of a uniformly moving charge.
Since the velocity of the charge is constant, we analyze the problem
in two steps:
1. Calculate the field in the reference system K 0 where the charge is

at rest.
2. Make a coordinate transformation to the observation reference

system K.
The problem in step 1 is purely electrostatic, and the problem in

step 2 is just a coordinate transformation. So this is a very easy
problem in principle.
In K 0 system, let a static point charge q be located at the origin
x = y0 = z0 = 0. The potentials of the charge q in K 0 is
0
q
φ0 = and A0 = 0. (4.108)
4πe0 r 0
This equation finishes step 1. Note that we use potentials here 4 to 4
For example, you can obtain E0 and
be consistent with the case in the next section, where the source for B0 first, and then coordinate transform
to obtain E and B.
generating the field is general.
We know how to transformation φ0 and A0 to φ and A in K.
φ0 (r 0 )/c
φ/c = γ(φ0 /c + βA0x ) = √ , (4.109)
1 − v2 /c2
(v/c2 )φ0 (r 0 )
A x = γ( A0x + βφ0 /c) = √ , (4.110)
1 − v2 /c2
Ay = A0y = 0, (4.111)
Az = A0z = 0, (4.112)
since Aα = (φ/c, A) is a four-vector. This looks pretty easy, except

that currently φ = φ(r 0 ) and A = A(r 0 ). We need one more
step to express φ and A in terms of r instead of r 0 , because the
∇ operator in K system is w.r.t. r. To do this, we know that r 0 =
p
x 02 + y02 + z02 and
x − vt
x0 = √ , (4.113)
1 − v2 /c2
y0 = y, (4.114)
0
z = z. (4.115)
Therefore
s
( x − vt)2
r0 = + y2 + z2 . (4.116)
1 − v2 /c2
Substituting r 0 = r 0 ( x, y, z) in Equation (4.116) into φ and A in

Equations (4.109)-(4.112) gives
q
φ= √ (4.117)
4πe0 r0 1 − v2 /c2
q
= p (4.118)
4πe0 ( x − vt)2 + (1 − v2 /c2 )(y2 + z2 )
q
≡ , (4.119)
4πe0 r ∗
v
A x = 2 φ, (4.120)
c
Ay = Az = 0. (4.121)
From φ and A, we can directly obtain E and B.

Note that r ∗ = ( x − vt)2 + (1 − v2 /c2 )(y2 + z2 ),
p
∂A v ∂φ
E = −∇φ − = −∇φ − 2 e x . (4.122)
∂t c ∂t
Noting φ = q/4πe0 r ∗ , we have
∂φ q ∂r ∗ q
=− =− ( x − vt) (4.123)
∂x 4πe0 r ∗ 2 ∂x 4πe0 r ∗ 3
∂r ∗ v2

∂φ q q
=− =− 1 − y (4.124)
∂y 4πe0 r ∗ 2 ∂y 4πe0 r ∗ 3 c2
∂r ∗ v2

∂φ q q
=− =− 1 − z (4.125)
∂z 4πe0 r ∗ 2 ∂z 4πe0 r ∗ 3 c2
∂φ q ∂r ∗ q
=− =− ( x − vt)(−v) (4.126)
∂t 4πe0 r ∗ 2 ∂t 4πe0 r ∗ 3
Combining Equations (4.123)-(4.126) and substituting them to

Equation (4.122) leads to
v ∂φ
E = −∇φ − ex (4.127)
c2 ∂t
q q
2
v
= ∗ 3
( x − vt)e x + ∗ 3
1 − 2 yey
4πe0 r 4πe0 r c
v2 v2

q q
+ 1 − 2 zez − 2 ( x − vt)e x (4.128)
4πe0 r ∗ 3 c c 4πe0 r ∗ 3
v2

q
= 3
1 − 2
( x − vt ) e x + ye y + ze z (4.129)
4πe0 r ∗ c

v 2
q
= 1− 2 R, (4.130)
c 4πe0 r ∗ 3
where R ≡ ( x − vt)e x + yey + zez is radius vector from the charge
position at t to the field point ( x, y, z) in K, because the charge
trajectory is given by x = vte x .
To find the magnetic field B, we note that B0 = 0 in K 0 , hence in
K system
v
B= × E. (4.131)
c2
Let’s now discuss the electric field E, since B is simple if we
know E. Introducing angle θ, which is the angle between R and the
direction of motion (e x in our case), then
Rk = x − vt, (4.132)
2 2 2 2
R⊥ = y + z = R sin θ, (4.133)
v2 v2

r ∗ 2 = R2k + 1 − 2 R2⊥ = R2 1 − 2 sin2 θ . (4.134)
c c
Using Equations (4.132)-(4.134), we rewrite E in Equation (4.130)
using θ and R as
v2

qR
E = 1− 2 3/2 (4.135)
c 4πe0 R3 1 − (v2 /c2 ) sin2 θ

qR 1 − β2
= . (4.136)
4πe0 R (1 − β2 sin2 θ )3/2
3
For a fixed distance R, the value of the field E maximize at θ = π/2

(perpendicular direction),
q q
E⊥ = Eθ =π/2 = 2
(1 − β2 )−1/2 = γ, (4.137)
4πe0 R 4πe0 R2
and minimizes at parallel directions (θ = 0 or π),
q 1 q
Ek = Eθ =0 = Eθ =π = (1 − β2 ) = 2 . (4.138)
4πe0 R2 γ 4πe0 R2
So as velocity increases, Ek decreases and E⊥ increases. Or the field
is contracted in the direction of motion. If v → c, then Ek → 0 and
E⊥ → ∞, because
qR 1 − β2
E= . (4.139)
4πe0 R (1 − β2 sin2 θ )3/2
3
As β → 1, the denominator 1 − β2 sin2 θ → 0 within ∆θ. The E field

at a given distance from the source is large only within a narrow
range of angles.
5
The Electromagnetic Field Equations
5.1 Electrodynamics before Maxwell
From your class on electromagnetism, you should have learned the

following laws
∇· E = ρ/e0 , Gauss’s law.

∇· B = 0, no name
∂B
∇× E = − , Faraday’s law
∂t
∇× B = µ0 j. Ampere’s law
There are inconsistencies in these equations that cannot not be

resolved by these equations themselves. Taking the divergence of
Ampere’s law, you will get
∇· (∇× B) = µ0 ∇· j.
While
∇· (∇× B) = 0,
is always correct,
∇· j = 0,
only holds if the current is steady. This implies that Ampere’s law
only holds for static case.
5.2 The charge conservation law
To see how Maxwell fixed these laws, we need to first describe the
conservation of charge. We first prove the equation of continuity in
a more traditional way, then we express it using the current density
4-vector.
The conservation of charge means
Z I
∂
ρdV = − ρv · dS, (5.1)
∂t
i.e., the time rate change of the total charge in volume V equals
minus the total amount of charge leaving the system. Using j = ρv,
Z Z I Z
∂ ∂ρ
ρdV = dV = − j · dS = − ∇· jdV. (5.2)
∂t ∂t V
Since this integration must hold for any volume V, the above
equation leads to the equation of continuity,
∂ρ
+ ∇· j = 0. (5.3)
∂t
This equation of continuity can also be directly proved using the
definition of ρ. For simplicity, we note that the charge density ρ for
a single particle1 is defined as, From this equation, we see 1
How do you derive the equation of
continuity for a system of particles?
Hint: Consider what is the appropriate
ρ(t, r ) = qδ[r − r 0 (t)]. (5.4)
definition of v.
Then

∂ρ ∂δ dr 0 ∂δ
=q · =q − · v = −∇· (ρv),
∂t ∂r 0 dt ∂r
since ∇· v = 0. Therefore, with j ≡ ρv,
∂ρ
+ ∇· j = 0.
∂t
Now we introduce the current four-vector. While the charge q
is a Lorentz invariant, ρ is not, since dq = ρdV and dV is not an
invariant. Multiplying dq = ρdV by dx α leads to
dx α
dqdx α = ρdVdx α = dVdt ρ . (5.5)
| {z } | {z } dt
four-vector four-scalar
Therefore ρdx α /dt must be a four-vector, we define it to be the

current density four-vector; i.e.,
dx α
jα = ρ , (5.6)
dt
or jα = (ρc, j) with j = ρv the normal current density vector you’re
familiar with.
Since jα is a four-vector, its divergence is a four-scalar; i.e.,
∂jα
= inv. (5.7)
∂x α
Writing this out with jα = (ρc, j) and using the equation of continu-
ity, we have
∂jα ∂ρ
= + ∇· j = 0. (5.8)
∂x α ∂t
That is the charge conservation law or the equation of continuity

can be expressed as that the the-divergence of the current density
4-vector is 0 (the current density 4-vector is divergence-free).
the electromagnetic field equations 57
5.3 How Maxwell fixed Ampere’s law
Ampere’s law gives
∇· (∇× B) = µ0 ∇· j,
Because the left hand side (LHS) is always right, one might try
to fix this by adding a term to the right hand side (RHS). Well,
we’ve just learned the equation of continuity, therefore, we can fix
Ampere’s law using 2 2
Note that in actual history, Maxwell
added the extra term for a reason
∂ρ based on ether, a model which is now
∇· (∇× B) = µ0 ∇· j + . considered to be incorrect.
∂t
Considering Gauss’s law that ∇· E = ρ/e0 , the above equation is

written as

∂ ∂E
∇· (∇× B) = µ0 ∇· j + µ0 e0 ∇· E = ∇· µ0 j + µ0 e0 .
∂t ∂t
Therefore, the Ampere’s law can be fixed to be
∂E
∇× B = µ0 j + µ0 e0 .
∂t
The new extra term is called by Maxwell the displacement current,
∂E
j d = e0 .
∂t
This new term means that
A changing electric field induces a magnetic field.
Now the new electrodynamics has a certain kind of symmetry in it.

Of course, theoretical convenience and consistency are only
suggestive. The confirmation of Maxwell’s law came later with
Hertz’s experiments on electromagnetic waves after Maxwell’s
death.
5.4 Maxwell’s Equations
The whole set of Maxwell’s equations are
∇· E = ρ/e0 , Gauss’s law.

∇· B = 0, no name
∂B
∇× E = − , Faraday’s law
∂t
∂E
∇× B = µ0 j + µ0 e0 . Ampere’s law
∂t
These are the fundamental equations of the EM field theory. The

Maxwell equations and the Lorentz force law,
dp
= q ( E + v × B ),
dt
are the entire content of the theoretical classical electrodynamics.
*5.5 Deriving Maxwell equations using the principle of least

action
5.5.1 The homogeneous Maxwell equations

From definitions of E and B using potentials φ and A,
B = ∇× A, (5.9)
∂A
E = −∇φ − , (5.10)
∂t
we have
∇· B = 0, (5.11)
∂B
∇× E = − . (5.12)
∂t
These are the two homogeneous Maxwell equations.
5.5.2 The action of the electromagnetic field

To obtain the other two equations of fields, we introduce the action
of the electromagnetic field. Previously we considered the action for
charges in a given field
S= Sp + Sp f . (5.13)
|{z} |{z}
free particle interaction
The action for the whole system is, by adding S f to the total action
S,
S= Sp + Sp f + Sf . (5.14)
|{z} |{z} |{z}
free particle interaction field only
To derive the field action S f , we note the following three con-

straints,
1. The field satisfies the principle of supposition. Equations are

linear in E and B; therefore, S f must be quadratic in E and B; i.e.,
∼ E2 or B2 .
2. S f must be a 4-scalar.
Considering these constraints, other people have figured out S f ∝

R
Fαβ F αβ dΩ. In SI units,
1
Z
Sf = − Fαβ F αβ dΩ, where dΩ = cdtdV (5.15)
4µ0
In 3D form, since Fαβ F αβ = 2( B2 − E2 /c2 ),

c
Z
Sf = ( E2 /c2 − B2 )dVdt, (5.16)
2µ0
or the Lagrangian for the field is

c
Z
Lf = ( E2 /c2 − B2 )dV. (5.17)
2µ0
Putting all pieces together, the total action is
1
Z Z Z
S = − ∑ mc ds − ∑ qAα dx − α
Fαβ F αβ dΩ, (5.18)
4µ0 c
where the summation is over all particles. Note that now charges
are not assumed to be small, so fields include external fields and
the field produced by the particles themselves.
Using the charge density 4-vector jα , the interaction action S p f
becomes
Z Z
Sp f = − ∑ qAα dx = −
α
ρAα dVdx α
dx α
Z
=− ρAα dV dt
dt
1
Z
=− jα Aα dΩ.
c
Substituting this form of S p f into the Equation (5.18)
1 1
Z Z Z
S = − ∑ mc ds − jα Aα dΩ − Fαβ F αβ dΩ.
c 4µ0 c
This is the total action expressed using the current density 4-vector.
5.5.3 The inhomogeneous Maxwell equations

In this subsection, we use the principle of least action to determine
equations obeyed by fields. Deriving field equations using the
principle of least action is similar to deriving particle equations.
Recall that to obtain particle trajectory, we vary x α → x α + δx α , and
then we obtain equations satisfied by x α . Now we want to obtain
equations obeyed by field Aα , we do it in the following three steps.
1. Particles move along their actual path
2. Vary Aα → Aα + δAα .
3. The field equations can be obtained from δS = 0.
First, noting that particles all move along their actual path (δS p =
0), the variation of action due to δAα is
1 1
Z Z
δS = δS p − δ Aα jα dΩ − δ Fαβ F αβ dΩ. (5.19)
|{z} c 4µ0 c
=0
Since particle move along their actual trajectories, δjα = 0,

1 1
Z Z
δS = − δAα jα dΩ − δ Fαβ F αβ dΩ. (5.20)
c 4µ0 c
The integrand of δS p f is in the form of (· · · )δAα , so we do not need
to do anything about it. We just need to write the integrand of δS f
in a similar form.
Now let’s calculate

1
Z
δS f = − δ( Fαβ F αβ )dΩ. (5.21)
4µ0 c
The variation of the action of field is

1
Z
δS f = − δ( Fαβ F αβ )dΩ (5.22)
4µ0 c
1
Z
=− (δFαβ F αβ + Fαβ δF αβ )dΩ (5.23)
4µ0 c
1
Z
=− F αβ δFαβ dΩ. (5.24)
2µ0 c
Substituting
∂A β ∂Aα ∂δA β ∂δAα

Fαβ = α
− β ⇒ δFαβ = − (5.25)
∂x ∂x ∂x α ∂x β
into Equation (5.24) leads to
1
Z
δS f = − F αβ δFαβ dΩ (5.26)
2µ0 c
Z
1 ∂δA β αβ ∂δAα
= − F αβ − F dΩ (5.27)
2µ0 c ∂x α ∂x β
Z
1 ∂δA β ∂δA β
= − F αβ − F βα
dΩ (5.28)
2µ0 c ∂x α ∂x α
Z
1 ∂δA β ∂δA β
= − F αβ
+F αβ
dΩ (5.29)
2µ0 c ∂x α ∂x α
1
Z
∂δA β
= − F αβ dΩ, (5.30)
µ0 c ∂x α
where we have used F αβ = − F βα . Perform the integration in

Equation (5.30) by parts,
1
Z
∂δA β
δS f = − F αβ
dΩ (5.31)
µ0 c ∂x α
Z
1 ∂F αβ
Z
∂ αβ
=− F δA β dΩ − δA β dΩ . (5.32)
µ0 c ∂x α ∂x α
The first term can be proved to be 0 using Gauss’s theorem in

four-dimensional space,
Z Z
∂ αβ
F δA β dΩ = F αβ δA β dSα = 0, (5.33)
∂x α
because
δA β = 0 for t = t0 and t = t1 . (5.34)

F αβ = 0 for r = ∞. (5.35)
Therefore Equation (5.32) becomes
1 ∂F αβ
Z
δS f = δA β dΩ. (5.36)
µ0 c ∂x α
The variation of the total action is δS = δS p f + δS f ,

Z
1 1 ∂F αβ
δS = − jβ + δA β dΩ (5.37)
c µ0 c ∂x α
Z
1 β 1 ∂F βα
=− j + δA β dΩ. (5.38)
c µ0 c ∂x α
The principle of least action δS = 0 gives
∂F βα
= − µ0 j β . (5.39)
∂x α
The components of these equations are
1 ∂Ex 1 ∂Ey 1 ∂Ez

− − − = −µ0 ρc (5.40)
c ∂x c ∂y c ∂z
1 ∂Ex ∂Bz ∂By
2
− + = − µ0 jx (5.41)
c ∂t ∂y ∂z
1 ∂Ey ∂Bz ∂Bx
+ − = −µ0 jy (5.42)
c2 ∂t ∂x ∂z
1 ∂Ez ∂By ∂Bx
− + = −µ0 jz (5.43)
c2 ∂t ∂x ∂y
In vector form, these equations give, using c2 = 1/µ0 e0 ,
∇· E = ρ/e0 , (5.44)
∂E
∇× B = µ0 e0 + µ0 j. (5.45)
∂t
the inhomogeneous Maxwell equations.
5.6 Integral forms of Maxwell equations
We now obtain useful laws by integrating the differential Maxwell

equations. First,
Z I
∇· B dV = B · dS. Gauss’s theorem (5.46)
V S
R
The integral S B · dS is the flux through surface S,
I
∇· B = 0 ⇒ B · dS = 0, (5.47)
S
means
the flux of B through any closed surface is 0.
We now integrate the ∇× E equation. First,

Z I
∇× E · dS = E · dl. Stokes’ theorem (5.48)
S l
From the differential equation,
∂B
∇× E = − , (5.49)
∂t
we have the conclusion

Z I Z
∂ ∂Φ
∇× E · dS = E · dl = − B · dS = − . (5.50)
S l ∂t S ∂t
In the equation
I
∂Φ
E · dl = − , (5.51)
l ∂t
the two terms mean
1. Φ: the magnetic flux through surface S.

H
2. l E · dl: the electromotive force in the contour.
The electromotive force thus equals minus the time rate

change of Φ.
Third, the integral form of Gauss’s law is
1
Z Z
∇· EdV =ρdV, (5.52)
e0
1
I Z
⇒ E · dS = ρdV = Q/e0 . (5.53)
e0
The flux of E through a closed surface equals 1/e0 times the

total charge inside the volume.
The integral form of Ampere’s law is

Z Z Z
∂
∇× B · dS = µ0 e0 E · dS + µ0 j · d§ (5.54)
∂t
I Z
∂E
⇒ B · dl = µ0 j + e0 · dS (5.55)
∂t
The circulation of B around any contour equals µ0 times

the sum of the displacement current and the true current
through the surface.
5.7 Energy density and energy flux of the EM field
Now we discuss some conservation properties of electromagnetic

field regarding their energy and momentum. Recall that field is
similar to a continuous media, therefore, when talking about field
energy and momentum, we almost always talk about energy density
and momentum density in the context of classical electrodynamics.
Field carries energy; and the total energy is conserved.

To figure out the energy density of the field, we see that the work
done by EM field on a charge q in interval dt is
dε k = F · dl = q( E + v × B) · vdt = qE · vdt (5.56)
If we consider a continuous distribution of charge,
dε k
Z Z
= ρE · vdV = j · EdV. (5.57)
dt
Through conservation of energy, dε k must equal minus the energy
change of the field in time dt.
To see the energy variation of the field, we express j · E using field
quantities only. The assumption here is that the only other way of
changing energy is through the field.

1 ∂E
j·E = ∇× B − µ0 e0 ·E
µ0 ∂t
1 ∂E
= ∇× B · E − e0 ·E
µ0 ∂t
1 1 e ∂E2
= (∇× B · E − ∇× E · B) + ∇× E · B − 0
µ0 µ0 2 ∂t
1 1 ∂B2 e ∂E2
=− ∇· ( E × B)− − 0
µ0 2µ0 ∂t 2 ∂t

1 1 ∂ 1 2
= − ∇· ( E × B) − e0 E 2 + B .
µ0 2 ∂t µ0
Reorganizing the equation as

∂ 1 1 2 1
e0 E 2 + B = − j · E − ∇· E×B . (5.58)
∂t 2 2µ0 µ0
Denoting the Poynting flux vector by
1
P= E × B, (5.59)
µ0
the equation can be written as

∂ 1 1 2
e0 E 2 + B + j · E = −∇· P. (5.60)
∂t 2 2µ0
R
To see what the equation (5.60) means, we apply dV to it.

∂ 1 1 2
Z Z Z
e0 E 2 + B dV + j · EdV = − ∇· PdV.
∂t 2 2µ0
or using Gauss’ theorem

∂ 1 1 2
Z Z Z
e0 E 2 + B dV + j · EdV = − P · dS (5.61)
∂t 2 2µ0
If we extend the integration to all space, the field is 0 at ∞, Equa-

tion (5.61) becomes

∂ 1 1 2
Z Z
e0 E 2 + B dV + j · EdV = 0. (5.62)
∂t 2 2µ0
Noting that
dε k
Z
= j · EdV, (5.63)
dt
Equation (5.61) then becomes
 

1 1 2
Z
∂  
e0 E 2 + B dV +  = 0, (5.64)
 
εk
2 2µ0

∂t  |{z} 
| {z } particle kinetic energy
field energy
from which we conclude that

1 1 2
w= e0 E 2 + B (5.65)
2 µ0
is the field energy density. Yes, the field has energy.

Expressed using w, Equation (5.61) is
Z I
∂
wdV + ε k = − P · dS. (5.66)
∂t
The left hand side (LHS) is the total energy variation inside a given
finite volume, thus P, the Poynting flux, is the field energy flux
density – the amount of energy passing through unit area per unit
time.
5.8 Momentum density and the Maxwell stress tensor
We derive the momentum of the field in a similar way as that of the

field energy density. That is,
Field carries momentum; the total momentum is conserved.
The time rate change of momentum of a particle q,
dpm
= F = q ( E + v × B ). (5.67)
dt
The subscript “m” denotes “mechanical” quantity. If we consider a
continuous distribution of charges, then
dpm
Z
= ρ( E + v × B)dV
dt Z
= (ρE + j × B) dV. (5.68)
We now express ρ and j using E and B, and the purpose is of

course to express the final equation in the form
d Z
pm + p f + ∇ · GdV = 0, (5.69)
dt
where p f is the field momentum, and G is the field momentum flux,
which is a tensor (since momentum is a vector).
We now express the right hand side of Equation (5.69) using field
quantities only. From Maxwell equations,

1 ∂E
ρ = e0 ∇· E and j = ∇× B − µ0 e0 (5.70)
µ0 ∂t
The first term ρE = e0 E(∇· E). The second term is
1 ∂E
j×B = (∇× B) × B − e0 × B. (5.71)
µ0 ∂t
Using
∂E ∂ ∂B
B× = − ( E × B) + E × , (5.72)
∂t ∂t ∂t
and adding B(∇· B), which is 0, to equation (5.68) gives,
∂
ρE + j × B = −e0 ( E × B)
∂t
1 1
+ e0 E(∇· E) + B(∇· B) − e0 E × (∇× E) − B × (∇× B) .
µ0 µ0
(5.73)
Substituting Equation (5.73) into equation (5.68) gives

Z
dpm d
+ e0 E × BdV =
dt dt
Z
1 1
e0 E(∇· E) + B(∇· B) − e0 E × (∇× E) − B × (∇× B) dV
µ0 µ0
(5.74)
Define
g = e0 E × B, (5.75)
which will be shown later to be the field momentum density, and

Z
pf = gdV, (5.76)
which is the total field momentum in the chosen volume, then

Equation (5.74) is
dpm dp f
+ =
dt dt
1 1
Z
e0 E(∇· E) + B(∇· B) − e0 E × (∇× E) − B × (∇× B) dV,
µ0 µ0
(5.77)
Let’s now figure out what’s in the right hand side. Recall that in
Homework 2, you have proved

1
E × (∇× E) = (∇· E) E − ∇· EE − E2 I , (5.78)
2
so

1
E(∇· E) − E × (∇× E) = ∇· EE − E2 I (5.79)
2
similarly,

1
B(∇· B) − B × (∇× B) = ∇· BB − B2 I (5.80)
2
Define the Maxwell stress tensor T to be

1 1 1 2
T = e0 EE + BB − e0 E 2 + B I , (5.81)
µ0 2 µ0
Equation (5.77) becomes
d Z I I
pm + p f = ∇· TdV = dS · T = T · dS, (5.82)
dt
because T is symmetric. Or written in the form of Equation (5.69),
d Z
pm + p f + ∇· G dV = 0, (5.83)
dt
where G = −T is the field momentum flux tensor.
To actually understand the meaning of p f and T (or G), let
V → ∞. Then T = 0 since field is 0 at infinity. Equation (5.82)
becomes
d
pm + p f = 0. (5.84)
dt
From the conservation of momentum, p f thus represents the mo-
mentum of the field.
To see the meaning of T or G, let V be finite, then
d I I
pm + p f = T · dS = − G · dS. (5.85)
dt
Therefore T represents the field momentum flux flowing into the
volume through surface. Or G, the field momentum flux tensor,
represents the field momentum flux flowing out of the volume
through surface.
In a Cartesian coordinate system, let the normal direction of dS
to be n = (n1 , n2 , n3 ),
d I
pm + p f = Tij n j dS (5.86)
dt i
Therefore, Tij n j is the ith component of the momentum per unit

area per unit time into the volume. On the other hand,
d
pm + p f = force on the field and particles, (5.87)
dt
Tij n j is the ith component of the force per unit area through S. If
the field is static, then p f is constant,
dpm
I
= T · dS (5.88)
dt
d( p m )i
I
or = Tij n j dS (5.89)
dt
On the other hand, if pm is constant, then
dp f Z
∂g
Z
= dV = ∇· TdV (5.90)
dt ∂t
or
∂g
= ∇· T = −∇· G, (5.91)
∂t
or
∂g
+ ∇· G = 0, (5.92)
∂t
further proving that the field momentum flux is G.
6
General Electrostatics
6.1 General considerations
The Maxwell equations are,

∂B
∇× E = − , (6.1)
∂t
ρ
∇· E = , (6.2)
e0
∂E
∇× B = µ0 e0 + µ0 j, (6.3)
∂t
∇· B = 0. (6.4)
In this chapter, we consider static EM fields; i.e.,

∂
= 0. (6.5)
∂t
The corresponding static EM field equations are
∇× E = 0 (6.6)
ρ
∇· E = , (6.7)
e0
∇× B = µ0 j, (6.8)
∇· B = 0. (6.9)
Notes that
∂A
E = −∇φ − = −∇φ,
∂t
B = ∇× A.
That is, E is completely determined using φ, and E and B are de-

coupled. So we separate static electromagnetic fields into two parts:
one about the electrostatic field and the other about magnetostatic
field. In this chapter, we discuss electrostatics.
6.2 Electrostatic fields
In case of a electrostatic field, E = −∇φ. Using φ can significantly

simplify analysis, since it’s a scalar. One exception is if the problem
possesses some symmetry, then you can apply Gauss’s theorem to
obtain E.
general electrostatics 69
The Coulomb’s law expressed using φ is

ρ ρ
∇· E = ⇒ ∇2 φ = − . (6.10)
e0 e0
If ρ = 0 (e.g., vacuum), the equation becomes
∇2 φ = 0. (The Laplace equation.) (6.11)
E and φ apparently satisfy the principle of superposition.

We first calculate the electrostatic field of a point charge. The ES
field of a distribution of charges can be obtained from the principle
of superposition. This is an example of applying Gauss’s law to a
problem with symmetry. For a point charge, because of symmetry,
E = E(r )er , where r is the distance to the charge, and er = r/r.
From the Coulomb’s law,
q
Z I Z Z
ρ
∇· EdV = E · dS = ∇ · EdV = dV = , (6.12)
e0 e0
where we have used ρ = qδ(r ). Choose the volume to be a sphere
centering at the charge,
q
I
E · dS = 4πr2 E = . (6.13)
e0
Therefore, the ES field of a point charge is
1 q
E= er .
4πe0 r2
The potential of the field is clearly
1 q
φ= ,
4πe0 r
since
1 q 1 q
−∇φ = 2
∇r = er = E. (6.14)
4πe0 r 4πe0 r2
Using this expression of φ, we can obtain an useful identify for
δ(r ). Putting φ = q/4πe0 r into ∇2 φ = −ρ/e0 , for a point charge at
r = 0, ρ = qδ(r ), therefore,

q qδ(r )
∇2 =− (6.15)
4πe0 r e0
or

2 1
∇ = −4πδ(r ) (6.16)
r
6.3 Electrostatic field multipole moments
One electrostatic problem we frequently encountered is to calculate

the potential or the E field of a system of charges. Normally the
calculation is tedious and is best done using a computer. However,
in lots of cases, we are only interested in the field far far away
from the charges. For example, we might be interested in the

field produced by a group of molecules at a distance much larger
compared to the spatial scale of system molecules. In this case,
we can use Taylor expansion to obtain leading fields of the charge
system. This technique and the corresponding fields are introduced
in this section.
Let’s introduce a coordinate system where the origin is within
the system of charges, the radius vector of charge a is x0a . The ES
potential at an arbitrary point x is
1 qa
φ( x) =
4πe0 ∑ | x − x0 | . (6.17)
a a
We now consider field at large distances | x| | x0a |; i.e., at distances

large compared to the dimensions of the system. The method is
to expand φ using the small parameter | x0a |/r, with r ≡ | x|, and
find out the dominant terms. We will also drop script “a” with
understanding that the summation is over all charges.
field point
’
x-x x
R=
qa
xa’
O
Illustration of the coordinates used in this chapter.
Using | x0 | | x|, we can Taylor expand a function f ( x − x0 ) by
1 ∂2
f ( x − x0 ) = f ( x) − x0 · ∇ f ( x) + xi0 x 0j f ( x) + · · · (6.18)
2 ∂xi ∂x j
In our case, f ( x − x0 ) = φ( x − x0 ), and
1 q
φ=
4πe0 ∑ | x − x0 | = φ (0) + φ (1) + φ (2) + · · · , (6.19)
a
where φ(i) denotes the ith order term. If you compare, e.g., φ(1) and
φ (0) ,
φ (1) x0 · ∇φ | x0 |φ/| x| | x0 |
∼ ∼ ∼ 1. (6.20)
φ (0) φ φ | x|
If we denote e ≡ | x 0 |/| x |, then φ(1) /φ(0) ∼ O(e). Similarly, you can

show that φ(2) /φ(0) ∼ O(e2 ).
Applying Equation (6.18) to Equation (6.19), you can immedi-

ately arrive at the conclusion that
1 q
φ (0) = φ ( x ) =
4πe0 ∑ r, with r ≡ | x|, (6.21)
a
1 1
φ (1) = −
4πe0 ∑ qx0a · ∇ r , (6.22)
a
∂2

1 1
φ (2) =
8πe0 ∑ qxi0 x0j ∂xi ∂x j r
. (6.23)
a
We’ll now treat these terms one by one.

The 0th order term φ(0) , normally called the monopole term, is
1 ∑i qi
φ (0) = φ ( x ) = . (6.24)
4πe0 r
This is equivalent to ignoring the internal distribution of charges

and treating them as a single point charge with Q = ∑ a q. Appar-
ently φ(0) dominates unless Q = 0, which is common, e.g., when
calculating fields produced by water molecules. The 0th term is
also consistent with our intuition: if we are far far away from a
distribution of charges, we can approximate the charges by a point
charge.
If Q = 0 (φ(0) = 0), we then need higher order terms φ(1) , or
even φ(2) if φ(1) = 0, to better approximate φ. Let’s first discuss φ(1) .
Introducing the electrostatic dipole moment p,
p= ∑ qx0 , (6.25)
a
we can write φ(1) , the electrostatic dipole potential, as
1 1 p 1 p·x
φ (1) = −
4πe0 ∑ qx0 · ∇ r =−
4πe0
·∇ =
r 4πe0 r3
. (6.26)
a
This is the potential due to the dipole moment of the charges.

The field E = −∇φ. Note here ∇ = ∂/∂x; i.e., with respect to
the field coordinate, therefore the dipole moment p is treated as a
constant when we perform ∇. Performing the gradient operation,
p·x 1 1 1
E = −∇ =− ∇( p · x) − ( p · x)∇ 3 (6.27)
4πe0 r3 4πe0 r3 4πe0 r
1 p · x −3
=− p · ∇x − ∇r (6.28)
4πe0 r3 4πe0 r4
p 3p · x x
=− 3
+ (6.29)
4πe0 r 4πe0 r4 r
or finally with n = x/r,
3( n · p ) n − p
E (1) = . (6.30)
4πe0 r3
Note that E(1) ∼ r −3 . You can also directly Taylor expand E to

obtain E(0) , E(1) , · · · .
The components of the electric field E can be obtained easily

from Equation (6.30). Assuming p = pez , then, with θ = hn, ez i,
e⊥ ⊥ ez .
3( n · p ) n − p 3 cos2 θ − 1
Ez = 3
· ez = p , (6.31)
4πe0 r 4πe0 r3
3( n · p ) n − p 3 cos θ sin θ
E⊥ = · e⊥ = p . (6.32)
4πe0 r3 4πe0 r3
Or if using spherical coordinates,
3(n · p)(n · n) − p · n 2 cos θ cos θ

Er = E · n = 3
=p 3
=p , (6.33)
4πe0 r 4πe0 r 2πe0 r3
3(n · p)(n · eθ ) − p · eθ sin θ
Eθ = E · eθ = 3
=p . (6.34)
4πe0 r 4πe0 r3
Two useful facts about the dipole moment are worth mentioning.
First, if the total charge ∑ a q = 0, then p does not depends on
the choice of the origin of coordinates. Consider two systems,
x00 = x0 + a, then
p0 = ∑ qx00 = ∑ qx0 + ∑ qa = ∑ qx0 = p. (6.35)
On the other hand, if Q ≡ ∑ a q 6= 0, then p0 = p + Qa. Second, a

simple and frequently used dipole moment is for two point charges
|q| at x0+ and −|q| at x0− .
p = |q| x0+ − |q| x0− = |q|l, with l = x0+ − x0− . (6.36)
Let’s now continue to the second order term φ(2) ; this term will
dominate if both Q and p equal 0. Rewrite Equation (6.23) here,
1 ∂2 1
φ (2) =
8πe0 ∑ qxi0 x0j ∂xi ∂x j r . (6.37)
a
Define temporarily the quadrupole moment of the system
D= ∑ 3qx0 x0 , or Dij = ∑ 3qxi0 x0j , (6.38)

a a
then
Dij ∂2 1
φ (2) = . (6.39)
24πe0 ∂xi ∂x j r
is called the quadrupole potential. There are 6 components of D,
D11 , D22 , D33 , D12 = D21 , D13 = D31 , D23 = D32 . (6.40)
But only 5 of them are independent. Let’s prove this fact. Since x is
located outside the system of charges,
1
∇2 = 0 (Recall it’s −4πδ(r ) in general.) (6.41)
r
We can rewrite the above equation as
1 ∂2 1 ∂2 1
∇2 = = δij = 0. (6.42)
r ∂xi ∂xi r ∂xi ∂x j r
Using Equation (6.42), the potential φ(2) can also be written as
1 ∂2 1
φ (2) = Dij − r 02 δij . (6.43)
24πe0 ∂xi ∂x j r
where r 0 = | x0 |. Thus we formally define the quadrupole moment as
Dij = ∑ q(3xi0 x0j − r02 δij ), (6.44)

a
The trace of the Dij = D11 + D22 + D33 = 0, indicating that there are
only 5 independent terms in Dij . From now on, we’ll use this D as
the quadrupole moment.
6.4 Electrostatic field energy
In case of a electrostatic field, the field energy is

Z
e0
U= E2 dV, (6.45)
2
the integral is taken over all space. Using E = −∇φ changes
equation (6.45) to
Z Z Z
e0 e0 e0
U=− E · ∇φdV = ∇· ( Eφ)dV + φ∇· EdV. (6.46)
2 2 2
Using Gauss’s theorem,
Z I
e0 e0
∇· ( Eφ)dV = φE · dS = 0, (6.47)
2 2
because φE ∼ R−3 and S ∼ R2 , and the integration surface is a
sphere with radius R, therefore this integration varies as R−1 . As
R → ∞, the integration equals 0. The electrostatic field energy is
therefore
1
Z Z
e0
U= φ∇· EdV = ρφdV, (6.48)
2 2
where we have applied ∇· E = ρ/e0 . For a system of point charges,
ρ = ∑ q i δ (r − r a ),
1 1
Z
2∑
U= ρφdV = q a φa , (6.49)
2 a
where φa ≡ φ(r a ) is the potential of all charges at r a .

Consider now a single point charge, U = qφ/2. However, φ =
q/r, and in this case, r = 0 leads to U = ∞! If the energy U is
infinity, then m = U/c2 must be infinity. This absurd result suggest
that the validity of EM theory must be restricted to some limits.
The problem results from that the charge is point-like. If we con-
sider an electron with radius R0 , then U = e2 /8πe0 R0 . Requiring
that the self-potential energy is on the same order as mc2 , we can
define
e2
R0 ∼ , (6.50)
4πe0 mc2
which is the classical electron radius. The classical electromagnetic
theory is problematic for r < R0 .
Let’s now return to a system of charges,
1 q
φa =
4πe0 ∑ rabb , (6.51)
b
where r ab = | x0a − x0b |, the distance between the two charges. This
φ contains two parts, the infinity self-energy, The energy of interac-
tion of the charges. From what we just discussed in the previous
paragraph, we will, from now on, only consider part 2, the energy
of interaction.
Since we only consider the energy of interaction,
1
2∑
U∗ = q a φa∗ , (6.52)
a
with
1 qb
φa∗ =
4πe0 ∑ r
, (6.53)
b6= a ab
the potential produced by all other charges at charge a. The electro-

static field energy of interaction is
1 qb 1 q a qb
U∗ = ∑
2 a
qa ∑
4πe0 r ab
=
8πe0 ∑ r ab
. (6.54)
b6= a a6=b
For example, U ∗ for two charges is

 
1  1q q 1 q2 q1  1 q1 q2
1 2
U∗ = + = . (6.55)
 
4πe0  2 r12 2 r 4πe0 r12

| {z21}

| {z }
i = 1, j = 2 i = 2, j = 1
6.5 A system of charges in an external field
We calculate the potential energy of a system of charges in an

external field. The potential of the external field is given by φ(r ),
and
U= ∑ qa φ(x0a ) = ∑ qφ(x0 ). dropping the "a" subscript (6.56)

a a
Again the origin of the coordinate is located somewhere in the

system of charges. The radius vector of charge a is x0a . Assume φ
varies slowly over the region of charges, then we can expand φ( x0 )
by
1 ∂2 φ
φ( x0 ) = φ(0) + x0 · ∇φ + xi0 x 0j +··· . (6.57)
2 ∂xi ∂x j
To see why we required that φ varies slowly over the regions of

charges, we compare the first and second terms on the RHS.
x0 · ∇φ | x0 |φ/L | x0 |
∼ ∼ , (6.58)
φ (0) φ L
where L is the characteristic scale of φ. Therefore, only when
| x0 |/L 1 can we say that the second terms is an order smaller
than the first term.
Like the multipole expansion used in deriving fields of a system
of charges, we have
U = U (0) + U (1) + U (2) + · · · , (6.59)
and
U (0) = ∑ qφ(0), (6.60)

a
U (1) = ∑ qx0 · (∇φ)0 = p · ∇φ, (6.61)
a
1 ∂2 φ
U (2) = ∑ q 2 xi0 x0j ∂xi ∂x j . (6.62)
a
Let’s again introduce these terms one by one.

Since U (0) = ∑ a qφ(0), this is just like the monopole term we
introduced in previous sections. This is the same as ignoring the
internal distributions of charges treat them as a single point charge
with Q = ∑ a q.
The first order term, or the dipole moment term, U (1) = p · ∇φ,
expressed using E, is U (1) = − p · E(0). One important conclusion is
the force felt by the charges, which is
f = −∇(U (0) + U (1) + · · · ) ≈ ∑ qE(0) + ∇( p · E) (6.63)

a
= QE(0) + ( p · ∇) E. (6.64)
If Q = 0, then
f = ( p · ∇) E. (6.65)
In deriving the last equation, we have used 1 1

Note that ∇· p = 0 and ∇× p = 0
since p is defined in particle coordi-
∇( p · E) = ( p · ∇) E + p × (∇× E) = ( p · ∇) E, (6.66) nates x0 , while ∇ ≡ ∂/∂x.
since for electrostatic fields, ∇× E = 0.

The second order term U (2) is
1 0 0 ∂2 φ
U (2) = ∑ 2 xi x j ∂xi ∂x j .
q (6.67)
a
Noting that for an external field 2 2

See Gauss’s law, ∇2 φ = −ρ/e.
External field here means that the
∂2 φ ∂2 φ source charge density ρ = 0 at the
∇2 φ = = δij = 0. (6.68) location of interest (x0 here).
∂xi ∂xi ∂xi ∂x j
Therefore, we rewrite U (2) as
1 ∂2 φ Dij ∂2 φ
U (2) = ∑
6 a
q 3xi0 x 0j − r 02 δij
∂xi ∂x j
=
6 ∂xi ∂x j
(6.69)
1 1
= D : ∇∇φ = − D : ∇ E. (6.70)
6 6
6.6 Polarization in Dielectrics
We have two kinds of objects: conductors and insulators (or di-

electrics). In conductors, there are charges, most likely electrons,
that are free to move through the material. On the other hand, di-
electrics do not have free charges. Instead, all electrons are attached
to some atoms or molecules–they can not move freely; the only
thing they can do is to move a bit within the atom or molecules.
Their cumulative effects, however, are important; these effects can
produce new fields inside the dielectrics. In this section, we’ll what
we learned in previous sections to study the electric fields in matter.
6.6.1 Induced dipoles

Suppose we have an neutral atom in an external field E. The atom
as a whole does not feel a force, since it’s neutral 3 . However, the 3
the monopole term is 0
positive core, the nucleus, and the negative electrons feel the E
force. If the E field is very strong, the atom can be pulled apart, or
ionized. However, if the E field is not that strong, an equilibrium
is soon established. The electrons and the core feel the E force
in opposite directions, making the atom polarized, creating a tiny
dipole p, which points in the same direction as E. It turns out that
p is approximately proportional to the E field, or
p = αE (6.71)
This constant α is called atomic polarizability. Its value depends on

the atom in question.
Here, let’s adopt a simple model of atoms, consisting of a posi-
tively charged core q and a negatively charged electron cloud with
radius a and total charge −q surrounding the core. Without an
external E field, the nucleus is located at the center of the cloud. If
there is an electric field, however, the nucleus will move slightly in
the direction of E; let’s say it’s now d from the center of the cloud.
Let’s approximately assume that the electron cloud is still a sphere.
The magnitude of the electric field of the electron cloud at the
nucleus is
qd
Ee = . (6.72)
4πe0 a3
At equilibrium, then Ee = E, or
qd
= E, or qd = 4πe0 a3 E. (6.73)
4πe0 a3
Note that qd is just the dipole moment of the newly formed dipole,
therefore
p = 4πe0 a3 E, and α = 4πe0 a3 (6.74)
for simple atoms.

For complicated molecules, things in general get more compli-
cated. Frequently they polarize more readily in some direction than
in others. For example, for CO2 molecules, p = α⊥ E⊥ + αk Ek . In
general, the relationship can be expressed using a tensor A, the
polarizability tensor, or
p = A · E. (6.75)
Note that no matter what the cause of the dipole is, when there is
an external field E, the dipole will feel a torque L = p × E, which
will try to align the dipole along E.
7
Boundary Value Problems in Electrostatics
7.1 General boundary conditions
From the integral form of Gauss’s law, on the boundary layer

between region I and region II,
I Z
D · dS = ρdV = Q = σA, (7.1)
where σ is the surface charge density, and A is the surface area.

Construct a Gaussian pillbox with thickness ε, and the disk area is
A, then Equation (7.4.3) gives
I II
( D⊥ − D⊥ ) A = σA, (7.2)
where we’ve let ε → 0; therefore,
I II
eI E⊥ − eII E⊥ = σ or (eI EI − eII EII ) · n = σ, (7.3)
with n the unit vector pointing from region II to region I.
The tangential components of E are continuous, as can be ob-
tained by
Z I
∇× E · dS = E · dl = 0. (7.4)
If we construct a thin rectangular loop around the boundary, with

the width of the two sides ε and let ε → 0, then from Equation (7.4)
EkI − EkII = 0. (7.5)
You can combine Equation (7.5) and Equation (7.3) to get the bound-
ary condition as
eI EI − eII EII = σn. (7.6)
You can see that the potential is continuous across any boundary,
since
Z b
ϕI − ϕII = − E · dl, (7.7)
a
as a → b, RHS equals 0; therefore ϕI = ϕII . However the gradient of

the potential is not zero, since
∂ϕ
n · ∇ϕ = − = E · n, (7.8)
∂n
or
∂ϕI ∂ϕII
eI − eII = −σ. (7.9)
∂n ∂n
boundary value problems in electrostatics 79
7.2 Boundary value problems and the first uniqueness theorem
Normally in the region we are interested in, there are no charges.

Hence in this section, we will mainly consider the electrostatic
problems of the type
∇2 ϕ = 0, (7.10)
with certain boundary conditions. Now the problem is, which kind
of boundary conditions do we need to determine completely the ϕ
inside the region?
Let’s start from a 1D problem, and the equation is
d2 ϕ
= 0, x ∈ [ a, b]. (7.11)
dx2
This equation can be easily solved to give ϕ = Ax + B. To deter-
mine ϕ, we need A and B. The following boundary condition can
uniquely determine ϕ:
• if we know ϕ( a) and ϕ(b);
• if we know ∂ϕ( a)/∂x and ϕ(b);
• if we know ∂ϕ(b)/∂x and ϕ( a).
On the other hand, if we know ∂ϕ(b)/∂x and ∂ϕ( a)/∂x, we cannot

determine A and B. Or if we just know ϕ( a), we cannot determine
both A and B.
In 3D cases, things get more complicated. We’ll now prove the
following theorem, which is called the first uniqueness theorem:
“The solution to ∇2 ϕ = 0 is uniquely determined if ϕ is specified
on the boundary S.”
There are different kind of uniqueness theorems in electrostatics.
The proof of these unique theorems shares some basic format. Let
me show you the format by proving the first uniqueness theorem.
Proof : Suppose there are two solutions to ∇2 ϕ = 0 called ϕ1 and
ϕ2 ; ∇2 ϕ1 = 0 and ∇2 ϕ2 = 0. Now to prove the first uniqueness
theorem, we just need to prove that ϕ1 = ϕ2 if we specify ϕ on the
boundary S.
Define ϕ3 ≡ ϕ1 − ϕ2 , then ϕ3 also satisfies the Laplace equation
∇2 ϕ3 = 0. Since both ϕ1 = ϕ and ϕ2 = ϕ on the boundary S,
ϕ3 = 0 on boundary S. In short, the equation for ϕ3 is
∇2 ϕ3 = 0, and ϕ3 = 0 on S. (7.12)
You have proved as an exercise that if a function satisfies ∇2 f = 0,

then f cannot have maximum or minimum inside the domain;
extreme values of f can only occur on boundary. Since ϕ3 = 0 on
the boundary and it cannot have maximum or minimum inside S,
then ϕ3 = 0 everywhere. Therefore,
ϕ1 = ϕ2 . (7.13)
Proof of the first uniqueness theorem is now completed.

The first uniqueness theorem has an interesting consequence. It’s
easy to show that if ϕ satisfies
ρ
∇2 ϕ = − , 1 (7.14)
ei
charge density ρ is specified throughout the domain, and ϕ is 1

the footnote i represents the dielectric
given on the boundary S, then the solution of ϕ to Equation (7.14) constant of the ith isotropic area
is uniquely determined. The proof is almost identical to the proof

of the first uniqueness theorem given above.
The uniqueness theorem allows you to solve boundary value
problems in electrostatics by guessing. For a problem satisfying
the conditions of these uniqueness theorems, if you can guess a
solution ϕ that satisfies the Poisson/Laplace equation, and at the
same time satisfies the boundary condition, then the solution you
guessed is the solution to the equation. We’ll see the power of the
uniqueness theorem in the method of images latter in the course.
7.2.1 Conductors and the second uniqueness theorems

Sometimes we encounter electrostatic problems where we do not
know the potential at the boundary, but we do know the charges on
various conductors. For example, we put Q a on conductor a, and
Qb on conductor b. As soon as we put charges on the conductors,
they will redistribute themselves in a way we do not control. Let’s
say there are some specified charge density ρ in the region between
conductors. The second uniqueness theorem then says:
“In a region V surrounded by conductors and containing a
specified charge density ρ, the electric field is uniquely determined
if the total charge on each conductor is given. The region as a
whole can be bounded by another conductor or else unbounded.”
Proof : The proof is similar to the one above, but slightly more
complicated. Suppose there are two solutions E1 and E2 satisfying
Gauss’s law in space between conductors,
ρ ρ
∇· E1 = and ∇· E2 = , (7.15)
e e
and they obey Gauss’s law in integral form for a Gaussian surface
enclosing each conductor:
Qi Qi
I I
E1 · ds = and E2 · ds = . (7.16)
e e
ith conductor ith conductor
Likewise, for the outer boundary,
Qtot Qtot
I I
E1 · ds = and E2 · ds = . (7.17)
e e
outer boundary outer boundary
As before, we define E3 = E1 − E2 . To prove the second uniqueness

theorem, all we need to do is to prove E3 = 0.
Apparently E3 satisfies ∇· E3 = 0 in the region between conduc-

tors and
I
E3 · ds = 0 (7.18)
over each boundary surface.

We know that each conductor is an equipotential, and therefore
ϕ1 and ϕ2 and ϕ3 are all constants. Here is the trick we need to use
∇· ( ϕ3 E3 ) = ϕ3 ∇· E3 + E3 · ∇ ϕ3 = − E32 , (7.19)
where we have used ∇· E3 = 0 and ∇ ϕ3 = − E3 . Integrating over the

whole domain, we have
Z I I
∇· ( ϕ3 E3 )dV = ϕ3 E3 · ds = ϕ3 E3 · ds = 0, (7.20)
S S
here we have used that ϕ3 is a constant over each surface. From

Equation (7.19), however, Equation (7.20) can also be written as
Z Z
∇· ( ϕ3 E3 )dV = − E32 dV ≤ 0. (7.21)
Combining Equations (7.21) and (7.20), we see that

Z
E32 dV = 0, (7.22)
or E3 = 0. Therefore E1 = E2 .
7.3 The method of images
7.3.1 The classic image problem

The method of images doesn’t apply to very general situations, but
it is a neat method when it works. The classic image problem is
this:
A point charge q is hold at a distance d above an infinite grounded
conducting plane. What’s the potential above the plane? For ease
of description, let’s establish a coordinate system, and the charge is
located at (0, 0, d) on the x-y plane.
Mathematically, we want to solve the Poisson equation in region
z > 0 with q at (0, 0, d), subjecting to the the following boundary
condition,
• ϕ = 0 when z = 0 (grounded conducting plane).
• ϕ = 0 when z → ∞.
Therefore the first uniqueness theorem applies: if we can guess a

solution satisfying the boundary condition, then it is exactly the
solution we want.
Here is the trick to solve this problem: instead of solving the
original problem directly, we solve a different problem. The new
problem has two charges q and −q at (0, 0, d) and (0, 0, −d), re-
spectively. Then the total potential at an arbitrary point ( x, y, z)
is
!
1 q q
e=
ϕ p −p . (7.23)
4πe0 x 2 + y2 + ( z − d )2 x 2 + y2 + ( z + d )2
You can easily check that
e = 0 when z = 0 (grounded conducting plane).

• ϕ
e = 0 when z → ∞.
• ϕ
From the first uniqueness theorem, ϕ = ϕ e is the solution we want

to the original problem.
Note that this method is not limited to a single point charge; any
stationary charge distribution near a grounded conducting plane
can be treated by introducing its mirror image; hence the name
method of images. You see the crucial role the uniqueness theorem
play in the method of images.
7.3.2 Induced surface charge

The surface charge induced can be calculated easily. The field inside
the conductor is 0; therefore,
∂ϕ ∂ϕ
σ = − e0 = − e0 . (7.24)
∂n ∂z
At z = 0 and using Equation (7.23),
qd 1
σ=− . (7.25)
2π ( x2 + y2 + d2 )3/2
The total induced charge can be obtained by integrating over x and
y:
Z Z
Q= σdxdy = σrdrdφ = −q. (7.26)
Left as an exercise, you can also calculated the force felt by q due to
the induced charge.
7.3.3 Other image problems

Suppose a point charge q is situated at a distance a from the center
of a grounded conducting sphere of radius R. We can use the
method of images to find the potential outside the sphere.
The solution using method of images is to solve a different
problem: a point charge q0 = −( R/a)q is placed a distance
b = R2 /a (7.27)
to the right of the center of the sphere. Now these two point
charges will produce a potential
1 q q0
ϕ (r ) = ( + 0 ), (7.28)
4πe0 Rq Rq
where Rq is the distance to q and R0q is the distance to q0 . You can

see that ϕ(r ) = 0 at all points on the sphere. Therefore ϕ(r ) is the
solution to the original problem.
Again, the method of images is very powerful and simple when
it works. But most of the time, finding the “image” is a mission
impossible.
7.4 Separation of variables
Since you’re very familiar with the method of separation of vari-

ables. I’ll just show a few examples of using the method to solve
electrostatic problems.
7.4.1 Cartesian coordinates

First, let’s solve an electrostatic problem in 3D using Cartesian
coordinates.
Example: An infinitely long rectangular metal pipe with sides a
and b is grounded, but one end, x = 0, is maintained at a specified
potential ϕ0 (y, z). Find the potential inside the pipe.
Solution: The equation we need to solve is
∂2 ϕ ∂2 ϕ ∂2 ϕ
+ 2 + 2 = 0, (7.29)
∂x2 ∂y ∂z
with the boundary conditions
ϕ = 0 at y = 0, y = a, z = 0, z = b; (7.30)
ϕ = 0 at x → ∞; (7.31)
ϕ = ϕ0 (y, z) at x = 0. (7.32)
Using separation of variables, we let ϕ( x, y, z) = X ( x )Y (y) Z (z).

Putting this into Equation (7.29), we have
1 d2 X 1 d2 Y 1 d2 Z
+ + = 0. (7.33)
X dx2 Y dy2 z dz2
Therefore,
1 d2 X
= C1 , (7.34)
X dx2
1 d2 Y
= C2 , (7.35)
Y dy2
1 d2 Z
= C3 , (7.36)
Z dz2
with C1 + C2 + C3 = 0.
Considering boundary conditions given by Equations (7.30)-
(7.32), we let C2 = k2 , C3 = l 2 , and then C1 = −(k2 + l 2 ). Putting
these into Equations (7.34) - (7.36), we have
√ √
k2 + l 2 x k2 + l 2 x
X = Ae + Be− , (7.37)
Y = C sin ky + D cos ky, (7.38)
Z = E sin lz + F cos lz. (7.39)
From Equation (7.31), it’s easy to see that A = 0. At y = 0, Y = 0,

then we must have D = 0. At y = a, Y = 0, we then have k = nπ/a.
Similarly F = 0 and l = mπ/b. The final solution of ϕ is
√
ϕ= ∑ Cn,m e−π (n/a)2 +(m/b)2 x
sin(nπy/a) sin(mπz/b). (7.40)
n,m
The remaining constant C can be determined from the boundary

condition ϕ( x = 0) = ϕ0 (y, z). From which we have
4
Z nπy mπz
Cn,m = ϕ0 (y, z) sin sin dydz. (7.41)
ab a b
In case ϕ0 (y, z) is a constant ϕ0 , then

(
0, if n or m is even,
Cn,m = (7.42)
16ϕ0 /π 2 mn, if n and m are odd.
7.4.2 Spherical Coordinates
Sometimes we have problems with round objects. In these cases,

using spherical coordinates is a natural choice.
In spherical coordinates (r, θ, φ), the Laplace equation is
∂2 ϕ

1 ∂ 2 ∂ϕ 1 ∂ ∂ϕ 1
r + 2 sin θ + = 0. (7.43)
r2 ∂r ∂r r sin θ ∂θ ∂θ r2 sin2 θ ∂φ2
In this class, we’ll only deal with problems with azimuthal symme-
try; i.e., ϕ is independent of φ. Therefore Equation (7.43) becomes

∂ 2 ∂ϕ 1 ∂ ∂ϕ
r + sin θ = 0. (7.44)
∂r ∂r sin θ ∂θ ∂θ
Using the method of separation of variables, we let ϕ = R(r )Θ(θ ),

then Equation (7.44) becomes

1 d dR 1 d dΘ
r2 + sin θ = 0. (7.45)
R dr dr Θ sin θ dθ dθ
The first and the second term then equals a constant of opposite
signs. For convenience, in spherical coordinates, we write

d 2 dR
r = l (l + 1) R, (7.46)
dr dr

d dΘ
sin θ = −l (l + 1) sin θ Θ. (7.47)
dθ dθ
You should have learned from your mathematical methods for

physics class that these two Equations give
B
R(r ) = Ar l + , (7.48)
r l +1
Θ(θ ) = Pl (cos θ ). (7.49)
Here Pl (cos θ ) are Legendre polynomials in the variable cos θ. The

first few Legendre polynomials are
P0 ( x ) = 1, (7.50)
P1 ( x ) = x, (7.51)
P2 ( x ) = (3x2 − 1)/2, (7.52)
3
P3 ( x ) = (5x − 3x )/2, (7.53)
4 2
P4 ( x ) = (35x − 30x + 3)/8. (7.54)
Notice that Pl ( x ) is the lth-order polynomial in x; it contains only

even powers if l is even, and odd powers if l is odd. The Legendre
polynomials are orthogonal functions; i.e.,
Z 1 Z π
Pl ( x ) Pl 0 ( x )dx = Pl (cos θ ) Pl 0 (cos θ ) sin θdθ
−1 0
if l 0 6= l,


 0,

= (7.55)
2
if l 0 = l.


 ,
2l + 1
We’ll of course need this property of the Legendre polynomials in
using the method of separation of variables.
The final general solution of the Laplace equation (7.44) is then
∞
B
ϕ(r, θ ) = ∑ Al r l + l +l 1 Pl (cos θ ). (7.56)
l =0 r
Example 1: The potential ϕ0 (θ ) is specified on the surface of a

hollow sphere of radius R. Find the potential inside the sphere.
Solution: Because we need a finite ϕ at r = 0, Bl = 0; therefore,
∞
ϕ(r, θ ) = ∑ Al rl Pl (cos θ ). (7.57)
l =0
At r = R,
∞
ϕ( R, θ ) = ∑ Al Rl Pl (cos θ ) = ϕ0 (θ ). (7.58)
l =0
Using Equation (7.55), we can find

2l + 1
Z π
Al = ϕ0 (θ ) Pl (cos θ ) sin θdθ. (7.59)
2Rl 0
Example 2: An uncharged metal sphere of radius R is placed in

an otherwise uniform electric field E = E0 ez . Find the potential in
the region outside the sphere.
Let’s determine the boundary conditions. First, the conducting
sphere is an equipotential, we can set its potential to be a constant
or simply 0. Second, since there is an external uniform E field, at
large z, ϕ = − E0 z + C. The problem is totally symmetric at z = 0 (ϕ
does not change in the z = 0 plane), therefore C = 0. In summary,
the boundary condition for this problem is
ϕ = 0, if r = R, (7.60)
ϕ = − E0 r cos θ, if r R. (7.61)
Condition (7.60) yields

Bl
Al Rl + = 0, (7.62)
R l +1
or
Bl = − Al R2l +1 . (7.63)
Equation (7.56) then becomes

∞
!
R2l +1
ϕ(r, θ ) = ∑ Al r l − l +1
r
Pl (cos θ ). (7.64)
l =0
For r R, the second term in parentheses is negligible compared

to the first term; therefore boundary condition (7.61) gives
∞
ϕ(r, θ ) = ∑ Al rl Pl (cos θ ) = −E0 r cos θ. (7.65)
l =0
If you recall that P1 ( x ) = x or P1 (cos θ ) = cos θ, you can immedi-

ately reach the conclusion that
A1 = − E0 , (7.66)
and all other Al ’s are 0. Therefore, the final solution of ϕ outside

the sphere is
R3

ϕ(r, θ ) = − E0 r − 2 cos θ. (7.67)
r
Apparently, the first term is the external field, while the second
term is due to the induced charge. The induced charge density can
be calculated directly using

∂ϕ
σ (θ ) = − e0 = 3e0 E0 cos θ. (7.68)
∂r R
Clearly σ is positive in the “northern” hemisphere, and “negative”
in the “southern” hemisphere.
Example 3: A spherified charge density σ0 (θ ) is glued over the
surface of a spherical shell of radius R. Find the potential field
inside and outside the sphere.
Solution: We’ll solve the equation in two regions, inside and
outside, and match them at the boundary r = R to satisfy the
boundary condition.
For r ≤ R,
ϕ= ∑ Al rl Pl (cos θ ). (7.69)
l
For r ≥ R,
B
ϕ= ∑ rl+l 1 Pl (cos θ ). (7.70)
l
The potential is continuous at r = R, therefore,

Bl
Al Rl = , (7.71)
R l +1
or
Bl = Al R2l +1 . (7.72)
Then using Equation (7.9),

∂ϕout ∂ϕin
e0 − e0 = −σ0 (θ ). (7.73)
∂r r= R ∂r r= R
or
e0 ∑(2l + 1) Al Rl −1 Pl (cos θ ) = σ0 (θ ). (7.74)

l
With Equation (7.55),

1
Z π
Al = σ0 (θ ) Pl (cos θ ) sin θdθ. (7.75)
2e0 Rl −1 0
This finishes our solution of Example 3.
7.4.3 Boundary value problems with linear dielectrics

For linear dielectrics, the boundary condition Equation gives
(eI EI − eII EII ) · n = σ f , (7.76)
where n is the unit vector pointing from region II to region I, or in

terms of potential
∂ϕI ∂ϕII
eI − eII = −σ f . (7.77)
∂n ∂n
The potential itself is still continuous,
ϕII = ϕI . (7.78)
A useful note sometimes we use is that for a homogeneous

isotropic linear dielectric, the bound charge density ρb is propor-
tional to the free charge density ρ f ,
e − e0 e − e0

ρb = −∇· P = −∇· D =− ρ f .2 (7.79)
e e
2
Notice that P = D − e0 E = (1 −
Example: A sphere of homogeneous linear dielectric material e0 /e) D
is placed in an otherwise uniform field E0 . Find the electric field

inside the sphere.
Solution: Since the material is homogeneous linear dielectric, then
ρb = ρ f = 0 inside the sphere. Therefore the equation to solve is
∇2 ϕ = 0 (7.80)
both inside and outside the sphere. The boundary conditions are
ϕin = ϕout , ϕ is continuous (7.81)

∂ϕin ∂ϕout
e = e0 , σ f = 0 on the surface, (7.82)
∂r ∂r
ϕout = − E0 r cos θ, for r R. (7.83)
Inside the sphere, the solution is

∞
ϕ= ∑ Al rl Pl (cos θ ); (7.84)
l =0
outside the sphere,

∞
B
ϕ = − E0 r cos θ + ∑ rl+l 1 Pl (cos θ ). (7.85)
l =0
Boundary condition Equation (7.81) requires that

∞ ∞
B
∑ Al Rl Pl (cos θ ) = −E0 R cos θ + ∑ Rl+l 1 Pl (cos θ ), (7.86)
l =0 l =0
which gives
B1
A1 R = − E0 R + , (7.87)
R2
Bl
Al Rl = , for l 6= 1. (7.88)
R l +1
Condition (7.82) yields
∞ ∞
(l + 1) Bl
e ∑ l Al Rl −1 Pl (cos θ ) = −e0 E cos θ − e0 ∑ Pl (cos θ ),
l =0 l =0 R l +2
(7.89)
so
e0 (l + 1) Bl
el Al Rl −1 = − , for l 6= 1, (7.90)
R l +2
2e0 B1
eA1 = −e0 E0 − . (7.91)
R3
Equations (7.88) and (7.90) give
Al = Bl = 0, for l 6= 1, (7.92)
while Equation (7.91) and (7.87) can be solved to give
3e0
A1 = − E0 , (7.93)
e + 2e0
e − e0 3
B1 = R E0 . (7.94)
e + 2e0
Therefore
3e0 3e0
ϕin = − E0 r cos θ = − E0 z, (7.95)
e + 2e0 e + 2e0
and the field inside the sphere is
3e0
E= E0 . (7.96)
e + 2e0
8
General Magnetostatics
8.1 Static magnetic field
Let’s first derive the general or strictly valid equations (like Coulomb’s
law) for static magnetic fields. The Maxwell equations describing
static magnetic fields are
∇· B = 0, (8.1)
∇× B = µ0 j. (8.2)
We have introduced the vector potential A, which can be used to

present B by
B = ∇× A. (8.3)
Substituting Equation (8.3) into Equation (8.2) gives the equation A

satisfies
∇×(∇× A) = µ0 j (8.4)
Note that ∇×(∇× A) = ∇(∇· A) − ∇2 A, and choose Coulomb

gauge
∇· A = 0, (8.5)
then the equation for A is
∇2 A = −µ0 j. (8.6)
This is completely analogous to the Poisson equation

ρ
∇2 φ = − . (8.7)
e0
Therefore we can directly write the solution of Equation (8.6) for
A using the knowledge we have learned from the electrostatic
problem. Recall that for a point charge,
1 q
φ= , (8.8)
4πe0 | x − x0 |
and the general solution of the Poisson equation (8.7) is
1
Z
ρ
φ= dV 0 , with dV 0 = d3 x 0 (8.9)
4πe0 | x − x0 |
Here x is for the field point and x0 for the charge. Therefore the
solution of A from Equation (8.6) is, with ρ/e0 → µ0 j,
j
Z
µ0
A= dV 0 . (8.10)
4π | x − x0 |
The static magnetic field can be obtained from A using B = ∇× A,

or
j( x0 )
Z
µ0
B = ∇×A = ∇× dV 0 ; (8.11)
4π | x − x0 |
i.e., we can bring the ∇ operator inside the integral, since the
∇ operator is the differentiation w.r.t. x, not x0 . Performing the
differentiation, we have

j
Z
µ0
B= ∇× dV 0 (8.12)
4π R
1
Z
µ
= 0 ∇ × j dV 0 . (8.13)
4π R
Or finally, the static magnetic field produced by current j is
j×R
Z
µ0
B= dV 0 . (8.14)
4π R3
This is the law of Biot and Savart. In case of a localized loop cur-
rent, we replace jdV 0 → Idl, where I is the current. Then
Idl
I
µ0
A= , (8.15)
4π R
Idl × R
I
µ0
B= . (8.16)
4π R3
8.2 Magnetic field multiple moments
In this section, we consider the field at large distances x produced

by a localized current. The technique used is similar to the multiple
expansion in the electrostatic case. Put the origin of the coordinate
inside the current system, then r r 0 ,
f ( x − x0 ) = f ( x) − x0 · ∇ f ( x) + · · · (8.17)
Therefore, with r ≡ | x|,
1 1 1
≈ − x0 · ∇ + · · · , i = 1, 2, 3. (8.18)
| x − x0 | r r
Everything works similarly to the electrostatic case. However,

because the magnetic potential A is a vector, the general multiple
expansion is more complicated. We’ll only consider the dipole
moment term.
The complete form of A at x is
j( x0 )
Z
µ0
A= dV 0 (8.19)
4π | x − x0 |
general magnetostatics 91
The 0th and 1st order term in the multiple expansion of A is

Z
µ0
A (0) = jdV 0 (8.20)
4πr
1
Z
µ
A (1) =− 0 j( x0 · ∇ )dV 0 (8.21)
4π Z r
µ0 x 0 0
= · x jdV . (8.22)
4π r3
We ignore higher order terms, since they are rarely used and also
they are complicated.
First we consider A(0) , the monopole term. Intuitively, A(0) = 0,
since there exists no magnetic monopoles. Let’s prove this directly.
We’ll need to use
Z
( f j · ∇0 g + gj · ∇0 f )dV 0 = 0. (8.23)
where ∇0 = ∂/∂x0 . For simplicity, we call this identity the f −

g identity. The proof of this identity will be left as an exercise.
Addition: j is local steady current density, and it has j · dS = 0 on
the boundary.
Let f = 1 and g = x0 · ei = xi0 , so that ∇0 g = ei . Using the f -g
identity (8.23),
Z
(0)
Ai = j · ei dV 0 = 0, i = 1, 2, 3. (8.24)
All together, we have
A(0) = 0. (8.25)
Again this is related to the fact that there is no magnetic monopole.

To calculate the 1st order term, we need to calculate
Z
x· x0 j dV 0 . (8.26)
Letting f = xi0 and g = xl0 , so that

Z
( xi0 j · el + xl0 j · ei ) dV 0 = 0, (8.27)
or
Z Z Z
( xi0 jl + xl0 ji ) dV 0 = 0. ⇒ xl0 ji dV 0 = − xi0 jl dV 0 (8.28)
The A(1) related term can be rewritten as

Z Z
x· x0 ji dV 0 = xl xl0 ji dV 0 (8.29)
Z 0
xl ji − xi0 jl

= xl dV 0 (8.30)
2
1
Z
= x ekli ( x0 × j)k dV 0 (8.31)
2 l
1
Z
= − eilk xl ( x0 × j)k dV 0 (8.32)
2

1
Z
= − x × ( x0 × j)dV 0 . (8.33)
2 i
Define the magnetic moment to be
1
Z
m= x0 × jdV 0 , (8.34)
2
then
µ0 m × x
A (1) = . (8.35)
4π r3
Since A(0) always vanishes, normally the dipole moment term A(1)
dominates. For a small current loop, jdV 0 → Idl 0 ,
1
I
m= x0 × Idl 0 . (8.36)
2
Define S ≡ 12 x0 × dl 0 . For a plane loop, m = IS, with S the area

H
of the loop. The direction of S is determined from the current using

the right-hand rule.
The magnetic field B can be calculated directly from A.
m×x

µ0
B = ∇× A = ∇× (8.37)
4π r3
or
µ0 3( n · m ) n − m x
B= , with n ≡ . (8.38)
4π r3 r
This is analogous to the electrostatic dipole field with p replaced by

m.
8.3 A localized current in an external magnetic field
We consider a localized current distribution in an external B. The

purpose is to calculate the force and the torque on the current
system. The force felt by the current system is
Z
F= j × B( x0 )dV 0 . (8.39)
To find the force, we first estimate the magnetic field, B = Bi ei ,
Bi ( x0 ) = Bi (0) + x0 · ∇ Bi (0) + · · · , (8.40)
The origin of the coordinates is within the current system.

Substituting the equation of B( x0 ) into F, then
Z
[( j × B(0))i + j × ( x0 · ∇) B i + · · · ]dV 0

Fi = (8.41)
(0) (1)
= Fi + Fi +··· . (8.42)
We first calculate F (0) ,

Z
F (0) = jdV 0 × B(0) = 0, (8.43)
from conclusions obtained in the last section. The force of the

dipole term is
Z
F (1)
= j × ( x0 · ∇) B(0)dV 0 . (8.44)
Its ith component is

Z Z
(1)
Fi = eiln jl x0 · ∇ Bn (0)dV 0 = eiln jl xk0 ∇k Bn (0)dV 0 . (8.45)
Recall that when deriving the multiple expansion of A, we have

used
µ0 x k µ0 ( m × x ) l
Z
(1) (1)
Al = xk0 jl dV 0 ⇒ Al = (8.46)
4π r3 4π r3
We can conclude that
Z
xk0 jl xk dV 0 = (m × x)l (8.47)
Replacing xk → ∇k Bn , we have
Z
xk0 jl ∇k Bn (0)dV 0 = (m × ∇)l Bn (0), (8.48)
therefore
(1)
Fi = eiln (m × ∇)l Bn (0), (8.49)
or
F (1) = (m × ∇) × B (8.50)
Using the middle-outer rule, ∇ = ∇ B , and the fact that the ∇

operator is the differentiation w.r.t. x, not x0 , we can easily derive,
(m × ∇) × B = ∇(m · B) − m(∇· B),
Because ∇· B = 0, we have:
F (1) = ∇(m · B). (8.51)
This equation could be put in another form. Using the identity,
∇( A · B) = A × (∇ × B) + B × (∇ × A) + ( A · ∇) B + ( B · ∇) A
and that ∇ ≡ ∂/∂x, we have
∇(m · B) = m × (∇ × B) + (m · ∇) B; (8.52)
Outside the region of current which generates the external field,

∇× B = 0; therefore
F (1) = (m · ∇) B. (8.53)
This is the force felt by a magnetic dipole in an external B. It’s

analogous to the force felt by the electrostatic dipole moment in
an external field with p replaced by m. The potential energy of a

localized current in external field B is
U = −m · B, (8.54)
which can be easily see from that the force is
F = −∇U = ∇(m · B). (8.55)
Compare this with the energy of electrostatic dipole moment,

which is U = p · E. It can be seen from Equation (8.54) that a
magnetic dipole will always try to orient itself parallel to B to
achieve the lowest potential energy U.
The torque felt by the localized current is
Z Z
K= x0 × f dV 0 = x0 × ( j × B)dV 0 . (8.56)
Here the lowest order term, which contributes, is

Z
K ≈ K (0) = x0 × [ j × B(0)]dV 0 (8.57)
Z
= [( x0 · B) j − B( x0 · j)]dV 0 . (8.58)
The first term on the right hand side is

Z Z
( x0 · B) jdV 0 = [ B · x0 ) jdV 0 = m × B(0). (8.59)
The second term on the right hand side is 0 from the f -g identity
(Equation (8.23)) with f = g = r 0 . Therefore the lowest order torque
is
K = m × B (0). (8.60)
With K and m, we introduce an interesting phenomena. The

angular momentum L is related to the torque by
dL
= K = m × B0 , (8.61)
dt
where B0 = B(0), and L = ∑ x0 × p = ∑ x0 × mv. Here the
summation is over all charged particles. Now the magnetic dipole
moment is
1 1 1
Z Z
2∑
m= x0 × jdV 0 = x0 × ρvdV 0 = x0 × qv. (8.62)
2 2
We see that, if q/m is the same for all particles,
1 q q q
2∑ ∑ x0 × mv = 2m L.
m= ( x0 × mv) = (8.63)
m 2m
So the angular momentum equation becomes
dL q
= L × B0 = L × Ω L (8.64)
dt 2m
with Ω L = (q/2m) B0 the Larmor precession frequency.

8.4 Magnetization
8.4.1 Diamagnets, Paramagnets, Ferromagnets

All magnetic fields are produced by currents, or electric charges
in motion microscopically. Electrons orbits around nuclei and spin
about their axes; we can view these tiny current loops as magnetic
dipoles. Normally, these magnetic dipoles are randomly oriented;
their magnetic fields cancel each other. When a magnetic field
is applied, however, a net alignment of dipoles occurs, and the
medium becomes magnetically polarized, or magnetized.
Different from electric polarization, where P k E, there are three
different kinds of magnetization: a magnetization parallel to B
(paramagnets), magnetization opposite to B (diamagnets), and for a
few substances, magnetization depending on the whole magnetic
“history” (ferromagnets). Ferromagnets are very complicated; I will
not discuss it in this class. You can read the corresponding chapters
in the textbook if you are interested. In the following sessions,
let me explain the microscopic mechanism for paramagnets and
diamagnets.
8.4.2 Paramagnetism
Electrons spin, and each electron constitutes a magnetic dipole.
We have learned before that when a B is applied, each magnetic
dipole feels a torque; this torque tends to align the dipole parallel
to the B to minimize the potential energy. This torque accounts
for paramagnetism, since the resulting magnetic field from the
magnetic dipole is in the same direction as B. You might expect
paramagnetism to be universal; however, quantum mechanics tends
to lock electrons in pairs with opposing spins. The torque on the
each pair of electrons is close to 0. As a result, paramagnetism most
often occurs in atoms with an odd number of electrons, where the
extra unpaired electron is subject to the magnetic torque.
8.4.3 Diamagnetism
Electrons not only spin; they also orbit around nucleus. Assume the
radius of the orbit is R; the resulting current is
e ev
I=− =− . (8.65)
T 2πR
The current is not steady, but unless we work on a time scale com-
parable to T, the approximation is not bad. Let z direction be the
direction of angular velocity; then the resulting magnetic dipole is
evR
m = IS = − ez (8.66)
2
When we apply an external magnetic field B, the orbit is subject
to a torque m × B. But it’s a lot harder to tilt the entire orbit than
it is the spin, so the orbital contribution to paramagnetism is small.
However, the electron speeds up or slows down, depending on the

direction of B. To see this, before applying B, we have
1 e2 v2
2
= me . (8.67)
4πe0 R R
When we apply an external field B, this is an external force −e(v/c) ×

B, and the electron’s new velocity is v. Assuming B = Bez , the new
force equation is
1 e2 v2
2
+ evB = me . (8.68)
4πe0 R R
Letting ∆v = v − v and keeping only first order terms in ∆v,

Equations (8.68) and (8.67) gives
v
evB = me 2∆v, (8.69)
R
or
eRB
∆v = . (8.70)
2me
When B = Bez is applied, the electron speeds up. Equation (8.66)
indicates that there is a change in m,
eR e2 R2
∆m = − ∆vez = − B. (8.71)
2 4me
Note that ∆m is opposite to the direction B. Therefore, when an
external B is applied, each orbiting electron picks up a little ex-
tra dipole moment, and the increments are all anti-parallel to B.
This is the mechanism responsible for diamagnetism. It’s universal,
but typically much much weaker than paramagnetism. Therefore,
diamagnetism is mainly observed in atoms with even number of
electrons, where paramagnetism is usually absent.
9
Time-varying Electromagnetic Fields
In this chapter, we will focus on time-varying electromagnetic fields

(electromagnetic waves) in vacuum and linear media.
9.1 The Wave Equation in Vacuum
Let’s start from probably the simplest case: the electromagnetic

waves in vacuum. The Maxwell equations in vacuum are
∂B
∇× E = − , (9.1)
∂t
∇· E = 0, (9.2)
1 ∂E
∇× B = 2 , (9.3)
c ∂t
∇· B = 0. (9.4)
If ∂/∂t = 0, then since ρ = 0, j = 0, E = 0 and B = 0 from

Coulomb’s law and Biot-Savart’s law. To obtain nonzero E and B
solutions from above equations, ∂/∂t 6= 0.
Applying ∂/∂t to the ∇× E equation,
∂E ∂ ∂2 B
∇× = ∇× E = − 2 . (9.5)
∂t ∂t ∂t
Using Ampere’s law, we rewrite the equation as
∂E
∇× = ∇× c2 ∇× B . (9.6)
∂t
or
1 ∂2 B
∇×(∇× B) = − . (9.7)
c2 ∂t2
Using the vector identity,
∇×(∇× B) = ∇(∇· B) − ∇2 B = −∇2 B, (9.8)
so the last equation on the previous slide becomes

1 ∂2 B
−∇2 B = − , (9.9)
c2 ∂t2
or
∂2 B
− c2 ∇2 B = 0. (9.10)
∂t2
Similarly, one can derive the equation for E,
∂2 E
− c2 ∇2 E = 0. (9.11)
∂t2
Using Maxwell equations in covariant form, one can directly derive

the wave equation for φ and A,
∂2 φ
− c2 ∇2 φ = 0,
∂t2
∂2 A
− c2 ∇2 A = 0.
∂t2
These are sometimes called the homogeneous d’Alembert equations,

and
∂2
≡ − c2 ∇2 , (9.12)
∂t2
is the d’Alembert operator.
9.2 The Electromagnetic Plane Waves
To understand better the d’Alembert equations for field variables,

we now study a prototype equation of E and B,
∂2 f 2
2∂ f
− c = 0, (9.13)
∂t2 ∂x2
this is equivalent to assume E = E( x ) or B = B( x ). If E or B
depends only on one spatial coordinate, the wave is called a plane
wave.
Let’s now solve the plane wave equation in two ways, allowing
us to understand the equation from different points of view.
The first method involves writing the plane wave equation as

∂ ∂ ∂ ∂
−c +c f = 0. (9.14)
∂t ∂x ∂t ∂x
We can define two new variables ξ and η given by
x
ξ = t− , (9.15)
c
x
η = t+ . (9.16)
c
These equations gives t = (η + ξ )/2 and x = (η − ξ )/2. The
definitions of ξ and η leads to

∂ 1 ∂ ∂
= −c , (9.17)
∂ξ 2 ∂t ∂x

∂ 1 ∂ ∂
= +c . (9.18)
∂η 2 ∂t ∂x
Therefore the equation of f becomes
∂2 f
= 0 ⇒ f = f 1 ( ξ ) + f 2 ( η ). (9.19)
∂ξ∂η
time-varying electromagnetic fields 99
In t and x,
f = f 1 ( x − ct) + f 2 ( x + ct). (9.20)
It’s not difficult to see that f ( x − ct) represents a plane wave mov-
ing in the positive x-direction, and f ( x + ct) represents a plane
wave moving in the negative x-direction. The solution f ( x − ct) is
illustrated in the following figure.
f = f(x-ct)
t=0 t=t
f (x0)
x0 = x x0 = x - ct
x = x0 x = x0 + ct
Now consider the electromagnetic plane wave propagating in the

positive x-direction. We know now that E, B, φ, A all are functions
of ξ = x − ct. Let’s see which properties of EM waves we can derive.
∂A ∂φ ∂A
E = −∇φ − = − ex − , (9.21)
∂t ∂x ∂t
so
∂φ ∂A x
Ex = − − . (9.22)
∂x ∂t
However, ξ = x − ct, φ = φ(ξ ), A = A(ξ ), therefore
∂φ ∂φ ∂φ ∂φ
= and = (−c). (9.23)
∂x ∂ξ ∂t ∂ξ
or
∂φ 1 ∂φ
=− , (9.24)
∂x c ∂t
Similarly, one can find
∂A x 1 ∂A x
=− , (9.25)
∂x c ∂t
Therefore, Ex can be written as
∂φ ∂A x 1 ∂φ ∂A x
Ex = − − = +c . (9.26)
∂x ∂t c ∂t ∂x
Choosing Lorenz gauge,
1 ∂φ
+ ∇· A = 0, (9.27)
c2 ∂t
we immediately arrive at the conclusion that Ex = 0. Since Ex = 0,
∂Ay ∂Az
Ey = − , and Ez = − (9.28)
∂t ∂t
Now

∂ 1 ∂ 1 ∂A
B = ∇× A = ex × A = − ex × A = ex × − (9.29)
∂x c ∂t c ∂t

1 1 ∂φ 1
= e x × ( E + ∇φ) = e x × E + e x = e x × E. (9.30)
c c ∂x c
This result becomes very general if we let e x = n̂, the direction of

propagation, then
B = n̂ × E/c. (9.31)
So for electromagnetic plane waves, we have the following proper-

ties:
1. B = E/c in SI units.
2. B ⊥ n̂, E ⊥ n̂, and B ⊥ E.
Both E and B are perpendicular to n, the wave is said to be trans-

verse. Electromagnetic waves in vacuum are transverse waves.
The energy Poynting flux of the plane wave is
1
S= E×B (9.32)
µ0
1
= E × (n̂ × E/c) (9.33)
µ0
1
= [n̂( E · E) − E( E · n̂)] (9.34)
µ0 c
1 2
= E n̂. (9.35)
µ0 c
So the energy flow direction is also n̂. The energy density of the EM
plane wave is
e0 2 1 2
w= E + B = e0 E 2 , (9.36)
2 2µ0
therefore
S = cwn̂. (9.37)
The velocity
S
v= = cn̂, (9.38)
w
is the group velocity of the EM plane waves.
9.3 The Monochromatic Plane Waves
The wave equation can be solved by other means, e.g., separation of

variables or Fourier analysis. 1 1
Normally Fourier transform is used,
but since you probably haven’t learned
it, I’ll use separation of variables here.
Letting f (t, x ) = g(t)h( x ), then
∂2 f d2 g
= h ( x ), (9.39)
∂t2 dt2
∂2 f d2 h
2
= g(t) 2 . (9.40)
∂x dx
The wave equation becomes
d2 g 2 d2 h
h ( x ) − c g ( t ) = 0. (9.41)
dt2 dx2
Re-organizing terms of the equation leads to
1 d2 g 2
21 d h
= c . (9.42)
g dt2 h dx2
From the method of separation of variables, we let
1 d2 gn
= −ωn2 , (9.43)
gn dt2
1 d2 hn
= −k2n with k n = ωn /c. (9.44)
hn dx2
The solutions of the equations are
gn = gn0 e−iωn t , (9.45)

ik n x
hn = hn0 e . (9.46)
so
f n = gn hn = f n0 ei(kn x−ωn t) , (9.47)

f = ∑ f n = ∑ f n0 ei(kn x−ωn t) . (9.48)
n n
In wave literature,
f = f 0 ei(k·r −ωt) , (9.49)
is called a monochromatic (single frequency) wave. Here ω is the

angular frequency 2 , k is the wave vector, and ϕ = k · r − ωt is called 2
In theoretical physics, this is simply
called frequency.
the wave phase. You can see that ∂ϕ/∂t = −ω and ∇ ϕ = k. For
electromagnetic waves in vacuum, the phase velocity v p ≡ ω/k = c
equals the wave group velocity (v g ) 3 . The solution f n is therefore a 3
Note that this is a special case and in
monochromatic plane wave with frequency ωn and wave number general is not true.
k n . From Equation (9.48), general plane waves can be considered to

be consisted of a series of monochromatic waves.
Using complex variables to represent real physical variables
related to waves is very convenient and has a lot of advantages. For
example, for electromagnetic plane waves in vacuum,
f = f 0 ei(k·r −ωt) , (9.50)
we can immediately have
∂f
= −iω f ; ∇ f = ik f . (9.51)
∂t
If we use real functions, like cos and sin, e.g., f = f 0 cos(kx − ωt),
then
∂f ∂f
= ω f 0 sin(kx − ωt); = −k f 0 sin(kx − ωt). (9.52)
∂t ∂x
This is too tedious and unnecessarily complicated.
If we use a complex function f to represent a physical quantity,
then the real physical variable is just Re( f ). But under what con-
dition can we do this? The equation that f satisfies must be a linear
equation. Suppose L is a linear operator, then
L{ f } = L{Re( f ) + i Im( f )} = L{Re( f )} + i L{Im( f )} = 0, (9.53)
meaning L{Re( f )} = 0 and L{Im( f )} = 0. Maxwell equations are

linear equations, so we can use complex numbers to represent E
and B.
E.g., for a monochromatic wave variable f (scalar) and A (vec-
tor),
∂A ∂f
∇× A = ik × A, = −iωA, ∇ f = ik f and = −iω f .
∂t ∂t
(9.54)
Therefore, for an electromagnetic monochromatic plane wave,
∂A
E = −∇φ − = −ikφ + iωA, (9.55)
∂t
B = ik × A = k × E/ω. (9.56)
From this we see that E ⊥ B, and B ⊥ k. The physical “E” and “B”
are just Re(E) and Re(B).
What’s the energy density of the field if we use complex E? From
now, let’s denote complex E by Ê so that
E = ReÊ. (9.57)
The energy density of the electric field is w = e0 E · E/2. But is the

equation
ne o
0
w = Re Ê · Ê , (9.58)
2
correct? We need to look at this carefully. Suppose Ê = E0 ei(k· x−ωt) ,
and for simplicity, E0 is real.4 4
In general, it is a complex constant.
Ê · Ê = E20 e2iϕ with ϕ ≡ k · x − ωt, (9.59)
then
ne o e0
0
Re Ê · Ê = E02 cos 2ϕ. (9.60)
2 2
However, the energy density of the field is, using real variables
E = E0 cos ϕ,
e0 ne
0
o
w = E20 cos2 ϕ 6= Re Ê · Ê . (9.61)
2 2
The reason for this is obvious: w is not a linear function of E. How
to properly represent the energy density using complex variables
will be left as an exercise.
9.4 Wave Polarization
Let’s now introduce wave polarization. Consider a plane monochro-

matic EM wave with k = ke x , then
E = E0 ei(kx−ωt) , (9.62)
Since k = ke x , E = Ey + Ez , or
E = ( E0y + E0z )ei(kx−ωt) = ( Ê0y ey + Ê0z ez )ei(kx−ωt) . (9.63)
Note here Ê0y and Ê0z are in general complex constants.

Let’s start from the simplest case, the linear polarization. If Ê0y
and Ê0z have the same phase α; i.e.,
Ê0y = E0y eiα and Ê0z = E0z eiα . (9.64)
Then
E = ( E0y ey + E0z ez )ei(kx−ωt+α) , (9.65)
You can see here that the electric field vector oscillates along a line
with a fixed direction n̂; the angle between n̂ and ey is

E0z
hn̂, ey i = tan−1 , (9.66)
E0y
This wave is called linearly polarized or plane polarized.

In general Ê0y and Ê0z have different phases, and the wave is
said to be elliptically polarized.
Let’s start from the simplest case of circular polarization, where
E0y = E0z = E0 , i.e.,
Ê0y = E0 eiα and Ê0z = E0 ei(α±π/2) , (9.67)
then
E = E0 (ey ± iez )ei(kx−ωt+α) . (9.68)
The actual electric field is Re(E), i.e.,
Ey = E0 cos(kx − ωt + α), (9.69)

Ez = ∓ E0 sin(kx − ωt + α). (9.70)
At a fixed point z = z0 , the electric field sweeps around in a circle

at a frequency ω with constant magnitude E0 . This wave is called
circularly polarized. Note that Ê0z / Ê0y = i or −i, the directions
of rotation (DOR) are different. When the observer is facing at the
oncoming wave,
1. Ê0z / Ê0y = i, DOR is counter-clockwise.
2. Ê0z / Ê0y = −i, DOR is clockwise.
E = E0 (ey + iez )ei(kx−ωt+α)

Circular polarization may be referred to as right-handed or left-

handed, but there is a lot of confusion about this definition. There
are at least three different ways I’m aware of to define these terms.
Way I: From the point of view of the source
Pointing one’s left or right thumb away from the source, in the
same direction as k, matching the curling of figures to the temporal
rotation of the field.
1. Ê0z / Ê0y = i, right-handed polarization.
2. Ê0z / Ê0y = −i, left-handed polarization.
Typical communities using Way I are like engineering, quantum

physics, and radio astronomy.
Way II: From the point of view of the receiver
Pointing one’s left or right thumb toward the source, against
the direction of k, matching the curling of figures to the temporal
1. Ê0z / Ê0y = i, left-handed polarization.
2. Ê0z / Ê0y = −i, right-handed polarization.
Typical community using Way II is optics.

Way III: From the direction of the local magnetic field
Pointing one’s left or right thumb along local B0 , regardless of
the direction of k, matching the curling of figures to the temporal
1. Ê0z / Ê0y = i, left or right polarization.
2. Ê0z / Ê0y = −i, left or right polarization.
Typical community using Way III is plasma physics. The reason for
this is that R-mode waves rotate in the same sense as an electron,
L-mode waves rotate in the same sense as an ion.
Circular polarization is a special case of the elliptical polarization.
Now let’s take Landau’s approach and study elliptical polarization.
Recall that
E = E0 ei(kx−ωt) , (9.71)
Suppose E0 = beiα , where b is a complex vector,
b = b1 + ib2 , (9.72)
where b1 and b2 are real vectors. Here we choose b so that b · b =

b21 + b22 + 2ib1 · b2 is real5 , then b1 · b2 must be 0, or b1 ⊥ b2 . For 5
It’s customary to use b2 ≡ b · b.
simplicity of discussion, we choose b1 = b1 e1 with b1 > 0.
The other direction e2 is defined so that e x × e1 = e2 , then
b2 = ±b2 e2 , b2 > 0. (9.73)

Let’s now calculate the e1 and e2 components of E.
E = bei(kx−ωt+α) = (b1 e1 ± ib2 e2 )ei(kx−ωt+α) . (9.74)

i(kx −ωt+α)
Ê1 = b1 e e1 , (9.75)
Ê2 = ±b2 ei(kx−ωt+α+π/2) e2 . (9.76)
Therefore, in e1 and e2 directions,
E1 = b1 cos(kx − ωt + α), (9.77)

E2 = ∓b2 sin(kx − ωt + α). (9.78)
You can then easily verify that E1 and E2 satisfy
E12 E22
+ = 1. (9.79)
b12 b22
At any fixed point in space, the E vector rotates in a plane per-

pendicular to the direction of propagation. The endpoint of the
E vector describes the ellipse given by the equation; the wave is
elliptically polarized. If b1 = b2 , then the wave becomes circularly
polarized. If b1 = 0 or b2 = 0 (but not both!), then E1 = 0 or E2 = 0,
respectively. The wave is linearly polarized.
9.5 Propagation of Electromagnetic Waves in Linear Media
From now on, we consider electromagnetic waves in matter. For

linear media without free current and free charges, we know that
the Maxwell equations are
∇· E = 0, (9.80)
∂B
∇× E = − , (9.81)
∂t
∇· B = 0, (9.82)
∂E
∇× B = µe . (9.83)
∂t
The boundary conditions are the same as before,
e1 E1⊥ = e2 E2⊥ , (9.84)

k k
E1 = E2 , (9.85)
B1⊥ = B2⊥ , (9.86)
k k
B1 B2
= . (9.87)
µ1 µ2
We have assumed here that µ and e are independent of wave fre-

quency. Now following what we have down before, the components
of E and B satisfy the wave equation
∂2 u
− v2 ∇2 u = 0, (9.88)
∂t2
√
where v = 1/ µe is the phase velocity of light in this substance.
The plane-wave solution of Equation (9.88) has been obtained
before, in general
u( x, t) = f ( x − vt) + g( x + vt). (9.89)
Or in terms of monochromatic plane wave solution,
u = u0 ei(k· x−ωt) , (9.90)
with
ω √
k= = ω µe, (9.91)
v
the wave number. Note that we can define the index of refraction by
6 6
Again, please note that n is the
refraction index while the vector n̂
c √
r
µe denotes unit vector in the propagation
n= = = er µ r . (9.92) direction
v µ 0 e0
√
For most substances, µr ≈ 1, and therefore n ≈ er . Equation (9.91)
shows that the phase velocity of EM waves in linear media is scaled
by a factor 1/n.
With a monochromatic plane wave solution like Equation (9.90),
we can see from the Maxwell equations that
k · E = 0 and k · B = 0, (9.93)
therefore, the wave is transverse. Furthermore, Faraday’s law gives
ik × E = −(−iω ) B (9.94)
or
E
n× = B, (9.95)
c
where n = nk b = kc/w. Equation (9.95) shows that B = nE/c for a

transverse wave in media.
In above discussions, we assumed that k and ω are real. In more
general cases, to represent wave growth or damping, k or ω can be
complex. Let’s say for now that k = kr + iki is complex and ω is real.
Then the plane wave solution is
u = u0 ei(kr · x−ωt) e−ki · x . (9.96)
This means that the wave damps (k i > 0) or grows (k i < 0) spatially,
we’ll use this in studying wave propagation in conducting medium.
On the other hand, if k is real and ω = ωr + iωi , then
u = u 0 e i ( k · x − ωr t ) e ωi t . (9.97)
means that the wave grows (ωi > 0) or damps (ωi < 0) temporally.
9.6 Reflection and refraction of electromagnetic waves
9.6.1 General information

In this section, we will study reflection and refraction (折射) of
light, mainly using Equations (9.84)-(9.87). We will explain, using
electrodynamics, the following two basic properties you are familiar
with:
• The incident, reflected, and refracted wave vectors form a plane.
• Angle of reflection equals angle of incidence.
• Snell’s law; i.e.,

sin θi n0
= , (9.98)
sin θr n
where θi and θr are the angles of incidence and refraction, respec-
tively, while n0 and n are the corresponding indices of refraction.
We will also study the following things that you may not know
• Intensities of reflected and refracted radiation.
• Phase changes and polarization.
The coordinate system is like this. For z > 0, we have a linear

media with µ0 and e0 ; for z < 0, µ and e. The indices of refraction
are then n0 = µ0 e0 /µ0 e0 for z > 0 and n = µe/µ0 e0 for z < 0.
p p
A plane wave with wave vector k and frequency ω is incident from

z < 0, medium µ and e. The refracted and reflected wave have wave
vectors k0 and k00 , respectively. Define θi the incident angle, θr0 the
angle of reflection, θr the angle of refraction.
There are three waves.
• INCIDENT
E = E0 ei(k· x−ωt) , (9.99)

E √ k
B = n× = µe × E. (9.100)
c k
• REFRACTED
0
E0 = E00 ei(k · x−ωt) , (9.101)
E0 k0
B0 = n0 × = µ0 e0 0 × E0 .
p
(9.102)
c k
• REFLECTED
00
E00 = E000 ei(k · x−ωt)
, (9.103)
E00 √ k00
B00 = n00 ×= µe × E00 . (9.104)
c k
Note that here we have used n00 = n = µe/µ0 e0 and k00 =

p
√ 0
k = ω µe/µ0 e0 /c = ω µe. Of course, k = ω µ0 e0 /µ0 e0 /c =
p p
ω µ0 e0 .
p
The field in regions below z = 0 is E + E00 and B + B00 , and

above z > 0, E0 and B0 . Boundary conditions (9.84)-(9.87) will be
something like, at z = 0,
00 0
(· · · )ei(k· x−ωt) + (· · · )ei(k · x−ωt)
= (· · · )ei(k · x−ωt) . (9.105)
This condition must be satisfied at all points, at all times, when

z = 0. Therefore we must have that all the phase factors are the
same, or
k · x = k00 · x = k0 · x, at z = 0, (9.106)
or
k x x + k y y = k00x x + k00y y = k0x x + k0y y. (9.107)
This equation will be satisfied for all points (x and y) only if
k x = k00x = k0x , (9.108)
and
k y = k00y = k0y . (9.109)
Let’s orient our axes so that k is in the xz plane; i.e., k y = 0. Then

according to equation (9.109), k00y = k0y = 0; i.e., k00 and k0 are both in
the xz plane. So we have the first conclusion:
The incident, reflected, and refracted wave vectors form a

plane (called the plane of incidence), which also includes the
normal to the surface (here ez ).
Second, Equation (9.108) gives
k sin θi = k00 sin θr0 = k0 sin θr . (9.110)
Because k = k00 , we immediate have that θi = θr0 , or
The angle of incidence equals the angle of reflection.
Regarding the angle of refraction θr , we have
sin θi k0 n0
= = . (9.111)
sin θr k n
This is just the law of refraction – Snell’s law.

To obtain intensity and phase relations, we need to use boundary
conditions. Also for simplicity, we assume that the incident wave
is linearly polarized. A more general elliptically polarized wave
can be decomposed into two linearly polarized waves. We consider
two cases, one is the polarization vector E is within the plane of
incidence (xz plane). The second one with the polarization vector E
normal to the plane of incidence.
We first consider E normal to plane of incidence. Let all E in y
direction. The electric field of the incident wave is E0 = E0 ey , etc.
All B fields are then defined by B = n × E/c. Applying boundary

conditions given by Equations (9.85) and (9.87), we have
E0 + E000 − E00 = 0, (9.112)

s
e0 0
r
e 00
( E0 − E0 ) cos θi − E cos θr = 0. (9.113)
µ µ0 0
The other two boundary conditions yield nothing new. The relative
amplitudes of the reflected and refracted waves are
E00 2n cos θi
= q , (9.114)
E0 0 0 2 2 2
n cos θi + (µ/µ ) n − n sin θi
q
00 n cos − ( 0 ) n02 − n2 sin2 θ
E0 θ i µ/µ i
= q , (9.115)
E0 2
n cos θi + (µ/µ0 ) n02 − n2 sin θi
where we have used Snell’s law. This is for E perpendicular to the

plane of incidence.
Now if the E field is parallel to the plane of incidence. Let’s now
make all magnetic field components in −ey direction. Boundary
conditions (9.85) and (9.87) give
cos θi ( E0 − E000 ) − cos θr E00 = 0, (9.116)

s
e0 0
r
e
( E0 + E000 ) − E = 0. (9.117)
µ µ0 0
These can be combined to give
E00 2nn0 cos θi

= q , (9.118)
E0
(µ/µ0 )n02 cos θi + n n02 − n2 sin2 θi
q
E000 (µ/µ0 )n02 cos θi − n n02 − n2 sin2 θi
= q . (9.119)
E0
(µ/µ0 )n02 cos θi + n n02 − n2 sin2 θi
For normal incidence θi = 0, both cases give the same results
E00 2
= p 0 0 , (9.120)
E0 µe /µ e + 1
E000 µe0 /µ0 e − 1
p
= p 0 0 . (9.121)
E0 µe /µ e + 1
If further we assume µ0 = µ, then
E00 2n
= 0 , (9.122)
E0 n +n
E000 n0 − n
= 0 . (9.123)
E0 n +n
9.6.2 Polarization by reflection

There is an interesting thing about reflection. If E is in the plane of
incidence, then from equation (9.119), you can see that there is a
special angle of incidence, θ B , at which the reflection total vanishes;

i.e., E000 = 0. Now let’s assume µ0 = µ for simplicity, this angle is
given by
q
n02 cos θ B − n n02 − n2 sin2 θ B = 0, (9.124)
which can be solved to give
n0

−1
θ B = tan . (9.125)
n
This angle is called Brewster’s angle. Waves with E perpendicular to

the plane of incidence do not show such behavior. Now consider a
light with mixed polarization is incident at the Brewster angle, then
the reflected light is completely polarized with E perpendicular
to the plane of incidence. At other angles, the reflected light is
partially polarized. This is the physics behind the circular polarizer
widely used in photography or sunglasses to remove reflections or
glares from water surface.
9.6.3 Total internal reflection

You are probably very familiar with total internal reflection from
other cases. If n > n0 , then there is angle of incidence θ0 , at which
θr = π/2. This follows directly from Snell’s law; i.e.,
0
π n
n sin θ0 = n0 sin = n0 ⇒ θ0 = sin−1 . (9.126)
2 n
At θ0 , the refracted wave has a k0 parallel to the interface; there can

be no energy flow across the surface. Hence when θi = θ0 , there
must be total reflection.
But what about θi > θ0 ? To solve this, we note that for θi > θ0 ,
n n
sin θr = sin θi > 0 sin θ0 = 1. (9.127)
n0 n
Therefore θr must be a complex angle, and its cosine is purely
imaginary, given by
s
sin θi 2
q
2
cos θr = 1 − sin θr = i − 1. (9.128)
sin θ0
Putting cos θr into k0z = k0 · ez = k0 cos θr , we immediately get

s
sin θi 2

k0 = k0 sin θr e x + ik0 − 1ez = k0x e x + ik0zi ez . (9.129)
sin θ0
That is, k0z is now purely imaginary, and it equals ik0zi . Therefore, for
refracted wave,
0 0 0
eik · x = e−kzi z eik x x . (9.130)
This equation shows that, for θi > θ0 , the refracted wave will be
quickly damped/attenuated, typically within a few wavelengths.
You can also verify that there is no energy flow through the surface.
The time-averaged Poynting vector along ez is
hSi · ez ∝ Re ez · ( E00 × H0 0∗ ) ,

(9.131)
where H 00 = n0 × E00 /µ0 c. Average equation (9.131) gives

h i
hSi · ez ∝ Re (ez · n0 )| E00 |2 , (9.132)
But ez · n0 ∝ ez · k0 is purely imaginary, therefore hSi · ez = 0.
9.7 The frequency dependence of permittivity
In the previous section, we assumed that e and µ were independent

√
of frequency, therefore normally n ≈ er is also independent of
frequency. In reality, however, all materials show some kind of
dispersion: the dependence of refractive index on frequency. The
medium is correspondingly called dispersive. In this section, we
develop a simple model to describe the dependence of e, therefore
n, on frequency ω.
The equation of motion of a bounded electron (dielectric) is
given by
e e
ẍ + γ ẋ + ω02 x = − E( x, t) = − E0 e−iωt . (9.133)
m m
Here ω0 is the natural frequency of the electron, and γ (ω0 , ω ).
We know this equation has a solution
e 1
x=− 2
E. (9.134)
m ω0 − ω 2 − iωγ
The homogeneous solution is discarded due to attenuation. The

corresponding dipole moment is
e2 1
p = −ex = 2
E. (9.135)
m ω0 − ω 2 − iωγ
Now suppose that there are N molecules per unit volume and Z
electrons per molecule. Instead of a single natural frequency for all
electrons, there are f j electrons per molecule with natural frequency
ω j and damping coefficient γ j , then polarization vector P is
Ne2 fj
P=
m ∑ ω2 − ω2 − iωγ E = χe e0 E. (9.136)
j j j
Where χe is electric susceptibility. Therefore the relative permittivity

e is
Ne2 fj
er = 1 + χ e = 1 +
me0 ∑ ω2 − ω2 − iωγ . (9.137)
j j j
√
Now that er is complex, correspondingly n = er and k = nω/c
are both complex. If k = kr + ik i , then the wave is damped/attenuated
if k i > 0. In our case, because of energy conservation, we expect

k i > 0 because of a finite γ.
Normally, since the intensity of the wave is proportional to
Re( E)2 , it’s customary to define
α
k = β+i , (9.138)
2
so that the intensity of the wave falls off as e−αx . For gases, the sec-
√
ond term on the RHS of equation (9.137) is small. From 1 + x ≈
1 + x/2, approximately we have
√ Ne2 fj
n≈ er ≈ 1 +
2me0 ∑ ω2 − ω2 − iωγ (9.139)
j j j
Now we separate n into a real part and an imaginary part, n =

nr + ini , or
Ne2 f j (ω 2j − ω 2 )
nr = 1 +
2me0 ∑ (ω2 − ω2 )2 + (ωγ )2 , (9.140)
j j j
Ne2 ω f j γj
ni =
2me0 ∑ (ω2 − ω2 )2 + (ωγ )2 (9.141)
j j j
Here nr is the normal index of refraction, and ni characterize the

damping/growth of the wave. Putting k = nω/c and using Equa-
tion (9.141), we have
Ne2 ω 2 f j γj
α=
me0 c ∑ (ω2 − ω2 )2 + (ωγ )2 (9.142)
j j j
Therefore the index of refraction is given by Equation (9.141) and

the damping coefficient is given by (9.142).
You can see by plotting Equation (9.141) that normally the index
of refraction increases with frequency. However, in the immediate
neighborhood of a resonance, the index of refraction drops sharply
with increasing frequency; this behavior is called anomalous disper-
sion7 . At the same time from (9.142), you see that α becomes very 7
The name “anomalous dispersion”
large near resonances, a large amount of energy is dissipated by the is kind of misleading, because it is a
very common phenomenon in many
damping mechanism. substances. This nomination is due to
If we stay away from resonances, the index of refraction can be historical reasons. People firstly found
“normal dispersion”, the phenomenon
approximately written as that dispersion rate increases as wave
frequency increases. Later people also
Ne2 fj found dispersion rate can also decrease
nr = 1 +
2me0 ∑ ω2 − ω2 . (9.143) as wave frequency increases, so it is
j j named “anomalous dispersion”.
For most substances, natural frequencies ω j are scattered all over

the spectrum. For transparent materials, the nearest significant
resonances typically lie in the ultraviolet, so that ω < ω j . In this
case,
!
1 1 ω2
≈ 2 1+ 2 , (9.144)
ω 2j − ω 2 ωj ωj
therefore,
! !
Ne2 fj Ne2 fj
nr = 1 +
2me0 ∑ ω2 +ω 2
2me0 ∑ ω4 . (9.145)
j j j j
In terms of the wavelength in vacuum (λ = 2πc/ω ):

B
n = 1 + A (1 + ). (9.146)
λ2
This is known as Cauchy’s formula; the constant A is called the
coefficient of refraction, and B is called the coefficient of dispersion.
Cauchy’s equation applies reasonably well to most gases, in the
optical region.
Before we end this section, let’s quickly discuss three limiting
cases for e.
Case 1): The low frequency limit ω → 0. If all ω j 6= 0 (like
insulators), then you’ll recover molecular polarizability if ω = 0.
Now we consider a different case. Suppose that some electrons are
free (conductors), not bound to any molecules, then the natural
frequencies of these electrons are 0. Use j = 0 to denote these
electrons. If we separate the contribution of free electrons,
Ne2 fj Ne2 f0
er ( ω ) = 1 +
me0 ∑ ω2 − ω2 − iωγ + 2
me0 ω0 − ω 2 − iωγ0
j 6 =0 j j
(9.147)
Ne2 f 0
≈ ee + i , (9.148)
me0 ω (γ0 − iω )
where ee are due to all bounded electrons (j 6= 0). Putting this er
into Faraday’s law, we have, for conductors,
∂D
∇× H = = −iωer (ω )e0 E. (9.149)
∂t
On the other hand, for conductors, we can view ee as the normal
dielectric constant due to bounded electrons. For free electrons, the
current density can be expressed using Ohm’s law j f = σE, where
σ is the electric conductivity. Putting this into Ampere’s law, we get
∂D σ
∇× H = j f + = −iω ee e0 + i E. (9.150)
∂t ω
Combining these two equations, we have
Ne2 f 0
σ= . (9.151)
m(γ0 − iω )
This is the model of Drude for the electric conductivity, with f 0 N
being the number of free electrons per unit volume.
Case 2):The other limit is the high frequency limit. If ω ω j for
all j, then
Ne2 fj NZe2 1 ω 2p
er = 1 +
me0 ∑ ω2 − ω2 − iωγ ≈ 1−
me0 ω 2
≡ 1 −
ω2
.
j j j
(9.152)
Here
NZe2
ω 2p = , (9.153)
me0
is the very famous plasma frequency of the medium. The correspond-
ing dispersion relation is, using n2 = er ,
ω 2 = ω 2p + c2 k2 . (9.154)
This is valid only for ω ω p in our cases. In actual plasmas, if
q
ω < ω p , the wave number k = ω 2 − ω 2p /c is imaginary, and the
refracted wave is attenuated. You can then use this to measure the
plasma density. For ω < ω p , the wave is reflected.
9.8 Electromagnetic Waves in Conductors
As we have discussed in the previous section, the dielectric constant

e is normally complex. The imaginary part of e can be neglected
for insulators for many purposes. However, for conductors, the
imaginary part of e becomes important. It is customary to separate
the complex dielectric constant of the conducting medium formally
into two parts: a real dielectric constant e and a real conductivity σ,
as we did in the last section, case 1).
Now with these in mind, we write the Maxwell equations for
conductors, assuming µ and e.
ρf
∇· E = , (9.155)
e
∂B
∇× E = − , (9.156)
∂t
∇· B = 0, (9.157)
∂E
∇× B = µe + µσE. (9.158)
∂t
Here we have used j f = σE.
Now consider the continuity equation and Gauss’ law (Equation
(9.155)), we have
∂ρ f σ
= −∇· j f = −σ (∇· E) = − ρ f . (9.159)
∂t e
This equation can be solved to give
ρ f = ρ f 0 e−t/τ , (9.160)
where τ = e/σ. This equation clearly shows that any free charge
will disappear within a time scale of τ. For a “perfect” conductor,
σ = ∞, and τ = 0. In reality, we can compare τ with the relevant
time scale to measure how good a conductor is. The relevant time
scale for a oscillatory system with frequency ω is 1/ω. For “good”
conductors, τ 1/ω. For “poor” conductors, τ 1/ω. These two
equations can also be written like this:
 σ

 1, f or”good”conductors
eω
(9.161)
 σ 1, f or”poor”conductors

eω
Now let’s consider good conductors, where τ is typically very

small, on the order of 10−14 or smaller. Then we are not interested
in the transient behavior of ρ explained above. We wait for any ac-
cumulated free charge to disappear. Then ρ f = 0, and the Maxwell
equations are
∇· E = 0, (9.162)
∂B
∇× E = − , (9.163)
∂t
∇· B = 0, (9.164)
∂E
∇× B = µe + µσE. (9.165)
∂t
These equations are just like the Maxwell equations we discussed in
the previous section, except the extra term in Ampere’ law.
As before, we can obtain wave equations for E and B. Both E
and B satisfy
∂2 E ∂E
∇2 E = µe + µσ , (9.166)
∂t2 ∂t
∂ 2B ∂B
∇2 B = µe 2 + µσ . (9.167)
∂t ∂t
Assuming E = E0 ei(k· x−ωt) and similarly for B, we have a com-
plex k, which is given by
k2 = µeω 2 + iµσω. (9.168)
Let k = β + iα/2, then we get

r "r #1/2
µe σ 2
β= ω 1+ +1 , (9.169)
2 eω
r "r #1/2
α µe σ 2
= ω 1+ −1 . (9.170)
2 2 eω
For a poor conductor, σ/eω 1, then

r
√ σ µ
k≈ µeω + i . (9.171)
2 e
In this case, Re(k) Im(k). For a good conductor, however,
σ/eω 1, β ≈ α/2, and
p
k = (1 + i) µσω/2 (9.172)
The imaginary part of k, as explained before, means the attenuation

of the wave. The distance it takes to reduce the amplitude by a
factor of 1/e ≈ 1/3 is called the skin depth δ, given by
s
2 2
δ= ≈ . (9.173)
α µσω
This equation is valid only for good conductors. For a conductor

like copper, δ ≈ 0.85 cm for 60 Hz frequency wave, and 0.71 ×
10−3 cm for a 100 MHz wave.
The whole calculation above for k can be obtained simply by

p
using a complex reflective index n = µb e = e+
e/µ0 e0 with b
iσ/ω(see formula (1.158)), and k = nω/c.
We can obtain the relation between B and E using Faraday’s’ law.
For a good conductor,
E ck E 1p
B = n× = × = µσω/2(1 + i)ek × E, (9.174)
c ω c ω
which is
r
µσ iπ/4
B= e ek × E. (9.175)
ω
Therefore, B lags E in phase by π/4 for a good conductor. Note
that the energy density in a linear media is
1 1 1
w= ( E · D + B · H ) = (eE2 + B2 ). (9.176)
2 2 µ
Hence in a good conductor, the ratio of magnetic energy density to

electric energy density is
wB B2 /µ σ
= = 1. (9.177)
wE eE2 eω
Therefore most energy is in the magnetic field.
9.9 Reflection at a conducting surface
We only deal with normal incidence from vacuum to a good con-

ductor here for simplicity. The relevant boundary conditions are
k k
E1 = E2 , (9.178)
k k
H1 − H1 = K f × n. (9.179)
Now for a good conductor with finite σ, j f = σE. This indicates

you cannot get a δ-function like free surface current (σ → 0), unless
you have E infinite 8 . The boundary conditions for good conductors 8
In case of a perfect conductor, cur-
are rents are confined to the surface and
there is true surface current.
k k
E1 = E2 , (9.180)
k k
H1 = H1. (9.181)
Using notations from the previous chapter and sessions, we have
E0 + E000 = E00 , (9.182)

00 0
H−H = H . (9.183)
Assuming µr ≈ µr0 ≈ 1 and using

r
0 σ
H = (1 + i) E00 , (9.184)
2µ0 ω
H = E/µ0 c, (9.185)
H 00 = E00 /µ0 c. (9.186)
we write Equation (9.183) as

r
µ0 σ
E0 − E000 = c (1 + i) E00 . (9.187)
2ω
Solving Equation (9.187) and (9.182) together, we get
E000
p
1 + i − 2ω/µ0 σ/c
=− p . (9.188)
E0 1 + i + 2ω/µ0 σ/c
The corresponding reflection coefficient for normal incidence is
00 2 p 2 s r
E0 1 − 2ω/µ 0 σ/c + 1 2 2ω 2ωe0
R = = 2 ≈ 1− = 1−2 .
E0 p
1 + 2ω/µ σ/c + 1 c µ 0 σ σ
0
(9.189)
For perfect conductors, σ → ∞, and R → 1. This is why excellent

conductors make good mirrors9 . 9
Check the back of your mirrors at
home, see what that is (most likely
some kind of silver).
9.10 Wave guides and resonant cavities
So far, we have only considered plane waves in infinite medium.

Now we consider electromagnetic waves confined to hollow metal-
lic pipe. If the pipe has end surfaces, it is called a cavity; otherwise,
a wave guide. We assume that the boundaries are made of perfect
conductors. We’ll only deal with pipes those have rectangular cross
sections, and we assume that the cross sectional size and shape are
constant. One of the main purposes of this section is to demonstrate
that boundaries can cause the change of wave properties. One main
change is that confined waves, in general, are not transverse.
We assume that the pipe is filled with a uniform non-dissipative
medium with dielectric constant e and permeability µ. Then the
Maxwell equations inside the medium are, assuming e−iωt depen-
dence,
∂B
∇× E = − = iωB, (9.190)
∂t
∇· E = 0, (9.191)
∇· B = 0, (9.192)
∂E
∇× B = µe = −iµeωE. (9.193)
∂t
These equations are the same as the equations in the previous
chapter, and the equations for E and B are
E
∇2 + µeω 2 = 0, (9.194)
B
subject to boundary conditions
Ek = 0, (9.195)
B⊥ = 0. (9.196)
Note that we do not assume the eik· x form for the spatial depen-
dence, as it depends on the type of boundary conditions.
9.10.1 Wave guides

Suppose the direction along pipe (in case of a wave guide) is z.
In this case, waves can propagate along z direction; therefore, we
assume
E( x, y, z, t) = E( x, y)ei(kz−ωt) , (9.197)
B( x, y, z, t) = B( x, y)ei(kz−ωt) . (9.198)
The field equation (9.194) reduces to two dimensions,

2
∂ ∂2
2 2
E
+ 2 + µeω − k = 0. (9.199)
∂x2 ∂y B
You can use separation of variables to solve directly this equation.
But before that let’s write out the components of Maxwell equations
– more specifically, the Faraday’s law and the Ampere’s law. Then
we’ll demonstrate that the Ex and Ey are completely determined if
we specify Ez and Bz .
The Faraday’s law and Ampere’s law give
∂Ey ∂Ex
− = iωBz , (9.200)
∂x ∂y
∂Ez
− ikEy = iωBx , (9.201)
∂y
∂Ez
ikEx − = iωBy , (9.202)
∂x
∂By ∂Bx
− = −iµeωEz , (9.203)
∂x ∂y
∂Bz
− ikBy = −iµeωEx , (9.204)
∂y
∂Bz
ikBx − = −iµeωEy . (9.205)
∂x
Equations (9.201), (9.202), (9.204) and (9.205) can be combined to
give

i ∂Ez ∂Bz
Ex = k +ω , (9.206)
µeω 2 − k2 ∂x ∂y

i ∂Ez ∂Bz
Ey = 2 2
k −ω , (9.207)
µeω − k ∂y ∂x

i ∂Bz ∂Ez
Bx = 2 2
k − µeω , (9.208)
µeω − k ∂x ∂y

i ∂Bz ∂Ez
By = 2 2
k + µeω . (9.209)
µeω − k ∂y ∂x
It’s clear from these equations that once we specify Ez and Bz ,
the x,y components of E and B are completely determined. The
equations for Ez and Bz are simply obtained from Equation (9.199),
2
∂ ∂2
2 2
E
z
+ 2 + µeω − k = 0. (9.210)
∂x2 ∂y Bz
Depending on Ez and Bz , there are three types of waves in a
wave guide. If Ez = 0, we call these TE (“transverse electric”) waves;
if Bz = 0, we call them TM (“transverse magnetic”) waves. If both

Ez = 0 and Bz = 0, we call them TEM waves.
Note that TEM waves cannot occur in a hollow wave guide. If
Ez = 0 and Bz = 0, you cannot use Equations (9.206)-(9.209); they
are indeterminate. So we go back to Maxwell equations. From
Gauss’s law,
∂Ex ∂Ey
+ = 0. (9.211)
∂x ∂y
If Bz = 0, Faraday’s law gives
∂Ey ∂Ex
− = 0. (9.212)
∂x ∂y
The vector E = Ex e x + Ey ey has zero divergence and zero curl. There-
fore it can be written as a scalar potential that satisfies Laplace’s
equation. But the boundary condition on E requires Ek = 0, or the
boundary surface should be an equipotential. Because Laplace’s
equation doesn’t allow local extreme values, extreme values only
occur on boundaries. Therefore, the scalar potential must be a con-
stant inside the domain, and the electric field E = 0. Note that this
argument only applies to a completely hollow pipe. If we have a separate
conductor down the middle (e.g., a coaxial transmission line), the poten-
tial at its surface need not be the same as on the outer wall, and hence a
nontrivial solution of potential is possible, allowing E 6= 0.
9.10.2 TE waves in a rectangular wave guide

Suppose we have a rectangular wave guide with height a and
width b, and we want to know the propagation of TE waves. From
discussions in previous section, we only need to solve
2
∂ ∂2
2 2

+ + µeω − k Bz = 0. (9.213)
∂x2 ∂y2
Because B⊥ = 0, therefore from Equations (9.208) and (9.209), we
have boundary conditions
∂Bz
= 0, at x = 0, a, (9.214)
∂x
∂Bz
= 0, at y = 0, b. (9.215)
∂y
The equation (9.213) subject to boundary conditions (9.214) forms a
typical eigenvalue problem.
Using separation of variables, Bz ( x, y) = X ( x )Y (y), then
1 d2 X 1 d2 Y
2
+ + µeω 2 − k2 = 0. (9.216)
X dx Y dy2
Therefore, we have
1 d2 X
= −k2x , (9.217)
X dx2
1 d2 Y
= −k2y , (9.218)
Y dy2
k2x + k2y + k2 = µeω 2 . (9.219)
The general solutions of X and Y are
X ( x ) = A sin(k x x ) + B cos(k x x ), (9.220)

Y (y) = C sin(k y y) + D cos(k y y). (9.221)
Consider boundary conditions
X ( x ) = B cos(k x x ), with k x = mπ/a, (9.222)

Y (y) = D cos(k y y), with k y = nπ/b. (9.223)
Here m = 0, 1, 2, 3, · · · , and n = 0, 1, 2, 3, · · · . Therefore,
Bz = B0 cos(mπx/a) cos(nπy/b). (9.224)
This solution is called the TEmn mode. Conventionally m ≥ n. Note

that m and n cannot be both 0. The wave number of TEmn mode is
q
k = µeω 2 − π 2 [(m/a)2 + (n/b)2 ]. (9.225)
If
1
q
ω< √ π (m/a)2 + (n/b)2 ≡ ωmn , (9.226)
µe
the wave number becomes imaginary, and the wave cannot propa-
gate far. Therefore ωmn is called the cutoff frequency for TEmn mode.
The lowest cutoff frequency for a given wave guide is clearly
π
ω10 = √ . (9.227)
µea
Frequencies below this will not propagate at all in the given wave
guide.
The wave number k can be written using the cutoff frequency as
√
q
k = µe ω 2 − ωmn 2 . (9.228)
The wave phase velocity is
ω 1
v ph = = √ p . (9.229)
k µe 1 − (ωmn /ω )2
The group velocity is
dω 1 1
q
vg = = = √ 1 − (ωmn /ω )2 . (9.230)
dk dk/dω µe
If µ = e = 1, then v ph > c and v g < c.

10
Fields of Moving Charges
Starting from this chapter, we discuss how moving charges generate

fields.
10.1 Potentials of general sources
To consider fields generated by general sources, we start from the

four-dimensional Maxwell equations and find the potentials,
∂F αβ
= µ0 j α . (10.1)
∂x β
In terms of Aα = (φ/c, A), these equations are
1 ∂2 A
∇2 A − = −µ0 j, (10.2)
c2 ∂t2
1 ∂2 φ ρ
∇2 φ − 2 2 = − . (10.3)
c ∂t e0
Operator
1 ∂2
≡ ∇2 − (10.4)
c2 ∂t2
is the d’Alembert operator.
If the fields are static, then
∇2 A = −µ0 j, (10.5)
ρ
∇2 φ = − . (10.6)
e0
These are static field equations. On the other hand, if ρ = 0 and

j = 0, then
1 ∂2 A
∇2 A − = 0, (10.7)
c2 ∂t2
1 ∂2 φ
∇2 φ − 2 2 = 0. (10.8)
c ∂t
These are homogeneous wave equations, or wave equations in
vacuum. We have dealt with both cases in previous chapters.
The general solution of an inhomogeneous equation is the sum
of a special solution (or called a particular solution) and a general
solution of the corresponding homogeneous equation. Let’s take

the φ equation for example. The solution is given by
φ = φ1 + φ0 , (10.9)
where φ1 is a special solution, and φ0 is the general solution of the

homogeneous equation (10.8).
To find the special/particular solution, we use the principle of
superposition. We start from considering the potential produced
by a point charge dq(t) at x0 . The charge density of this charge is
ρ = dq(t)δ( R), with R = x − x0 . The corresponding φ-equation (10.3)
can be written as
1 ∂2 φ dq(t)
∇2 φ − 2 2
=− δ ( R ). (10.10)
c ∂t e0
Everywhere with R 6= 0, Equation (10.10) is
1 ∂2 φ
∇2 φ − = 0. (10.11)
c2 ∂t2
The method here is to first solve Equation (10.11) for φ at R 6= 0,
then we match the this solution to φ near R = 0.
Clearly for a point charge dq(t), φ has central symmetry, i.e.,
φ = φ( R). Using spherical coordinates and considering φ = φ( R),
we write Equation (10.11) as
1 ∂2 φ

1 ∂ 2 ∂φ
R − = 0. (10.12)
R2 ∂R ∂R c2 ∂t2
Letting φ( R, t) = χ( R, t)/R, Equation (10.12) becomes
∂2 χ 1 ∂2 χ
− = 0. (10.13)
∂R2 c2 ∂t2
This is the equation of plane waves, whose general solution is
χ( R, t) = f (t − R/c) + g(t + R/c). (10.14)
Since we only want a particular solution, we can choose either f or

g for χ. Note that f represents waves propagating in + R direction,
and g represents waves propagating in − R direction. Most of the
time, we’re interested in fields emitted/radiated, so we choose
χ = f (t − R/c), and
χ(t − R/c)
φ= . (10.15)
R
The form of χ is still not known; we determine χ by match φ near
R = 0.
Near the origin, there are various ways to determine χ. We’ll
introduce Landau’s method. As R → 0, φ → ∞, therefore spatial
variation dominates the temporal variation. Neglecting the time
derivative, the φ equation near R = 0 is
dq(t)
∇2 φ = − δ ( R ), near R = 0. (10.16)
e0
fields of moving charges 123
Note that this is the electrostatic potential equation for a point

charge dq. From Coulomb’s law, we immediately have
dq(t)
φ= near R = 0. (10.17)
4πe0 R
Matching φ in Equation (10.15) to this φ near R = 0 yields
dq(t − R/c)
φ= for all R. (10.18)
4πe0 R
For an arbitrary distribution of charges, the φ-equation solution is
just

1 R
Z
0
φ( x, t) = ρ x ,t− dV 0 + φ0 . (10.19)
4πe0 R c
Here x is the field point vector, and dV 0 = dx 0 dy0 dz0 . Similarly, the
solution of A is

R
Z
µ0
A( x, t) = j x0 , t − dV 0 + A0 . (10.20)
4πR c
Let’s now discuss the physical implications of these solutions.
The particular solutions of φ and A we have found,

1 R
Z
φ1 ( x, t) = ρ x0 , t − dV 0 , (10.21)
4πe0 R c

R
Z
µ0
A1 ( x, t) = j x0 , t − dV 0 , (10.22)
4πR c
are called “retarded potentials” because of the factor t − R/c. From
now on, I will only discuss φ for simplicity.
Define t0 ≡ t − R/c, φ(t) is determined by ρ(t0 ); here t0 is called
the “retarded time”. In words, φ at current time t is determined by
ρ at a previous time t0 1 . This is because it takes electromagnetic 1
An example of the retarded potentials
signals a finite amount of time to propagate from x0 to x; this time might help you better understand the
concept: the sun light you see “now”
is R/c. We’ve learned this from the special theory of relativity: the is produced by moving charges at
maximum propagation speed of a signal is c. Now consider two the sun at about 8 minutes ago; since
it takes light roughly 8 minutes to
points in the source x10 and x20 , whose corresponding distances to x propagate from the sun to Earth.
are R1 and R2 . The potential φ( x, t) is determined by ρ( x10 , t − R1 /c)
and ρ( x20 , t − R2 /c): different points at different “previous” times.
The general solutions of homogeneous equations φ0 and A0 are
to be determined by boundary/initial conditions. An important use
of φ0 and A0 is in this problem: First, an external field is incident
on the system from outside. Then the system reacts and emits new
radiation. In this case, φ0 and A0 are used to match the external
field. And the retarded potentials represent the new radiation.
From now on, we’ll consider only retarded potentials.
In static case, ρ and j are independent of time, and the retarded
potentials become
1
Z
ρ x0 dV 0 ,

φ( x, t) = (10.23)
4πe0 R
Z
µ0
j x0 dV 0 .

A( x, t) = (10.24)
4πR
These are just the Coulomb’s law and the Biot-Savart law.
Compare these equations with the retarded potentials, all we did
was putting in factor t − R/c on ρ and j in place of t. These “tricks”,
however, do not apply in general to E and B fields.
To derive the general form of E and B from retarded potentials,
recall that
∂A
E = −∇φ − , (10.25)
∂t
B = ∇× A. (10.26)
The spatial derivative is with respect to x, the field point. Retarded

potentials have the form,

1 R
Z
0
φ( x, t) = ρ x ,t− dV 0 , (10.27)
4πe0 R c

R
Z
µ0
A( x, t) = j x0 , t − dV 0 . (10.28)
4πR c
Note that φ and A depend on x in two places:
1. explicitly through R = | x − x0 |.
2. implicitly through t0 = t − R/c.
Left as an exercise, you can prove that

ρ(r0 , t0 ) ∂ρ(r0 , t0 )
Z
1 1
E(r, t) = + (r − r0 )
4πe0 | r − r 0 |3 | r − r 0 |2 c ∂t
∂j(r0 , t0 ) 3

1
− d r (10.29)
| r − r 0 | c2 ∂t
j(r0 , t0 ) ∂j(r0 , t0 )
Z
µ0 1
B(r, t) = + × (r − r0 )d3 r (10.30)
4π | r − r 0 |3 | r − r 0 |2 c ∂t
These are called Jefimenko’s equations. But as it’s typically easier to
calculate fields from retarded potentials than using above equations,
they are of limited utility.
10.2 The Lienard-Wiechert potentials
We consider the potential of an arbitrarily moving point charge

q. Suppose the trajectory is given by x0 (t) and velocity v(t). From
retarded potentials, φ( x, t) and A( x, t) are determined by x0 (t0 ) and
v ( t 0 ).
One way to obtain potentials of an arbitrarily moving charge is
to use results from the previous section and consider a point charge
as the source. We, however, choose to use Lorentz transformation
to find A and φ just as we did for a uniformly moving charge. The
main difference is that K 0 now moves with v(t0 ) and this velocity
itself is not constant. Therefore, K 0 is an instantaneous inertial
reference frame.
Let’s discuss a little more about the retarded time t0 . For a point
charge, the retarded time t0 is determined by
t0 = t − R/c or | x − x0 (t0 )| = c(t0 − t). (10.31)

We can find out t0 by solving this equation, but how do you know
there is only one root? Let’s now prove that.
Suppose there are two roots t10 and t20 , then
R1 = c(t − t10 ) and R2 = c(t − t20 ), (10.32)
from which R1 − R2 = c(t20 − t10 ). Consider x, x0 (t10 ), x0 (t20 ) as the

three sides of a triangle, clearly
R1 − R2 = | x − x0 (t10 )| − | x − x0 (t20 )| ≤ | x0 (t20 ) − x0 (t10 )|, (10.33)
hence | x0 (t20 ) − x0 (t10 )| ≥ c(t20 − t10 ), or
| x0 (t20 ) − x0 (t10 )|
≥ c. (10.34)
t20 − t10
This is impossible from Einstein’s principle of relativity: no charged

particles can travel at or faster than c. Therefore for a given field
point x and t, there is only one t0 .
The potentials at x and t are determined by x0 (t0 ) and v(t0 ). We
consider an instantaneous inertial reference frame K e moving with
0
v(t ) relative to K, the laboratory frame.
First, we calculate potentials in K,
e where the particle is at rest.
Therefore φ e and A are
e
q
e=
φ , and A
e = 0. (10.35)
4πe0 R
e
e = c(et − te0 ) is the distance between x and x0 in K.

Here R e
Now we can transform from K e to K to find φ and A. Since K
e
0
moves with v(t ) relative to K,
φ
e 1 q
φ= p = p , (10.36)
1 − v(t0 )2 /c2 1 − v(t0 )2 /c2 4πe0 R
e
(v(t0 )/c2 )φe v(t0 )
A= p = 2 φ. (10.37)
1 − v(t0 )2 /c2 c
We now have the same problem we once faced in Section 4.7.3: φ

and A are expressed using Re instead of R, the spatial coordinates in
K. Hence we need to express Re in terms of R.
Using the Lorentz transform to express et in terms of t and x
gives
v(t0 )

c
e = c(et − te0 ) = p
R t − t0 −
· ( x − x 0
) , (10.38)
1 − v(t0 )2 /c2 c2
1 h v i
= √ c(t − t0 ) − · ( x − x0 ) , (10.39)
1 − v2 /c2 c
= γ ( R − β · R ). (10.40)
Important: Note here all variables on the right hand side (RHS)
(v, β, R, γ) are evaluated at t0 ; i.e., γ = 1/ 1 − v(t0 )2 /c2 , β =
p
v(t0 )/c, R = x − x0 (t0 ), and R = c(t − t0 ).

Substituting Equation (10.40) into Equations (10.36) and (10.37)

results in the retarded potentials for a point charge
q 1 q
φ( x, t) = γ = , (10.41)
4πe0 ( R − β · R) γ 4πe0 ( R − β · R)
qβ
A( x, t) = . (10.42)
4πe0 ( R − β · R)c
Again the RHS variables are all evaluated at time t0 ; i.e.,
β = v/c = v(t0 )/c, and R = x − x0 (t0 ). (10.43)
The retarded potentials for a moving point charge, Equations (10.41)

and (10.42), are known as the Liénard-Wiechert potentials.
From the potentials, the E and B fields are given by
∂A
E = −∇φ − , (10.44)
∂t
B = ∇× A. (10.45)
where ∇ = ∂/∂x. These differentiations seem to be simple, but

they are not because of explicit or implicit dependences on t(t0 ) and
x( x0 ). Obtaining E and B from φ and A is a tedious process; we’ll
first analyze how φ and A depend on t and x.
Potentials φ and A are functions of R, R and v, and
R = x − x0 (t0 ) (10.46)
0
v = v(t ) (10.47)
Here the retarded time t0 is determined by
| x − x0 (t0 )| = c(t − t0 ) = R, (10.48)
therefore t0 = t0 ( x, t). From above analysis, φ and A
1. depend on t through t0 (Equation (10.48)).
2. depend on x explicitly through R and implicitly through t0

(Equation (10.46)).
Therefore we may write φ = φ( x, t0 ) and A = A( x, t0 ). To find E and

B from φ and A, we must evaluate ∂/∂t and ∇ ≡ ∂/∂x.
Let’s first calculate ∂/∂t, since the dependence on t is only
through t0 . Therefore
∂t0 ∂

∂
= . (10.49)
∂t x ∂t ∂t0
Since variables like v, R are all functions evaluated at t0 , we do

not need to do anything about ∂/∂t0 . However, we still need to
calculate ∂t0 /∂t. Using R = c(t − t0 ),
∂t0 ∂R ∂t0

∂R
= c 1− = 0 . (10.50)
∂t ∂t ∂t ∂t
The purpose is to solve for ∂t0 /∂t from this equation and we still
need ∂R/∂t0 . Note R = | x − x0 |, or R2 = R · R = x2 + x0 (t0 )2 − 2x ·
x 0 ( t 0 ),
∂R2 ∂x0 ∂x0

0
= 2x0 · 0 − 2x · 0 = 2[ x0 (t0 ) − x] · v(t0 ) (10.51)
∂t ∂t ∂t
= −2R · v. (10.52)
Therefore
∂R R·v
0
=− . (10.53)
∂t R
Substituting Equation (10.53) into Equation (10.50) gives
∂t0 R · v ∂t0

c 1− =− , (10.54)
∂t R ∂t
from which we find
∂t0 c 1 R
= = ≡ , (10.55)
∂t c − R · v/R 1 − R · v/Rc S( x, t0 )
where
S( x, t0 ) = R − β · R (10.56)
Finally, substituting Equation (10.55) into Equation (10.49) gives

∂ 1 ∂ R ∂
= 0
= . (10.57)
∂t 1 − R · v/Rc ∂t S( x, t ) ∂t0
0
Potentials depend on x explicitly through R = x − x0 and

implicitly through t0 , hence
∂f
∇ f ( x, t0 ) = ∇ f ( x, t0 )t0 + 0 ∇t0 .

(10.58)
∂t
To calculate ∇, we need ∇t0 , which is the partial derivative of t0
with respect to x while fixing t. We again start from R = c(t − t0 ),
1
∇ R = −c∇t0 ⇒ ∇t0 = − ∇ R. (10.59)
c
But note that R = R( x, t0 ),
∂R
∇ R( x, t0 ) = ∇ R( x, t0 )t0 + 0 ∇t0 .

(10.60)
∂t
Therefore

1 ∂R 0
∇t0 = − ∇ R |t0 + ∇ t . (10.61)
c ∂t0
Fortunately, we’ve just shown that
∂R R·v
=− . (10.62)
∂t0 R
Now we only need to calculate ∇ R|t0 . Again using R2 = R · R and
R = x − x 0 ( t 0 ),
2R ∇ R|t0 = 2 R · ∇ R|t0 = 2R · ∇ x = 2R · I = 2R, (10.63)

or
∇ R|t0 = R/R. (10.64)
Substituting Equation (10.64) for ∇ R|t0 and Equation (10.62) for

∂R/∂t0 into Equation (10.61) gives
R·v 0

1 R
∇t0 = − − ∇t , (10.65)
c R R
from which we can find
1R 1 R
∇t0 = − =− (10.66)
c R 1 − R · v/Rc cS
With these preparations, we’re now ready to calculate fields.

Using S, φ and A are
q v
φ= and A = 2 φ. (10.67)
4πe0 S c
Therefore
∂A q 1 ∂
E = −∇φ − = ∇S − 2 (vφ) (10.68)
∂t 4πe0 S2 c ∂t

q 1 ∂v ∂φ
= ∇ S − φ + v (10.69)
4πe0 S2 c2 ∂t ∂t
∂t0

q 1 q ∂S
= ∇ S − a φ + v − , (10.70)
4πe0 S2 c2 ∂t 4πe0 S2 ∂t
where a = dv/dt0 is the acceleration of the particle at t0 .

The structure of E tells us we need ∇S and ∂S/∂t. First, let’s
calculate ∇S.
∇S = ∇( R − β · R) = ∇ R − ∇(v · R)/c, (10.71)

0 R R
∇ R = −c∇t = −c − = , (10.72)
cS S
∇(v · R) = ∇v · R + ∇ R · v, (10.73)
dv R
∇v = ∇t0 0 = − a, (10.74)
dt cS
R
∇ R = ∇[ x − x0 (t0 )] = I − ∇t0 v = I + v, (10.75)
cS
R R
∇(v · R) = − ( a · R ) + v + v2 . (10.76)
cS cS
In summary,
R R R
∇S = + 2 ( a · R ) − β − β2 (10.77)
S c S S
a·R

R 2
= 1−β + 2 −β (10.78)
S c
Now let’s calculate ∂S/∂t,

∂S ∂R 1 ∂v · R
= − (10.79)
∂t ∂t c ∂t
∂R ∂t0

1 ∂v ∂R
= 0 − ·R+ ·v (10.80)
∂t ∂t c ∂t ∂t
R · v R 1 R ∂v(t0 ) ∂x0 (t0 )

=− − · R − · v (10.81)
R S c S ∂t0 ∂t
R·v 1 R R ∂x0 (t0 )

=− − a·R− · v (10.82)
S c S S ∂t0
R·v 1 R

R
=− − a·R− v·v . (10.83)
S c S S
Or in summary,
∂S R·v R a · R v2 R
=− − + . (10.84)
∂t S S c c S
Using ∇S and ∂S/∂t, we can easily find
a·R

q q R 2
∇φ = − ∇ S = − 1 − β + − β ,
4πe0 S2 4πe0 S2 S c2
R·v R a · R v2 R

∂A 1 qR qv
= 2 a − − − + .
∂t c 4πe0 S2 4πe0 S2 S S c c S
Combining these two,
a·R

q R 2
E= 1−β + 2 −β
4πe0 S2 S c
R·v R a · R v2 R

v R
+ 2 − − + − 2a (10.85)
c S S c c S c
We divide terms inside {} in Equation (10.85) into two parts. Part

1 are those not related to acceleration a,
R · v v2 R

R 2
v

1 = 1−β −β+ 2 − + (10.86)
S c S c S
1h i
= R 1 − β2 − βS − β( R · β) + β2 Rβ (10.87)
S
1 h i
= R 1 − β2 − β( R − β · R) − β( R · β) + β2 Rβ (10.88)
S
1 h i
= R 1 − β2 − βR(1 − β2 ) (10.89)
S
1
= (1 − β2 )( R − Rβ). (10.90)
S
Part
2 are those terms related to acceleration a,
Ra·R v R a·R R

2 = − 2 − 2a (10.91)
S c2 c S c c
1
= [ R( a · R) − ( a · R) Rβ − RSa] (10.92)
Sc2
1 n h
2
i o
= ( a · R )( R − Rβ ) − R − ( β · R ) R a (10.93)
Sc2
1
= 2
( a · R)( R − Rβ) − [ R · ( R − Rβ)] a (10.94)
Sc
1
= R × ( R − Rβ) × β̇ . (10.95)
Sc
Combining parts 1 and 2 leads to

q 2 1
E= ( 1 − β )( R − Rβ ) + R × ( R − Rβ ) × β̇ (10.96)
4πe0 S3 c

q R − Rβ q R × ( R − Rβ) × β̇
= + . (10.97)
4πe0 γ2 S3 4πe0 c S3
The first part (called part )
1 on the RHS does not depend on a,
and it is called the “velocity field”. The part (called part )
2 depen-
dent on a is called the “radiation field.”.
Note that S ∼ R, so 1 ∼ R−2 and 2 ∼ R−1 . Therefore the total
energy flow through a sphere with radius R per unit time is
P
1
∼ E 2 R 2 ∼ R −2 , (10.98)
hence as R → ∞, P 1
→ 0. The velocity field exists even if a = 0, it
equals the Coulomb field if v = 0 (sometimes called the generalized
Coulomb field), or the Lorentz transformation of the Coulomb field
for a partial moving uniformly with the velocity v. The velocity
field moves with the charge and does not transfer energy to infinity.
For the second part, note that 2 ∼ R−1 , so at large distances
E ≈ ,
2 (10.99)
i.e., the second part of E dominates at large distances. Also consider
the energy flow through a sphere with radius R,
P
2
∼ E2 R2 ∼ const, (10.100)
i.e., it can radiate energy to large distances. This part is responsible
for the electromagnetic radiation. We will focus on radiation in the
next chapter.
The magnetic field B is B = ∇× A,
B = ∇× A = ∇×(φβ/c) = φ∇× β/c − β × ∇φ/c
= φ∇t0 × β̇/c − β × ∇φ/c
β̇ · R

q R q R 2
= − × β̇ − β × − 1 − β + − β
4πe0 cS cS 4πe0 cS2 S c
q q R β̇ · R q
= β × R (1 − β2 ) + β× − R × β̇
4πe0 cS3 4πe0 cS2 S c 4πe0 c2 S2

q Rβ q q
= eR × − − Rβ ( β̇ · R ) − β̇RS
4πe0 cS3 γ2 4πe0 c2 S3 4πe0 c2 S3

q Rβ q
= eR × − − R β ( β̇ · R ) − β̇ ( β · R ) + β̇R
4πe0 cS3 γ2 4πe0 c2 S3

q Rβ q
= eR × − + − Rβ ( β̇ · R ) − β̇ ( R − Rβ ) · R
The terms in {} look similar to E, except a few terms are missing.
However, noticing that the missed terms are all proportional to R,
we can add in those terms to make them E; i.e.,
β̇ · R

q Rβ q
B = eR × − − Rβ
4πe0 cS3 γ2 4πe0 cS3 c

q
− β̇( R2 − Rβ · R)
4πe0 c2 S3
Since e R × R = 0, we can insert two terms: R/γ2 and R( β̇ · R)

q 1 q
= eR × ( R − Rβ ) + R ( β̇ · R ) − Rβ β̇ · R − β̇ ( R − Rβ ) · R

q 1 q
= eR × ( R − Rβ ) + R × ( R − Rβ ) × β̇ .
Or in summary,
B = e R × E/c, where e R ≡ R/R. (10.101)

11
Radiation of Electromagnetic Waves
The word “radiation” here means the transportation of energy in

the form of electromagnetic waves out to infinity. Therefore in this
chapter we are mainly focused on calculating the power and energy
radiated by moving charges.
11.1 Radiation of a Point Charge
We start from the simple case about the radiation of a point charge
and calculating the corresponding power radiated. We will calcu-
late this using the radiation electric and magnetic fields. From the
previous chapter, the E and B fields are

q R − Rβ q R × ( R − Rβ) × β̇
E= + , (11.1)
4πe0 γ2 S3 4πe0 c S3
B = e R × E/c. (11.2)
We have already learned that the first term on the RHS of Equation
(11.1) is the “velocity” field, and it varies as R−2 , therefore it does
not radiate. The second term varies as R−1 , and therefore can
transport energy to infinity. Therefore, we now only consider the
radiation part of the field, using S = R − β · R,

q R × ( R − Rβ) × β̇
E= (11.3)
4πe0 c ( R − β · R )3

q e R × (e R − β) × β̇
= , (11.4)
4πe0 c (1 − β · e R )3 R
B = e R × E/c. (11.5)
The Poynting flux is, noting that E · e R = 0 1 , 1
Please note: S = R − β · R and S is
the Poynting flux.
1 1 2
S(r, t) = E×B = E eR , (11.6)
µ0 µ0 c
2 2
e R × (e R − β) × β̇

1 q
= eR (11.7)
µ0 c 4πe0 c (1 − β · e R )6 R2
The Poynting vector means the amount of energy passing through
unit area in the direction S in unit time. Therefore, if we want to
calculate the energy radiated from t1 (t10 ) to t2 (t20 ) is
Z t2
∆E = S · e R dt. (11.8)
t1
radiation of electromagnetic waves 133
Note here that terms on the RHS of Equation (11.6) are explicit
functions of t0 instead of t, therefore it is convenient to make a
transformation from t to t0 , and write ∆E as
Z t2
∂t 0
∆E = S · eR dt , (11.9)
t1 ∂t0
here the meaning of the integrand
∂t 0
dE(t0 ) = S · e R dt , (11.10)
∂t0
is the amount of energy radiated by the particle in dt0 through unit
area in direction of e R . The total energy radiated by the particle in
dt0 is then,
∂t 2
∆E = S · e R R dΩ∆t0 , (11.11)
∂t0
here dΩ = sin θdθdφ is the solid angle. The radiation power is
dE ∂t
dP ≡ 0
= S · e R R2 0 dΩ, (11.12)
dt ∂t
and the power radiated by the particle per solid angle dΩ is
dP ∂t
= S · e R R2 0 . (11.13)
dΩ ∂t
Now using
∂t
= 1 − e R · β, (11.14)
∂t0
dP
= S · e R R2 (1 − e R · β ). (11.15)
dΩ
Substituting Equation (11.7) into Equation (11.15) leads to
2
dP q2 µ0 c e R × (e R − β) × β̇
= . (11.16)
dΩ 16π 2 (1 − β · e R )5
This is the general equation we’ll use to discuss various special

cases.
11.1.1 The radiation of a slow particle

First, let’s see the radiation from a non-relativistic particle, β 1,
then 1 − β · e R ≈ 1, and Equation (11.16) becomes
dP q2 µ0 c 2
= 2
e R × (e R × β̇) (11.17)
dΩ 16π
q2 µ0 c 2
= (e R · β̇)e R − β̇ (11.18)
16π 2
q2 µ0 c 2
= β̇ (11.19)
16π 2 ⊥
q2 µ0 2
= v̇ sin2 Θ, (11.20)
16π 2 c
here Θ ≡ hv̇, e R i = h a, e R i. If we further note that the dipole

moment 2 for a single point charge is p = ∑ qr 0 = qr 0 , then ṗ = qv 2
Note that in this chapter, the scalar P
is power, p is momentum and p is the
and p̈ = qv̇ = qa. Therefore Equation (11.20) can be written using p
dipole moment.
as
dP µ | p̈|2
= 0 2 sin2 Θ. (11.21)
dΩ 16π c
The total power radiated by the slow charge is
µ0 | p̈|2 µ0 | p̈|2 µ q2 a2
Z Z 2π Z π
P= dP = dφ dΘ 2
sin3 Θ = = 0 .
0 0 16π c 6πc 6πc
(11.22)
This is the famous Larmor formula. Note from the formula that the
power radiated by a point charge is proportional to the square of its
acceleration.
11.1.2 The radiation of a fast particle
For a fast particle where β ∼ 1, we cannot ignore β or β in Equation

(11.16) and things in general are complicated. Here we first discuss
two special cases with βk β̇ and β ⊥ β̇, and leave the general case as
your exercise.
Let’s first calculate the radiation power for βk β̇. If βk β̇, then
Equation (11.16) becomes
2
dP q2 µ0 c e R × e R × β̇
= . (11.23)
dΩ 16π 2 (1 − β · e R )5
Define θ = h β, e R i. For βk β̇, h β̇, e R i = θ.
dP q2 µ0 c β̇2 sin2 θ
= , (11.24)
dΩ 16π 2 (1 − β cos θ )5
The angular distribution of radiation is interesting. If θ = 0, then

dP/dΩ = 0. However, for ultra-relativistic particles, β → 1, then
1 − β cos θ ∼ 1 − cos θ → 0, as θ → 0. (11.25)
Therefore dP/dΩ becomes very large near θ ∼ 0 but not at θ = 0.

Note that the angular distribution of the power radiated is inde-
pendent of the sign of a (whether the particle is accelerating or
de-accelerating). When a high speed electron is de-accelerating, for
example, when it hits a target, it radiates 3 . This kind of radiation is 3
One example is the X-ray radiation
called bremsstrahlung, braking radiation, 轫致辐射，刹车辐射或制动 caused by relativistic electrons from
space hitting atmosphere. People
辐射. actually can use the radiated X-ray
to deduce the energy spectrum of
the precipitated electrons. Google
“NASA BARREL mission” for more
information.
The total power radiated for βk β̇ is
q2 µ0 c β̇2 sin2 θ
Z
Pk = dΩ (11.26)
16π 2 (1 − β cos θ )5
q2 µ0 c β̇2 sin2 θ
Z 2π Z π
= dφ dθ (11.27)
16π 2 0 0 (1 − β cos θ )5
q2 µ0 c β̇2 4
= 2
2π (11.28)
16π 3(1 − β2 )3
q2 µ0 c β̇2
= (11.29)
6π (1 − β2 )3
q2 µ0 6 2 q2 µ0 6 2
= γ v̇ = γ a . (11.30)
6πc 6πc
Sometimes it’s convenient to express the radiation power in terms
of the force felt by the particle. For vkv̇, we have from the time rate
change of momentum p,
dp d
F= = (γmv) = γ̇mv + γmv̇. (11.31)
dt dt
Note that β2 = (γ2 − 1)/γ2 , and for vkv̇,
γ̇mv = γ3 β2 mv̇ = (γ3 − γ)mv̇. (11.32)
Substituting Equation (11.32) into Equation (11.31) yields

dp
F= = γ3 mv̇. (11.33)
dt
Therefore in terms of force F, the total radiation power for βk β̇ is
2
q2 µ0

dp
Pk = . (11.34)
6πcm2 dt
Therefore the radiation power is independent of γ for a given force.

The second special case we are going to consider is β ⊥ β̇; this
case is slightly more complicated than the previous case. We first
establish a coordinate system where v = vez , and v̇ = v̇e x , then
β̇ · e R = β̇ sin θ cos φ, (11.35)

β · e R = β cos θ. (11.36)
Here ( R, θ, φ) forms a spherical coordinate system. Using these two

equations,
e R × [(e R − β) × β̇]
= (e R · β̇)(e R − β) − [(e R − β) · e R ] β̇ (11.37)
= β̇ sin θ cos φ(e R − β) − (1 − β cos θ ) β̇. (11.38)
Therefore,
|e R × [(e R − β) × β̇]|2
= ( β̇ sin θ cos φ)2 (e R − β) · (e R − β) + (1 − β cos θ )2 β̇ · β̇
− 2 β̇ sin θ cos φ(1 − β cos θ )(e R − β) · β̇ (11.39)
= β̇2 (1 − β cos θ )2 + (sin θ cos φ)2 ( β2 − 1) .

(11.40)
Using Equation (11.40) in Equation (11.16) gives
dP q2 µ0 c β̇2 [(1 − β cos θ )2 + (sin θ cos φ)2 ( β2 − 1)]

= . (11.41)
dΩ 16π 2 (1 − β cos θ )5
Integrating (11.41) gives the total radiation power
q2 µ0 c β̇2 4 q2 µ0 2 4
P⊥ = 2π = v̇ γ . (11.42)
16π 2 3(1 − β2 )2 6πc
Since the force dp/dt for v ⊥ v̇ is, noting that γ is constant,
2
dp dp
= γmv̇ ⇒ = γ2 m2 v̇2 , (11.43)
dt dt
the radiation power in terms of the force felt by the particle is
2
q2 µ0 γ2

dp
P⊥ = . (11.44)
6πc m2 dt
Therefore for a given force, the radiation increases as γ2 or energy

squared.
The above results for Pk and P⊥ are important for the design
of accelerators. There are at least two types of accelerators: linear
accelerators (v k v̇) and cyclic particle accelerators (v ⊥ v̇). You
can see that for a given force, P⊥ = Pk γ2 . Therefore as energy
goes up, the radiation power loss is much much worse in cyclic
particle accelerators. So in principle, it is more effective using linear
accelerators to accelerate particles to really really high energy. On
the other hand, the most important application of “v ⊥ v̇” case
is circular motion, and the radiation is called synchrotron radiation.
Instead of accelerating particles, people get radiation (or called
“light”) from this kind of accelerators.
Now we are ready to calculate the power radiated by a charge in
arbitrary motion. Noting that E and B fields satisfy linear superpo-
sition, therefore E = E( β̇k ) + E( β̇⊥ ), and k and ⊥ are with respect to
v. The electric field is
E ∝ e R × [(e R − β) × β̇] (11.45)

= e R × [(e R − β) × ( β̇k + β̇⊥ )] (11.46)
= e R × [(e R − β) × β̇k + (e R − β) × β̇⊥ )]. (11.47)
Hence if we write E = Ek + E⊥ , then
E2 = Ek2 + E⊥
2
+ 2Ek · E⊥ . (11.48)
Therefore Equation (11.16) becomes

dP dP dP dP
= + + . (11.49)
dΩ dΩ k dΩ ⊥ dΩ cross
From Equation (11.47),

dP
∝ 2{e R × [(e R − β) × β̇k ]} · {e R × [(e R − β) × β̇⊥ ]}.
dΩ cross
(11.50)
The total power is given by

Z
" #
dP dP dP
P= + + dΩ (11.51)
dΩ k dΩ ⊥ dΩ cross
Z
dP
= Pk + P⊥ + dΩ. (11.52)
dΩ cross
Left as an exercise, you can prove that

Z
dP
dΩ = 0. (11.53)
dΩ cross
Therefore using Equations (11.30) and (11.42),
q2 µ0 6 2 q2 µ0 4 2 q2 µ0 6 2
P = Pk + P⊥ = γ ak + γ a⊥ = (γ ak + γ4 a2⊥ ).
6πc 6πc 6πc
(11.54)
This is total power radiated by a relativistic particle in arbitrary

motion. It is called Lienard formula. Lienard’s generalization of
the Larmor formula given by Equation (11.22). you can see that
Equation (11.54) reduces to (11.22) if γ → 1.
11.2 Radiation reaction
In this section, we consider the effects of radiation on the motion

of a single particle. This is normally called radiation reaction, or
radiation damping.
Since radiation carries energy away to infinity, an accelerated
charged particle loses energy in this process. For a given force, a
charged particle accelerates less than a neutral particle. Equiva-
lently speaking, the radiation exerts a force, called F rad , back on the
charge. The purpose of this section is to calculate this force, and
for simplicity, we consider only non-relativistic particles. You’ll see
that, like “self-energy”, this “self-force” (the force a charged particle
exerts on itself) is not well understood, at least in the framework of
classical electromagnetic theory.
From the conservation of energy, you might want to write that
the energy lost by radiation equals the amount of work done by
F rad , or
µ0 q2 a2
F rad · v = − P = − , (11.55)
6πc
where the radiation power is given by the Larmor formula. But this
is actually not correct. When we calculated the radiation power,
we excluded the energy from the velocity field, because we were
only interested in the energy that could be transported to infinity.
Well, the velocity field do carry energy, they just don’t transport it
to infinity. As the particle accelerates or de-accelerates, the energy
exchanges between the particle and the velocity field, and at the
same time, is radiated to infinity. Equation (11.55) only accounts for
the radiation field, but not the velocity field. If we want to know the
force exerted by the fields on the charge, we need to know the total
power lost by the charge at any given moment 4 . 4
The term “radiation reaction” or
“radiation damping” are kind of
Therefore the total energy lost by the particle should equal the
misleading; we are actually talking
total energy radiated and the total energy pumped into the velocity about the force exerted by the total
field. However, if we consider only intervals where the initial state field (radiation field + velocity field)
on the particle, not just the radiation
and the final state of the system are identical, e.g., a period system, field.
the energy stored in the velocity field is the same. The only net
loss of energy is in the form of radiation. Thus Equation (11.55)
is incorrect at any given moment, but is valid if we perform an
average over the interval
µ0 q2
Z t2 Z t2
F rad · vdt = − a2 dt. (11.56)
t1 6πc t1
We want to emphasize again that the state of the system is identical

at t1 and t2 . Integrating the RHS of the equation by parts, we have
dv t2
Z t2 Z t2 Z t2 2
dv dv d v
a2 dt = · dt = v · − · vdt. (11.57)
t1 t1 dt dt dt t1 t1 dt2

Because the state of system is identical at t1 and t2 ,

Z t2 Z t2 2 Z t2
d v
a2 dt = − · vdt = − ȧ · vdt. (11.58)
t1 t1 dt2 t1
Putting this equation into Equation (11.57) gives
µ0 q2
Z t2
F rad − ȧ · vdt = 0. (11.59)
t1 6πc
Apparently Equation (11.59) will be satisfied if
µ0 q2
F rad = ȧ. (11.60)
6πc
This is called the Abraham-Lorentz formula for the radiation reac-
tion force. However, you should note that you can add any term
perpendicular to v and Equation (11.59) will still be satisfied. Equa-
tion (11.60) gives the time-average force of the field exerted on
the charged particle. It is by far the simplest form of the radiation
force can take. We’ll use this formula to characterize the radiation
reaction force exerted on a charged particle.
However, we need to point out that the Abraham-Lorentz for-
mula has disturbing implications. To see this, let’s first write down
the equation of motion of the charged particle if we consider the
radiation reaction:
µ0 q2
ma = ȧ + F0 , (11.61)
6πc
here F0 denotes any other forces, and for simplicity, we shall con-
sider a 1D problem. Equation (11.61) can also be written as
a = τ ȧ + F0 /m, (11.62)
where
µ0 q2 2rq
τ= = , (11.63)
6πmc 3c
with rq = q2 /4πe0 mc2 the classical charged particle radius. For

an electron, τ ∼ 6 × 10−24 s. Note that within radiation reaction
consider, the acceleration a is a continuous function of time, just
like velocity and coordinates. To see that, we integrate Equation
(11.62) from t − e to t + e,
Z t+e Z t+e Z t+e
adt = τ ȧdt + F0 /mdt, (11.64)
t−e t−e t−e
this is the same as

Z t+e
v(t + e) − v(t − e) = τ [ a(t + e) − a(t − e)] + F0 /mdt. (11.65)
t−e
Note that v is a continuous function of t, then if we let e → 0,
0 = τ [ a(t + e) − a(t − e)] + 0, as e → 0. (11.66)
Therefore, a(t + e) − a(t − e) → 0 as e → 0, or a is a continuous

function of t.
Now we consider a particle moving from a region with some
kind of field so that the particle feels a constant force F0 to a region
with no field at t = 0. Let’s denote the region with field region I,
and the region without field region II. Since the force is constant,
the general solution of Equation (11.62) in Region I is
aI = aI0 et/τ + F0 /m, (11.67)
and in Region II is
t/τ
aII = aII
0e . (11.68)
Equating aII to aI at t = 0, we have
aI0 + F0 /m = aII
0. (11.69)
Since we choose t = 0 quite arbitrarily, there is no reason to believe

aII II
0 = 0. As long as a0 6 = 0, the solution of a in region II tells us that
the particle accelerates/de-accelerates indefinitely even if there is
no external force! This problem also arises if the radiation damping
force dominates, since then the equation of motion is reduced to
a = τ ȧ, (11.70)
which again gives a = a0 et/τ . From what we have discussed above,

we shall assume that the radiation damping force is much much
smaller than any external force from now on.
11.3 The field of a system of charges at large distances
We choose the origin of the coordinates O in the interior of the

system of charges, like before in the chapters about electrostatics
and magnetostatics. Charge position r 0 and field position r, and
R = r − r0 . (11.71)
We normally consider radiation in the form of waves. Correspond-

ingly there are typically three spatial scales, |r |, |r 0 |, and λ, which
is the typical radiation wave length. To simplify analysis, we make
some assumptions. First, we assume that |r 0 | (|r |, λ); i.e., the
system is small in size compared to the typical wave length and the
distance of the field point to the origin. Then by comparing λ and
|r |, we can define three zones of radiation.
1. The near zone with |r | λ.
2. The intermediate zone: |r | ∼ λ.
3. The far/radiation zone: |r | λ.
In this class, we only consider electromagnetic field in the radi-

ation zone. Further more, since we are considering a system of
charges now, we need to use the general expression of potentials,
the retarded potentials, to calculate E and B.
Now here is one extra note about the condition |r 0 | λ. Assume
the the characteristic wave period corresponding to λ is T, or
λ ∼ Tc, where c is the speed of light in vacuum. Now consider
the motion of a particle in the system within T, the distance this
particle travels should be about |r 0 |, the scale size of the system.
Therefore the particle should move with a characteristic speed
v ∼ |r 0 |/T. The condition |r 0 | λ then implies
|r 0 | λ ∼ Tc (11.72)
or
|r 0 |
∼ v c; (11.73)
T
i.e., the condition |r 0 | λ implies that the particles are non-

relativistic.
Now let’s start our analysis of the radiation field of a system
of charges. We can use Fourier transform to analyze the field in
far zone, and therefore we consider only the component with
frequency ω; i.e., assume the source ρ and j are ρ(r 0 )e−iωt and
j = j(r 0 )e−iωt , then the retarded potentials give
1 ρ0 (r 0 )e−iω (t− R/c)

Z
φ(r, t) = dV 0 , (11.74)
4πe0 R
j0 (r 0 )e−iω (t− R/c)
Z
µ0
A(r, t) = dV 0 . (11.75)
4π R
Using k = ω/c for electromagnetic waves in vacuum, we rewrite
above equations as
1 ρ0 (r 0 )ei(kR−ωt)
Z
φ(r, t) = dV 0 , (11.76)
4πe0 R
j0 (r 0 )ei(kR−ωt)
Z
µ0
A(r, t) = dV 0 . (11.77)
4π R
Now that since the time dependence in potentials is in the form

of e−iωt , then
∂
= −iω, (11.78)
∂t
therefore outside the source region,
1 ∂E ω
∇× B = 2
= −i 2 E = −ikE/c, (11.79)
c ∂t c
or
c
E = i ∇× B, and B = ∇× A. (11.80)
k
Therefore for radiation of a system of charges, we calculate B
first from the vector potential A, and then use Equation (11.80) to
calculate E, the electric field5 . Note that since the time dependence 5
Compare this with the radiation field
can be easily separated from the r dependence, we write of a point charge, where we calculate E
first, and then use B = e R × E to obtain
B.
A(r, t) = A(r )e−iωt , (11.81)
where
j0 (r 0 )eikR
Z
µ0
A (r ) = dV 0 . (11.82)
4π R
From now on we’ll only analyze A(r ), since the time dependence is
fixed and easy to calculate.
Now let’s go back to Equation (11.77) and simplify the expres-
sion of A(r ) using the fact that we are considering fields in the far
zone; i.e., |r | λ |r 0 |. Now we need to be careful here: there are
two small parameters
e ≡ |r 0 |/|r | 1, (11.83)
and
η ≡ |r 0 |/λ ∼ v/c 1. (11.84)
If we compare η and e, we find that
η/e = |r |/λ 1. (11.85)
In our Taylor expansion-type analysis below, we’ll throw away high

order terms in e, but not η.
First, let’s Taylor expand R,
q
R = (r − r 0 )2 = (r 2 + r 02 − 2r · r 0 )1/2 ≈ r − r 0 · n + · · · , (11.86)
where r = |r |, n ≡ r/r. Substituting Equation (11.86) into Equation

(11.82), we find
µ0 eikr
Z Z
µ0 0 0
A (r ) ≈ j0 (r 0 )eikr−ikr ·n dV 0 = j0 (r 0 )e−ikr ·n dV 0 .
4πr 4πr
(11.87)
There is a reason why we ignored r 0 · n in the denominator while

keeping the term ikr 0 · n in the phase. First,
r0 · n

1 1 1 1 1
1 + r 0 /r = (1 + e) ,

= ≈ 1+ ∼
R r − r0 · n r r r r
(11.88)
On the other hand,

0
e−ikr ·n ≈ 1 − ikr 0 · n ∼ 1 − ir 0 /λ = 1 − iη (11.89)
Since η/e 1 and we are keeping lowest order terms of e, but not
0
η, we cannot ignore e−ikr ·n in general.
You can also try to understand this term directly using the
retarded potentials, in which we have ρ(r 0 , t0 ) or j(r 0 , t0 ). Using
Equation (11.86),
ρ(r 0 , t0 ) ≈ ρ(r 0 , t − r/c + r 0 · n/c) (11.90)

r0

0 ∂ρ ·n
≈ ρ(r , t − r/c) + 0 . (11.91)
∂t t−r/c c
Therefore, whether or not you can ignore the second term depends
on how much ρ changes during the time interval r 0 · n/c. This is
totally different from the factor 1/R.
The remaining analysis of this chapter makes use of the small
parameter η = |r 0 |/λ, and it’s a similar process similar to the
multipole expansion of the electrostatic and magnetostatic fields.
11.4 The dipole radiation
To lower order in η = |r 0 |/λ, Equation (11.87) becomes
µ0 eikr
Z
A (r ) ≈ j0 (r 0 )dV 0 . (11.92)
4πr
Because j0 (r 0 ) = ρ0 (r 0 )v, the integration becomes

Z Z
j0 (r 0 )dV 0 = ρ0 (r 0 )vdV 0 = ∑ qa va = ṗ. (11.93)
a
Here we have used
ρ (r 0 ) = ∑ qa δ(r 0 − r 0a ), (11.94)
and the definition of the dipole moment,
p= ∑ qa r 0a ⇒ ṗ = ∑ qa ṙ 0a = ∑ qa va . (11.95)
Therefore, Equation (11.92) becomes
µ0 eikr
A (r ) = ṗ. (11.96)
4πr
This is the vector potential of the dipole radiation in far zone.
From Equation (11.96), we can easily obtain the magnetic field B

by
µ0 eikr ik

1
B = ∇× A = − 2 ∇r × ṗ (11.97)
4π r r
ikµ0 e ikr
1

= 1− n × ṗ. (11.98)
4πr ikr
Using that 1/kr ∼ λ/r ∼ e/η 1,
ikµ0 eikr iωµ0 eikr

B= n × ṗ = n × ṗ. (11.99)
4πr 4πcr
Since the time dependence is in the form of e−iωt , iω ṗ = − p̈, and
µ0 eikr
B= p̈ × n. (11.100)
4πcr
From the above analysis, we see that when performing ∇ operation
and when we only need to keep the lowest order terms in λ/r, we
just need to replace ∇ by ikn.
The electric field can be obtained from Equation (11.80),
c c µ0 eikr
E = i ∇× B = i ikn × B = cB × n = ( p̈ × n) × n. (11.101)
k k 4πr
If we define θ ≡ h p, ni, noting that p̈ = −ω 2 p, therefore θ is
also the angle between p and n. The E (Equation (11.101)) and B
(Equation (11.100)) fields are
µ0 eikr
B (r ) = p̈ sin θeφ , (11.102)
4πcr
µ0 eikr
E (r ) = p̈ sin θeθ . (11.103)
4πr
Here (r, θ, φ) form the normal spherical coordinates. These are the
electric and magnetic fields from dipole radiation.
To calculate the power radiated by the dipole radiation, we start
from the calculation of Poynting flux.
1 E + E∗ B + B∗

1
S= Re( E(r, t)) × Re( B(r, t)) = ×
µ0 µ0 2 2
(11.104)
1
= ( E × B + E × B∗ + E∗ × B + E∗ × B∗ ) . (11.105)
4µ0
From equations (11.102) and (11.103),
E × B = E(r ) × B(r )e−2iωt , (11.106)

∗
E × B = E (r ) × B (r ). (11.107)
The Poynting flux therefore contains rapidly varying terms pro-

portional to e−2iωt . In this case, it’s customary to calculate the time
averaged Poynting flux
Z t+ T
1
hSi = S(τ )dτ. (11.108)
T t
Note that
Z t+ T
1
heimωt i = eimωτ dτ = 0 for m 6= 0. (11.109)
T t
Therefore the time-averaged Poynting flux is
1 1
hSi = ( E × B∗ + E∗ × B) = Re( E × B∗ ). (11.110)
4µ0 2µ0
For dipole radiation, E = cB × n, then
E × B∗ = (cB × n) × B∗ = cn( B · B∗ ) = c| B|2 n, (11.111)
since B ⊥ n from Equation (11.100). Substituting Equation (11.111)

into Equation (11.110) gives
c
hSi = | B|2 n. (11.112)
2µ0
The angular distribution of the radiation power is, using Equa-

tion (11.100),

dP µ0
= hSi · nr2 = | p̈ × n|2 , (11.113)
dΩ 32π 2 c
or
µ0 ω 4

dP µ0 2 2
= | p̈ | sin θ = | p|2 sin2 θ, (11.114)
dΩ 32π 2 c 32π 2 c
where we have used | p̈| = ω 2 | p|. The total power radiated is then
Z Z
dP dP
h Pi = dΩ = sin θdθdϕ (11.115)
dΩ dΩ
Z 2π Z π
µ0 2
= | p̈ | dϕ sin3 θdθ (11.116)
32π 2 c 0 0
µ | p̈|2 µ0 ω 4 | p0 |2
= 0 = , (11.117)
12πc 12πc
where we have used p = p0 e−iωt .

See the dependence on ω 4 ? This means that the radiated power
increases tremendously if we increase frequency. Also this is what’s
responsible for the “blue” sky, also know as Rayleigh scattering.
As electromagnetic waves from the sun, the sunlight, pass through
the atmosphere, the atoms in the atmosphere feel the electromag-
netic field and get stimulated. If you treat these oscillating atoms
as simple harmonics, they oscillate at the “forced” frequency. The
tiny dipoles formed by these atoms will then re-radiate electromag-
netic waves, more intense in higher frequencies (blue) than in the
lower frequencies (red). This is what accounts for the blue sky. At
sunset/sunrise, those blue has been removed by the exact same
scattering, and you see what’s left, beautiful red light. We’ll cover
more about the scattering of electromagnetic waves later.
As an application of the dipole radiation, let’s consider the
radiation of a linear antenna whose length d λ. Choosing the
origin of a coordinate system to be the center of the antenna. Let

the current I in the antenna be
2|z| −iωt
I = Iez = I (z)e−iωt ez = I0 (1 − )e ez . (11.118)
d
From the continuity equation,
∇· j − iωρ = 0 (11.119)
or ρ = ∇· j/iω, so the linear charge density6 is 6

Normally people use λ to denote
linear charge density; however, we’ve
∇· j∆A ∇· ( Iez ) 1 dI 2I used λ for wavelength.
ρ0 = ρ∆A = = = = i 0 sgnz. (11.120)
iω iω iω dz ωd
The electric dipole moment is
Z d/2
p= ∑ qa r 0a = ∑ q(z)zez = −d/2
ρ0 zdzez . (11.121)
a
Substituting Equation (11.120) into (11.121) gives

Z d/2
2I0 I0 d
p=2 i zdzez = i ez . (11.122)
0 ωd 2ω
Substituting Equation (11.122) into Equations (11.114) and (11.117)
gives
µ0 ω 4 I0 d 2 µ0 cI02

dP 2
= sin θ = (kd)2 sin2 θ, (11.123)
dΩ 32π 2 c 2ω 128π 2
and
µ0 c 2
h Pi = I (kd)2 . (11.124)
48π 0
Another thing is the total radiation power is proportional to kd or
d/λ, so to increase radiation power for a given λ, you can increase
d. However, we derived the dipole radiation under the assumption
that d/λ 1, or kd 1, therefore you cannot increase d without
limits but still use Equation (11.124) to calculate the power.
11.5 Magnetic dipole and electric quadrupole radiation
Normally the electric dipole radiation (sometimes just called

“dipole radiation”) dominates. However, in case the electric dipole
radiation is 0 or if we want to go to O(η ), we need to expand
0
e−ikr ·n and keep terms of O(η ).
The vector potential, when Taylor expanded, is
µ0 eikr
Z
0
A (r ) = j(r 0 )e−ikr ·n dV 0 (11.125)
4πr
µ0 eikr
Z
≈ j(r 0 )(1 − ikr 0 · n)dV 0 . (11.126)
4πr
We have consider the lowest order term, the electric dipole radia-
tion. The first order term in r 0 /λ is
ikµ0 eikr ikµ0 eikr
Z Z
A (r ) = − jr 0 · ndV 0 = − n · r 0 jdV 0 . (11.127)
4πr 4πr
Any tensor can be written as the summation of a symmetric part

and an asymmetric part. So we do this for r 0 j.
1 0 1
r0 j = (r j + jr 0 ) + (r 0 j − jr 0 ). (11.128)
2 2
After dotting Equation (11.128) with n/c from left we have
1 1
n · r0 j = [(n · r 0 ) j + (n · j)r 0 ] + [(n · r 0 ) j − (n · j)r 0 ] (11.129)
2 2
1 1 0
= [(n · r 0 ) j + (n · j)r 0 ] + (r × j) × n. (11.130)
2 2
Substitute Equation (11.128) into Equation (11.127). Let’s first
consider the anti-symmetric part. The integration for the anti-
symmetric part is
1 1
Z Z
(r 0 × j) × ndV 0 = −n × (r 0 × j)dV 0 = −n × m, (11.131)
2 2
where
1
Z
m= (r 0 × j)dV 0 (11.132)
2
is just the magnetic dipole moment we have learned before.
The integration of the symmetric part becomes
1
Z
(n · r 0 ) j + (n · j)r 0 dV 0 ,

(11.133)
2
Note that j = ∑ q a δ(r 0 − r 0a )v a , the integration becomes
1
Z
[(n · r 0 ) j + (n · j)r 0 ]dV 0 (11.134)
2
1
2∑
= q a [(n · r 0a )v a + (n · v a )r 0a ]. (11.135)
Using v a = dr 0a /dt, we have
(n · r 0a )v a + (n · v a )r 0a (11.136)
dr 0 dr 0a
= (n · r 0a ) a + (n · )r 0a (11.137)
dt dt
d 0 0
= n· (r r ). (11.138)
dt a a
Substituting Equation (11.138) into Equation (11.135) leads to
1 1 d
Z
dt ∑
(n · r 0 ) j + (n · j)r 0 dV 0 = n · qr 0 r 0 .

(11.139)
2 2
Because B = ∇× A = ikn × A, we can add to A an arbitrary vector
in the direction of n, so we add to Equation (11.138)
1 d
− n·
6 dt ∑ qr 02 I (11.140)
and the resulting summation is

n d n d
·
2 dt ∑ qr 0 r 0 − ·
6 dt ∑ qr 02 I (11.141)
n d
= ·
6 dt ∑ q 3r 0 r 0 − r 02 I (11.142)
1 1
= n · Ḋ ≡ Ḋ, (11.143)
6 6
where the tensor D is the electric quadrupole we have learned

before, and the vector D = n · D is a vector we defined by D = n · D.
Combining the symmetric and the anti-symmetric part we have
the total vector potential
ikµ0 eikr

1
A=− Ḋ − n × m . (11.144)
4πr 6
Therefore the magnetic dipole radiation and the electric quadrupole

radiation are of same order. The radiation magnetic field can be
obtained from A as
−ikµ0 eikr 1

B = ∇× A = ikn × Ḋ − n × m (11.145)
4πr 6
k2 µ0 eikr n

1
= × Ḋ − n × m . (11.146)
4πr 6
We can separate the magnetic dipole part and the electric quadrupole
part. The magnetic dipole part is
k2 µ0 eikr n
Bm = × (−n × m)
4πr
k2 µ0 eikr
= (n × m) × n
4πr
µ eikr
= (m̈ × n) × n 0 2 . (11.147)
4πc r
The electric quadrupole part is
k2 µ0 eikr k3 cµ0 eikr ... µ eikr

B D = n × Ḋ = −i n × D = D × n 0 2 . (11.148)
24πr 24πr 24πc r
The electric field can be obtained from E = cB × n, and corre-
spondingly
µ0 eikr
Em = cBm × n = (n × m̈) , (11.149)
4πcr
... µ0 eikr
E D = cB D × n = ( D × n) × n . (11.150)
24πcr
Using Equations (11.147) and (11.148), we can obtain the angular
distribution of power
* +
c µ0 (m̈ × n) × n 2 µ |m̈|2

dP c
= 2 2
| Bm | r = 2
= 0 2 3 sin2 θ.
dΩ m µ0 µ0 4πc 32π c
(11.151)
* ... 2 +
c µ0 D × n ...

dP c µ0
= | B D |2 r 2 = 2
= 2 3
| D × n |2 .
dΩ µ0 24πc 1152π c

D µ0
(11.152)
The total power of magnetic dipole radiation can be easily ob-

tained
µ0 |m̈|2 µ0 |m̈|2
Z
dP
ZZ
h Pim = dΩ = 2 3
sin3 θdθdϕ = . (11.153)
dΩ 16π c 12πc3
This is in exact the same form as the electric dipole radiation power
shown in Equation (11.117). Left as an exercise, you need to figure
out why the electric dipole radiation is one order lower than the
magnetic dipole radiation, even though they have the same form.
Because D = D · n, the angular distribution of power radiated
by the electric quadrupole is in general very complicated. Here
we’ll use a simple example to demonstrate its calculation and
the corresponding total radiation power. Consider three charges
(q, q, −2q) located at r 0 (q) = lez , r 0 (q) = −lez , and r 0 (−2q) = 0ez ,
then
D = 6ql 2 ez ez , (11.154)
so D = D · n = 6ql 2 ez cos θ, then
| D × n| = |6ql 2 cos θez × n| = 6ql 2 cos θ sin θ, (11.155)
and
...
| D × n|2 = 36q2 l 4 ω 6 cos2 θ sin2 θ. (11.156)
The total power radiated is then
µ0 q2 l 4 ω 6
Z 2π Z π
µ0
h Pi D = dϕ 36q2 l 4 ω 6 cos2 θ sin3 θdθ = .
1152π 2 c3 0 0 60πc3
11.6 Scattering by free charges
Now we have considered the radiation by charges. Let’s now con-

sider a closely related physical process: scattering of electromag-
netic waves by charges. When electromagnetic waves incident on
charges, charges feel the Lorentz force and get accelerated. The
acceleration leads to radiation of waves in all directions. This is the
physical mechanism responsible for scattering of light. And in the
most general cases, it is called the scattering of the original wave.
It is customary to characterize the scattering by the ratio of
energy emitted by the scattering system in a given direction per
unit time 7 , denoted by hdPi, to the energy flux density of the 7
This is just the power emitted by the
scattering system in a given direction.
incident radiation, denoted by hSi. This ratio has dimension of
area, and is called the effective scattering cross-section, or simply
cross-section8 . Here h· · · i denotes time average. Therefore 8
You can see “cross-section” in lots of
places, e.g., collision.
hdPi
dσ = . (11.157)
hSi
You can integrate dσ to get σ, which is called the total scattering
cross-section.
In this section, we consider the scattering by a free charge at rest.
We only consider non-relativistic case here; i.e., the charge velocity
v c. Then the ratio of the magnetic force to the electric force is
|v × B| vB v
∼ ∼ 1. (11.158)
| E| E c
Therefore we neglect the magnetic force and only consider the elec-
tric force. Let the incident wave be a linearly polarized monochro-
matic wave
E = E0 cos(k · r − ωt + α), (11.159)
Then the equation of motion of the charge is

q q
r̈ = E = E0 cos(k · r − ωt + α). (11.160)
m m
Now again because we are considering non-relativistic particles,
the amplitude of the motion l ∼ vT cT = λ, where T is the
wave period and λ is the wavelength. Therefore we can ignore the
variation of r in the wave phase factor k · r and let k · r = k · r 0 . For
simplicity, we can set r 0 = 0. Equation (11.160) becomes
q
r̈ = E0 cos(−ωt + α). (11.161)
m
In complex notation,
q b −iωt b 0 = E0 eiα .
r̈ = E0 e , with E (11.162)
m
Because the particle moves with a non-zero acceleration, it
radiates and the lowest order radiation is given by the dipole
radiation. The dipole moment p = qr and
q2 b −iωt
p̈ = qr̈ = E0 e . (11.163)
m
So the averaged radiated field can be calculated using Equation
(11.114), which is
µ0 q4 2

dP µ0
= | p̈|2 sin2 θ = E sin2 θ, (11.164)
dΩ 2
32π c 32π 2 c m2 0
or
µ0 q4 2
hdPi = E sin2 θdΩ. (11.165)
32π 2 c m2 0
The energy flux of the incident wave is
1 1 2
S= | E × B| = E , (11.166)
µ0 µ0 c
so
1 E02 1
hSi = = E2 . (11.167)
µ0 c 2 2µ0 c 0
Substituting Equation (11.167) and Equation (11.165) into Equation

(11.157) gives the scattering cross-section
2
µ20 q4 q2

dσ = sin2 θdΩ = sin2 θdΩ. (11.168)
16π 2 m2 4πe0 mc2
We see that dσ is independent of the frequency of the incident

wave9 . 9
Compare this with Rayleigh scatter-
ing, see below.
R
The total scattering cross-section is simply σ = dσ, or
2 Z 2
q2 q2
Z 2π
π
3 8π
σ= sin θdθ dϕ = .
4πe0 mc2 0 0 3 4πe0 mc2
(11.169)
If we use the classical electron radius re = q2 /4πe0 mc2 , then
8π 2
σ= r . (11.170)
3 e
The scattering by a free non-relativistic electron is called Thomson
scattering and the cross-section (11.169) or (11.170) is called the
Thomson formula.
Equation (11.169) is also valid even if the incident wave is un-
polarized. In this case, we need to average over the incident wave
to find the total cross-section. Let’s establish a coordinate system,
where E is in x-z plane, and φ = h E, ez i, Note that
cos θ = n · e, with e ≡ E/E, (11.171)
or
cos θ = [(n · ey )ey + (n · ez )ez ] · e = sin α cos φ. (11.172)
Therefore
sin2 θ = 1 − cos2 θ = 1 − sin2 α cos2 φ. (11.173)
We need to average over φ since the incident wave is unpolarized,
sin2 α
Z 2π
1
hsin2 θ iφ = (1 − sin2 α cos2 φ)dφ = 1 − , (11.174)
2π 0 2
and
2
q2

dσ 1
= (1 + cos2 α), (11.175)
dΩ 4πe0 mc2 2
and the total cross section is
2 Z 2
q2 q2

1 2 8π
σ= ( 1 + cos α ) dΩ = . (11.176)
2 4πe0 mc2 3 4πe0 mc2
11.7 Scattering by bounded electrons
Let’s now calculate the scattering of an incident electromagnetic

wave by a bounded electron, e.g., an electron in an atom. We model
the electron motion using a harmonic oscillator with natural fre-
quency ω0 . The corresponding equations of motion of the electron
is
ẍ + ω02 x = ( F damping + F drive )/m, (11.177)
Here F damping is any damping force, and we consider radiation

damping only, therefore
...
F damping = mτ ȧ = mτ x = −mτω 2 ẋ = −mγ ẋ, (11.178)
where γ = τω 2 . On the other hand, F drive is the force from the

incident wave, and it’s
F drive = qEe−iωt . (11.179)
Putting F damping and F drive into Equation (11.177) results in the

needed equation
q −iωt
ẍ + ω02 x + γ ẋ = Ee . (11.180)
m
Note here we assume that γ (ω0 , ω ).
From what we have learned about the forced oscillation of a
harmonic oscillator, x in Equation (11.180) has a solution of the
form x0 e−iωt , and the solution of x is given by
(q/m) E0 e−iωt (q/m) E0 e−i(ωt−δ)

x= = , (11.181)
ω02 − ω 2 − iγω
q
(ω02 − ω 2 )2 + γ2 ω 2
where −δ is the phase angle of ω02 − ω 2 − iγω, and

γω
tan δ = . (11.182)
ω02
− ω2
From Equation (11.181), the second order time derivative of the
dipole moment p is
(q2 /m)ω 2 E0 e−i(ωt−δ)

p̈ = q ẍ = − q . (11.183)
(ω02 − ω 2 )2 + γ2 ω 2
The remaining calculation of the scattering cross-section is

almost identical to that in the previous section. The angular distri-
bution of the radiated power from the dipole radiation is

dP µ0
= | p̈|2 sin2 θ (11.184)
dΩ 32π 2 c
2 2
µ0 q ω4
= E2 sin2 θ. (11.185)
32π 2 c m (ω02 − ω 2 )2 + γ2 ω 2 0
R dP
The total radiated power is simply h dΩ idΩ, or
µ0 q4 ω 4 E02
h Pi = . (11.186)
12πm2 c (ω02 − ω 2 )2 + γ2 ω 2
The total cross section is simply
h Pi h Pi
σ= = (11.187)
hSi (1/2µ0 c) E02
µ20 q4 ω4
= 2 2
, (11.188)
6πm (ω0 − ω 2 )2 + γ2 ω 2
where we have used Equation (11.167) for hSi. Using the classical
electron radius re = q2 /4πe0 mc2 , the cross section is
8π 2 ω4
σ= re 2 . (11.189)
3 ( ω0 − ω 2 ) 2 + γ 2 ω 2
Let’s now discuss several limiting cases of Equation (11.189). If

ω ω0 , then
8π 2 ω 4
σ≈ r . (11.190)
3 e ω04
This is the Rayleigh scattering we’ve mentioned before. If ω ω0 ,
8π 2
σ≈ r , (11.191)
3 e
which is Thomson scattering cross section, the electron move (ω0 )

so slowly that it is essentially a free electron to the incident wave. If,
however, ω ∼ ω0 , then
2
8π 2 ω
σ= r , (11.192)
3 e γ
which is much much larger than the Thomson scattering cross

section because resonance occurs.

Ce PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ce PDF

Uploaded by

Copyright:

Available Formats

X I N TA O

1 Introduction to Vector and Tensor Analysis 7

2 Foundations of Theory of Relativity 19

4 Charges in a Given Electromagnetic Field 40

5 The Electromagnetic Field Equations 55

7 Boundary Value Problems in Electrostatics 78

9 Time-varying Electromagnetic Fields 97

10 Fields of Moving Charges 121

11 Radiation of Electromagnetic Waves 132

It is critically important for you to learn basic vector and tensor

1.1 Vector Analysis

1.1.1 The definition

Any set of three components that transforms in the same

1.1.2 Einstein Summation Convention

Here the repeated index i is called the “dummy index” (哑指标),

1. Dummy indices are implicitly summed over.

2. Each index can appear at most twice in any term.

3. Each term must contain identical non-repeated indices,

To understand the following rules, we give some examples here.

Rule #2 prohibits expressions like Ai Bi Ci , because no vector op-

Here δij = 1 if i = j and 0 otherwise. Note that we used “i” as the

A · B = Ai Bj gij , 2 not Ai Bi as you would have by violating rule #2. 2

Einstein summation convection, as it will be used throughout this

A × B = ( A2 B3 − A3 B2 )e1 + ( A3 B1 − A1 B3 )e2 + ( A1 B2 − A2 B1 )e3 .

eijk eimn = δjm δkn − δjn δkm , (1.6)

where δjm = 0 if j 6= m and 1 if j = m. With the Levi-Civita symbol,

1.1.3 The Del Operator

We see that ∇ is a vector operator; it has properties of a vector and

The curl of a vector A is

1.2 Vector Algebra

In this section, I will teach you how to memorize/derive commonly

A·B×C = B·C×A = C·A×B

Or you can cyclically permute the order of vectors, like

Second, for A × ( B × C ), you can use the middle-outer rule 3 . 3

I’ve found this middle-outer rule quite convenient.

∇ A × ( A × B) = (∇ A · B) A − B(∇ A · A) middle-outer rule

1. write out the components in Cartesian coordinates

3. from components, restore the original variables

Take a break and let’s continue...

1.3 Curvilinear coordinates

I do not talk about general vector analysis in curvilinear coordi-

In cylindrical coordinates (er , eφ , ez ),

1.4 A very brief introduction to tensor

A tensor may be regarded as the product of two vectors; it has two

the component in ei e j direction is Ai Bj . Lots of students like to use

Using these rules, we see that,

You see that in general AB 6= BA, since AB · C is a vector in the

A general second-order tensor cannot be written as a dyad, but it

In 3-D space, you need at least three dyads to represent a general

If you define a second-order tensor using transformation

The dot-product between a second-order tensor and a vector is a

F · A = Fij ei e j · Ak ek = Fij Ak ei δjk = Fij A j ei .

So a dot product reduces the order or rank of a tensor by one. The

A · F · B = Ai ei · Fjk e j ek · Bl el = Ai Fjk Bl δij δkl = Ai Fik Bk .

This dot product can also be written as

A special second-order tensor is the unit tensor I, which has the

Here are two examples involving tensors.

Example 2: Prove that

1. ∇· ( f gh) = ∇ f · ( f gh) + ∇ g · ( f gh) + ∇h · ( f gh)

1.5 The Dirac Delta Function

1.5.1 The one-dimensional Dirac Delta Function

Strictly speaking, δ( x ) is not a function, since its value is not finite

A slightly more general form of 1D Dirac delta function is of

2. 向量张量运算符号法胡友秋，电动力学讲义 (课程主页上可以下