You are on page 1of 648

ElectroInagnetics

IEEE PRESS Series on Electromagnetic Waves


The IEEE PRESS Series on Electromagnetic Waves consists of new titles as well as
reprints and revisions of recognized classics that maintain long-term archival
significance in electromagnetic waves and applications.

Donald G. Dudley
Editor
University of Arizona

Advisory Board
Robert E. ~ollin
Case Western University

Akira Ishimaru
University of Washington

Associate Editors
Electromagnetic Theory, Scattering, and Diffraction
Ehud Heyman
Tel-Aviv University
Differential Equation Methods
Andreas C. Cangellaris
University of Arizona
Integral Equation Methods
Donald R. Wilton
University of Houston
Antennas, Propagation, and Microwaves
David R. Jackson
University of Houston

Series Books Published


Collin, R. E., Field Theory of Guided Waves, 2d. rev. ed., 1991
Tai, C. T., Generalized Vector and Dyadic Analysis:
Applied Mathematics in Field Theory, 1991
Elliott, R. S., Electromagnetics: History, Theory, and Applications, 1993
Harrington, R. F., Field Computation by Moment Methods, 1993

Future Series Titles


Tai, C. T., Dyadic Green's Function in Electromagnetic Theory
Dudley, D. G., Mathematical Foundations of Electromagnetic Theory
ElectroDlagnetics
History, Theory, and Applications

Robert S. Elliott
Hughes Chair Professor of Electronlagnetics
University of California, Los Angeles

IEEE
'V
• PRESS

IEEE PRESS Series on Electromagnetic Waves


Donald G. Dudley, Series Editor

IEEE Antennas and Propagation Society, Sponsor

The Institute of Electrical and Electronics Engineers, Inc., New York


IEEE PRESS
445 Hoes Lane, PO Box 1331
Piscataway, NJ 08855·1331
1992 Editorial Board
William Perkins, Editor in Chief

K. K. Agarwal G. F. Hoffnagle A. C. Schell


R. S. Blicq J. D. Irwin L. Shaw
R. C. Dorf A. Michel M. Simaan
D. M. Etter E. K. Miller Y. Sunahara
J. J. Farrell III J. M. F. Moura D. J. Wells
K. Hess J. G. Nagle
Dudley R. Kay, Executive Editor
Carrie Briggs, Administrative Assistant
Karen G. Miller, Production Editor
IEEE Antennas and Propagation Society, Sponsor
AP·S Liaison to IEEE PRESS
Robert J. Mailloux
RomeLaboratory, ERI
Hanscom AFB
This book may be purchased at a discount from the publisher
when ordered in bulk quantities. For more informationcontact:
IEEE PRESS Marketing
Attn: Special Sales
PO Box 1331
445 Hoes Lane
Piscataway, NJ 08855-1331
Fax: (732) 981-9334
© 1993 by the Institute of Electrical and ElectronicsEngineers, Inc.
3 Park Avenue, 17th Floor, New York, NY 10016-5997
This is the IEEE edition of a book published by McGraw-Hill Book Company
under the title Electromagnetics.
ALL rights reserved. No part of this book 111ay be reproduced in any form,
nor may it be stored in a retrievalsystem or transmittedin any[orm,
without written permissionfrom the publisher.

10 9 8 7 6 5 4 3 2 1
ISBN 0-7803-5384-6
IEEE Order Number: PP3723
The Library of Congress has catalogued the hard cover
edition of this title as follows:

Elliott, Robert Stratman (date)


Electromagnetics: history,theory,and applications/ by Robert
S. Elliott.
p. em.
IEEE Antennas and PropagationSociety,sponsor.
Includes bibliographicalreferencesand index.
ISBN 0-7803-1024-1
1. Electromagnetictheory. 1. IEEE Antennas and Propagation
Society. II. Title.
QC670.E42 1993
537-dc20 93-7061
CIP
TO THE CHILDREN
Table of Contents
Original Preface, xiii
Preface to the IEEE Edition, xvii
Glossary of Symbols, xix

1 THE PHENOMENON OF LIGHT

1.1 Historical Survey-The Nature of Light. 1


1.2 Historical Survey-The Velocity of Light 14
1.3 Sound lVaves and Light Waves . 29

2 THE SPECIAL THEORY OF RELATIVITY

2.1 H isiorical Survey . 37


2.2 The Principle of Relativity and Its Classical I mplications, 41
2.3 A pplications of the Classical Velocity Transformation Law 46
2.4 Fizeau's Experiment with 11{ oving Water 49
2.5 The Michelson-Morley Experiment . 51
2.6 Ether Drag. 58
2.7 The Lorentz-FitzGerald Contraction l-Iypothesis. 59
2.8 Emission Theories. 61
2.9 The Interdependence of Space and Time 61
2.10 The Lorentz Transformation . 70
2.11 Length and Time Under the Lorentz Transformation 73
2.12 Proper Time and Proper Distance 78
2.13 Velocity 81
2.14 Relativistic Interpretation of the Fizeau Experiment 82
2.15 The Cedarholm-Townes Maser Experiment. 83
2.16 The Variation of M ass 85
2.17 M omentum and Energy . 89
2.18 The Transformation Law for Mass 91
2.19 The Transformation Law for Force 92
viii Table of Contents

3 ELECTROSTATICS IN FREE SPACE

3.1 Historical Survey . 98


3.2 Mathematical Formulation of the Inverse Square Law . 117
3.3 The Electric Field . 121
3.4 Electrostatic Potential. 127
3.5 Gauss'Law. 134
3.6 Electric Flux 138
3.7 A Conductor- Vacuum Interface 140
3.8 The Method of I mages 142
3.9 Poisson's Equation 148
3.10 Laplace's Equation 151
3.11 Solutions to Laplace's Equation in Rectangular Coordinates 152
3.12 Solutions to Laplace's Equation in Cylindrical Coordinates 155
3.13 Solutions to Laplace's Equation in Spherical Coordinates. 165
3.14 Green's Functions. 172
3.15 Solutions to Laplace's Equation in Two Dimensions unih. the Use of Con-
formal Mapping 175
3.16 The Schuiarz Transformation. 182
3.17 Capacitance 186
3.18 AIulticapacitor Systems 189
3.19 Electrostatic Stored Energy 193
3.20 The 111 axwell-McAlister Experiment 199
3.21 The Plimpton-Lawton Experiment 204

4 MAGNETOSTATICS IN FREE SPACE

4.1 Historical Survey . 216


4.2 The Transjormaiion of Electric Force 224
4.3 The Fields Due to a Closed Circulating Charge System 230
4.4 The Biot-Savart Law . 232
4.5 The Magnetic Field Intensity. 236
4.6 The Force Between Currents . 237
4.7 The Time-Independent illagnetic Vector Potential Function 239
4.8 Ampere's Circuital Law 244
4.9 Boundary- Value Problems in 111 agnetostatics 249
4.10 Composite Fields 252

5 ElECTROMAGNETICS IN FREE SPACE

5.1 Historical Survey . 256


5.2 The Transformation Equations for Electric and Magnetic Fields. 264
5.3 The Transformation Equations for the Source Densities 267
Table of Contents IX

5.4 Maxwell's Equaiion« . 268


5.5 Integral Solutions of Maxwell's Equations in Terms of the Sources 272
5.6 Conditions at Infinity. 275
5.7 The Potential Functions 279
5.8 Magnetic Stored Energy 283
5.9 Poynting's Theorem 285
5.10 Solutions to the Wave Equation in Rectangular Coordinates-Unguided Waves 291
5.11 Rectilinear Guided Waves. 297
5.12 Solutions to the Wave Equation in Cylindrical Coordinates 303
5.13 Solutions to the Wave Equation in Spherical Coordinates . 306
5.14 Inductance. 309
5.15 Transformation of the Integral Solutions to Forms Suitable for Waveguide
Problems 315
5.16 A Minkowskian Formulation of the Field Equations 319

6 DIELECTRIC MATERIALS

6.1 Historical Survey 327


6.2 The Electric M oment of a Neutral System of Charges 330
6.3 The Static 111acroscopic Electric Field Due to a Volume of Polarized Dielectric
Material 331
6.4 A Generalization of Do 339
6.5 The Local Field 342
6.6 Electronic Polarization 344
6.7 I onic Polarization . 346
6.8 Orientational Polarization 351
6.9 Dielectric Susceptibility, Permittivity, and Relative Dielectric Constant. 354
6.10 The Static Dielectric Constant of Gases . 356
6.11 T he Static Dielectric Constant of Solids and Liquids . 361
6.12" The Clausius-Moesotii Equation. 366
6.13 Primary Static Charges in an Infinite, Homogeneous, Isotropic Medium: 369
6.14 Ferroelectric Crystals 370
6.15 Piezoelectrics 376
6.16 Time-Harmonic Fields and Complex Permittivity 379
6.17 Time-Harmonic Electronic Polarizability 380
6.18 Complex I onic Polarizabilitu; Time-Harmonic Permittivity of Non-Polar
M aterials . 382
6.19 Dipolar Relaxation 383
6.20 Dielectric Losses 390
6.21 lYI axwell' s Equations for Dielectric Materials 393

7 MAGNETIC MATERIALS

7.1 Historical Survey . 397


x Table of Contents

7.2 The Static M acroscopic M agnetic Field Due to a l ' olume of Polarized M ag-
netic 111aterial . 404
7.3 A Generalization of flo 408
7.4 The Local Field 410
7Jj Magnetic Susceptibility 411
7.6 Measurement of Susceptibility 414
7.7 Diamagnetism . 416
7.8 Permanent 1.1agnetic Moments 420
7.9 Paramagnetism 427
7.10 Properties of Ferromagnetic M aterials 433
7.11 The Weiss Theory of Ferromagnetism 436
7.12 The Weiss Field Constant and the Exchange Integral 440
7.13 Ferromagnetic Domains 442
7.14 A ntiferromagnetism 445
7.15 Ferrimagnetism 451
7.16 Time- T1 arying Phenomena 454
7.17 Maxwell's Equations for M agnetic 1.1aterials 464

8 CONDUCTIVE MATERIALS

8.1 Historical Survey . 467


8.2 Classification of Conductive Properties Under the Band Theory 473
8.3 Free-Electron Theory of 111 etals-Ohm'sLaw 478
8.4 Ohm's Law-Alternate Derivation 482
8.5 The JJ1ean Time Between Electron-Lattice Interactions. 485
8.6 A1ean Free Path 486
8.7 Joule's Law 489
8.8 The Debye Theory of Specific Heat 490
8.9 The Temperature Dependence of the Resistivity of AI etals . 494
8.10 Thermal Conductivity of Metals and the Wiedemann-Franz Law. 496
8.11 Conductivity of Semiconductors 500
8.12 M axwell's Equations for Conductive 111 edia. 509
APPENDICES

A Fringe Shift versus Rotation of the M ichelson-Morley Apparatus 513


B Classical Doppler Sh~ft [rom a 1110ving Source in the Presence of aMoving
Ether 515
C Some Properties of Bessel Functions 519
D The Associated Legendre Equation 524
E Composition of General Sources . 530
F Generalization of the Field Transformation Equations , 532
G Reduction of the Vector Green's Formula for E . 534
H The Wave Equations for A and cI> 537
I Vector Wave Solutions in Spherical Coordinates 539
J Green's Functions for Rectangular Waveguide . 540
Table of Contents xi

K The Average Electrostatic Field Intensity Inside a Sphere Containing an


A rbitrary Dipole Distribution 544
L The Dynamic Macroscopic Scalar Potential Function Due to a Volume of
Polarized Dielectric Material. 547
M The Damping Constant of a Freely Oscillating Dipole. 550
N T he Average M agnetostatic Field I ntensity Inside a Sphere Containing an
A rbitrary Distribution of Current Loops 552

MATHEMATICAL SUPPLEMENT

I Taylor's Series
8.1 Historical Survey . 557
8.2 Mean Value Theorems 558
8.3 Taylor's Series for One Variable 560
8.4 Taylor's Series for Several Variables 561

II Vectors
V.l Historical Survey . 564
V.2 Scalars and Vectors 568
V.3 The Addition Law for Vectors 569
V.4 The Multiplication of Vectors by Scalars 572
V.5 Resolution into Components . 572
V.6 Multiplication of Vectors-The Dot Produ.ct ,576
V.7 The Equation of a Plane. 579
V.8 Multiplication of Vectors-The Cross Product. 580
V.9 T he Derivative of a Vector 584
V.lO Tangent Lines and Tangent Planes. 587
V.Il Generalized Coordinates . 5S9
V.12 Elementary Geometry in Generalized Coordinates 594
V.13 Addition, Subtraction, and M uliiplication in Generalized (Jrthogonal
Coordinates 598
V..14 Gradient 598
V.I5 Divergence. 603
V.16 The Laplacian Operator 606
V.I7 The Divergence Theorem. 607
V.I8 Curl 607
V.19 Stokes' Theorem . 612
V.20 Vector Identities . 612
V.2l Green's Integral Theorems 613
V.22 Solenoidal and I rrotational Vector Fields 614
V.23 Complex Vectors. 615
Summary of Important Vector Relations 616

AUTHOR INDEX, 621


SUBJECT INDEX, 625
Original Preface
THIS TEXTBOOK has evolved as a result of a change in curriculum which took place at
the author's institution six years ago. At that time a required junior offering in engineer-
ing was converted from being a course in electrical machinery to being an introductory
exposition of field theory. Prior to that change, the students first encountered electro-
magnetic field theory in a two-semester elective sequence open to first-year graduate
students and advanced seniors. The first semester of the senior-graduate sequence had
been devoted principally to the development of the theory in a conventional manner;
each of the major areas was introduced by an experimental postulate-electrostatics
by Coulomb's law, magnetostatics by the Biot-Savart law, and electrornagnetics by
Faraday's emf law. The second semester of the sequence was concerned with appli-
cations, notably to transmission lines, waveguides, cavities, and antennas.
Updating the junior course pre-empted much of the material in the first semester
of the senior-graduate sequence and provided the opportunity to choose among several
alternatives. Elimination of the first semester of the senior-graduate sequence seemed
unwise, because electromagnetic theory is a subject of sufficient subtlety to justify a
second exposure. Following along the same path taken in the junior course and merely
digging deeper seemed an inefficient use of the students' time in face of the realization
that all of their time has become pre111iu111. Institution of a problem-solving course was
also considered, and admittedly many educators favor this as a way to solidify the
students' understanding of Maxwell's equations. However, it was felt that one junior
course in electromagnetics was not sufficient background to provide a suitable interface
with a meaningful course in boundary value problems,
Fortunately, a third alternative was available, and has been tried with gratifying
success. This alternative retains the scope of the senior-graduate sequence but begins
with a study of special relativity. With this as a basis, it is possible to develop all of
electromagnetic theory from a single experimental postulate founded on Coulomb's
law, An enriched understanding of magnetism results, and the Biot-Savart law is a
consequence rather than a postulate. The Lorentz force law is seen to be a transforma-
tion of Coulomb's law occasioned by the relativistic interpretation of force. Upon
accepting the Lorentz force law as fundamental, one is able to derive Faraday's emf
law and Maxwell's equations as additional consequences. This procedure provides the
further satisfaction of demonstrating that the fields contained in the Lorentz force
law and in Maxwell's equations are one and the same, a conclusion not possible in the
conventional development of the subject.
xiv Preface

A major advantage of this approach is the inclusion of special relativity, an intel-


lectual discipline of growing importance to engineers as well as to scientists. It seems
almost a redundancy to argue that a subject which gets to the heart of the concepts of
space and time-r-concepts 011 which all physical measurements are based-deserves to
be included in the basic core of a science curriculum. Presumably the reason why a
serious, detailed treatment of special relativity has not been widely found at the 10,\\Ter
levels of curricula is that it is customarily based on Maxwell's equations and therefore
treated as an advanced graduate topic. But this need not be so and indeed special
relativity, which affects all branches of physics, 111ay properly be considered 1110re
fundamental than electromagnetics. The postulates on which special relativity is based
are not electrical in nature, and the mathematics needed to develop the theory is not
complex. Thus there is no compelling pedagogical reason for deferring the study of
special relativity until after a mature grasp of Maxwell's equations has been secured.
Accordingly, the first semester of the senior-graduate sequence has beC0111e a course
which begins with special relativity and ends with the theory of electromagnetic waves
in general media: the second semester is still devoted to applications. Each course
meets four times a week for fifteen weeks and this textbook is designed to fill the
needs of the first semester. Both courses are populated by first-year graduate students
and advanced seniors in about the ratio t\VO to one. The first semester has been taught
essentially in the manner just described for the last five years.
The textbook will be found to consist of eight chapters, a group of appendices, and a
mathematical supplement. The latter contains sections on Taylor'S series and vector
analysis and it is presumed that the student is at least somewhat familiar with this
material. It is provided for reference or remedial work as needed. The appendices con-
tain derivations which, for brevity, have been avoided in the Blain text.
The first chapter is concerned with the phenomenon of light, with emphasis on its
wave properties and velocity. Contrasts are made between light and other wave
phenomena such as sound, and the behavior of each in moving media is described in
order clearly to point up the dile111111a which led to the special theory. The second
chapter deals with the special theory of relativity itself, and includes discussions of the
crucial experiments and the classical attempts to explain them. The Lorentz equations
are established and followed by derivations of the transformation laws for velocity,
mass, and force. S0111e discussion of relativistic mechanics is provided to enhance C01l1­
prehension of the new concepts.
Chapter 3 contains a conventional treatment of electrostatics in free space and in-
cludes discussions of electric force, potential, flux, Gauss' law, and the equations of
Poisson and Laplace. Considerable space is devoted to the solution of boundary-value
problems and capacitance is introduced as well as the concept of electrostatic stored
energy.
The fourth chapter employs a Lorentz coordinate transformation to convert the
static system of charges treated in Chapter 3 to a rigidly translating system of charges
having the features of a steady current. Use of the force transformation equations then
yields the Lorentz force law and permits definition of a magnetic field. The Biot-Savart
law is derived, as is Ampere's circuital law. The mathematical discussion is facilitated by
introduction of the vector potential function.
Chapter 5 employs a second Lorentz transformation to convert the steady electric
and magnetic fields discussed in the third and fourth chapters into time-varying fields
as seen by a moving observer. These fields are defined to conform to the Lorentz force
Preface xv

law and it is then shown that they satisfy Maxwell's equations. Frorn this point the
chapter proceeds conventionally-general solutions of Maxwell's equations are estab-
lished using the vector Green's theorem and conditions at infinity are explored. The
potential functions are introduced and Poynting's theorem is proved. Inductance is
defined and the chapter concludes with consideration of solutions to the homogeneous
wave equation in rectangular, cylindrical, and spherical geometries.
The last three chapters are devoted to generalizations of the constitutive parameters
E (permittivity), J.L (permeability), and (J (conductivity). The electrical behavior of

materials is explained in terms of equivalent electric and magnetic 1110111ents and elec-
tron-lattice interactions, and the dependence of the constitutive parameters on fre-
quency and temperature is included. Maxwell's equations are generalized so as to be
applicable in material media as well as free space.
This brief outline of the contents points up the fact that the approach adopted is to
develop a complete electromagnetic theory for free space before undertaking any
consideration of material media. There are several advantages to this procedure:
First, the unity of development in Chapters 3, 4, and 5, with the force between charges
as the underlying link, is not diluted as it would be otherwise; second, by deferring
the discussions of material media, 1110re general inspection of the constitutive pararn-
eters is possible, including time-varying effects.
The ability to proceed from Coulomb's law as the single electrical postulate and,
with the aid of special relativity, to deduce a complete electromagnetic theory was first
demonstrated by Professor Leigh Pagel of Yale in 1912. He later incorporated this
approach into the textbook Eleciroduruimics co-authored with Adams. The present
development differs from the earlier treatment by Page and Adams in that more es-
tensive consideration of electrostatics and magnetostatics is undertaken. Major
differences in the form of the mathematical derivations also will be found in the two
approaches.
The chapters on electrical properties of materials are intended to illuminate the
meaning of the symbols E, J.L, and (J which occur in Maxwell's equations. In preparing
these chapters, the author has been helped particularly by Dekker's Solid State Physics
and Kittel's Introduction to Solid State Physics.
The reader will observe that each chapter begins with an historical survey related
to the ideas contained in that chapter. This has been done in the belief that not all
technical textbooks should be devoid of historical material, and further that many
readers find scientific history interesting and are thus additionally motivated to grasp
the technical aspects of the subject. Without the historical background, the reader of
a technical exposition often is left with a bland reaction to his first encounter with a
new physical concept. Yet, 1110re often than not, there is behind this concept a rich
heritage of thought, as outstanding human minds have struggled to identify the con-
cept and clarify it. Awareness of this heritage instills added respect for each new
principle and reveals an important lesson which all scientific history teaches-that
complete understanding is rarely attained and that the struggle for clarity is still
going on.
The literature contains many excellent treatises of scientific history, and the author
is indebted to these sources for background material. Particular mention should be
given to Whittaker's A History of the Theories of Aether and Electricity whieh has also
1 L. Page, "A Derivation of the Fundamental Relations of Electrodynamics from Those of Electro-

statics," Am J Sci, 34, 57-68, 1912.


xvi Preface

been used as a guide in establishing chronology. An effort has been made to achieve a
contextual relationship between the historical material and the technical expositions
which follow. Such blending is not normally possible in works devoted exclusively to
the historical aspects of a technical subject. The reader will also note that extensive
use has been made of direct quotations from the writings of scientific discoverers. It is
hoped that this adds to the sense of reality in the reconstruction of the event and gives
SOBle insight to the character of the discoverer.
In all chapters but the first and third, the historical survey is brief and limited to
the discovery of principal ideas. The first chapter is extensively historical and traces
the evolution of thought as to whether light is corpuscular or wavelike and whether
its velocity is finite or infinite. Light is perhaps the least tangible of physical entities
and its behavior in a VaCUUl11 roots the special theory of relativity. It was felt that a
lengthy discourse on the phenomenon of light would enhance understanding of the
later discussion of how light behavior gave rise to relativity theory: In the third chapter,
more than usual space is devoted to the establishment of the inverse square law, the
justification being that this law forms the sole experimental basis for the subsequent
development of electromagnetic theory.
All historical sections in all chapters are marked with an asterisk and can be omitted
without loss of continuity by the reader solely interested in technical exposition. It
has been found that these sections make suitable outside reading assignments and need
not occupy excessive classroom time.
The pedagogical problem of repetition in the teaching of electrornagnetics at several
levels also exists in physics curricula, and the present approach may find some favor
in physics as well as engineering. All of the formulas developed are relativistically
exact and the emphasis on force as the fundamental link makes this approach a
natural avenue to the study of relativistic particle dynamics.
The possible interest of another group of readers also has been considered in the
preparation of this text. Practicing engineers and scientists who have been in industry
a number of years, and who wish to update or renew their knowledge of electrical
theory, may find a fresh approach to the subject more rewarding than a reacquaintance
with a conventional treatment. For such self-study, it is desirable to provide a pro-
fusion of illustrative examples of varied difficulty. An effort has been made to do this
through the introduction of a broad spectrum of practical illustrations, including
applications to antennas, transmission lines, waveguides, unbounded propagation, and
scattering. The problems at the ends of the chapters are also numerous and graduated
and it is hoped thereby that this text is sufficiently self-contained to satisfy the need
of the reader for whom formal instruction is not available.
Rationalized MI{S units have been used consistently throughout the text, and
tables of constants, units, and conversion factors have been provided on the inside
of the back cover. A glossary of symbols will be found in the frontispiece. Time-
harmonic quantities have been represented through use of the factor ei wt . Those readers
who are more comfortable with the notation e- i wt need only replace j by -i in any of
the resulting expressions of interest.
The author will be grateful to anyone who brings errors to his attention.

R.S.E.
Los Angeles
Ala-y, 1966
Preface to the IEEE Edition
THIS TEXTBOOK first appeared in 1966, and although it has been out of print for some years, the
continuing demand has caused it to become something of a collector's item. The primary
appeals seem to be (1) the historical material that introduces each chapter, and, (2) the
development, via special relativity, of a complete electromagnetic theory based on Coulomb's
Law as the sole experimental postulate. However, it is hoped that the extensive elaboration of
electrostatics, magnetostatics, and electrodynamics in chapters three, four, and five will
continue to find favor. The final three chapters on dielectric, magnetic, and conductive
materials are devoted mainly to basic concepts, thereby avoiding, at least partially, the onus of
being dated.
The path of presentation in this text has been to develop electrostatics fully, then make a
relativistic transformation of Coulomb's Law to obtain the Lorentz Force Law, in the process
introducing the concept of a time-independent magnetic field, from which a complete
magnetostatic theory emerges. A second relativistic transformation leads to time-varying
sources, Maxwell's Equations, and a full electrodynamic theory. However, an alternate
approach that appeals to some readers uses a single relativistic transformation of Coulomb's
Law to obtain Maxwell's Equations, thereby leap-frogging over magnetostatics, a subject to
which one can return conventionally by reducing Maxwell's Equations to the case of
time-independent sources and fields. This alternate approach is developed in the article
"Relativity and Electricity," IEEE Spectrum, pp. 140-152, March 1966.
The author wishes to express his gratitude to the IEEE PRESS for re-issuing Electromag-
netics and trusts that their faith in this project will not go unrewarded.

Los Angeles R. S. Elliott


Glossary of Symbols
ONLY the principal uses are listed. An effort has been made to attach a single meaning
to a symbol unless it is used in widely separated and disconnected developments.

a acceleration
a, b radii
A area
A amplitude of oscillation
A magnetic vector potential function
ace, ¢) field pattern
A four-dimensional vector potential function
b time-averaged magnetic field
B magnetic field
(B a factor of the magnetic field
Q3 Brillouin function
c velocity of light
C contour, complex constant
C capacitance
e, ([ contours in complex plane
coefficients of capacitance
d diameter, distance
D length, diameter
D electric flux density
e proton charge, N apierian base
e time-averaged electric field
E energy
E electric field
t a factor of the electric field
f, F force
j, g scalar functions
F,G vector functions
xx Glossary of Symbols
~ field tensor
g gravitational acceleration
gJ spectroscopic splitting factor
G universal gravitational constant, Green's function
h Planck's constant
11 = h/27f reduced Planck's constant
H magnetic intensity
sc a factor of the magnetic intensity
1 areal current density
I 11101nent of inertia
I current
I four-dimensional areal current density
i lineal current density
J Ff
J,L,S momentum quantum numbers
c9 total angular momentum
3 exchange integral
I n, Y n, n; In, s: Bessel functions
k Boltzmann constant
k vector propagation constant
Ie, f propagation constants or wave numbers
K constant
t length
l, L distances, lengths
L self-ind uctance
.e orbital angular momentum
~ Langevin function
m magnetic moment
m electronic mass, index
1Ft, Jl;J masses
M magnetic moment density
M mutual inductance
n refractive index, index
n normal direction
n, N volume particle density
NA Avogadro's number
~ number of moles
0 observer
p momentum, dipole moment
P pressure, index
Pi;' coefficient of potential
Glossary of Synlbols XXI

p point, power
p dipole moment density
CP Poynting vector
Legendre functions
q, Q charge
r radial distance, radius of curvature
r, ¢, Z cylindrical coordinates
r, f), Q> spherical coordinates
r position vector
~ distance from source point to field point
R ideal gas constant
R resistance
CR amplitude of cornplex number
S surface
S spin angular momentum
t time
T absolute temperature, kinetic energy
T torque
u relative speed
u, v substitution variables
u potential energy
v, V velocity
V volume
V voltage
b substitution variable
WF Fermi energy
m == u + jv == CRejl/> complex variable
x, y, Z Cartesian coordinate variables, field point coordinates
x, Y, Z Cartesian coordinate axes
S x + jy == Re j ¢ complex variable
Z atomic number
polarizability
0', {3 angles, parameters
l' ratio of specific heats, internal field constant
0, € small increments
o Dirac function
Kronecker delta
phase d ifference
permittivity
proper distance
impedance of free space
xxii Glossary of Symbols
angles
angles
K reciprocal of contraction factor
lineal charge density
wa velength, rnean free path
permeability, permanent dipole moment
v frequency
~ displacemen t
~, '1], r source point coordinates
II product
p volume density of charge or 111aSS
(J surface charge density, conductivity
summation
T period of oscillation, lifetime, proper time
acoustic power density
scalar functions
potential functions
dielectric and magnetic susceptibilities
electric flux
w angular frequency, angular velocity
solid angle
JI directional position of source point
unit vector
del operator
three-dimensional Laplacian operator
four-dimensional Laplacian operator
ElectroInagnetics
CHAPTER 1
The Phenomenon of Light
THE OHIGIN of the special theory of relativity lies in a dilemma concerned with the
nature and velocity of light. Appreciation of this dilemma adds purpose and meaning
to relativity, and it is for this reason that the present chapter is concerned with light
and its properties. The first two sections trace the evolution of thought with respect
to whether light is corpuscular or wavelike, and whether its velocity is finite or infinite;
present-day views of these properties culminate both developments. Light and sound
(the latter being representative of wave phenomena requiring a tangible medium)
are compared in the third section and their essential similarities and differences are
highlighted; the resulting contrast prepares the way for the introduction, in Chapter 2,
of the aforementioned dilemma.
More than usual space is given in this chapter to the historical aspects of the subj ect.
An explanation of the decision to do this may be found in the Preface. The reader wish-
ing to concentrate his efforts on the technical development may prefer to limit his
attention to the Bradley aberration experiment in Section 1.2 and the comparison of
light and sound in Section 1.3.

1.1 * HISTORICAL SURVEY-THE NATURE OF LIGHT


Speculation about the nature of light can be traced back to antiquity. The Sicilian
Empedocles (c.490-c.43;") B.C.) was credited with the vie\v 1 that light consists of srnall
particles emitted from a visible body. These particles were presumed to enter the eyes
and were then returned to the visible body (a conservation law l) with the resulting
streams of particles being responsible for the sensations of shape and color.Unfor-
tunately, only fragments of the writings of this extraordinary man have survived, and
the direct evidence of his view is merely suggestive, being contained in the lyrical
passage"
As when a 111 an , about to sally forth,
Prepares a light and kindles him a blaze
Of flaming fire against the wintry night,
* Throughout this book the content of sections marked with an asterisk is primarily historical. The
reading of these sections can be omitted without materially affecting the technical exposition.
1 Plato, Meno. (See, e.g., the W. R. M. Lamb translation, \:'"01. 165 of the Loeb Classical Library, p.

285, Harvard University Press, 1962.)


2 W. E. Leonard, The Fragments of Empedocles, pp. 42-43, The Open Court Publishing Company,

Chicago, 1908.
2 T'ke Phenomenon. of [Jt'ght CHAPTER 1

In horny lantern shielding from all winds:


Though it protect from breath of blowing winds,
Its beam darts ou tward, as 1110re fine and thin,
And with untiring rays lights up the sky:
Just so the Fire primeval once lay hid
In the round pupil of the eye, enclosed
In films and gauzy veils, which through and through
Were pierced with pores divinely fashioned,
And thus kept off the watery deeps around,
Whilst Fire burst outward, as more fine and thin.

Empedocles was a close observer of nature, the apparent originator of the long-
standing and influential notion that all things are composed of the four elements: air,
fire, water, and earth. He was a poet of stature whose wide-ranging opinions exerted a
strong influence on later Greek scholars. Aristotle (384-332 B.C.) quotes hin1 fre-
quently, often contentiously, and in De Sensu says"

Empedocles at times seems to hold that vision is to be explained as above-stated, by light


issuing forth from the eye; e.g., in the following passage: [The 13 lines given above are then
quoted.] Sometimes he accounts for vision thus, but at other times he explains it by emana-
tions from the visible objects.

Aristotle states his own opinion about the nature of light in De A nima:"

N O\V there clearly is something which is transparent, and by "transparent" I mean what
is visible, and yet not visible in itself, but rather owing its visibility to the color of somethinq
else; of this character are air, water, and many solid bodies. Neither air nor water is trans-
parent because it is air or water ; they are transparent because each of them has contained
in it a certain substance which is the same in both and is also found in the eternal body
which constitutes the uppermost shell of the physical COS1110S. Of this substance light is the
activity-the activity of what is transparent so far forth as it has in it the determinate
power of becoming transparent; where this power is present, there is also the potentiality
of the contrary, viz. darkness. Light is as it were the proper color of what i:-; transparent,
and exists whenever the potentially transparent is excited to actuality by the influence of
fire or something resembling "the uppermost body"; for fire too contains something which is
one and the same with the substance in question.
We have now explained what the transparent is and what light is; light is neither fire
nor any kind whatsoever of body nor an efflux from any kind of body (if it were, it would
again itself be a kind of body)-it is the presence of fire or something resembling fire in
what is transparent. I t is certainly not a body, for two bodies cannot be present in the same
place. The opposite of light is darkness; darkness is the absence from what is transparent
of the corresponding positive state above characterized; clearly therefore, light is just the
presence of that.

Aristotle's influence was greater with later cultures than with his O\VIl, and thus one
finds most ancient Greek scholars preferring to accept a simpler view similar to that of

3 Aristotle, De Sensu, 437 b , 23, English translation under editorship of W.D. Ross, Oxford at the
Clarendon Press, 1931.
4 Aristotle, De Anima, 418 b , 4, English translation under editorship of W. 1). Ross, Oxford at the
Claren don Press, 1931.
SECTION 1 Historical S"urvey-The Nature of Liqh;

Empedocles; for example, both Euclid and Ptolemy held the opinion that light consists
of rays which originate in the eye, illuminate the object seen, and then return to the eye.
In contrast to the richness of Greek speculation about light, Roman scholars do not
appear to have been interested in this problem. Indeed, all of Roman science was
essentially derivative in character and distinctly low order, contributing little that was
original, and nothing worthy of note in the present survey. Arabic science, on the other
hand, while also being derivative, was of a rather high order, being based on the finest
products of Greek scientific achievement. The successors of Mohammed evinced a
great interest in the ideas of the western people whom they conquered, and far from
being the destroyers of Western literature, they were its chief preservers. The Arabs
came into contact with the Greeks in Egypt as well as western Asia, and becarne their
virtual successors in carrying forward the torch of learning. Although inclined to be
conservative and traditional, thus accepting most Greek ideas as authoritative, the
Arabian scholars did make several independent discoveries of significance. An impor-
tant example is the Arabic numbering system in use today, which evolved during this
period.
In the specific field of light, many accornplishrnents can be credited to Ibn al-Haitharn
(c.9G5-c.1039), known to the Western world by the Latin uame Alhazen. He was the
true physicist of medieval Islam, just as Archimedes had been in the Grecian period,
for he combined with rare skill both the experimental investigation of natural phe-
nomena and the analysis of results by mathematics.' Alhazen was one of the ablest
students of optics of all times and published a seven-volume treatise on this subject
which had great celebrity throughout the medieval period and strongly influenced
Western thought, notably that of Roger Bacon and Kepler." This treatise discussed
concave and convex mirrors in both cylindrical and spherical geometries, anticipated
Fermat's law of least time, and considered refraction and the magnifying power of
lenses. It contained a remarkably lucid description of the optical system of the eye,
which study led Alhazen to the belief that light consists of rays which originate in the
object seen, and not in the eye, a view contrary to that of Euclid and Ptolemy.
Ibn Sina, or Avicenna (980-1037), the most famous of the Islamic scientists, whose
immense medical encyclopedia, the Quanun, made him the greatest name in medi-
cine for four centuries, was also a perceptive student of various physical questions
-motion, contact, force, vacuum, infinity, light, and heat. He shared Alhazen's
view that light originated in the luminous source and felt that it must consist of some
type of particles."
Roger Bacon (1214-1294), a learned scholar who stressed the value of reading works
in their original languages, was well-versed in the teaching of Aristotle, St. Augustine,
and the Muslim scientists Alhazen and Avicenna. During a sojourn in Paris, he so
impressed the future Clement VI that the latter, upon elevation to the Papacy in 1265,
requested Bacon to transmit copies of all his writings without delay. Up to that time,
Bacon had writ.ten but little; however, in the span of one year, he composed the Opus
Mtijus, the Opus Minor, and the Opus Tertium, a stupendous undertaking, the fruits
of which exerted a great influence on Western thought for centuries. In his masterpiece,
5 H. J. J. Winter, Eastern Science, John Murray Publishers, Ltd., London, 1952.
6 G. Sarton, Introduction to the History of Science, Vol. 1, p. 721, Williams and Wilkins Cornpany,
Baltimore, l\ld., 1927.
7 Ibid., Vol. 1, p. 710.
4 The Phenomenon of Light CHAPTER 1

the Opus 111ajus, Bacon appears to endow Alhazen and Avicenna with an ambivalent
position by saying"

If, moreover, Alhazen and A. vicenna, in the third book on the Soul . . . are cited as
opposed to this view, I reply that they are not opposed to the generation of the species of
vision, nor to the part it plays in producing sight; but they are opposed to those who have
maintained that some material substance as a visible or similar species is extended from the
sight to the object, in order that vision may perceive the object itself, and that it may
seize upon the species of the object seen and carry it back to the sight.

Bacon's own view coincided with the opinion of many of the ancients, that light
consists of emanations which originate in the eye, and he defends this view in the
passage"

The reason for this position is that everything in nature completes its action through its
own force and species alone, as, for example, the sun and the other celestial bodies through
their forces sent to the things of the world cause the generation and corruption of things;
and in a similar manner inferior things, as, for example, fire by its own force dries and con-
sumes and does many things. Therefore vision must perform the act of seeing by its own
force. But the act of seeing is the perception of a visible object at a distance, and therefore
vision perceives what is visible by its own force multiplied to the object . . . it is clear to
him who gives it due consideration that vision must take place by means of its species
emitted to the visible object.

As for the species of light itself, Bacon says, in an explanation which has the interesting
tinge of wave motion, that!"

. . . the species is not a body, nor is it changed as regards itself as a whole from one place
to another, bu t that which is produced in the first part of the air is not separated fr0111 that
part, since form cannot be separated from the Blatter in which it is, unless it be soul, but
the species forms a likeness to itself in the second position of the air, and so on. Therefore
it is not a motion as regards place, but is a propagation multiplied through the different
parts of the medium ; nor is it a body which is there generated, but a corporeal form, with-
out, however, dimensions per se, but it is produced subject to the dimensions of the air . . . .

The passage of three centuries marks the interval between the death of Roger Bacon
and the birth of Rene Descartes (1596-1650), whose intellect and creative genius
were to stir scientific imagination, and whose prolific pen was to prove even D10re
influential than Bacon's. Descartes lived at a time in which the world was ripe for a
new conception of the nature of things. Major changes in attitude about man's sur-
roundings were being culminated; Galileo and Kepler were advocating the overthrow
of the geocentric hypothesis of Ptolemy, the Magellan expedition had circumnavigated
the globe, the invention of the telescope was leading to expanded knowledge of the
skies, and Aristotelian scholasticism was under attack at all its weakest points. Xlon-
taigne's skepticism had paved the way for a break with tradition, and Descartes set
for himself the task of erecting a new structure to replace the old. In the words of
8 R. Bacon, Opus N!ajus, Part 5, 7th Distinction, Chap. 3, the R. B. Burke translation, University of

Pennsylvania Press, Philadelphia, 1928.


9 Ibid., 7th Distinction, Chap. 4; Dth Distinction, Chap. 1.

10 Ibid., 9th Distinction, Chap. 4.


SECTION 1 Historical Survey-The Nature of Light 5

Whittaker,l1 "His aim was nothing less than to create from the beginning a theory
of the universe, worked out as far as possible in every detail."
To understand Descartes' position on the particular subject of the nature of light,
one must first appreciate the major features of his grand design of the universe and the
attitudes which shaped this design. His philosophy was essentially dualistic; he believed
the physical world to be mechanistic and divorced from the mind, the only connection
between the two being through God's intervention. In science, he supported the induc-
tive method of Francis Bacon, but with emphasis on rationalization and logic, rather
than upon experiences. Mathematics "vas Descartes' greatest interest and he is widely
called the father of analytic geometry. Under Kepler's influence, he became convinced
that the precision and universality of mathematics set it apart from all other fields of
study. This admiration of the clarity of mathematical expression serves to explain
why, as the first rule in the Discourse on 111ethod, Descartes vowed
never to accept anything as true if I had not evident knowledge of its being so; that is, to
accept only what presented itself to mv mind so clearly and distinctly that I had no occasion
to doubt it.

This attitude led Descartes to the decision that, since effects produced by 111eanS of
contacts and collisions were the simplest and most comprehensible phenomena in the
physical world, he would accept no other causes. Such a decision implies that bodies can
act on each other only when they are contiguous, and thus Descartes ruled out action
at a distance. To account for such phenomena as the lunar influence on tides, Descartes
assumed that space is not a void but is a plenum, t being populated by transparent
particles capable of transmitting force. He actually went further than this, postulating
that all matter was in one of three distinct forms, the luminous matter of the sun, the
transparent matter of interplanetary space, and the opaque matter of the earth, giving
as his reason;"

For, seeing that the sun and the fixed stars emit light, that the heavens transmit it, and
that the earth, the planets, and the cornets reflect it, it appears to me that there is ground
for using these three qualities of luminosity, transparency, and opacity to distinguish the
three elements of the visible world.

Descartes assumed that the luminous matter of the sun consisted of particles which
were in continuous motion. Since there "vas no emp ty space for the particles to 1110Ve
into, he argued that they took the places vacated by other particles which were also in
motion, and thus developed the notion of closed chains of moving particles. The
motions of these closed chains constituted vortices, an important concept in his explana-
tion of the universe. Thus, according to Descartes' theory;" the sun consists of an
enormous vortex composed of the first or subtlest kind of matter. The luminous par-
ticles of this vortex, due to centrifugal action, constantly strain away from their centers
of rotation and thus press against the transparent particles of the ether. The ether
t Thus did the concept of an ether en tel' science for the first tirne. The word is of Greek extraction and
originally meant blue sky.
11 E. Whittaker, A Historu of the Theories of Aeiher and Electricity, Vol. 1, p. 4, Thomas Nelson and
Sons, Ltd., London, 1951.
12 R. Descartes, Principes de la Philosphie, 4th ed., Part 3, Sec. 52, Chez Theodore Girard, Paris, 168l.
13 Ibid., Sec. 55-64.
6 The Phenomenon of Light CHAPTEH 1

Descartes imagined to consist of a closely packed assemblage of globules, of a size


intermediate between that of the luminous matter of the sun and the opaque matter
of the earth. The pressure of the vortex against these ether particles causes them to
tend to move, thus exerting a pressure on their neighbors, which in turn tend to move,
and in this manner the force exerted by the vortex is passed along through the ether
particles, from layer to layer. In Descartes' view, the transmission of this pressure con-
stitutes light, a thought he summarizes in the passage!'
. . . the force of light . . . does not consist in the duration of some motion but only in the
fact that these small globules (of the ether) are pressed and tend to 1110Ve toward some new
location, although they do not actually move,
Descartes also provided the first theoretical derivation of the law of refraction, dis-
covered experimentally somewhat earlier (1621) by Willebrord Snell. This derivation is
important because it contains a consequence which later 100n1ed as a decisive factor in
settling the controversy as to the true nature of light. In the Descartes derivation, a
light ray is assumed to be incident on a plane interface bet\veen t\VO media at an angle i
with respect to the normal, traveling at a velocity Vi in the first medium, and departing
from the interface at a velocity v, in the second medium, in a direction making an angle
r with respect to the normal. Descartes then assumed that the component of velocity
parallel to the interface was unaffected, obtaining
Vi sin i = u, sin r
from which Snell's law
sin i o,
=n
SIll r Vi

follows immediately, However, if the second medium is denser, so that i > r, it follows
that u; > Vi. Thus Descartes' derivation leads to the conclusion that light 111USt travel
faster in a denser medium, a conclusion which was later shown to be in contradiction
with experiment.
Descartes' opinions were vigorously attacked by Robert Hooke (163;'">-1703), whose
views mark a significant turning point in conjectures about the nature of light. X oted
for Hooke's law, he was an able mechanician who devised many improvements in
clocks and astronomical instruments, and was the first to formulate a theory of plane-
tary movements as a mechanical problem. He was responsible for the development of
microscopy as a science in England, and his interest in this subject led him to many
experiments concerned with light itself. Hooke became convinced that light was an
undulatory phenomenon, and his reasons are lucidly expressed in the passage"
And first for Light, it seems very manifest, that there is no luminous Body but has the
parts of it in motion more or less . . . . It would be somewhat too long . . . to examine,
and positively to prove, what particular kind of motion it is that must be the efficient of
Light . . . . I found it ought to be exceeding quick . . . that in all extreamly hot shining
bodies, there is a very quick motion that causes Light, as well as a more robust that causes
Heat, may be argued from the celerity wherewith the bodies are dissolv'd.
14 Ibid., Sec. 63.
15 R. Hooke, M icroqraphia, or Some Physiological Descriptions of illinuie Bodies J.[ade by M agn1fying
Glasses, 1st ed., pp. 54-56, published by the Royal Society of London, reproduced by Dover Pub-
lications, Inc., Ne\v York, 1961.
SECTION ] lj'£storical Survey-l 1he
N aiure of Liqh! 7

Next, it must be a Vibrative motion. And for this the newly montiori'd Diamond affords
us a good argument: since if the motion of the parts did not return, the Diamond must
after n1any rubbings decay and be wasted . , , ,
And 'Thirdly, That it is a very short vibrating motion, I think the instances drawn from
the shining of Diamonds will also make probable. For a Diamond being the hardest body
we yet know in the World, and consequently the least apt to yield or bend, must conse-
quently also have its vibrations exceeding short.

Having proposed an explanation for the sources of light, Hooke then suggested

That the motion is propagated every way through an Ilomoqeneous medium by direct or
straight lines extended every way like Rays from the center of a sphere . . . in an Homo-
geneous medium this motion is propagated every way with equal velocity, whence necessarily
every pulse or vibration of the luminous body will generate a Sphere, which will continually
increase, and grow bigger, just after the same manner (though indefinitely swifter) as the
waves or rings on the surface of the water do swell into bigger and bigger circles about a
point of it, where, by the sinking of a stone the motion was begun, whence it necessarily
follows, that all the parts of these Spheres undulated through an H omoqeneous medium
cut the Rays at right angles.

Thus Hooke paralleled Descartes in postulating a medium as the vehicle of light.


However, he replaced Descartes' notion that light was a statical pressure in i.he medium
with the notion that it is a rapid undulatory motion of small amplitude. Hooke then
went on to replace the Descartes analysis of refraction with one of his own, based on the
tilting of a wavefront at the interface of two media, but he failed to notice that it would
be necessary to assume the velocity to be slower in the denser medium in order to be
consistent with Snell's law.
The issue of whether light was wa velike or particlelike was firmly joined wi th the
emergence on the scientific scene of Isaac N ewton (1642-1727). Renowned for his dis-
coveries in mechanics, N ewton also made many significant contributions in the field of
light. His most notable discovery was that white light is made up of the spectral colors,
which led him to propound a theory of prismatic colors directly opposed to an earlier
theory put forward by Hooke. This precipitated a bitter controversey in which Hooke
displayed considerable vexation and accused ~ ewton of favoring the doctrine that light
is a material substance. ~ ewton gave his answer in a communication to the Royal
Society in 1675 in which he said'"

Were I to assume an hypothesis, it should be this, if propounded more generally, so as


not to determine what light is, farther than that it is something or other capable of exciting
vibrations in the aether: for thus it will become so general and comprehensive of other
hypotheses, as to leave little room for new ones to be invented. And therefore, because I
have observed the heads of some great virtuosos to run much upon hypotheses, as if my dis-
courses wanted an hypothesis to explain them by, and found, that so111e, when I could not
make them take my meaning, when I spake of the nature of light and colours abstractedly,
have readily apprehended it, when I illustrated 111y discourse by an hypothesis; for this
reason I have here thought fit to send you a description of the circumstances of this hypothe-
sis as much tending to the illustration of the papers I herewith send you ..A.nd though I shall
not assume either this or any other hypothesis, not thinking it necessary to concern myself,
16 1. Newton, Papers and Letters on Natural Philosophy, edited by 1. Bernard Cohen, p. 179, Harvard

University Press, 1958,


8 The Phenomenon of IJight CHAPTER 1

whether the properties of light, discovered by me, be explained by this, or Mr. Hooke's,
or any other hypothesis capable of explaining them ; yet while I am describing this, I shall
sometimes, to avoid circumlocution, and to represent it more conveniently, speak of it,
as if I assumed it, and propounded it to be believed. This I thought fit to express, that no
man may confound this with n1Y other discourses, or measure the certainty of one by the
other, or think me obliged to answer objections against this script: for I desire to decline
being involved in such troublesome and insignificant disputes.

N ewton's lifelong distaste for controversy is clearly evident here, but equally evident
is his refreshing lack of dogmatism about rigid hypotheses. He thoroughly disliked
highly imaginative suppositions, such as Descartes had invoked for his grand scheme of
the universe, and was much more interested in the formulation of the laws which govern
natural phenomena. Despite this, he found it impossible to give coherence to the
observed facts about light without resorting to some speculation about its nature.
Thus in this same communication, after an exhaustive and detailed discussion of the
possible composition of an ether, K ewton goes on to suppose that
Light is neither aether, nor its vibrating motion, but something of a different kind propa-
gated frOIU lucid bodies. They, that will, may suppose it an aggregate of various peripatetic
qualities. Others may suppose it multitudes of unimaginable small and swift corpuscles of
various sizes, springing from shining bodies at great distances one after another; hut yet
without any sensible interval of time, and continually urged forward by a principle of
motion, which in the beginning accelerates them, till the resistance of the aethereal medium
equal the force of that principle, much after the manner that bodies let fall in water are
accelerated till the resistance of the water equals the force of gravity.

In K ewtori's lifetime, all the facts known about light could not be harmonized with
either the corpuscular or wave theories then being proposed. However, he leaned
toward a corpuscular hypothesis, and near the end of his life summed up his objections
to the wave theory in a query at the conclusion of a revised edition of his Opticks"
Are not all Hypotheses erroneous, in which Light is supposed to consist in Pression or
motion, propagated through a fluid Medium? . . . I f Light consisted only in Pression pro-
pagated without actual Motion, it would not be able to agitate and heat the Bodies which
refract and reflect it . . . . And if it consisted in Pression or Motion, propagated either in
an instant or in time, it would bend in to the Shadow. For Pression or Motion cannot be
propagated in a Fluid in right Lines, beyond an obstacle which stops part of the Motion,
but will bend and spread every way into the quiescent Medium which lies beyond the
Obstacle . . . . The Waves on the Surface of stagnating 'Vater, passing by the sides of a
broad Obstacle which stops part of them, bend afterwards . . . . But Light is never known
to follow crooked Passages nor to bend into the Shadow.

Newton goes on, in this query, to add the further objection that the wave theory (as it
then existed) could not accoun t for the recen tly discovered phenomenon of the polariza-
tion of light.
The discoverer of this phenomenon of polarization was Christiaan Huygens (lG29-
1695), a contemporary of both Hooke and Newton, who sided with Hooke in favoring a
wave theory of light. Inventor of the pendulum clock, perceptive and influential critic
of Descartes' cosmological theories, Huygens is known principally for his work in optics.

1. Newton, Opticks, 4th ed., pp. 362-370, William Innys, Publisher, London, 1730. (Reprinted by
17

Whittlesey House, McGra\v-Hill Book Cornpany, New York, 1931.)


SECTION 1 Historical Survey-The Nature of I.Jight 9

He greatly extended and improved the wave theory first enunciated by Hooke and
subscribed wholeheartedly to Hooke's hypothesis that light consists of S0111e form of
motion. Witness the passage."
It is inconceivable to doubt that light consists in the motion of some sort of matter. For
whether one considers its production, one sees that here upon the Earth it is chiefly engend-
ered by fire and flame which contain without doubt bodies that are in rapid motion, since
they dissolve and melt many other bodies, even the most solid; or whether one considers its
effects, one sees that when light is collected, as by concave mirrors, it has the property of
burning as a fire does, that is to say it disunites the particles of bodies. This is assuredly
the mark of motion, at least in the true Philosophy, in which one conceives the causes of all
natural effects in terms of mechanical motions. This, in my opinion, we must necessarily do,
or else renounce all hopes of ever comprehending anything in Physics.
And as, according to this Philosophy, one holds as certain that the sensation of sight is
excited only by the impression of some movement of a kind of matter which acts on the
nerves at the back of our eyes, there is here yet one reason more for believing that ligh t
consists in a movement of the matter which exists between us and the luminous body.

Huygens next addresses himself to the question as to whether the motion is that of a
medium, as assumed by Hooke, or whether it is a stream of particles, as favored by
Newton. He says
Further, when one considers the extreme speed with which light spreads on every side,
and how, when it comes from different regions, even from those directly opposite, the rays
traverse one another without hindrance, one may well understand that when we see a
luminous object, it cannot be by any transport of matter corning to us from this object,
in the way in which a shot or an arrow traverses the air; for assuredly that would too greatly
impugn these two properties of light, especially the second of them.

Huygens shared with Newton the inclination to picture an ethereal medium in which
light propagated. Whereas K ewton favored the idea that this medium was set into
vibration by the passage of light corpuscles through it, Huygens preferred to imagine
a process analogous to sound, in which the vibrating particles of the luminous source
would excite the contiguous portion of the medium into vibration, which would in turn
transfer this excitation on to the next portion, etc. This mechanical model of light
propagation led him to his most important contribution, ever since known as Huygen's
principle, and explained in the passage!"
There is the further consideration in the emanation of these waves, that each particle of
matter in which a wave spreads, ought not to communicate its Illation only to the next
particle which is in the straight line drawn from the luminous point, but that it also imparts
some of it necessarily to all the others which touch it and which oppose themselves to its
movement. So it arises that around each particle there is made a wave of which that particle
is the centre.

Using this principle, Huygens was able to show how all the points in one wavefron t
could be treated as secondary sources which created the next wavefront, and thus pro-
vided satisfactory explanations for propagation and reflection. By assuming that the
velocity of light was slower in a denser medium he was also able to explain refraction.
18 C. Huygens, T'raite de la Lumiere, pp. 3-4, first published in Leyden in 1690; English translation by
S. P. Thompson, London, 1912; reprinted by University of Chicago Press.
19 Ibid., p. 19.
10 The Phenomenon of Light CHAPTEH 1

This proved to be a pivotal point a century and a half later in deciding between a
corpuscular or wave theory, since it has already been observed that the corpuscular
theory requires a faster velocity in a denser medium in order to be consistent with
the law of refraction.
Huygens was unsuccessful in explaining interference effects, such as the colored rings
of thin films and sharp shadows past obstacles, partly because it was not then appreci-
ated how short the wavelengths of visible light are. He also confessed his inability to
explain his own discovery of polarization, but this is easily understood when one re-
members that in 1700 it was not recognized that light consisted of transverse vibrations.
Similarly, N"ewton had difficulty in explaining the colors of thin films under the cor-
puscular theory and the noninterference of beams of light whose paths crossed, Al-
though neither theory was adequate, the esteem in which K ewton was held by his
contemporaries and followers was so great that the wave theory was rejected and
allowed to remain unnourished for over a century. If the fact that Newton found the
corpuscular hypothesis more acceptable retarded the growth of the theory of light, as
some have claimed, the fault lay with those who blindly espoused all his views. It has
already been noted that K ewton himself did not hold rigidly to any one hypothesis but
rather gave tentative acceptance to that theory which appeared to him to fit most of
the facts.
Although most scientists of the eighteenth century accepted the corpuscular hy-
pothesis, the wave theory was not totally without advocates. Franklin (170G-1790)
favored it, and Euler (1707-1783) took the same position, being persuaded by the
notion that particle emission from a luminous source would cause a diminution in its
mass, an effect not observed, whereas the emission of waves did not involve such a
consequence. However, the wave theory did not make any serious headway until a
new champion arose when Thomas Young (1773-1829) turned his attention to the
subject. A man of diverse and considerable talent, Young was a practicing physician
on the staff of St. George's Hospital. I-Ie was also a physicist, whose lectures at the
Royal Institution of London introduced the modern physical concept of energy. He
was a prodigy at t\VO, an accomplished linguist while still in his boyhood, a musician,
and an archeologist who participated in the deciphering of the Rosetta stone. He made
contributions to the theory of tides, explained capillarity, and established the coefficient
of elasticity known as Young's modulus,
Drawing upon all earlier explanation by K ewton in connection with tides, Young
introduced the concept of interference by saying;"
Suppose a number of equal waves of water to 1110Ve upon the surface of a stagnant lake,
with a certain constant velocity, and to enter a narrow channel leading out of the lake.
Suppose then another similar cause to have excited another equal series of waves, which
arrive at the same channel, with the same velocity, and at the same time with the first.
Neither series of waves will destroy the other, but their effects will be combined: if they
enter the channel in such a manner that the elevations of one series coincide with those of
the other, they must together produce a series of greater joint elevations; but if the eleva-
tions of one series are so situated as to correspond to the depressions of the other, they
must exactly fill up those depressions, and the surface of the water must remain smooth:
at least I can discover no alternative, either from theory or from experiment.
20 T. Young, M iscellaneous Works, edited by George Peacock, Vol. 1, pp" 2()2-2();~, John Murray
Publishers, Ltd., London, 18\15.
SECTION 1 Historical Survey-The Nature of Light 11

Kow I maintain that similar effects take place whenever t\VO portions of light are thus
mixed; and this I call the general law of the interference of light.

Young demonstrated this concept in an experiment performed before the Royal


Society of London in 1803. Using a distant source of a single color, he permitted light
to pass through two tiny holes placed close together in one screen, and to fall on a second
screen. The second screen showed a pattern of fine bands, alternately ligh t and dark.
Young explained this pattern by recourse to a law he had enunciated " in 1802:

Wherever t\VO portions of the same light arrive at the eye by different routes, either
exactly or very nearly in the same direction, the light becomes most intense when the dif-
ferences of the routes is any multiple of a certain length, and least intense in the inter-
mediate state of the interfering portions; and this length is different for ligh t of different
colours.

He also used this law to give the first satisfactory explanation of the colors of light
reflected from thin plates, arguing that the incident light causes t\VO beams to reach
the eye: the first of these beams has been reflected from the first surface of the thin
plate, and the other from the second. These t\VO beams produce the colors in the
reflected light due to their interference. I ndeed, Young used the measured thickness
of thin plates to determine for the first time the characteristic lengths, or wavelengths,
of the various colors of visible light, publishing?" a table of values which is remarkably
accurate by today's standards.
Despi te a bi tter attack on Young by the followers of N ewton , support for the wa ve
theory accumulated rapidly. Fresnel (1788-1827) satisfactorily explained diffraction
past a sharp edge in terms of mutual interference of the secondary "Huygens" waves
generated by those portions of the original wavefront not obstructed by the diffracting
obstacle. Sharp shadows beyond obstacles big in terms of wavelengths thus became
understood, a point about the wave theory which had always bothered N ewton. Fres-
nel also demonstrated light interference by employing t\VO mirrors, and in a brilliant
experiment confirmed all hypothesis by Young that light consisted of transverse vibra-
tions by showing that two cross-polarized beams of light do not interfere wi t.h each
other. 'I'his permitted an explanation under the wave theory of the phenornenon of
light polarization in crystals, which had earlier been a stumbling block for Huygeus.
Kirchhoff (1824-1887), starting from the wave equation, developed a diffraction for-
mula in which Huygens' secondary sources were revealed, thus putting that principle
on a much firmer foundation."
Finally, the coup de grace was delivered to the corpuscular theory in 1850 when
Foucault." (1819-1868) and Fizcau " (1819-1896) measured the velocity of light in

21 T. Young, "An Account of Some Cases of the Production of Colours," Phil Trans Roy Soc (London),

92, 387-397; July 1802.


22 T'. Young, "On the Theory of Light and Colours," Phd Trans Roy Soc (London), 92, 12-48; N overn-
ber I80l.
23 Kirchoff summarized his work in the textbook Vorlesu.ngen 'tiber maihemaiische O'piik, Zweite Vorles-

ung, Sec. 2, Berlin, 1891.


24 M. L. Foucault, "General Method for Measuring the Speed of Light in Air and Transparent Media.

Relative Speeds of Light in Air and Water," Compl Rend, 30, 551-5()(); May 1850.
25 H. Fizeau and L. Brequet, "Note on an Experiment Relative to the Cornparativc Velocities of

Light in Air and in Water," Cornpt Rend, 30, 562-563; May 1850.
12 The Phenomenon of Light CHAPTER 1

air and water, finding that it was slower in the latter. This result was consistent with
the wave theory, whereas the reverse had been predicted by the corpuscular hypothesis.
With this experiment, all sensible objection to the wave theory of light had disappeared.
At about this time Maxwell (1831-1879) began formulating his theory of electro-
magnetism, culminating in the celebrated equations which bear his name. Wavelike
solutions to these equations indicated that electromagnetic fields would propagate
through a vacuum at the same speed as light. This led Maxwell to the important
conjecture that light is an electromagnetic phenomenon and further strengthened the
belief that light is basically wavelike in nature.
In 1887, Heinrich Hertz (1857-1894) provided the first successful demonstration of
the generation and propagation of electromagnetic waves, using separate spark gap
coils to transmit and receive. This achievement was hailed immediately by his con-
temporaries as the crowning victory of physics, the first experimental verification of
the validity of Maxwell's theory. Ironically, a side effect of this experiment was destined
to contribute to a great revolution in scientific thought. Hertz noticed that the sparks
produced in the gap of his receiving coil were influenced by the light falling on this gap
from the sparks in the transmitting coil. Further investigation led Hertz to conclude
that it was the ultraviolet portion of the light which was responsible for the effect,
and that the effect was greatest if the light were incident on the negative point of the
gap. Hertz reported these observations but carried the investigation no further. How-
ever, his discovery intrigued many others, and significant contributions were made by
Hallwachs, who showed that the photoelectric effect, as it came to be called, consisted
of the emission of negative charges, and by Lenard, who measured the charge to mass
ratio of the emitted charges and concluded that they were electrons.
A variety of materials was found to be photosensitive, but the characteristics of
the emission were surprising. The number of electrons emitted per unit time was pro-
portional to the intensity of the incident light, which seemed reasonable. However,
the maximum kinetic energy of the emitted electrons was dependent on the frequency
of the light used, but independent of its intensity. A classical argument, assuming a
collision-like process, would anticipate that the greater the intensity of the incident
wave, the greater would be the energy of the electrons which were torn loose from the
surface.
Albert Einstein (1879-1955) offered an explanation of the photoelectric effect in 1905,
the same year he received his doctorate from Zurich and published his first paper on
relativity. Drawing on an hypothesis made several years earlier by Planck, who had been
concerned with the spectral distribution of black-body radiation, Einstein assumed 26

. . . that the incident light is composed of quanta of energy (Rj N A){3V . . . . The quanta
of energy penetrate the surface of the material and their respective energies are at least
in part changed into the kinetic energy of electrons. The simplest process conceivable is
that a quantum of light gives up all its energy to a single electron . . . . G pon reaching the
surface, an electron originally inside the body will have lost a part of its kinetic energy.
Furthermore, one may assume that each electron in leaving the body does an amount of work
lV, which is characteristic of the material. Those electrons which are ejected normal to and
from the immediate surface will have the greatest velocities. The kinetic energy of these

26 A. Einstein, "An Heuristic Viewpoint Concerned with the Generation and Transformation of Light,"
Ann Phys, 322, 132-148; 1905.
SECTION 1 J-J istorical Sllrvey- The N ature of Light 13

electrons is
R
-{3v - }V
Nit
Einstein thus hypothesized that the incident light was composed of quanta, or photons,
whose energy was proportional to the frequency v of the light. His proportionality
factor consisted of a parameter {3 multiplied by the Boltzmann constant, k = R/N A,
with R the ideal gas constant and N A Avogadro's number. Einstein then argued that,
if the photoelectric material were raised to a potential V above a surrounding grounded
electrode, then even the most energetic emitted electrons would not reach the grounded
electrode if V were of such magnitude that

R
Ve==-{3v-TV
NA
in which e is the electronic charge. He then went on to say

If the formula derived is correct, it would follow that V, if plotted in cartesian coordi-
nates as a function of the frequency of the exciting photons, would yield a straight line
whose slope is independent of the material under investigation . . . . If each quantum of
light were to give its energy to the electrons independently of all the others then the velocity
distribution . . . will be independent of the intensity of the exciting radiation; on the other
hand the numbers of electrons leaving the body under equal conditions will be directly pro-
portional to the intensity of the incident radiation.

Einstein's formula and explanation are notable for their simplicity and fit all the
observed facts. At the time he proposed this explanation he had at his disposal only
qualitative data, but his equation received final and thorough experimental verification
through the precise work of Millikan in 1916. 27 Working with a circuit shown in simpli-
fied form in Figure 1.1, Millikan varied the reverse bias until it reached a value V
such that the ammeter read no current. Since this voltage was just enough to prevent
the most energetic electrons from reaching the second electrode, one could argue that
Ve was the maximum kinetic energy any of the electrons had upon being emitted
from the photosensitive electrode. When Millikan varied v, the frequency of the inci-
dent light, and recorded V for each frequency, he obtained a curve such as shown
in Figure 1.2. This experimental result was consistent with Einstein's equation
Ve == (R/ N A){3v - W, and the experimental significance of the intercept Vo is that
light at a lower frequency cannot cause photoelectric emission from the metal con-
cerned. The quantity v« was found to be characteristic of the photosensitive material
forming the electrode, but the slope of the curve was the same for all electrodes. The
slope, which is Einstein's proportionality constant (Rj N A){3 proved to be identical
with the constant h which Planck employed to explain black-body radiation. Thus
Einstein's quantum of light, or photon, was found to have an energy E == h»,
However, the concept that light consists of discrete energy bundles, or photons,
smacks strongly of the earlier corpuscular theories. Is light wavelike or corpuscular?
The best current answer appears to be that it has a dual personality, exhibiting one
set of characteristics or the other, depending on how it is interacting with its environ-
ment. If the process being considered is at the microscopic level, the quantized nature
27 R. A. Millikan, uA Direct Photoelectric Determination of Planck's h," Phys Rev, 7, 355-388; 1916.
14 The Phenomenon of Light CHAPTER 1

+
V
FIG VRE 1.1 Photoelectric diode.

of light will 1110st likely have to be considered; if it is a macroscopic process, the wave
nature of light should account successfully for the interaction.
It would seem that just about everybody was right all along.

FIGURE 1.2 111 aximum electron energy VB. light frequency.

1.2* HISTORICAL SURVEY-THE VELOCITY OF LIGHT

Whereas a determination of the nature of light is not totally decisive, such ambivalence
does not exist when the discussion turns to the conception of the velocity of light.
Whether light is thought of as a stream of photons or a propagating wave, the transfer
* The reader solely interested in the technical presentation may wish to omit this section except for
the discussion of Bradley's experiment.
SECTION 2 IIistorical Survey-The Velocity of Light 1,5

of energy occurs at a speed which, today, can be measured with extraordinary precision.
Yet this speed is so great that it is not surprising to find earlier debates as to whether
the velocity of light is finite or infinite.
The direct evidence is lost to us, but Empedocles apparently felt that the velocity is
finite, for Aristotle disputes with him in the passage."
Empedocles (and with him all others who used the same forms of expression) was wrong
in speaking of light as 'traveling' or being at a given moment between the earth and its
envelope, its movement being unobservable by us; that view is contrary both to the clear
evidence of argument and to the observed facts; if the distance traversed were short, the
movement might have been unobservable, but where the distance is from extreme East to
extreme West, the draught upon our powers of belief is too great.

Heron of Alexandria, whose life span has variously been placed in the period from the
second century B.C. to the third century A.D., and who is noted for his invention of
many contrivances operated by water, steam, or compressed air, believed with Euclid
and Ptolemy that light rays originated in the eye. This belief led him to an interesting
argument as proof that the velocity of light is infinite :29
That the sight rays emanating from our eyes move with infinite velocity can also be seen
from the following. N amely if, after having closed our eyes, we look again upward to the
heavens, these rays reach the heavens without any time interval having elapsed (i.e., irn-
mediately). For in the same instant in which we open our eyes, we see the stars, even though
we may say that the distance is practically infinite. Also, if this distance were even greater,
the same occurrence would be repeated in any case, and thus it results that the rays emanating
from our eyes propagate with infinite velocity. They therefore suffer in their propagation
no interruption in their motion, nor do they make a detour, nor follow a broken-line path,
but rather move along the shortest line, namely the straight one.

Alhazen believed otherwise, and in his treatise on optics stated :30


And we shall see that color will not be perceived in that which is color by the sight, nor
light in that which is light, except in time . . . the arrival of the sensation (of light) to the
hollow of the optic nerve is like the arrival of light from holes . . . the passing of light
from a hole to an object opposite the hole will not be possible except in time, even though
this fact is concealed from the mind.
The passing of light from a hole to an object opposite the hole cannot escape being in one
of the two following ways, namely, that either light will come to that part of the air which is
near the hole, before it can arrive to another following point, and thereafter it will come
to another point, and so to another, until it arrives at the object opposite the hole, or light
will arrive at the entire intermediate atmosphere between the hole and the object opposite
the hole, and to the very object, all at the same time. If the air received light in a succes-
sive fashion, the light would not arrive at the object opposite the hole, except through
movement. But movement does not exist except in time; thus, if the whole atmosphere
receives light at the same time, even the arrival of light to the atmosphere does not exist,
since it was not in the atmosphere before . . . .

28 Aristotle, De Anima, 418 b , 20, English translation under editorship of W. D. Ross, Oxford at the

Clarendon Press, 1931.


29 Heronis Alexandrini, Catoptrica, Vol. 2, pp. 320-323, translated into German by L. Nix and W.

Schmidt, von B. G. Teubner, Leipzig, 1900. (Private English translation.)


30 Alhazen, Opticae Thesaurus, edited by Risner, Vol. 2, Chap. 2, Article 21, Basel, 1572. (Private

translation.)
16 The Phenomenon of Light CHAPTER 1

If the hole through which the light enters becomes blocked, and then the blockage is re-
moved, the instant during which the blockage is removed is different from the instant
during which the light reaches the contiguous atmosphere Therefore this is done by
a movement ; but a movement does not exist except in time However, this time
element is strongly concealed from the mind due to the rapidity of the perception of the
sensation of light by the air.

Avicenna agreed with Alhazen, basing his opinion on the belief that light consisted of
the motion of finite particles which therefore could not have an infinite velocity. Roger
Bacon also sided with Alhazen, although he did not like the reasons advanced above
and preferred the argument Alhazen put forth in his seventh volume that "from the
same terminus the perpendicular ray reaches more quickly the terminus of the space
than the ray that is not perpendicular." However, Bacon Vias very gentle in his dis-
agreement with Aristotle, drawing a fine distinction between perceptible and imper-
ceptible intervals of time. His principal reason for believing in a finite velocity is con-
tained in the passage!'

. . . an instant has the same relation to time as a point to a line. Therefore, interchanging
terms, an instant has the same relation to a point as time has to a line; but the passage
through a point is in an instant. Therefore the passage through the whole line is in time.
Therefore species [of light] passing through linear space, however small, will pass through
in time . . . . If, therefore, the multiplication of light is instantaneous, and not in time,
there will be an instant without time; because time does not exist without motion. But it
is impossible that there should be an instant without time, just as there cannot be a point
without a line. It remains, then, that light is multiplied in time, and likewise all species of
a visible thing and of vision. But nevertheless the multiplication does not occupy a sensible
time and one perceptible by vision) but an imperceptible one, since anyone has experience
that he himself does not perceive the time in which light travels from east to west.

Francis Bacon (1561-1626), an English philosopher credited with the formulation


and introduction of the inductive method of modern science, struggled with the ques-
tion of the velocity of light in the absence of experimental information, as is evident in
this excerpt.r"

Even in sight, whereof the action is most rapid, it appears that there are required certain
moments of time for its accomplishment . . . . (It is not surprising that we do not see the
actual passage of light, for there are things which by reason of the velocity of their rnotion
cannot be seen-as when a ball is discharged from a musket . . . . ) This fact, with others
like it, has at times suggested to me a strange doubt, viz. whether the face of a clear and star-
light sky be seen at the instant at which it really exists, and not a little later; and whether
or not, as regards our sight of heavenly bodies, [there is] a real time and an apparent time,
just like the real place and apparent place which is taken account of by astronomers in the
correction for parallaxes . . . [whether or not] the images or rays of heavenly bodies . . .
take a perceptible time in travelling to us. Btl t this suspicion as to any considerable in terval
between the real time and the apparent afterwards vanished entirely . . . what had most
weight of all with me was, that if any perceptible interval of time were interposed between
31 R. Bacon, Opu» "AI ajus, Part 5: 9th Distinction, Chap. 3, the H,. B. Burke translation, University

of Pennsylvania Press, Philadelphia, 1928.


32 Francis Bacon, Philosophical Works, edited by J. 1\1. Robertson from the edition of Ellis and Sped-

ding, p. 363, London, 1905. (As quoted in 1. B. Cohen, Roemer, p. 11, The Burndy Library, Inc.,
New York, 1944.)
SECTION 2 Historical Survey-The Velocity of Light 17

the reality and the sight, it would follow that the images would oftentimes be intercepted
and confused by clouds rising in the meanwhile, and similar disturbances of the medium.

A contrast to all this metaphysical speculation is found in the attitude of Galileo


Galilei (1564-1642). Widely regarded as the father of modern physics, Galileo was a
champion of the experimental method. At the age of twenty-six, while professor of
mathematics at Pisa, he began a systematic investigation of the mechanical doctrines
of Aristotle. Having convinced himself by experiment of the error in many of Aristotle's
assertions, Galileo invoked the enmity of the Church by loudly proclaiming his dis-
sensions. These included the question of whether or not a heavy body falls faster than a
light one, and later the profound question of whether the Ptolemaic or Copernican view
of the universe was the proper one.
Galileo was the first to observe that a simple pendulum has a natural period. He
properly deduced the formulas of uniformly accelerated motion, and his contributions
to mechanics were an important precursor to the generalizations made by N ewton a
century later. He constructed the first astronomical telescope and with it discovered the
satellites of Jupiter, the crescent phases of 'Tenus, sunspots and the rotation of the sun,
and the libration of the moon. Galileo became interested in the question of light velocity
and, believing it to be finite, undertook to establish this experimentally. His approach
was logical but doomed to failure because of the great velocity involved. In the famous
Dialogues, published in Leyden in 1638, Galileo proposed that."

Each of two persons take a light contained in a lantern, or other receptacle, such that by
the interposition of the hand, the one can shut off or admit the light to the vision of the
other. Next let them stand opposite each other at a distance of a few cubits and practice
until they acquire such skill in uncovering and occulting their lights that the instant one
sees the light of his companion he will uncover his own ..After a few trials the response will
be so prompt that without sensible error the uncovering of one light is immediately fol-
lowed by the uncovering of the other, so that as soon as one exposes his light he will instantly
see that of the other. Having acquired skill at this short distanee let the two experimenters,
equipped as before, take up positions separated by a distance of two or three miles and let
them perform the same experiment at night, noting carefully whether the exposures and
occultations occur in the same manner as at short distances; if they do, we may safely
conclude that the propagation of light is instantaneous; but if time is required at a dis-
tance of three miles which, considering the going of one light and the COIning of the other,
really amounts to six, then the delay ought to be easily observable. If the experiment is to
be made at still greater distances, say eight or ten miles, telescopes may be employed, each
observer adjusting one for himself at the place where he is to make the experiment at nigh t;
then although the lights are not large and are therefore invisible to the naked eye at so great
a distance, they can readily be covered and uncovered since by aid of the telescopes, once
adjusted and fixed, they will become easily visible

Later he comments,

In fact I have tried the experiment only at a short distance, less than a mile, from which
I have not been able to ascertain with certainty whether the appearance of the opposite
light was instantaneous or not; but if not instantaneous it is extraordinarily rapid-I should
call it momentary; . . . .

Galileo Galilei, Dialogues Concerning Two New Sciences, p. 43, reprinted by Dover Publications, Inc.,
33
New York.
18 The Phenomenon of Light CHAPTER 1

Galileo's experiment was repeated by scientists of the Florentine Academy but with
inconsistent results. The human reaction times were much too great, the separation of
the lanterns was only a few miles, and the timepieces of that era were extremely crude.
In the continuing absence of decisive experimental results, the speculation continued.
Kepler (1571-1630) held an Aristotelian view;" maintaining that light can be propa-
gated an infinite distance in zero time. He based this view on the argument that light
is not matter and thus cannot offer resistance to the force which moves it. In Aris-
totelian mechanics, this requires that light attain an infinite velocity.
Descartes, as has already been noted, believed that light consisted of a transmission
of pressure through the tightly packed globules of the ether. However, in his conception,
light was not a motion because the globules only tended to move, being restrained in
position by their neighbors. 'rhus each globule was capable of transmitting force
instantaneously, which led Descartes to conclude"
Thus, we shall have no trouble in realizing why such an effect, which I attribute to light,
extends in a spherical fashion all around the sun . . . and why such light propagates in-
stantaneously to all distances.

I t is interesting to observe that Descartes could believe both that the veloci ty of ligh t
was infinite and that the velocity of light was not the same in different media, an
assumption he made in deriving Snell's law (see Section 1.2).
In a correspondence with the Dutch physicist Beekman (1570-1637), Descartes was
hard pressed to defend his metaphysical arguments in favor of an infinite light velocity,
and hit upon an argument which is scientifically sound, and which seemed to him to be
a complete proof that his position was the only correct one. Descartes proposed con-
sideration of a lunar eclipse, caused by the earth being interposed between the sun and
the 11100n. He then supposed that it requires an hour for light to travel from the earth
to the moon, which would mean that the l1100n did not growdark until an hour after
the instant of collinearity of the three bodies. People on earth would not be aware of
this darkening for an additional hour, or until the earth and moon had 1110ved in their
orbits an additional t\VO hours beyond the position of collinearity. But, argued
Descartes, this is clearly contrary to experience, for the eclipsed moon is always
observed at a point in the ecliptic opposite to the sun. Thus the light must travel
instantaneously.
Huygens challenged this proof at its only weak point, saying."

But it must be noted that the speed of light in this argument has been assumed such that
it takes a time of one hour to make the passage from here to the Moon. If one supposes
that for this . . . it requires only ten seconds of time . . . then it will not be easy to per-
ceive anything of it in observations of the Eclipse; nor, consequently, will it be permissible
to deduce from it that the movement of light is instantaneous.
I t is true that we are here supposing a strange velocity that would be a hundred thousand
times greater than that of Sound . . . . But this supposition ought not to seem to be an
impossibility; since it is not a question of the transport of a body with so great a speed,
but of a successive movement which is passed on from some bodies to others. I have then

34.J. Kepler, Ad Vitellionem paralipornena quibus astronomiae pars opiica traditur, Frankfurt, 1604.
35 R. Descartes, Principes de la Philosophie, 4th ed., Part 3, Sec. 64, Chez Theodore Girard, Paris, 168l.

36 C. Huygens, Traiie de la Lumiere, pp. 6-7, first published in Leyden in 1690; English translation by

S. P. Thompson, London, 1912; reprinted by University of Chicago Press.


SECTION 2 Historical Survey-The Velocity of Light 19

made no difficulty, in meditating on these things, in supposing that the emanation of light
is accomplished with time . . . .

Hooke also appreciated the weakness in Descartes' argument, and in speaking of the
propagation of light through a transparent body or medium, he asscrted " that the
light

may be communicated or propagated through it to the greatest imaginable distance in


the least imaginable time; though I see no reason to affirm, that it must be in an instant:
For I know not anyone Experiment or observation that does prove it. And, whereas it may
be objected, That we see the Sun risen at the very instant when it is above the sensible
Horizon, and that we see a Star hidden by the body of the Moon at the same instant, when
the Star, the Moon, and our Eye are all in the same line; and the like Observations, or
rather suppositions, may be urg'd. I have this to answer, That I can as easily deny as they
affirm; for I would fain know by what means anyone can be assured any more of the
Affirmative, than I of the Negative. If indeed the propagation were very slow, 'tis possible
something might be discovered by Eclypses of the Moon; but though we should grant the
progress of the light from the Earth to the l\100n, and from the l\100n back to the Earth
again to be full t\VO Minutes in performing, I know not any possible I11eanS to discover
it . . . .

The distinction for having performed the first decisive determination of the velocity
of light goes to Ole Roemer (1644-1710). Born in Denmark, and educated under the
Bartholins at the University of Copenhagen, he then went to Paris as a young astron-
omer for the Acadernie Royale des Sciences, which at that time was undertaking a
project to prepare more accurate maps. A technique had been proposed whereby the
longitude of any place could be determined relative to the longitude of Paris by simul-
taneous observation of an astronomical phenomenon from the t\VO positions. What was
needed "vas a celestial occurrence of reasonable frequency, and a tentative selection was
made of the eclipses of the satellites of Jupiter, a phenomenon which had been dis-
covered earlier in the same century by Galileo.
In choosing Roemer to work on this project, the Academic picked a man who was to
prove to be one of the greatest practical astronomers of all time, He built the first good
transit instrument and the earliest transit circle, greatly improved on the construction
of micrometers, and showed that the epicycloid is the best shape for gear teeth, in-
corporating this discovery into the design of all his astronomical instruments: in his
later years he supervised the erection of an excellent observatory near Copenhagen.
While in Paris at the beginning of his career, and upon launching into a study of the
eclipses of Jupiter's moons, Roemer was struck by a surprising observation. Since one
would expect that the period of a moon would remain constant, knowing the time at
which one eclipse occurred, it was then a simple matter to predict a sequence of later
times at which a given moon would be eclipsed by Jupiter. But when Roemer did this,
he predicted a time sequence which did not agree with later eclipse measurements, He
attributed this disparity to the changed distance between Earth and Jupiter, which,
if the velocity of light were finite, would explain the irregularity in eclipse occurrences.
Accordingly, in September 167G, Roemer announced to members of the Paris
Academic that the next eclipse of the innermost satellite of Jupiter, expected on
371L Hooke, ~~1 icrographia, 1st ed., p. 56, published by the Hoyal Society of London, reproduced by
Dover Publications, Inc., New York, 1961.
20 The Phenomenon of Light CHAPTER 1

November 9, would occur exactly ten minutes later than the time computed on the
basis of previous eclipses. When observation had confirmed this startling prediction,
Roemer again addressed the Academic, saying"
The necessity of this new equation of the retardation of light, is established by all the
observations that have been Blade by the Academic Royale and by the Observatory during
the last eight years, and it has been confirmed anew by the emersion of the first satellite,
observed at Paris last November 9th at 5h 35 m 45 8 at night, 10 minutes later than had been
expected . . . .

From his knowledge of the relative positions of the earth and Jupiter, Roemer deduced
that this retardation was such that light should take 22 minutes to cross the diameter
of the earth's orbit, which translates into a velocity of light of approximately 140,000
mi/sec. Roemer's value was thus about 2tj percent low, t but his accomplishment was
nevertheless impressive. For the first time in history man had been able to measure
a velocity which was so great that many had thought it to be infinite.
Roemer's assertion was accepted promptly by Huygens and 1\ewton, and many of
his colleagues were quick to rectify the error in his calculations. Thus N ewton, in the
first edition of his Opticks (1704), introduces the proposition that 39

Light is propagated from luminous Bodies in time, and spends about seven or eight min-
utes of an Hour in passing from the Sun to the Earth.

adding that this effect was first observed by Roemer. However, no such acceptance was
found among the Cartesians, and such was the influence of Descartes' ideas that the
Continent remained unconvinced until the brilliant confirming experiments of Bradley
a half century later.
Bradley (1693-1762) was born in Gloucestershire and educated at Oxford. His
interest in astronomy was aroused early by an uncle whose home contained an excellent
amateur observatory, and he became an acute observer through having engaged in a
regular series of observations extending from boyhood. He was elected a member of the
Royal Society in 1718 and three years later was appointed Savilian Professor of
Astronomy at Oxford. He succeeded Halley as Astronomer Royal in 1742 and devoted
the remainder of his life to the Greenwich observatory.
In addition to the discovery of stellar aberration, to be discussed below, Bradley's
minute observations led him to the detection of the nutation of the earth's axis. In an
action so characteristic of his painstaking nature, Bradley refrained from announcing
the discovery of nutation until February 1748, after he had assured himself of its
certainty by careful measurements extending over an entire revolution (18.6 years).
Bradley's discovery and interpretation of the phenomenon of stellar aberration came
t His principal source of error was an oversight. Roemer had used eclipse data from the years 1671-
1673 to predict the retardation time, because he had at his disposal many observations from that
period, and also because Jupiter at that time had been making an aphelion passage and thus was at a
nearly constant distance from the sun. However, in 1676 Jupiter was no longer in such a position, and
Roemer failed to account for its changed distance from the sun between eclipses, thus obtaining an
incorrect value for the change in the distance between Earth and Jupiter.
38 O. Roemer, "Demonstration Concerning the Movement of Light," J des Scavans, 233-236; Decem-

ber 7, 1676. (Reprinted in Phil Trans Roy Soc (London), 12, 893-894; June 25, 1677.)
39 1. Newton, Opticks, 4th ed., Book 2, Part 3, Proposition 11, William Innys, Publisher, London, 1730.

(Reprinted by Whittlesey House, McGraw-Hill Book Company, NeVI York, 1931.)


SECTION 2 Il istorical Survey-The Velocity of Light 21

as the result of an effort to detect stellar parallax, which he began in 1725. The absence
of any measurable parallax had long been a stumbling block for adherents of the
Copernican system. Tycho (1546-1601) had recognized earlier that, when viewed from
opposite sides of the earth's orbit, stars should show a displacement in direction, but his
careful observations convinced him that no such displaeement so great as one minute of
arc existed. Later observers also had sought this effect in vain, and stellar parallax had
become one of the outstanding problems in astronomy.
Working with improved instruments, Bradley attacked this problem by systema-
tically recording the position of l' Draeonis, a bright star in the constellation Draco, at
various times during the year. As shown in Figure 1.3a, what he was seeking was a
difference in the angles a and {3, which certainly should be evident if rl and 1'2 were not
too much greater than the diameter of the earth's orbit. It is obvious from the figure
that this parallax effect should be greatest for stars near the ecliptic pole, t and thus
l' Draconis was an ideal choice. The plane containing the ecliptic axis and l' Draconis
cuts the earth's orbit in points the earth occupies in June and December. Thus Bradley
expected to find l' Draconis making its smallest angle to the ecliptic plane in December
and its greatest angle in June. To his surprise, he found that l' Draconis lies closest to
the ecliptic in March and is most elevated in September, the difference in these angles
being about 40 sec of arc.
Bradley checked his findings by observing other stars over a three-year period, always
with similar results. Finally satisfied that the effect was real, he reported 40 his observa-
tions in 1728. After carefully eliminating other possible explanations for the effect, he
said

At last I conjectured, that all the Phenomena hitherto mentioned, proceeded from the pro-
gressive Motion of Light and the Earth's annual Motion in its Orbit. For I perceived, that,
if Light was propagated in Time, the apparent Place of a fixt Object would not be the same
when the Eye is at Rest, as when it is moving in any other Direction, than that of the Line
passing through the Eye and Object; and that, when the Eye is moving in different Direc-
tions, the apparent Place of the Object would be different.

Bradley then proceeded to explain the apparent shift in position of the stars under this
hypothesis. His reasoning can be understood with reference to Figure 1.3b, in which
Cartesian axes have been chosen fixed in the sun, with the Z axis pointing toward the
ecliptic pole and l' Draconis in the XZ plane, close to the Z axis. In March the orbital
velocity of the earth is toward 'Y Draconis, whereas in September it is away from l'
Draconis, K eglecting the diurnal rotational motion of the earth (which is only about
1 percent of the orbital motion), Bradley reasoned in effect that in March the velocity
components of the light entering his telescope from ')' Draconis were (c x + v, 0, c.),
whereas in September they were (cx - v, 0, e.), with v the orbital speed and Cx, c, the
velocity components of the light relative to the sun. Thus in March he needed to point
his telescope at an angle a above the ecliptic plane given by tan a = cz / (cx »), and in +
September he needed to point his telescope at a slightly higher angle {3 above the

t The earth's orbit lies in the plane of the ecliptic, and the ecliptic pole is the axis perpendicular to this
plane and piercing it at the center of the earth's orbit.
40 J. Bradley, "An Account of a New Discovered Motion of the Fix'd Stars," Phil Trans Roy Soc

(London), 35, 637-660; December 1728.


22 The Phenomenon of Light CHAPTER 1

Ecliptie pole
To 'Y Draconis To 'Y Draconis

I
Earth
I
- -----SUN
Ecliptic plane

I June II
/
/ Dec.

I
(a)

To 'Y Draconis ....- - -

~
/'
March

/
/ y \

Sept.

(b)
FIGURF. 1.3 Stellar aberration.
SECTION 2 H isiorical Survey-The Veloc1:ty of Light 23

ecliptic given by tan {3 = cz / (cx - v). Since {3 - a is small,

tan B - tan a 2vcz


- - -2 = tan (/3 - a) ~ B- a
1 + tan {3 tan a c
2 - v

from which, because v «< c, it follows that


v c,
{3-a"'-'2-- (1.1)
cc

Upon inserting measured values for a, {3, and v into Equation (1.1), Bradley was able to
deduce a value for the velocity of light c, since he knew the direction eosine cz / c. In
his own words,

. . . the Velocity of Light [is] to the Velocity of the Eye (which in this Case may be supposed
the same as the Velocity of the Earth's annual Motion in its Orbit) as 10,210 to One, from
whence it would follow, that Light moves, or is propagated as far as from the Sun to the
Earth in 8'12".
It is well known, that 1\1r. Romer, who first attempted to account for an apparent Inequal-
ity in the Times of the Eclipses of Jupiter's Satellites, by the Hypothesis of the progressive
Motion of Light, supposed that it spent about 11 Minutes of Time in its Passage from the
Sun to us: but it hath since been concluded by others from the like Eclipses, that it is propa-
gated as far in about 7 Minutes. The Velocity of Light therefore deduced from the foregoing
Hypothesis, is as it were a IIIean betwixt what had at different times been determined
from the Eclipses of Jupiter's Satellites.

Bradley's value for the time of passage of light from the sun to the earth trans-
lates into a light velocity of 189,000 miz'scc, a value in close agreement with modern
measuremen ts.
Bradley termed this effect which shifts the apparent position of a star aberraiion,
When his findings became widely known, all sensible objection to the view that the
velocity of light is great, but finite, ceased to exist.
The first attempt to measure the velocity of light using a purely terrestrial method
was made by Fizeau in 1849. He employed a large toothed wheel as a light chopper
and selective receiver, sending light pulses to a remote mirror at a known distance.
Upon their return, the pulses would be unable to get past a tooth which had moved
over to replace a space, if the rotational speed of the wheel were a cri tical val ue; this
fact was used to deduce the time taken for a pulse to travel from the wheel to the
distant mirror and back, from which the velocity of light followed immediately.
A lifelong resident of Paris, Fizeau (1819-1896) devoted his long and productive
career to scientific research. With Foucault, he conducted an extensive series of experi-
ments on interference of both light rays and heat rays. He explained the Doppler effect,
made valuable discoveries related to the polarization of light, and applied the principle
of light interference to the measurement of the dilatation of crystals. He is best rerncm-
bered for determinations of the velocity of light in air and in moving water. The latter
determination played a significant role in the development of the special theory of
relativity and will be discussed in Chapter 2. Fizeau's determination of the velocity
of light in air was accomplished earlier, in 1849, with an apparatus which is suggested
in simplified form by Figure 1.4.
In this experiment, light from a source S was focused at f by means of the lens L 1
24 The Phenomenon of Light CHAPTER 1

FIGURE 1.4 Fizeau's apparatus.

and the half-silvered mirror P. The principal focus of the lens £2 was made to coincide
with j so that a parallel beam of light emerged from the apparatus and traveled to a
distant station consisting of the lens L 3 and the spherical mirror 111. This beam was
focused by £3 on M, whose center of curvature was chosen to lie in £3. Thus the reflected
beam emerged from L 3 in a parallel pencil and was brought to a focus es.], from whence
it diverged to fall upon the half-silvered mirror P and be partially transmitted to the
eyepiece V.
When a toothed wheel TV was inserted in the light path atj, an image of the source S
could be seen at V unless f were blocked by the presence of a tooth. Fizeau used a
wheel with 720 teeth separated by spaces congruent to the teeth, and connected the
wheel to a clockwork driven by weights, thus using the wheel to pulse the light. With
the wheel rotating very slowly, the image of S would appear and disappear successively
as the spaces and teeth passed beforej. However, if the speed were increased to the point
that several teeth per second passed j, the persistence of vision would render a perma-
nent image at half the intensity which had been seen with the wheel at rest and two
teeth straddling f.
When the speed of the toothed wheel was increased further, because of the finite
velocity of light, a sensible part of the light transmitted through a space toward M
would, upon returning, fall upon the adjacent tooth and be intercepted, thus decreasing
the intensity of the image. If the rotational speed became great enough so that, when
the light returned, the tooth had just moved into the position previously occupied
by the space, then all the returning light was intercepted and the image at V was
totally extinguished.
What occurred, therefore, was that at first a bright image was observed, which
faded away as the rotational speed increased to a value just sufficient to replace a space
by a tooth in the time T it took light to travel from! to M and back. When the rotational
speed was increased further, the image returned, increasing in brightness until a maxi-
mum was reached corresponding to one space replacing another in time T. Having
thus reached a maximum, the image would fade away again, and so on in succession
for higher and higher speeds.
From his knowledge of the wheel geometry and a measurement of the rotational
SECTION 2 Historical Survey- The Velocity of Light 25

speed during image eclipse, Fizeau was able to deduce T and thus the velocity of light,
since he knew the distance from f to 111. In reporting this experiment," he said

. . . the result turned out very well, and one was able to observe, depending on whether
the speed of rotation was more or less, a bright point of light or a total eclipse. Under the
conditions in which the experiment was performed, the first eclipse occurred for 12.6 rota-
tions per second. For double that speed, a new bright point; for triple, a second eclipse . . .
and so forth.
The first station was placed in the belvedere of 2, house situated at Suresnes, the second
on the top of Montmartre, at a distance of approximately 8633 meters . . . .
These first attempts furnished a value for the velocity of light which differs but little from
that which has been obtained by astronomers, The mean deduced from twenty-eight obser-
vations made so far give for its value 70,948 leagues] . . . .

Fizeau's technique was limited in its accuracy because it was difficult to judge
just when the image had reached maximum or minimum intensity. Foucault devised
a modification of the apparatus which overcame this limitation by replacing the toothed
wheel with a rotating mirror. This mirror caused a measurable displacement of the
image, thus providing a determination of the velocity of light. In 1850 Foucault used
this apparatus to measure the relative velocities of light in air and water, and in 1862
he used an improved version to make an absolute determination of the velocity of light
in air.
Foucault (1819-1868) was also a Parisian, the son of a publisher. He originally
studied for a medical career but then abandoned it for physical science. With Fizeau
he carried on a series of investigations on the intensity of the light of the sun, as well
as the above-mentioned interference experiments. He established that the velocity of
light is inversely proportional to the refractive index of the medium, thus contributing
to the overthrow of the corpuscular theory. In 1851 he demonstrated the diurnal
motion of the earth via what has corne to be known as the Foucault pendulum, and in
1852 he invented the gyroscope; for these t\VO achievements he received the Copley
medal in 1855.
The 1862 determination of the velocity of light was achieved with the apparatus
shown in Figure 1.5. Foucault let solar light, transmitted from a rectangular aperture S,
pass through a half-silvered mirror P and fall upon the achromatic lens L. The ligh t
then proceeded to a rotatable plane mirror R, which was initially fixed at the proper
angular position to bring the rays to a focus at the point M, A concave mirror fixed at
111, with a radius of curvature equal to R'M; then reflected the light along a return
path such that half of the light came to a focus at A, to be viewed by a micrometer
eyepiece. A fine grating was stretched over the slit at S, so that the image at A was
crossed by dark lines, above which a cross-hair of the eyepiece could be positioned
accurately.
When the mirror R was rotated, it acted as a light chopper, in that only when R

t The league is an itinerary measure of distance which varies frorn country to country but is usually
estimated at about 3 mi. Fizeau used it in a precise sense such that his result 'was equivalent to a light
velocity of 3.13 X 108 m/sec or 194,000 mi/sec.
41 A. H. Fizeau, "On an Experiment Relative to the Speed of Propagation of Light," Compt Rend, 29,

90-92; July 1849.


26 The Phenomenon of Light CHAPTER 1

------------------
s
----- -----

FIGURE 1.5 Foucault's apparatus.

was in the proper angular position to deliver light to Al would an image be seen at the
eyepiece. However, during the time T light takes to travel from R to M and back, the
mirror would rotate an additional angular amount a = WT in which W was the angular
velocity of the mirror. This caused the reflected beam to be deflected an angle 2a, thus
shifting the image from A to A'. By measuring the displacement AA' and the rota-
tional speed w, since he knew the relative positions of the components of his apparatus,
Foucault was able to determine T and thus the velocity of light.
Foucault placed the mirrors Rand M an equivalent distance of 20 m apart through
the use of multiple reflections, and turned the mirror R at speeds up to 1,000 revolu-
tions per second, obtaining image displacements in the order of 1 111m. Of his results
he said 42

Definitively, the velocity of light has been found to be noticeably diminished. Earlier data
had indicated that the velocity was 308 millions of meters per second, and this new experi-
ment with the turning mirror gives a value, in round numbers, of 298 millions.
One is able, it seems to me, to count on the exactness of this number, in the sense that the
corrections it would have to suffer should not change its value more than 500,000 meters.

Despite the confidence expressed by Foucault in this determination, his apparatus


also suffered from a serious limitation. The distance RM could not be increased sig-
nificantly without diminishing the intensity of the image at A', since the intensity
of the light reflected from M was attenuated as (Rlll) 2 before returning to R. But with
R111 at 20 m and extremely high speeds for the rotating mirror, the displacement A A'
was still small enough to be subject to considerable error.
Michelson eliminated this drawback by placing the lens L between Rand M so that
S lay at its principal focus, thus providing a parallel beam to travel to 111. The mirror M
could then be made plane and placed at a much larger distance from R, thus enhancing
the displacement AA'; indeed, Michelson was able to achieve such great image dis-
placements that he eliminated the half-silvered mirror P. His simplified version of the

42 J. B. L. Foucault, "Experimental Determination of the Velocity of Light," Compt Rend, 55, 501-

503; September 1862.


SECTION 2 Historical Survey-The Velocity of Light 27

8'

/1
1 s
/ I
/ I
/
/ L
/

FIGURE 1.6 Michelson's apparatus.

apparatus is shown in Figure 1.6. About this apparatus and his measurements, Michel-
son said 43

In the following experiments the distance between the mirrors was nearly 2000 feet
and the speed of the mirror was about 257 revolutions per second. The deflection exceeded
133 millimeters, being about 200 times as great as that obtained by Foucault. If it were
necessary it could be still further increased. This deflection was measured within three
or four hundredths of a millimeter in each observation; and it is safe to say that the result,
so far as it is affected by this measurement, is correct to within one ten-thousandth part.
The revolving mirror was actuated by a current of air . . . . 1'0 regulate and measure
the speed of rotation a tuning fork, bearing on one prong a steel mirror, was employed. This
was kept in vibration by a current of electricity. The fork was so placed that the light from
the revolving mirror was reflected to a piece of plane glass in front of the eye-piece, and
thence reflected to the eye. When fork and mirror are both at rest, an image of the revolving
mirror is perceived. When the fork vibrates, this image is drawn out into a band of light.
When the mirror commences to revolve, this band breaks up into a number of rnoving
images of the mirror; and when, finally the mirror makes as many turns as the fork makes
vibrations, or any multiple . . . of this number, the images become stationary . . . .
The electric fork made about 128 vibrations per second. No dependence was placed upon
this rate, however, but at each set of observations it was COIn pared with a standard Ut 3
fork, the temperature being noted at the time.

Being thus assured of great accuracy in both of the critical measurements-image


displacement and mirror velocity-e-Michelson listed 200 data points, each of which was
the mean of 10 separate observations, and concluded that the velocity of light in air was
299,740 km /sec, being thus 299,820km/sec in vacuo. In 1882 he repeated the experiment
and announced a new value for the velocity of light in vacuo, 299,853 km /sec. This was
to remain the accepted figure for forty-five years, and when it was replaced by a more
precise figure, Michelson was once again involved in the determination.
Albert A. Michelson (1852-1931) was born in Poland but emigrated to America
with his parents at the age of two. They settled in the West following the gold rush
and he was raised in a ruining town. A rare presidential appointment as midshipman
at the X aval Academy insured his college education and stimulated his interest in
science. Upon graduation he became an instructor at Annapolis and embarked on his

A. A. Michelson, "Experimental Determination of the Velocity of Light," Am J Sci, 18, 390-393;


43

Novem ber 1879.


28 The Phenomenon of Light CHAPTER 1

first determination of the velocity of light, described above. There followed a period of
study in Europe during which he invented the interferometer and with it performed
the first ether drift experiment. Upon returning to the United States, he teamed with
Professor Morley to improve the interferometer and repeat this celebrated experiment
which has so influenced the subject of relativity. They also collaborated in a precise
repetition of Fizeau's moving-water experiment and in the establishment of the wave-
length of sodium light as a standard of length.
Michelson's ingenuity at optical instrumentation also led to the development of an
echelon spectroscope, to a determination of the rigidity of the earth, and to measure-
ments of the distances and diameters of giant stars. In recognition of his many con-
tributions to physics, he was awarded the Nobel prize in 1907, the first American
scientist so honored.
In 1923 Michelson was asked to go to Pasadena to make another determination of
the speed of light, and this he accomplished with the apparatus shown in Figure 1.7.

Arc light source

Mirror on
-,
\Q1='"
Mt. Wilson Slitf~;;;;' ~
- ~:-::~1~
,~i/ ..
7
~
---Lens

;:t;,.~-
\

o ? -,
------::/-~~~-- 'J>
Observer
Fixed mirror on
Mt. San Antonio
Prism Lens
Rotating octagonal prism
on Mt. Wilson
FIGURE 1.7 Michelson's improved apparatus. [From 1.1ichelson and the
Speed of Light by Bernard Jaffe. (Science Study Series). Copyright 1960
by Educational Services Incorporated. Reprinted by permission of
Doubleday & Company, Inc.]

The principle of operation was still the same, although many refinements of the original
apparatus are evident. An eight-sided rotating prism of nickel-steel, with its mirror
surfaces polished true to one part in a million, was used in place of the single rotating
mirror. Once again, an air blast was used to actuate the mirror system, and a tuning-
fork stroboscope to measure its rotational speed. The t\VO stations were considerably
farther apart, being placed on Mt. Wilson and Mt. San Antonio. The United States
Coast and Geodetic Survey established the distance between these stations within a
fraction of an inch in 22 miles. The intensity of the image was enhanced by using large
parabolic mirrors at both stations. Many observations yielded a mean value for the
velocity of light of 299,798 krn/sec.
But Michelson was not yet through. He wanted to measure the velocity of light in as
near perfect a vacuum as possible, free' from the obstruction of haze or smoke. A mile-
long tube of corrugated steel was constructed and evacuated down to a pressure of
i mm, with a version of the apparatus of Figure 1.7 enclosed. Unfortunately, Michel-
son did not live to see the end of this experiment, succumbing two years before its
SECTION ;) Sound Waves and Light Waves 29

completion. His colleagues made almost 3,000 independent observations, reporting:" a


mean figure for the velocity of light in vacuum to be 299,774 krrr/sec.
The value 299,792.5 km/sec in vacuo has been adopted as the velocity of light by the
International Union of Geodesy and Geophysics and by the International Scientific
Radio Union. This fundamental constant is within the limits of error of Michelson's
final figure.

1.3 SOUND WAVES AND LIGHT WAVES

The previous t\VO sections have indicated that light as a wave phenomenon has charac-
teristics common to those of all other types of waves, These include a wavelength, a
frequency, and their product the wave velocity, as well as a variety of interference
effects. However, light has one characteristic which makes it unique-it can propagate
in the absence of a tangible medium. This feature will prove to be of fundamental
significance.
It is instructive to contrast the properties of light with those of other wave phe-
nomena. A comparison of the behavior of sound waves and light waves in air is a good
illustrative example, because the air can be permitted to become increasingly rarefied,
approaching in the limit the absence of a tangible medium.
The Acoustic Wave Equation. Sound waves in air consist of longitudinal molecular
vibrations, resulting in alternate compression and rarefaction of the air. If one con-
siders the case in which sound is propagating in the positive X direction, the molecules
which (on the average) lie in a plane x = constant will (on the average) oscillate in the
X direction. As seen in Figure 1.8, their instantaneous average position will be x + ~(x,t)
in which ~(x,t) is the time-varying displacement around the average position x. Similarly,
the average position of molecules at an adj acent cross section will be x + dx + ~(x + dx, t).
For unit transverse area, the instantaneous volume between these t\VO planes of mole-
cules is

[x + dx + ~(x + dx, t)] - [x + ~(x,t)] = (1 + axa~) dx (1.2)

and thus the fractional change in volume is a~/ ax. Since the average number of mole-
cules in this volume is a constant, it follows that the density is fluctuating. If the
instantaneous density is designated by Po + PI (x,t), then

[pO + pl(X,t)] (1 + ~D dx = constant = Po dx (1.3)

When it is assumed that the density fluctuation pI(X,t) is small compared to the average
value Po and that the fractional change in volume a~/ ax is small compared to unity,
Equation (1.3) yields the first-order result

PI (x,t)
(1.4)
po

A. A. Michelson, F. G. Pease, and F. Pearson, "Measurement of the Velocity of Light in a Partial


44

Vacuum," Astrophys J, 82,26-61; July 1935.


30 l Phenomenon of Light CHAPTER 1
lhe

Layer of molecules in Adjacent layer of molecules in


its average position x its average position x + dx

/
~.' ..... _.....·:.:.··.··.J:·.-.~~"';"--O'......-----~X

x x + dx

Layer of molecules in its Adjacent layer of molecules in its


instantaneous displaced instantaneous displaced position
position x + ~(x,t) x + dx + ~(x + dx, t)

x + ~(x,t) :r + dx + Hx + dx, t)

FIGURE 1.8 Average behavior of layers of air molecules in presence of sound waves.

The fluctuations in density of the air as the sound waves pass through are so rapid
that the air does not transfer heat. The compressions and rarefactions are thus adia-
batic, and the process conforms to the gas law equation
PV'Y = constant (1.5)
in which p is the pressure, V the volume, and 'Y is the ratio of specific heats at constant
pressure and constant volume.
SEC'l'ION :) Sound Waves and Light Waves 31

Since it has been observed that the volume occupied by a fixed number of molecules
is fluctuating, it follows from (1..5) that the total pressure is varying also. Thus one
may write
p = po + P1(X,t) (1.6)
in which P1(X,t) is the small fluctuation around the relatively large constant average
pressure p.;
Taking the total differential of (1.5) and then dividing by (1.5) itself, one obtains
clp dV
-"I­
p II
which yields the first-order result
PI (X,t) a~
-)'- (1.7)
po ax
because it has been noted, in connection with Equation (1.2), that a~/ax is the frac-
tional change in volume.
N ewton's force law can be applied to the segment of air between the two adjacent
cross sections. The net force per unit transverse area acting on the molecules is
-[PI(X + dx, t) - Pl(X,t)]. Since to first order the mass is Po dx, one may write
a2~ api
Po at2 = - ax (1.8)

Combination of (1.8) with the spatial derivative of (1.7) yields the wave equation
a2~ 1 a2~
(1.9)
ax 2 = ~s at2
in which c. = (~::Y' (1.10)

The reader will have little difficulty convincing himself that the general solution of
(1.9) is
~(X,t) = j(x - cst) + g(x + cst) (1.11)
in which f and g are arbitrary functions. At a time t i the spatial distribution of j is
f(x - cs t 1) , as illustrated in Figure 1.9. At a later time t 2 i t is
f(x - cst 2 ) = f( {:r - cs(t 2 - t 1) } - cst})

~c, -....c,
r--~-----~~-____:lI~- x t----.-..;..-----~~-____:lI-X

FIGURE 1.9 Traveling sound waves.


32 The Phenomenon of Light CHAPTER 1

and is therefore the same spatial distribution as earlier, but shifted along the X axis a
distance cs(t2 - t 1). For this reason .r(x - cst) represents a wave of arbitrary but con-
stant spatial shape, traveling in the + X direction at speed Ca. Similarly, g(x + cst)
represents an arbitrary wave traveling in the - X direction at speed C8 \ The speed of
these waves is seen, from Equation (1.10), to depend on the conditions of the medium,
namely, the pressure and density of the air. If the air is sufficiently well approximated
by the ideal gas law]
pV='JLRT
in which 'JL is the number of moles, then

'JLRT)~
CS =
(l' - -
poV
I"..J T~2 (1.12)

since 'JLI Po V is a constant. Therefore this first-order theory yields the result that the
propagation velocity of sound waves in air depends only on the temperature of the air.
Propagation Independent of Source. A significant feature of Equation (1.10) is
its suggestion that c, is independent of the motion of the source of the sound waves and
is governed solely by the properties of the medium. This suggestion is confirmed by
experiment and is reasonable when one considers that only the air molecules in the
proximity of the source make contact with it, all others depending for their excitation
on somewhat-ordered collisions with their neighbors.
The fact that sound waves have a velocity controlled only by the medium and inde-
pendent of the motion of the source can be used to explain the Doppler effect. This
effect is familiar through the common example of an approaching locomotive. As shown
in Figure 1.10, at an instant when the diaphragm of the locomotive's horn is in its most
forward position, the air adjacent to the diaphragm suffers a compression, and this
compression travels forward at a velocity c.. If r is the period of oscillation of the
diaphragm, then r seconds later the next compression of air is about to be launched
from the horn. At this moment, the earlier compression is a distance A. = (c, - v)r in
front of the horn, with v the speed of the locomotive. A is the separation between points
in the wave train representing positions of successive maximum compression and is
thus the wavelength. The frequency of the sound wave is therefore

C8 c,
V = - = - - vo (1.13)
A C8 - V

in which vo = l/r is the frequency the sound wave would have if the locomotive were
at rest (vo is also the frequency of oscillation of the diaphragm). Equation (1.13) has
been amply confirmed by experiment.
Thus the motion of the source of a sound wave affects both its frequency and wave-
length but in such a way that their product remains constant at the value C8 given by
(1.10).
Acoustic Power. The rate at which energy is being transmitted by the sound wave,
per unit transverse area of the wavefront, is called the intensity, and will be denoted
by T. Consider a column of air of unit cross section, extending to infinity from the layer
of molecules whose average position is x. The net force on this column is Pl(X,t) and
t This approximation becomes better as the air is rarefied.
SECTION 3 Sound Waves and Light Waves 33

Horn moving Sound disturbance


at velocity v moving at velocity C8

. .. ... · .... ..
~

· ......
··.. ···· .... .... ···..
·...... ...
~ ~ .~ ··· .... ···· .... ..
····...... ....
·..
·· ....
..:::
I)) ..
··.. ··· ......
···...... ....
.
·.. .
.. · .... ·.. .
(JDiaphragm in
forward position
.: ·mo~eCUles
...
-vr-I---(C8 - v)r---

C~j
.. . ... ··· ··.... .. ··.. ..
···..
• • • •••

::: ..
··...... ··..
\
\ ~:: ·· ·.. ..
···....
::: ··· ....
··....
\ ::: :::
,
::: · ····......
:: :
· .
··....
eli r1,.:: : ::: ··· ·.. ...
I :: :
·..
\ ;:Phragm in forward
position one period r later
FIGURE 1.10 The Doppler effect in sound waves.

during a time interval dt the column is compressed an amount (a~/ at) dt so that the
work done on the column during this interval is Pl(a~/at) dt. With the aid of (1.7), the
rate of energy flow into the column can thus be written

(1.14)

For a simple harmonic wave traveling in the positive X direction one can write
27r
~ = A cos - (x (1.15)
A

which is a special case of (1.11). In this equation, A is a constant (the amplitude of


molecule oscillation) and A is the wavelength of the sound disturbance. Since c, = Av,
introducing the wave number k = 27r/'A and the angular frequency w = 27rV enables
one to rewrite Equation (1.15) in the form
~ = A cos (wt - kx) (1.16)

Substitution of (1.16) into (1.14) gives


T = poc sw2A 2sin" (wt - kx) (1.17)

At any cross section the time average flow is therefore


(1.18)
34 The Phenomenon of Light CHAPTER 1

Equation (1.18) reveals that, if the air is increasingly rarefied, the intensity of a
sound wave diminishes. This occurs because the density Po decreases, whereas, if the
temperature remains constant, Cs is unaffected (cf. Equation (1.12)); the amplitude of
molecule oscillation A is limited by the finite amplitude of oscillation of the source. In
the limit, with no molecules to transfer the oscillations to their neighbors, no acoustic
power can be transmitted, and the sound wave ceases to exist.
This discussion can be summarized by saying that sound waves cannot exist without
the presence of a tangible medium, but that they are characterized by a wave velocity
which depends on the properties of the medium but not on the motion of the source.
These remarks are equally true of water waves, elastic waves in solids, etc.
Comparison. Does light share these characteristics? With respect to the require-
ment of a tangible medium, the answer is no. Light can propagate in gaseous, liquid,
and solid media, but it does not require the presence of these media to exist. Indeed, it
can propagate in the almost complete vacuum which separates the stars from each
other, and many times has been shown to traverse man-made vacua with an intensity
no less than it had when air was present. For example, Xlichelson's last experiments on
the determination of the speed of light were performed in a huge evacuated tunnel. In
this respect light] as a wave phenomenon is unique in not requiring a tangible medium
for its existence.
Does light share the second characteristic, that is, does it possess a wave velocity
which is independent of the motion of the source? An indication that it does was pro-
vided when Maxwell discovered that wavelike solutions to his equations described
electromagnetic fields which would propagate through space at the velocity of ligh t,
leading him to assert that light is an electromagnetic phenomenon, But the equation he
used to obtain these wavelike solutions was similar to (1.9), the wave equation for
sound. Thus just as in the case of acoustic disturbances, Maxwell's analysis suggested
that the velocity of light should be completely independent of its source.
There is also strong experimental evidence to support this view. W. de Sitter" has
analyzed with great care the dynamics of eclipsing binary stars. Were the velocity of
light dependent on the motion of the source, it is apparent that the time for light to
reach the earth from the approaching star of a binary would be different than the time
for light to reach the earth from the receding star. de Sitter deduced that this would
introduce apparent eccentricities in their orbits as they circled each other, but such
eccentricities have never been observed. Some binary stars are at such a distance from
the earth and have sufficiently high orbital velocities that this effect could scarcely
escape observation. Because of this evidence the postulate will be accepted that light,
in common with all other wave phenomena, has a velocity which does not depend on the
motion of the source. (Many successful Doppler radar systems have been built under
this assumption.)
The Ether. It has been noted earlier, in Section 1.1, that light was not really ac-
cepted as being wavelike in nature until the middle of the nineteenth century. By that
time many other wave phenomena were well understood. Since these other wave
phenomena all required a medium for transmission, it was natural to believe that light
t The term "light" is used here in the broad sense to include the nonvisible portions of the
electromagnetic spectru In.
45 W. de Sitter, "An Astronomical Argument for the Constancy of the Velocity of Ligh t," Z Phys, 14,

429; May 15, 1913.


SECTION ;3 Sound Waves and Light Waves 35

did also, even after it was appreciated that light could propagate in a VaCUUlTI. Thus an
intangible medium was hypothesized to provide the support for light waves, The ether,
as this medium was called, being intangible, was endowed with extraordinary proper-
ties not shared by any other known medium. These included the ability to pass through
all substances without frictional resistance and the property of being mass-less and th us
unaffected by gravitation. Despite the mystical aspects of this hypothesis, most nine-
teenth-century scientists firmly believed in the existence of the ether and many serious
scientific experiments were undertaken to prove the validity of the ether concept. The
quest for the ether served to sharpen a dilemma concerned with the velocity of light, a
subject which will be explored in Chapter 2.

REFERENCES

1. Cohen, 1. B., Roemer and the First Determination of the ~l elocity of Light, The Burndy Library,
Inc., New York, 1944.
2. Drude, P., The Theory of Optics, translation by Mann and Millikan, Longrnans, Green and
Company, London, 1917.
3. Jaffe, B., M'ichelson and the Speed of Light, .Anchor Books, Doubleday and Company, Inc.,
New York, 1960.
4. Morse, P. M., Vibration and Sound, 2nd ed., Chap. 6, i.\t'lcGra\v-Hill Book Company, New
York, 1948.
5. Preston, T'., The Theory of Light, 5th ed., Macmillan and Company, Ltd., London, 1928.
6. Reymond, A., History of the Sciences in Greco-Iiomun Antiquity, Methuen and Company,
Ltd., London, 1927.
7. Richtmyer, F. K., E. H. Kennard, and 'r. Lauritsen, Introduction to Jfodern Physics, 5th ed.,
Chaps. 1 and 2, McGraw-Hill Book Company, New York, 1955.
8. Whittaker, E., A 1listory of the Theories of the Aether and Electriciiu, Thomas Nelson and
Sons, Ltd., London, 1951.
9. Williams, H. S., .4 History of Science, Vol. 1 and 2, Harper and Brothers, N ew York, 1904.
CHAPTER 2
The Special Theory of Relativity
RELATIVITY THEORY is usually divided into t\VO categories, the special or "restricted"
theory, and the general theory. The special theory is concerned with phenomena as they
appear to different observers who have a constant velocity relative to each other. The
general theory removes this restriction and considers phenomena as they appear to
different observers who are in arbitrary relative motion. As one would expect, the
general theory is considerably more difficult. Only the special theory will be needed as a
foundation for the electromagnetics to be developed in the remaining chapters of this
text.
The concepts underlying the special theory of relativity are sometimes puzzling on
first consideration because they lead to predictions about space, time, and matter which
are contrary to C01111110n experience. However, once these concepts are grasped, and
it is recognized that common experience need not be rejected (because it consists of
phenomena in which relativistic effects are too small to be detected), the path is opened
to an understanding of important new relationships. Fortunately, the mathematical
tools required to comprehend the special theory do not extend beyond algebra and
some elementary calculus, so that in approaching this subject much of one's attention
can be concentrated on the concepts themselves.
I t is difficult to appreciate fully the need for the special theory of relativity and its
accomplishments without first recognizing the impasse in physics which it solved.
For this reason an essentially dual chronological presentation of subject Blatter will be
followed in this chapter. In the first (or classical) chronology, the principle of relativity
is introduced and then its consequences ill terms of classical mechanics are considered.
In order to be consistent with this principle, X ewtou's Law of Inertia is shown to
require the Galilean transformation as the connection between different inertial coordi-
nate systems, This development requires the assumptions that distance intervals
and time intervals are invariants. When the additional assumption is made that mass
is an invariant, 1\ewtori's general force law also is seen to transform properly via the
Galilean equations. A by-product of this proof is the familiar classical law of velocity
transformation. Application of this velocity law to the ease of sound waves yields a
result in agreement with observation; however, when this law is applied to the veloc-
ity of light, such agreement with observation is lacking, thus posing a fundamental
dilemma. This disagreemen t between classical prediction and observation is discussed
in terms of the Fizeau experiment involving light propagation in moving water, and the
Michelson-Morley ether drift experiment, the null result of which raises questions about
the existence of a light medium.
SECTION] Historical Survey 37

After various classical explanations of this dilemma are considered and rejected as
unsatisfactory, the second chronology] begins with a reexamination of the funda-
mental definitions of space and time. Einstein's two postulates of special relativity
are then used as the basis for a resolution of the impasse and a derivation of the Lorentz
transformation. This transformation is seen to be consistent with the principle of
relativity in the case of light velocity and provides a convincing explanation of the
Fizeau experiment. The view of Einstein that the concept of a luminiferous ether is
superfluous automatically explains the null result of ether drift experiments such as
those of Michelson and Morley and the more recent test by Cedarholm and Townes
using masers.
Application of the Lorentz equations to the transformation of the laws of mechanics
is found to have several significant consequences. These include the dependence of
length on motion, time dilatation, variation of mass, and the equivalence of mass and
energy. These effects combine to yield transformation laws for mass and force. The
latter is used in Chapter 4 to derive all the results of magnetostatics via a relativistic
transformation of Coulomb's electric force law,
A second transformation is used in Chapter 5 to derive Maxwell's equations for the
case of a source system consisting of steady currents and charges, as seen by one
observer, with respect to whom a second observer is in constant translational mo tio n.
The second observer detects time-varying electrornagnetio fields due to sources of
a restricted class; upon superimposing a set of such fields, one can establish Max-
well's equations for the general case of accelerated sources. In this manner all the
basic relations of electromagnetics are derived by using the special theory of relativity
to enlarge upon the single experimental postulate of Coulomb's law, without the need
to invoke the general theory.

2.1 * HISTORICAL SURVEY

Bradley's discovery of stellar aberration in 1728 has already been recounted in the
previous chapter. At that time the corpuscular theory of light held sway, and thus
Bradley's explanation of the effect was based on the mechanistic law of addition of
velocities. A century later, when the wave theory of light had been revived successfully
by Young and his followers, the need existed to reexamine all optical phenomena,
including aberration, on a wave basis. The concept of a luminiferous ether through
which light propagates became a natural part of the wave theory, in analogy with all
other known wavelike disturbances, each of which requires a medium for transmission.
Thus it was Young himself who employed the ether concept to explain Bradley's
discovery of aberration in terms of a wave picture. In addressing the Royal Society in
1803 he remarked that!

t The two chronologies overlap because Einstein's explanation of the dilemma, though offered in 1905,
did not gain universal acceptance immediately; many classical attempts at an explanation were still
to be forthcoming for several decades.
* This section may be omitted without loss in continuity of the technical presentation.
1 T. Young, j\1iscellaneous Works, edited by George Peacock, Vol. 1, p. 188, John Murray Publishers,

London, 1855.
38 The Special Theory of Relativity CHAPTER 2
IT pon considering the phenomena of the aberration of the stars, I am disposed to believe
that the luminiferous ether pervades the substance of all material bodies with little or no
resistance, as freely perhaps as the wind passes through a grove of trees.

In this conception the earth glides through the ether, .and the light from a distant
star is unaffected by the earth's motion, Thus the light waves, during the interval of
time they traverse the tube of a telescope, suffer a displacement equal to the displace-
ment of the earth through the ether in the same time interval. This displacement can be
compensated for by making a small angular correction in the position of the telescope,
thus accounting for aberration.
The notion of an ether which pervaded the entire universe, being everywhere at
rest in S0111e particular frame of reference, gained favor for the additional reason that it
lent support to the idea of an absolute frame of reference, with respect to which the
absolute position and velocity of all bodies could be specified.
Young enlarged on this ether concept with the suggestion that?

For explaining the phenomena of partial and total reflection, refraction, and inflection,
nothing more is necessary than to suppose all refracting media to retain: by their attraction,
a greater or less quantity of the luminous ether, so as to make its density greater than that
which it possesses in a VaCUll111, without increasing its elasticity.

Fresnel put this idea into a precise form by postulating that the density of ether in any
body is proportional to the square of its refractive index. The excess of ether density
over that in vacuo was assumed to be dragged along with the body, the remainder
staying at rest as part of a uniform background ether. With this model Fresnel was
able to derive an expression for the velocity of light v in a moving body, namely,

in which u is the velocity of the body with respect to the ether and Vo is the velocity
light would have in the body if the body were stationary in the ether; all velocities are
measured with respect to the frame of reference in which the ether is at rest.
With this formula, Fresnel was successful in explaining refraction effects under the
wave theory, for bodies in motion as ,veIl as at rest with respect to the ether. His
theory was consistent with Arago's result that the apparent refraction in a moving
priS111 is equal to the absolute refraction in a stationary prism, and it further predicted
that if observations were made with a water-filled telescope, the aberration would be
unaffected by the presence of the water. This prediction was verified by Airy in 1871.
Fizeau, in a significant experiment, passed light through tubes of moving water, and
used an interference technique to substantiate the above Fresnel formula, this being
done in 18t51.
IVI ax w ell, who possessed a physical imagination akin to that of Faraday, firmly
believed in the existence of an ether. In the classic paper" which introduced his theory
of the electromagnetic field, one finds the passage
2 I bid., p. 80.
3J. C. Maxwell, "A Dynamical Theory of the Electromagnetic Field," Phil Trans Roy Soc (London),
155, 450; 1865. (See also J. C. Maxwell Scientific Papers, Vol. 1, pp. 526-597, Dover Publications,
Inc., New York.)
SECTION]

I t appears therefore that certain phenomena in electrici ty and magnetism lead to the
same conclusion as those of optics, namely, that there is an aethereal medium pervading all
bodies, and modified only in degree by their presence; that the parts of this medium are
capable of being set in Illation by electric currents and magnets: that this motion is corn-
municated from one part of the medium to another by forces arising £1'0111 the connexions
of those parts; that under the action of these forces there is a certain yielding depending
on the elasticity of these connexions; and that therefore energy in two different forms 111ay
exist in the medium, the one form being the actual energy of motion of its parts, and the
other being the potential energy stored up in the connexions, in virtue of their elasticity.

Several years earlier .vlaxwcll had devised a mechanical conception of the electro-
magnetic field and had been led by analogy to the conclusion that electromagnetic
waves are propagated at the velocity of light. He therefore felt that light was an electro-
magnetic disturbance and made the assertion

We can scarcely avoid the inference that light consists in the transverse undulations of
the same medium which is the cause of electric and magnetic phenomena.

Thus an answer was provided to speculation as to whether or not several ethers existed
for the separate support of light, heat and electricity.
Interest in the detection of this luminiferous ether grew, but it was several decades
before an experiment of sensitivity sufficient to be definitive was performed. In 1881
Michelson invented an interferometer capable of measuring second-order effects in the
assumed velocity of the earth relative to the ether. His technique for determining
ether drift was analogous to the detection of a river current through comparison of
the round trip times of rowers who follow courses parallel to and perpendicular to the
flow. This first experiment gave a null result but the sensitivity was marginal, so the
apparatus was improved and the experiment repeated by Michelson and Morley in
1887. A null result was again obtained; it was as though the earth were at rest in the
ether.
This experiment caught the attention of the Dutch physicist H. A. Lorentz (1853-
1928), who became convinced that the null result was a real effect and sought a reason
to explain it. In 1892 he hypothesized that a material body suffers a contraetion in its
longitudinal dimeusiou, due to its rnotion through the ether, just sufficient to prevent
the ether's detection with the Michelson interferometer. This sarne explanation had
been put forth verbally by G. F. Fitz Gerald (18.51-1901) several years earlier and is
often referred to as the Lorentz-FitzGerald contraction hypothesis.
Lorentz attempted to develop a complete electron theory which would explain this
contraction in terms of a readjustment of electrical forces between molecules, due to
absolute motion through the ether. In a succession of papers he ultimately formulated
a theory in which l\Iaxwell's equations would transform from one set of variables to
another without a change in form. 4 The t\VO sets of variables were related to each other
by what have come to be known as the Lorentz transformation equations; the repre-
sentation is that of the connection between two coordinate systems in different states
of constant mot.ion through the ether. To obtain this transformation Lorentz assumed
that spherical electrons were flattened into ellipsoids due to their motion through the
4 H. A. Loren tz, "Electromagnetic Phenornena in a System Moving With Any Velocity Less Than

That of Light," Pl'OC A.nlsi Acad, 6, 809; 1904. (Heprinted in English in The Principle of Relativity,
pp. 11-34, Dover Publications, New York.)
40 The Special Theory of Relativity CHAPTER 2

ether and introduced what he called "local time" in one frame of reference which
depended on both time and distance in the other frame. The physical meaning of this
local time was not elaborated.
In 1932 R. J. Kennedy devised an ingenious modification of the Miehelsou-Morley
experiment which showed the Lorentz-FitzGerald contraction hypothesis to be un-
tenable. Meanwhile, the intervening years had seen a series of repetitions of the original
Michelson experiment by a number of investigators. Though the sensitivity and accu-
racy increased, no change from the null result was noted and a variety of explanations
based on an ether theory proved unsatisfactory.
Concomitantly, a new approach to the problem had been evolving. In 1900 while
addressing the International Congress of Physics at Paris, Poincare reviewed the impli-
cations raised by the null result of the Michelson experiment and asked, "Our ether-
does it really exist? I do not believe that 1110re precise observations ever could reveal
anything more than relative displacements." Poincare became convinced that it was
impossible to determine the earth's absolute motion (that is, its velocity through the
ether), and embraced this belief in the enunciation of a Principle of Relativity. Speak-
ing in St. Louis in 1904, he said 5
According to the Principle of Relativity, the laws of physical phenomena 111USt be the
same for a "fixed" observer as for an observer who has a uniform Illation of translation
relative to him: so that we have not, and cannot possibly have, any means of discerning
whether we are, or are not, carried along in such a motion.
In 1905 Einstein made a complete break with the ether concept, discarding it as
superfluous. He adopted the principle of relativity as a postulate and added as another
that light is always propagated in empty space with a definite velocity c which is inde-
pendent of the state of motion of the emitting body. Upon careful reexamination of the
concepts of the measurements of space and t.ime, he concluded that neither was an
invariant. In satisfying his second postulate, Einstein was led to the same transfor-
mation equations derived earlier by Lorentz. However, the derivation was on an en-
tirely different basis, and one which has stood the test of time.
The noninvariance of spatial and temporal intervals has caused a major reinterpre-
tation of the concepts of mechanics. In 1906 Max Planck determined the modifications
which would be needed in the K ewtonian equations of motion to place them in accord
with the new relativity theory, and then developed expressions for the kinetic energy
and momentum of a material particle. I t was recognized that the concept of 111aSS as
an invariant must also be abandoned. This variability of mass was clearly illustrated
by G. :N. Lewis and R. C. Tolman in 1909, when they considered the collision of t\VO
similar balls as viewed from different coordinate systems, and found that either ­1AÆ
mentum was not conserved or mass depended on speed. The relativistic expression for
kinetic energy led Einstein, Lewis, and others to suggest that energy and mass were
related by the now-celebrated equation l~ = me", Transf'ormation laws based on the
Lorentz equations were worked out for velocity, mass, and force, from which emerged
the result that the velocity of light is the upper limit for motion of Blatter and energy.
Experimental evidence in support of Einstein's special theory of relativity is positive
and abundant. With the advent of atomic clocks, greatly increased precision in the
measurement of time intervals has made possible a variety of terrestrial experiments
5 An English translation of this address by G. B. Halstead can be found in The Monist, January 1905.
SECTION 2 The Principle of Relativity and Its Classical I mplications 41

which verify all the major predictions of the theory, including the dependence on speed
of distance, time, and mass, A variety of nuclear processes has confirmed the relation
E = me', ?\Iuch of this evidence will be presented in the sections to follow, together
with the principal developments of the theory which have been enumerated above.

2.2 THE PRINCIPLE OF RELATIVITY AN­ ITS CLASSICAL IMPLICATIONS

The principle of relativity in science is an old idea whose origins are difficult to trace.
Simply stated, it expresses the belief that all the laws of nature should operate in the
same manner everywhere in the universe." This idea was given specific articulation by
Poincare at a meeting of the International Congress of Physics at Paris in 1900 and
was raised to the status of a formal postulate by Einstein in 1905. Despite the apparent
simplicity and self-evident logic of this principle, it has deep-seated consequences.
Consider first the implications of the relativity principle with respect to X ewtou's
laws of motion. The First Law, or Law of Inertia, states: ..4 body at rest or in uniform
motion will remain at rest or in uniform motion unless some externaltorce is applied to it.
Implicit in this law is the notion of an observer who can determine that the mot.inn
of the body is unaccelerated. But not all observers will make such an observation, for if
t\VO observers are in accelerated motion with respect to each other, they cannot both
perceive the body to have an unaccelerated motion. 'rhus the Law of Inertia as stated
above is not applicable for all observers in all coordinate systems. Those systems in
which it is applicable are said to be inertial susieme.

By this one means that an observer °


Let XYZ be a Cartesian coordinate system in which the Law of Inertia is valid.
who is stationary in XYZ will determine that
any body which is removed from interaction with all other bodies will be at rest or
traveling ill a straight line at constant speed. In the X ewtonian conception of space,
o can imagine the X, Y, and Z axes extending as straight lines in three perpendicular
directions to the limits of the universe and can imagine a one-to-one correspondence
between the points in physical space and the triplets of numbers (x,y,z). As seen by A,
the instantaneous position of a particle can be described by its three coordinate varia-
bles x(t), y(t), z(t). If this particle is force-free, 0 can write
i == 0 y == 0 z == 0 (2.1)
in which each dot signifies a time derivative. Integration once with respect to time gives
i: = Vx iJ = v y z == Vz (2.2)

with v = lxvx + lyvy + lzv z the constant straight-line velocity which A observes the
particle to have, in conformance with the First Law, t
But clearly the frame of reference X Y Z is not unique in the sense of being the
only inertial system, and one can readily imagine another observer 0', at rest in a
different coordinate system X' Y'Z', for whom the same body also seems unaccelerated.
Since 0 and 0' are observing the same phenorncnou. they should be able to deduce
each other's measurements from a knowledge of the relative position and motion of
t In this text, unit vectors will be designated by the symbols lx, l r, 14>, etc. Sce the Mathematical
Supplement.
6 Anaxagoras (c.5()()-430 B.C.) apparently held this belief. See 1). E. Gcrshenson and 1). A. Greenberg,

A naxaqoras and the Birth. of the Scientific M ethocl, Blaisdell Publishing Company, New York, 1964.
42 The Special Theory of Relativity CHAPTEH 2

their two frames of reference. Stated differently, if observer 0 knows the triplet (x,y,z)
which establishes the instantaneous position of the particle as determined by himself,
he should be able to deduce the corresponding triplet (x',y',z') and thus know the
instantaneous position of the particle as seen by 0'.
This connection is accomplished through the coordinate transformation equations]

x' = 01(x,y,z,l) y' = A2(X,Y,Z,t) (2.3)

Several restrictions can be invoked to determine a specific form of this transformation.


First, if it is assumed that 0 and 0' agree in their measurement of distance, then the
functions gl, g2 and g3 must be linear and commensurate in the spatial variables.
Second, if 0 and 0' are both to observe the particle to have a straight line trajectory,
only a translational motion of X'Y'Z' with respect to XYZ is permitted; rotational
motion is excluded. Third, if it is assumed that 0 and 0' agree in their measurement
of time intervals, then the functions 01, 02, and 03 must also be linear in the temporal
variable, for otherwise one would not obtain x' = y' = z' = A when x = fj = z = o.
With these restrictions, the most general suitable solution of (2.3) is the Galilean
transformation discussed in the Mathematical Supplement (Example V.16); namely,

x' = (x - Xo - uxt) cos xx' + (y - Yo - uyt) cos yx' + (z - Zo - uzt) cos zx'
y' = (x - Xo - uxt) cos xy' + (y - Yo - uyt) cos yy' + (z - Zo - uzt) cos zy' (2.4)
z' = (x- Xo - uxt) cos xz' + (y - yo - uyt) cos yz' + (z - Zo - uzt) cos zz'

In (2.4), u = l x u x + lyuy + lzu z is the constant velocity of 0' with respect to 0;


cos xx', cos yx', etc., are the cosines of the constant angles between the X and X' axes,
the Y and X' axes, etc.; (xo,Yo,zo) is the position of the origin of the X'Y'Z' system as
seen by 0 at t = O.
Equations (2.4) are known as the most general Galilean transformation, Their physi-
cal interpretation is that the primed system is in translative motion relative to the
unprirned system at a speed U = (u; + u~ + u;)}~. This motion is in an arbitrary direc-
tion with respect to the XYZ axes. Furthermore, the X'Y'Z' axes are tilted arbitrarily
relative to the XYZ axes and the primed origin is in an arbitrary position relative to
the unprimed origin at t = O.
I t is a simple matter to show that X cwtou's First Law applies in one of these sys-
terns if it applies in the other. If x(t), Yet), and z(t) are the time-varying coordinates
of a particle as seen by 0, then differentiation of (2.4) gives
dx'
dt
(x - u x ) cos xx' + (iJ - u y ) cos yx' + (z - u z ) cos zx'

dy'
(x - u x ) cos xy' + (iJ - u y ) cos yy' + (z - u z ) cos zy' (2.5)
dt
dz'
dt = (x - u x ) cos xz' + (iJ - Uy) cos yz' + (z - u z ) cos zz'

Since it has been assumed that observers 0 and ()' agree in their measurement of ti me
intervals, so that it is proper to write dt' = dt, it follows that the left sides of Equa-
tions (2.5) are the instantaneous velocity components of the particle as seen by 0'.
t See the Ma thematical Supplement, Sec. V.II.
SECTION 2 The Principle of Relativity and Its Classical Lmplications 43

Under this assumption, Equations (2.5) are known as the velocity transformation
equations, and one additional differentiation gives the acceleration transformation,
namely,
x' == x cos xx' + y cos yx' + Z cos zx'
y' == x cos xy' + y cos yy' + z cos zy' (2.6)
z' == x cos xz' + y cos yz' + Z cos zz'
Thus if a is observing an unaccelerated particle, so that x = y == z == 0, then Equa-
tions (2.6) give x' == y' = z' == 0, indicating that the particle also appears unacceler-
ated to observer a'.
Cartesian coordinate systems linked by a Galilean transformation have several other
important properties. One of these is the invariance of distance. Suppose that (X2,Y2,Z2)
are the coordinates of one particle and (Xl,Yl,Zl) are the coordinates of another particle
at a common time t, as noted by observer a
who is stationary in XYZ. 0 then says
that the instantaneous distance separating the two particles is
(2.7)
Similarly, observer 0', who is stationary in X'Y'Z', finds that the instantaneous po-
sitions are (x~,y;,z~) and (x~,y~,z~) and concludes that the distance of separation is
(2.8)

If the two coordinate systems are connected by a Galilean transformation, Equations


(2.4) can be used to deduce that
x~ - x~ (X2 - Xl) cos xx' + (Y2 - YI) cos yx' + (Z2 - Zl) cos zx'
y~ - y~ == (X2 - Xl) cos XV' + (Y2 - YI) cos yy' + (Z2 - ZI) cos zy' (2.9)
z~ - z~ == (X2 - Xl) cos xz' + (Y2 - YI) cos yz' + (Z2 - ZI) cos zz'
If one substitutes (2.9) in (2.8), recognizing that terms of the type cos? xx' + COs 2 XV' +
cos" XZ' are unity, whereas terms of the type cos X~t' cos yx' + cos xy' cos yy' + cos xz'
cos yz' are zero t makes it apparent that
d' == d (2.1A)

Therefore the Galilean transformation leaves distance an invariant.


This invariance of distance permits a simple proof of the most important property
of a Galilean transformation-the fact that N ewton's general force law is invariant
(actually eovariaut j) under such a transformation. I t has been shown above that if
the First Law (concerning unaccelerated bodies) is valid in the unprimed system, a
Galilean transformation renders it valid in the primed system as well. But for acceler-
ated bodies, if f = ma is a valid relation in XYZ, and it is transformed using (2.4),
does one obtain f' == m' a'? To see that under suitable assumptions this does occur,
consider the result of Example 'l.22 in the Mathematical Supplernent. In that ex-
t The term cos? xx' + cos" xy' + cos" xz' is seen to be a unit vector parallel to the X axis,
resolved in to com ponen ts along the primed axes, and dotted with itself. The term cos xx' cos yx' +
cos XVI cos YY' + cos xz' cos yz' is seen to be the dot product of two perpendicular unit vectors, one
parallel to the .L\ axis, the other parallel to the Y axis, both resolved into their primed components.
t If the [orm of a law is unchanged by a certain coordinate transformation, that is, if the law has the
same functional form in terms of either set of coordinates, the law is said to be covariant with respect
to the transformation considered.
44 The Special Theory of Relativity CHAPTER 2

ample, a mass m experiences gravitational forces due to an assemblage of other masses


m, . . . m», The total force on rn is found to be expressible as the negative of the
gradient of the scalar potential function
N
\' mm,
cI>(x,Y,Z,Xl,Yl,Zl, ... ,t) = - '-' G-- (2.11)
i= 1 ri

in which G is the universal gravitational constant and


r, = [(x - Xi)2 + (y - Yi)2 + (z - Zi)2)H
is the instantaneous distance between m and mi. Through use of (2.11), Newton's force
law for the case of mass particles can be written in the form
N
\' mm;
ma = V L G - .- = - V<P (2.12)
i =1 1i

Because distance is an invariant, and because of the form of (2.11), ~f it is assumed that
mass is also an invariant, then
ep(X',y',z',x~,y~,z~, ... ,t) = ep(x,Y,Z,Xl,Yl,Zl, ... ,t) (2.13)
In other "'''0 I'ds, for a given set of relative positions of the masses, observers 0 and 0'
will agree on the value of the potential function. Formation of partial derivatives
of (2.13) gives
a4> a4> ,a4> ,aep ,
- = - cos xx + - cos yx + - cos zx
ax' ax ay az
a4> aep aep a4>
, = - cos xy' + - cos yy' + - cos zy' (2.14)
ay ax ay az
a<I> acI> ,a4> ,a<fl ,
- = - cos xz + - cos yz + - cos zz
az' ax ay az
in which it has been recognized, through differentiation of the inverse of Equations
(2.4), that ax/ax' = cos xx', ay/ax' = cos yx', etc.
Substitution of the three components of (2.12) into (2.14) gives
a<I>
- -, =
ax
m.i: cos xx' + my cos yx' + mz cos zx'
a4>
- -
ay'
= mx cos xy' + my cos yy' + mz cos zy' (2.15)

a4>
- - = mi cos xz' + my cos yz' + mz cos zz'
az'
Upon comparing (2.15) with (2.6), one can conclude that
a<I> .., a<I> .., a<I> ..,
mz
- - = mx - - = my - - =
ax' iy' az'

LG m~i
N

and thus that ma' = Vi = -V ' 1J (2.16)


i = 1 1i
SECTION 2 7
1he
Principle of Relativity and Its Claseical l m.plicaiions 45

which is Newton's general force law in the same form as (2.12). Thus, under the
assumptions that time and mass are invariants (plus the consequence of the Galilean
transformation that distance is an invariant), the general Galilean transformation
leaves all of 1\ ewtorr's laws of mechanics for free mass particles unaltered in form. I t is
for this reason that considerable importance attaches to the transformation (2.4).
Other branches of mechanics, including hydrodynamics, elasticity, and the mechanics
of rigid bodies, can be treated as extensions of the mechanics of free mass particles,
through the introduction of suitable interaction energies in the form of potential func-
tions whose gradients give forces. I t is thus clear, without entering into a detailed
treatment of these branches of mechanics, that the laws which govern them also trans-
form properly via the Galilean Equations (2.4), under the same assumptions which
were made in the preceding development. Therefore the t\VO inertial systems XYZ and
X'Y'Z' appear to be equivalent for the description of all the phenomena of mechanics.
This belief is often referred to as the Galilean principle of relativity.
One special case of the general Galilean transformation proves particularly useful.
Assume the situation of Figure 2.1 in which the primed and unprimed axes are respec-

Z' r-----y

x
~----y'

X'
FIGURE 2.1 Cartesian coordinate systems in constant translative relative motion.

tively parallel, the origins having coincided at t = 0, and in which the X and X' axes
are sliding along each other at a relative speed u. It is seen readily that for this case (2.4)
reduces to
X' = x - ut

y' = Y (2.17)
z' = z
Similarly, the velocity transformation equations (2.5) reduce to
x' = x - u y' = Y i' = i (2.18)
a result which depends on the assumption that time is an invariant. Equations (2.18)
also could have been deduced directly by a time differentiation of (2.17).
4G The Special Theory of Relativity CHAPTER 2

The usefulness of the transformation (2.17) extends beyond its simplicity. Imagine a
third coordinate system .LY * Y *Z * connected to the system X YZ by a static rotation and
also imagine a fourth system X~ Y~Z~ connected to the system X' Y' Z' by a static rota-
tion plus a static translation. t Then X~ Y~Z~ is moving relative to X * Y*Z* at a speed u.
This motion is in an arbitrary direction with respect to the ..cY* Y*Z* axes and is also in
an arbitrary direction with respect to the X~ Y~Z: axes. Furthermore, the t\VO origins
are in an arbitrary relative position at t == O.
But this is the description of t\VO Cartesian frames connected by the general Galilean
Equations (2.4). Therefore one can obtain the most general Galilean transformation,
connecting X * Y *Z* and X~ Y~Z~, via t\VO static transformations of Equations (2.17).
Since t\VO observers, each at rest in separate Cartesian systems of coordinates (but
systems connected by a static Galilean transformation), are in complete agreement
about measurements of motion, it follows that the observations of 0 and 0* are
equivalent, and that the observations of 0' and O~ are equivalent. Any deductions
based on (2.4) are also obtainable from (2.17). For this reason no loss in generality is
suffered if, for brevity and clarity, all the remaining discussion is presented in terms
of the simple Galilean Equations (2.17).
To summarize the ideas of this section, one can say that any two inertial frames, con-
nected by a Galilean transformation of the type (2.17) {possibly through the inter-
mediary of t\VO static transformations) are equally suitable as references in which to
express the general laws of mechanics. This conclusion requires the assumptions that
distance, time, and mass are invariants.
In the nineteenth century, mechanics was such a highly developed branch of science,
and there was such a satisfactory agreement between K ewton's laws and experiment,
that mechanics enjoyed a greater confidence and trust than any other area of physical
knowledge. Since the principle of relativity seemed so logical and natural, and since
the Galilean equations transformed all the laws of mechanics in conformance with the
principle of relativity, the greatest confidence also reposed in the belief that the Galilean
transformation was correct. A test of its correctness arose with the question whether
or not all the other (nonmechanica1) laws of physics also transform properly via the
Galilean equations, as the principle of relativity in its broadest sense requires. The
next several sections are concerned with this question.

2.3 APPLICATIONS OF THE CLASSICAL VELOCITY TRANSFORMATION LAW

A simple mechanical example will serve to illustrate the reasonableness of the velocity
transformation equations (2.18). Consider the ease of an observer 0 standing beside a
highway as a sedan goes by traveling at BO mph relative to the ground. At the same
time observer 0' is in a second car which is traveling at 70 mph relative to the ground
and in the process of passing the sedan. If the XYZ· system is attached to the ground
with the X axis parallel to the highway, the situation is suggested by Figure 2.2.
Observer 0 will say that the sedan has a velocity given by i: = tjO mph.
If the X' Y' Z' system is attached to the car in which 0' is riding, then u = 70 mph is

t By a static rotation plus a static translation, one means that ..x; y~Z: and .Y ' Y ' Z' have no relative
motion, but their axes are tilted with respect to each other and their origins do not coincide.
SECTION 3 Applications of the Classical Velocity Transforrnation Law 47

the relative speed of the two coordinate systems, and (2.18) gives x' == 50 - 70 == - 20
mph as the speed of the sedan relative to a'. This is a result completely consistent with
common sense, and is typical of many similar applications of (2.18) which can be
encountered in everyday experience.
N ext consider an observer 0, stationary with respect to the average motion of the

,..)---------------y
o

x /
50
-:
70
FIGURE 2.2 Relative speed.

air which surrounds him. An acoustic source is generating sound waves which pass 0
at a speed c., governed solely by the properties of the air. Imagine also an observer 0'
traveling at a speed u relative to A, in the direction of the wave motion. Equations
(2.18) predict that the sound waves will pass A' at a speed c, - u, and experimental
observations are consistent with this prediction.
The motion of the acoustic source will be different as observed by A and A' but this
has no effect on the wave velocity (cf. Section 1.3). The reason why 0 and 0' observe a
different value for the speed of the sound waves is that the medium is at rest with respect
to 0 but is in motion at a speed u relative to 0'.
Finally, consider an observer A, stationary in XYZ, past whom a light wave is
propagating at speed c. If a second observer 0' is traveling at a speed u relative to 0
in the direction of the wave motion, then Equations (2.18) predict that the light waves
will pass 0' at a speed c' == c - u. This result should be valid even without the presence
of a tangible medium, since it is known that light can propagate through a VaCUU111.
But in this extremity of the absence of a tangible medium, two possibilities need to be
considered:

1. There is a detectable intangible medium, call it ether, which supports the light
waves, and in which light propagates at a speed c governed by the properties of
the ether.
2. There is not a detectable ether, and a vacuum is a region to which no physical
properties can be ascribed.
48 T he Special Theory of Relativl'ty CHAPTER 2

Under the first possibility, the existence of a detectable ether, the situation is com-
pletely analogous to the case of sound waves, For convenience in this discussion
observer A can be assumed to be at rest in the ether, so that the light waves do pass
him at a speed c. These light waves then pass 0' at a speed c' = c - u because the
medium is in motion at speed u relative to 0', The motion of the light source does not
affect these values of c and c' because only the ether governs the velocity of propaga-
tion (cf. Section 1.3).
Under the second possibility, the nonexistence of a detectable ether, it is illogical to
write
c' = c - u ~ c (2.19)

This point can be appreciated by recognizing that A is in a vacuum, observing a light


source with some particular motion, and that this source is emitting light waves whose
velocity relative to himself 0 can measure. But 0' is also in a vacuum, observing the
same light source with S0111e particular motion, and this source is emitting light waves
whose velocity relative to himself 0' can measure, The only difference in the situation
for observers 0 and 0' is the motion of the source. But if the velocity of light is inde-
pendent of the motion of the source (and the experimental evidence indicates that
light does share this characteristic with sound), then 0 and 0' should measure the same
velocity for light, and conclude that
c' = c (2.20)

which is in violation of the classical law of velocity transformation.


Thus the t\VO possibilities lead to different predictions, and one should be able to
design experiments which will test the validity of each possibility. Several such
experiments have been performed, two of which will be described in the sections to
follow, However, before a discussion of these experiments is presented, it is significant
to point out the implications of a choice between the two possibilities (1) and (2) listed
above. If the principle of relativity is applicable to all the laws of physics, and if the
Galilean transformation equations (2.17) are consistent with this principle, then since
the velocity transformation law (2.18) is a direct consequence of (2.17), the presence of
a detectable ether is required; without an ether, the relation c' = c - u is meaningless,
Alternatively, if there is not a detectable ether, then either the Galilean transformation
equations are incorrect or the principle of relativity holds for mechanics but not for
light. A decision between possibilities (1) and (2) is of fundamental importance.
When the need to make this decision was first appreciated, there was every confidence
that an ether would be detected, that the Galilean transformation was correct, and that
the principle of relativity embraced all branches of physics. The actual detection of the
ether was eagerly a wai ted, and there were philosophical overtones to the scien tific
interest evinced in this imminent discovery, Without an ether, no single inertial frame
of reference could in any way be preferred over any other. However, if the ether could
be detected, presumably it would not consist of different portions in relative motion,
but would be everywhere at rest in one Galilean coordinate system. It would then seem
logical to take this preferred frame as the absolute reference. 'I'he instantaneous position
of every body in the universe with respect to this preferred frame could then be desig-
nated as its absolute instantaneous position.
An absolute reference frame for the entire universe had long been an appealing idea.
(K ewton, for example, had believed in absolute 1110tioIl, defining it as translation of a
SECTION 4 Fizeau's Experiment with lJloving WT ater 49

body from one absolute place to another absolute place.) Detection of the ether was
therefore not only expected to settle an outstanding question about light, but also to
establish a means for defining absolute position and motion.

2.4 FIZEAU'S EXPERIMENT WITH MOVING WATER

In 1859, in a classic paper," Fizeau described an experiment he had performed to deter-


mine the influence upon the velocity of light of the motion of the tangible medium
through which it passes. The result of this experiment has a strong bearing on the
question of the existence of a detectable ether, and was later credited by Einstein as
being of primary importance in his formulation of the special theory.
Fizeau divided a beam of light, which issued from a slit S placed at the principal focus
of a lens, into two parallelbeams, which he then passed through two parallel tubes
(Figure 2.3). At the end of these tubes, the two beams impinged upon a second lens

Tube

Flow..........- -
/////// !vI
~Flow

Tube
FIGURE 2.3 Fizeau's moving water apparatus.

and were reunited at its focus, where Fizeau had placed a plane mirror. Upon reflec-
tion the rays crossed and were each returned through the other tube, to be reunited
once again by the first lens and brought to a focus at the point F, through the inter-
position of the half-silvered mirror 1).
With both tubes filled with water, and the water at rest, transverse interference
fringes could be observed at F with a bright central fringe corresponding to equal paths.
If then the water were put in motion with equal speeds, but in opposite directions in the
t\VO tubes, and if the velocity of light were affected by this motion, one would predict
that the central fringe would be displaced. This would be so because one beam of light
would be traveling with the water flow, both out and back, whereas the other beam
would be traveling against it. A simple rncasurerncn t of the shift in position of the cen-
tral fringe would yield the difference in times along the t\VO paths and thus the depend-
ence of light velocity on motion of the water.
When Fizeau performed this experirnent, he did note a fringe shift which depended
on the rate of flow of the water, and his data fitted the formula

(2.21)

in which v is the velocity of light in water when the water is moving at a speed u relative
7 A. H. Fizeau, "On Hypotheses Relative to a Luminous Ether," Ann de Chimie et de Phys, Ser. III,

57, ;385-404; May 1859.


50 The Special Theory of Relativity CHAPTER 2

to the laboratory, Vo is the velocity of light in stationary water, and n is the index of
refraction. Both v and Vo were measured relative to the laboratory.
This result is at variance with a simple classical prediction. An observer 0' at rest
relative to the water should measure a velocity of light in the water of value Vo. An ob-
server 0 at rest in the laboratory, seeing 0' go by at speed u, can invoke Equations (2.18)
to predict that v = Vo + u. Since the index of refraction of water is approximately 1.3,
the difference between (2.21) and the classical prediction is too great to be ascribed to
experimental errors. Strong reinforcement of Fizeau's findings has been provided by
the precise work of later investigatorsv" who repeated his experiments, also obtaining
agreement with the formula (2.21).
This formula had actually been derived earlier by Fresnel on theoretical grounds
through a complication of the ether concept. At the time he was interested in explaining
an observation by Arago that the apparent refraction of light in a moving prism was
equal to the absolute refraction in a fixed prism. However, the argument of Fresnel's
derivation is equally applicable to the Fizeau experiment, and in that context proceeds
as follows:
Assume that the ethereal density in any body is proportional to the square of its index
of refraction. Then if c is the velocity of light in the ether in the absence of any tangible
matter, and if Vo is the velocity of light in the given material body when it is at rest, so
that n = c/vo is the refractive index, it follows that

in which p is the density of the ether in free space and p' is its density in the material
body.
Fresnel made the additional assumption that when the material body was in motion
at speed u, part of the ether was carried along with it-namely, that part which consti-
tutes the excess of its density over the density of ether in free space. The rest of the ether
within the space occupied by the body was assumed to remain stationary. In this man-
ner the density of the ether carried along by the body could be computed as

p' - p = (n 2 - l)p

while a density p remained at rest. The motion of the center of gravity of the ether
within the body was therefore
n2 - 1
- -2- u
n

Since this is the average motion, relative to the observer, of the ether associated with
the body, he should add this term to u», the velocity of light in the body when it is at
rest, in order to obtain v, the velocity of light in the body when it is in motion at
speed u. This addition yields the formula (2.21).
Fresnel's derivation is seen to require further hypotheses about the behavior of the
ether. No longer is the ether simply an intangible medium which is everywhere at rest
in some absolute reference frame with all the material bodies of the universe gliding
8 A. A. Michelson and E. W. Morley, Atn J Sci, 31,377; 1886.
9 P. Zeeman, Proc Arnst Acad, 17,445; 1914. Also 18, 398; 1915.
SECTION 5 The Jllichelson-J11 orley Experiment 51

through it without interaction. The ether becomes more dense inside a material body
and part of it is dragged along by the body's motion. Furthermore, Fresnel's derivation
would be valid only for an observer at rest in the absolute reference frame, since he
assumed that part of the ethereal density was stationary with respect to the observer.
An ether hypothesis adequate to explain Fizeau's result is thus seen to be rather
com plicated.
It is interesting to note that if formula (2.21) is valid for all material bodies, and if
a succession of material bodies is considered whose indices of refraction are sue-
cessi vely closer to unity, in the limit as n -.., 1, (2.21) reduces to c = c',

2.5 THE MICHELSON-MORLEY EXPERIMENT

A definitive experiment designed expressly to detect the presence of an ether was first
performed by Michelson in 1881 and repeated with improved accuracy by Michelson
and Morley in 1887. The essence of the approach is precisely analogous to Example V.3
in the IVlathematical Supplement, in which two rowers determine a river's flow by
noting the difference in their elapsed round-trip times, when one man rows across the
river and the other parallel to the bank. The reader is urged to Iamiliarize himself wi th
that problem and to convince himself of the soundness of the logic underlying the
analysis.
The apparatus employed in the ether experiment was an interferometer invented by
Michelson and shown in schematic form in Figure 2.4. Light from a source IJ is split into
two parts by a half-silvered mirror P. One part travels over path ll, is reflected by
mirror M 1, and upon returning to I) is partially reflected toward the viewing telescope F.
The light which thus reaches F has gone through the plate P three times.
The other part of the original light beam travels over path Ls, through the equalizing
plate, is reflected by M 2, and upon returning to P is partially transmitted toward the
viewing telescope F. The light which thus reaches F has gone through the plate F) once
and the equalizing plate twice. Since these t\VO plates are identical except for silvering,
the paths in glass along the two routes are the same. When the light source is mono-
chromatic, the relative phase of the t\VO light components reaching F depends on the
difference in round-trip times for light to travel along the t\VO paths P-1l1 1- 1) and
]J-A1 2-P. This relative phase manifests itself by interference effects in the field of the
viewing telescope.
Imagine that the half-silvered mirror P is set precisely at 45 deg, and that the tilts
of M', and AI 2 are adjusted so that the two light components which reach F via JVJ 1 and
M 2 travel parallel paths as they approach F. In this case the light intensity is essentially
uniform over the central region of a transverse plane AA', and the level of intensity
depends on the relative phase of the t\VO light components. However, if the tilt of P is
now shifted slightly away from 45 deg, the two light components reaching F via ]VII and
]yf 2 no longer travel parallel paths as they approach F. Thus in the transverse plane AA'
there will be alternate regions which are light and dark due to the constructive and
destructive interference of the t\VO light components. The positions of these light and
dark regions, or interference fringes as they are called, depends on the relative phase of
the t\VO light components. If this relative phase changes, the interference fringes will
shift transversely, and there will be a shift of one fringe for every 360 deg change in
52 The Special Theory of Ilelativity CHAPTER 2

To eyepiece

A--- ---+---A'

Half-silvered mirror

NIl U

~--+---ll----~
-I

NI2
FIGURE 2.4 The Michelson interferometer.

relative phase of the t\VO light components. Upon focusing the viewing telescope and
adjusting the tilt of P so that this transverse field of fringes is distinct, the operator
of the interferometer has an extremely sensitive indication of a change in relative phase
of the two light components arriving from M 1 and 1.1 2 , through an observed shift in the
fringe pattern in his field of view,
With this experimental technique in mind, assume that the earth, and with it the
apparatus of Figure 2.4, are moving at a speed u relative to the ether in a direction that
would take M 2 into P. According to the ether hypothesis, the speed of light is c in any
direction in the ether. If t~ is the time for light to travel from P to M 2, then ct~ is the
SECTION [) The 111 ichelson- 111 orleu Experiment 53

distance this light traveled through the ether. But this distance must also equal Z, - ut;
due to the motion of the apparatus through the ether. Thus
, l2
t2 == - -
c+u
Similarly, if t;' is the time for light to travel the return path from 111 2 to [J, then

The total time for light to travel the path P to M 2 to P is therefore

(2.22)

To compute the time for light to travel along the path to M 1 and back, one must
account for the fact that, while the light travels from P to M 1, the whole apparatus
moves a distance 0 in the M 2-P direction, as shown in Figure 2.5. The actual distance

~------ll------""""

FIGURE 2.5 Ray path [rom P to M', to P.

traveled by the light through the ether is therefore (li + 02)}~. If t~ is the time it takes
for light to get from P to 111 1 , then
o = ut~
Upon eliminating 0, one obtains
t~ = Ide
(1 - u 2/ C2)}1

Since the time light takes along the return path from M 1 to P is the same, the total time
for light to travel the path ]J to 111 1 to P is

2l / c
t - - - -1 - - (2.23)
1 - ( 1 _ U 2/ C2) }~
54 The Special Theory of Relativity CHAPTER 2

The difference in phase (assuming monochromatic light) of the two light components
arriving at F is therefore

4~v/c [ l2] (2.24)


6 = 21rV(tl - t2 ) = (1 _ U 2/C 2 )% II - (1 _ /c2)H
U2

in which u is the frequency of the light being used. As the apparatus is rotated through
90 deg, this difference in phase should steadily change, until at the end of the 90-deg
rotation, the roles of II and l2 are interchanged. At this position, the difference in phase
IS

(2.25)

If an observer continually notes the interference fringes as the apparatus is rotated


through 90 deg, he should see a total shift of n fringes, where n is given by

(2.26)

If tt/c is small, a series expansion (cf'. Mathematical Supplement, Part I) gives

n
II
= ----
+ l2U 2
(2.27)
A c2
whereas, if u/c is not small, n is larger than the value given by (2.27). Thus (2.27)
is the most pessimistic prediction for fringe shift and is seen to be a second-order
expression in u/c.
If an observer were willing to perform this experiment every day for six months, he
would expect to encounter a value for u at least as great as the orbital velocity of the
earth around the sun, namely 30 kru /sce, this minimum value occurring if the sun were
at rest in the ether. Upon inserting u = 30 kmysec. in (2.27) one finds that II + l2 needs
to be approximately 50 m in order to assure the observation of one fringe shift. It has
proved possible to build an interferometer of this type capable of detecting as little as
1/1000th of a fringe shift, putting a reasonable requirement on the size of the apparatus.
Thus the sensitivity needed is well within the capabilities of construction.
Additional factors affecting the accuracy are apparent. The relative positions of dif-
ferent parts of the apparatus must remain constant within a small fraction of a wave-
length during operation. The stability of the light source is important, and the fre-
quency bandwidth of the "monochromatic" light must be as small as possible. However,
if great care is taken in the assembly of the apparatus, with due consideration given to
these possible sources of error, one should expect to be able to measure the ether drift
regardless of how slowly the sun might be moving through the ether.
The first satisfactory trials by Michelson in 1881 indicated a null result, but the sen-
sitivity was marginal, In Michelson's words!"
In the first experiment one of the principal difficulties encountered was that of revolving
the apparatus without producing distortion; and another was its extreme sensitiveness to

10 A. A. Michelson and E. W. Morley, "On the Relative Motion of the Earth and the Luminiferous

Ether," Am J Sci, ser. III, 34, 333-345; November 1887.


SECTION S The 1\1 ichelson-Morley Experiment 55

vibration. This was so great that it was impossible to see the interference fringes except at
brief intervals when working in the city, even at t\VO o'clock in the morning. Finally . . .
the quantity to be observed; namely, a displacement of something less than a twentieth of
the distance between the interference fringes may have been too small to be detected when
masked by experimental errors.

Accordingly, the apparatus underwent a major redesign before the 1887 trials. The
interferometer was mounted on a massive stone 1.5 m square and 0.3 m thick. The stone
rested on an annular wooden float whose outside diameter was 1.5 m, with an inside
diameter of 0.7 m and a thickness of 0.2.5 m. The wooden float rested on liquid mercury
(which Morley had collected and purified), contained in a cast iron trough 1.0 ern
thick, and of such dimensions as to leave a clearance of approximately one centimeter
around the float. A central pin was used to keep the float concentric with the trough.
The annular iron trough rested on a bed of concrete on a low brick pier built in the form
of a hollow octagon. An excavation was made down to bedrock to set the supporting
column for the apparatus.

a brick pier e guiding pin h equalizing plate


b cast iron trough f Argand lamp banks of four mirrors
c wooden float g half-silvered mirror viewing telescope
d stone slab
FIGURE 2.6 Perspective view of the J[ ichelson-Jforley apparatus.

A bank of four mirrors was placed at each corner of the stone and multiple reflections
were utilized to increase the effective lengths of the two legs of the interferometer to
about 11 m. An Argand lamp was used as the light source and a wooden cover was
placed over the interferometer to prevent air currents and rapid changes in tempera-
ture. A perspective view of the complete equipment is shown in Figure 2.6.
56 T he Special Theory of Relat1:vity CHAPTER 2

It is demonstrated in Appendix A that if the t\VO legs of the interferometer are equal
(as was the case in the Michelson-Morley experiment), and if the ether drift velocity
u is small compared to c, then the number of fringes shifted, n, is a function of the rota-
tional angle e of the apparatus, and is given by
l u'!.
n = - - 2 cos 28 (2.28)
Ac
Using l = 11 m and the wavelength of yellow light, and choosing the minimum value
u = 30 kru/sec., one obtains
n = 0.2 cos 2e (2.29)
as the minimum predicted fringe shift versus rotation angle of the apparatus. Equation
(2.29) assumes, in effect, that at some point in its orbit about the sun, the earth 111USt
have a motion through the ether at least as great as 30 knr/seo. If, while the earth is
in this orbital position, an experiment is performed in which the apparatus is rotated
through 360 deg, the fringes in the field of the viewing telescope should undergo a
cyclical displacement whose amplitude is at least as great as four-tenths the distance
between adjacent fringes.
Michelson and Morley conducted trials during the period July 8-12, 1887, and
plotted their data against ith of the minimum predictable fringe shift given by (2.29).
Their curves for daytime and nighttime observations are reproduced in Figure 2.7.

One-eighth of minimum
predicted fringe shift

/ --\--- .......... ...........


"-
--0.05~

" ", -
//
./
/ /
~_ ~ Daytime ~

, //
-0.05', ,,/
....... """""--------,,"

FIGURE 2.i illichelson-.llorley data for July 1887.

They estimated that the second harmonic of their experimental data was no greater
than 0.005 fringes and thus the maximum detected fringe shift was less than toth of the
minimum predicted fringe shift. Of course, the possibility existed that in July, 1887, the
earth was nearly at rest in the ether, and thus Michelson and Morley concluded
I t is just possible t hat the result an t veloci ty (of the earth relative to the ethel') at the
time of the observations was small though the chances are much against it. The experiment
SECTION 5 The A1ichelson-A1otley Experiment 57

will therefore be repeated at intervals of three months, and thus all uncertainty will be
avoided.

However, after completing the July 1887 trials, Michelson and Morley did not
return to this problem. The completion of the ether drift experiment for all epochs was
finally accomplished by Dayton C. Miller, first in Cleveland and then at Xlount Wilson,
during the years 1921 through 1926. The Cleveland data gave a null result comparable
in level to what had been obtained by Michelson and Morley, but considerable discus-
sion was caused by the Mount Wilson data, because it seemed to indicate a small ether
drift through the fact that the observed fringe displacements were down to only about
one-thirteenth of the value predicted by the ether theory for a 30 km /sec. velocity of
the earth in its orbit.
Miller's harmonic analysis of the data not only yielded a slight amplitude but also
a phase; however, the latter was incapable of being fitted into any logical relationship

'TABLE 2.1

TRIALS OF THE MICHELSON-MORLEY EXPERIMENT

2l/'A.(u/C)2 A
Observer Year Place t, em fringe fringe
Ratio

Michelson- ............... 1881 Potsdam 120 0.04 0.01 2


Michelson and Morley! ..... 1887 Cleveland 1,100 0.40 0.005 40
Morley and Miller- ........ 1902-04 Cleveland 3,220 1 .13 0.0073 80
l\1iller d . . . . . . . . . . . . . . . . . . . 1921 Mt. Wilson 3,200 1 .12 0.04 15
Miller e . . . . . . . . . . . . . . . . . . . 1923-24 Cleveland 3,200 1 .12 0.015 40
Miller (sunlight)! .......... 1924 Cleveland 3,200 1 .12 0.007 80
Tomaschek (starlight)> ..... 1924 Heidelberg 860 0.3 0.01 15
Millerh . . . . . . . . . . . . . . . . . . . 1925-26 Mt. Wilson 3,200 1 .12 0.044 13
Kennedyi .. .............. 1926 Pasadena and 200 0.07 0.001 35
IV1 t. Wil~on
Illingwort hi ............... 1927 Pasadena 200 0.07 0.0002 175
Piccard and Stahel" ........ 1927 l\1t. Rigi 280 0.13 0.003 20
Michelson et al:'. . . , ....... 1929 1\1 t. \Vilson 2,590 0.9 0.005 90
Joosr' .................... 1930 .lena 2,100 0.75 0.001 375

a A.A. l\lichelson, Am. J. Sci. 22,120 (1881); Phil. l\Iag. 13,236 (1882).
b A.A. Michelson and E. W. Morley, Am. J. Sci. 34, 333 (1887); Phil. Mag. 24,449 (1887).
c E.W. Morley and I). C. Miller, Phil. Mag. 9, 680 (1905); Proc. Am. Acad. Arts Sci. 41, 321 (1905).
d D.C. Miller, Data sheets of Observations December 9 to 11,1921 (unpublished).
e D.C. Miller, Observations, August 23 to Scptem ber 4, UJ23; June 27 to July 26, 1D24 (unpublished).
f D. C. Miller, "Observations with Sunlight on July 8 to 9,1924," Proc. Natl. Acad. Sci. 11,311 (1925).
(J R. Tomaschek, Ann. d. Physik 73, 105 (1924).

h D. C. lVliller, Revs. Modern Phys. 5, 203 (19~33).

i R. J. Kennedy, Proc. Natl. Acad. Sci. 12,621 (1926); Astrophys. J. 68,367 (928).

i K. K. Illingworth, Phys. Rev. 30, 692 (I927).


k A. Pic card and E. Stahel, Compt. rend. 183, 420 (1926); 184, 152, 451 (1927); 185, 1198 (1927);

J. phys. radiurn 8, 56 (1927).


ll\1ichelson, Pease, and Pearson, Nature 123,88 (1929); J. Opt. Soc. Am. 18, 181 (1929).
m G. Joos, Ann. Physik 7, 385 (1930); Naturwiss. 38,784 (1931).
58 The Special Theory of Relativity CHAPTER 2

corresponding to an oscillation of the north point during the course of a sidereal day.
This anomaly cast S0111e doubt on the interpretation of the results and led to a critical
review of the data using statistical methods, The conclusion was reached that the small
observed second harmonic in the Mount Wilson experiment was not due to ether drift
hut rather could be accounted for by temperature effects.!'
Many other investigators have repeated the Michelson-Morley experiment, and a
summary of the various trials is given in Table 2.1 (on page 57) with the appropriate
journal references listed underneath. 12 In response to various objections to the original
experiment, several of the parameters were varied; sunlight and starlight were substi-
tuted for the terrestrial source, mountain-top installations were used to minimize a
possible "ether drag" over the surface of the earth, and one experiment was even
performed in a balloon.
In all these trials, the t\VO arms of the interferometer were equal, the length being as
listed in Column 4 of Table 2.1. Column 5 gives the minimum predicted shift at some
time of year, based on the earth's orbital speed of 30 km/sec. and is twice the amplitude
of the corresponding second harmonic. Column 6 lists the amplitude A of the second
harmonic of fringe shift actually found by each observer. The last column gives the
ratio of the minimum predicted second harmonic to that actually observed. For many
of the trials this ratio is large enough that clearly a null result for the Michelson-Morley
experiment can be accepted with confidence.

2.6 ETHER ­RAG

The negative result of the Michelson-Morley experiment was totally unexpected and
very perplexing. If one presumes that there is a luminiferous ether in which the velocity
of light is c in all directions, the results of this experiment suggest that the light from a
distant star sweeps past an observer on earth at this velocity c regardless of where the
earth happens to be in its orbit, and thus regardless of the earth's velocity relative to
the ether. But this contradicts all common-sense knowledge of the law of addition of
velocities, embodied in Equations (2.18).
To phrase this problem more specifically, suppose that when the earth is at a certain
point in its orbit, Cartesian axes XYZ are constructed so that the earth is instantane-
ously at rest in XYZ and so that the X axis lies along the earth's orbit. Then for the
incoming starlight,
x = Cx if = Cy i = c,
and C = (c; + c~ + c;)~~
is the speed of the starlight in XYZ. Six months later, another Cartesian frame can be
constructed whose X' axis slides along the original X axis at speed 2v, with v the earth's
orbital velocity. The earth will be instantaneously at rest in X'Y'Z', and according to
(2.18)
.,
x' = c; - 2v iJ' = Cy z = c,

11 R. S. Shankland, S. W. McCuskey, F. C. Leone, and G. Kuerti, "New Analysis of the Interferome-

ter Observations of Dayton C. Miller," Rev Mod Phys, 27, 167-178; April 1955.
12 This table is reproduced with the kind permission of Messrs. Shankland, McCuskey, Leone, and

Kuerti and is taken from their paper, Ibid.; 168.


SECTION 7 The Lorentz-Fitzilerold Contraction Hypothesis 59

so that
e' == [(ex - 2V)2 + e~ + C;P2 ~ e

Thus the velocity of the starlight should be different at the t\VO orbital positions. But
the resul ts of the Michelson-Morley experiment do not reveal this difference. I t is as
though the ether were caught up by the earth's atmosphere and dragged along with it,
thus accounting for the apparent constancy of the velocity of light in all directions
within the earth's atmosphere for all positions in the earth's orbit.
However, this concept of "ether drag" suffers a fatal objection when Bradley's dis-
covery of aberration is recalled. If the ether were dragged along by the earth's atmos-
phere, then the setting of a telescope would not have to be altered to compensate for
the earth's orbital velocity and there would be no aberration in the position of any star.

2.7 THE LORENTZ-FITZGERAL­ CONTRACTION HYPOTHESIS

Another attempt to preserve the ether hypothesis but remain consistent with the
negative result of the Michelson-Morley experiment was made by Lorentz ." He
postulated that, as a result of its speed u through the ether, a material body is con-
tracted by the factor (1 - u 2/e 2) }2 in the direction of its motion. This 111eanS, in the
Michelson-Morley experiment, if 11 and 12 are the lengths of the arms of the inter-
ferometer when it is at rest in the ether, then under the conditions depicted by Fig-
ure 2.4, 12 == 12 (1 - u 2 / e 2)H and II == L. After the apparatus has been rotated 90 deg,
l2 == Z2 and II == L(l - U2/C2)}~. Making these substitutions in (2.24) and (2.25) gives

1::1'-1::1
n==---==O (2.30)
27r

which would account neatly for why no fringe shift was observed by Michelson and
Morley,
In a footnote Lorentz acknowledges that this possibility had occurred independently
to FitzGerald, who apparently had limited his discussion of the idea to lectures to his
students and had not published his speculations. The length contraction hypothesis is
customarily identified with the names of both these men.
This proposal was sternly criticized by Poincare, who objected to an ad hoc hypothe-
sis, without experimental basis, designed to explain why one cannot detect the presence
of something else which has been hypothesized-the ether. K evertheless, the proposal
was taken seriously by others, and Kennedy devised a modification of the Michelson-
Morley experiment capable of testing the Lorentz-FitzGerald contraction hypo thesis.I"
Assuming that (2.30) is correct, if 11 == 12 (as was intended by Xlichclson and Morley
in the original experiment), 1::1 would remain constant at zero even if u, the speed of the
earth through the ether, were to change. Kennedy, with the assistance of Thorndike,
constructed an interferometer in which II - l2 was as great as the coherence of the
source would permit, attaining a value l1 - 12 == 318 mm. Then, instead of rotating the

13 H A. Lorentz, "Michelson's Interference Experiment," Versuch einer Theorie der elektrischen und
optischen Erscheinungen in bewegten Korpern, Sections 89-92, Leyden, 1895. (An English translation
appears in The Principle of Relativity, Dover Publications, Inc., New York, 1958.)
14 R. J. Kennedy and E. 1\1. Thorndike, "Experimental Establishment of the Helativity of Time,"

Phys Rev, 42, 400-418; November 1932.


60 The Special Theory of Relativity CHAPTER 2

apparatus, he held it fixed to see if there were any variation in ~ as the earth's speed
through the ether changed.
If it is assurned that the sun is gliding through the ether at a velocity v., that the
center of the earth is moving along its orbit at a velocity v, relative to the sun, and that
a point on the surface of the earth has an instantaneous velocity v- relative to the
earth's center, then

is the square of the instantaneous speed of this terrestrial point through the ether.
Twelve hours later it has changed to

whereas six mouths later it becomes

The fringe shift noted by not rotating the apparatus, but taking readings 12 hours
apart should be, according to (2.30),

n21 =
Li2 - Li1
= 2 a
1 - 12 ) [( IL.
1 - u~/ C2)-,~ - (1 - ui/ c2)- ;"'! ]
lL,

21r 'A

Similarly, the fringe shift noted by not rotating the apparatus but taking readings six
months apart should be

n3l = Li3 - Lil = 2(Zt - 12) [(1 - ui/ C2)-~1 - (1 - ui/c2)- H]


21r 'A

If ui, u~, and u; are all small relative to c2 , these expressions reduce to

(2.31 )

(2.32)

By using a precise photographic technique, Kennedy and Thorndike were able to


detect a fringe shift as small as 1 0100 th of the spacing between adjacent fringes. This
was almost two orders of magnitude more sensitive than the original Michelson and
l\Iorley technique, and compensated for the lowered sensitivity in the length factor.
Thus expressions (2.31) and (2.32) for the case of the Kennedy-Thorndike experiment
give an overall sensitivity comparable to that arising from (2.27) in the case of the
Michelson-Morley experiment.
The result of the analysis of 300 exposures of the fringe pattern photographed at the
viewing telescope of the Kennedy-Thorndike apparatus once again gave a null result
within the limit of experimental error. As a result of this experiment, it is reasonable
to conclude that the Lorentz-FitzGerald contraction hypothesis put forth as a means
of explaining the null result of the Michelson-Morley experiment, while still preserving
the ether concept, is invalid.
SECTION 9 The Interdependence of Space and Time 61

2.8 EMISSION THEORIES


Several other explanations of the Michelson-Morley null result were attempted,
involving the assumption that the velocity of light was the same in all directions in a
coordinate system in which the source was at rest. These emission theories, as they were
called, differed from each other in that they predicted different results when the ligh t
was reflected from a moving mirror. After reflection the three alternatives were that the
velocity of the light (1) remain c relative to the source, (2) beC0111e c relative to the
mirror, or (3) become c relative to the mirror image of the source.
The first alternative predicts complications in the interference pattern in the
Michelson-Morley experiment when extraterrestrial sources are used, but the results
of Miller using sunlight did not reveal any such effects. The second and third altera-
tives lead to coherence difficulties with reflected light, and all three emission theories
are inconsistent with the findings of de Sitter, previously rnentioned, that the velocity
of light is independent of the motion of the source. Thus these attempted explanations
had to be rejected along with the Lorentz-Fitz Gerald contraction hypothesis and the
assumption of ether drag.
Classical physics had reached an impasse. The laws of mechanics seemed to obey a
relativity principle via the Galilean transformation. The velocity of light also seemed
to obey a relativity principle in that it appeared to be the same in all coordinate sys-
terns in vacuo. Still this was incompatible with the velocity addition law (2.18) arising
from the Galilean transformation. The ether hypothesis, which had at first seemed so
promising, had not been established after several decades of brilliant experimental
research. Clearly a new approach to the problem was needed. This was provided
by Einstein, who concentrated his attention on a reexamination of the basic prin-
ciples involved in velocity determinations, namely, the measurement of space intervals
and time intervals. Realization that neither was an invariant led to a resolution of the
impasse and to a satisfactory modification of the Galilean transformation equations.

2.9 THE INTER­EPEN­ENCE OF SPACE AND TIME


In the introduction to his first paper on this subject Einstein said 15
. . . The same laws of electrodynamics and optics will be valid for all frames of refer-
ence for which the equations of mechanics hold good. \Ve will raise this conjecture (the
purport of which will hereafter be called t he 'Principle of Relativity') to the status of a
postulate, and also introduce another postulate, which is only apparently irreconcilable
with the former, namely, that light is always propagated in empty space with a definite
velocity c which is independent of the state of motion of the emitting body. These t\VO
postulates suffice for the attainment of a simple and consistent theory . . . . The introduc-
tion of a 'luminiferous ether' will prove to be superfluous inasmuch as t he view here to
be developed will not require an 'absolu tely stationary space' provided wi th special prop-
erties . . . .

Einstein thus accepted the principle of relativity in its broadest sense, postulating
that all the laws of physics take the same form in every inertial system of coordinates.
i e A. Einstein, "On the Electrodynamics of Moving Bodies," Ann Phys, 17,891-921; 1905. (An Eng-

lish translation can be found in The Principle of Relativ1'ty, Dover Publications, Inc., N ew York,
1958.)
62 T'he Special Theory of Relativity CHAPTER 2

He further adopted the view that light waves, like sound waves, have a propagation
velocity which is unaffected by the motion of the source. t In discarding as superfluous
the concept of an ether, Einstein thus also accepted the notion that a light wave trav-
eling through empty space will pass t\VO different observers at the same speed c even
if these observers are in motion relative to each other.
Acceptance of the second postulate together with elimination of the ether concept
automatically explains the null result of the Michelson-Morley experirnent. It also
leads to a modification of the Galilean transformation equations and therefore to a
modification of the velocity transformation law, and thus ultimately to a satisfactory
explanation of the Fizeau experiment. However, before proceeding to these develop-
ments it is desirable to consider the implications of the second postulate with respect
to the nature of space and time, The conclusions to be drawn will appear surprising
on first inspection because they are contrary to common experience, and it is this facet
of special relativity which often causes the greatest initial difficulty. Con11110n experience
develops the ingrained belief that time and space are totally different and unconnected;
once this belief is successfully challenged, the remainder of the special theory of rela-
tivity follows logically and without great difficulty.
I t takes only a simple example to challenge this belief. The one to be presented here
consists of a sequence of experiments designed to establish the lengths of rulers under
various conditions of motion, SOD1e aspects of these experiments are not completely
practical, but could perhaps be made so by a modest amount of elaboration. However,
the experiments are completely logical, which is all that is essential.
That which follows will be developed in what might seem to be overly great detail.
However it is concerned with the crux of the dilemma involving light velocity and the
velocity transformation law, and a thorough understanding at this stage will greatly
facilitate all subsequent developments.
Two Rulers at Rest. Imagine t\VO long slender rulers, Rand R', perfectly straight
and rigid and laid out side by side 011 the ground, at rest in an inertial coordinate
system. Three observers, who will be designated as 0, 0', and 0" are in the process of
determining if these rulers are precisely the same length. They do this by lining up the
rulers so that they are parallel and flush at one end, and then seeing if they are flush
at the other end also. Having satisfied themselves that such is the case, the three
observers then establish midpoints on each of the rulers, by the use of standard tech-
niques such as the construction of the perpendicular bisector or the employment of
an auxiliary third ruler of half-length. They have no difficulty doing all this because
the t\VO rulers are at Test side by side.
The length of a Moving Ruler. Next imagine that one of these rulers, say R, is
parallel to the ground and just above it, but is now in motion with respect to the ground
at a constant velocity. Let this velocity be parallel to the ground, but at an arbitrary
angle with respect to the long dimension of the ruler R. This situation is depicted in
Figure 2.8. Is the length of the moving ruler R the same as when it was at rest on the
ground?
To decide this question, one must first establish an operational definition of length
which is applicable to situations involving motion. A suitable definition is embodied in

t It is interesting to note that Einstein postulated this eight years before de Sitter's oonfirrn ing
observations of the ligh t arriving from binary stars.
SECTION 9 The Interdependence of Space and Time 63

Ruler R' is stationary on ground -1/--- PA

II
B' ~I- -__ ..::.=-.-
R' ----.11 A' I;
II
II
II
II
Points P A and P B are on ground I I Ruler R moves just above
and underneath two ends of II I the ground at constant
R at common time I I velocity v
-~---
P B
B

FIGURE 2.8 The movinq ruler R.

the Following technique of measurement: Let an observer stationary with respect to


the ground determine a fixed point jJ A OIl the ground directly under one end A of the
moving ruler at a specific time t. Similarly, let him determine a fixed point ])B on the
ground directly under the other end 13 of the moving ruler at the same time t. These
two points fixed on the ground beco mo a permaueut record, and at their leisure, ground
observers call reposition the other ruler It' so that one of its ends coincides with P A; if
its second end coincides with f) B, the moving ruler R may be said to have a length
which is unchanged from the value it had when the t\VO rulers were at rest side by side.
The crucial feature in this technique of dynamic length measurement is the require-
ment that the t\VO fixed points P A and 1)B on the ground be determined at precisely the
same time. 'fa ensure this the three observers 0, A', and 0" can equip opposite ends of
each ruler with small, insulated charged probes of unlike electrical charge. Thus if one
refers again to Figure 2.8, it can be imagined that the ruler ends labeled 11 and ii' con-
tain positively electrified probes and the ruler ends labeled Band B' contain negatively
electrified probes. A detail of one of these probes is suggested in Figure 2.9a.
Ruler R' can then be placed on the ground, with its charged probes pointing up, in
approximat.ely the position above which ruler R is expected to pass, as indicated in

H' A

~
v

Probe

Insulator
/~
6Jl ~.:::i
B

(a) Probe construction A'


(c) Coincidence of
A' two probes

(h} Moving ruler R approaching


coincidence wit.h stationary
ruler R'
FIGURE 2.9 Details of the ruler experiment.
64 The Special Theory of Relativity CHAPTER 2

Figure 2.9b. If the two rulers are still o] equal length, and R' has been properly positioned
on the ground, t as R passes by with its probes pointing down, there is an instant at
which the negatively charged B probe is directly above the positively charged A' probe,
as suggested by Figure 2.9c. This coincidence causes an intense local field, resulting in
an arc of short duration, which is the source of a light pulse. At this same instant, the
positively charged A probe is directly above the negatively charged B' probe, and this
coincidence is the source of another light pulse. If A' is stationed at the midpoint of the
stationary ruler R', these t\VO light pulses will reach him simultaneously. Conversely,
from the single observation that two ligh t pulses reach him at the same time, A' will
deduce that A and B' did momentarily coincide, that A' and B did also momentarily
coincide, that these coincidences occurred at the same time, and thus that the t\VO
rulers are still of equal length, even though R is now moving, whereas R' is stationary.
If the two rulers are no longer o] equal length, the probes at A' and B' may be displaced
equal amounts away from (or toward) the midpoint of R'. 'I'he ruler R' can then be
placed in a variety of positions on the ground in the hope of determining a position
which will cause, for some pass of the ruler . 1. f" t\VO light pulses to reach the midpoint
of R' simultaneously. Ultimately, a placement of R' and a separation of its two probes
will be found such as to cause the simultaneous arrival of t\VO light pulses at the mid-
point of R'. The distance separating the t\VO probes on R' is then the length of the
moving ruler R.
So far, a technique for determining the length of the moving ruler R has been devel-
oped, but the question still has not been answered as to whether R has a dynamic
length which is the same as its stationary length. To settle this question, the three
observers 0, 0', and 0" devise t\VO symmetrical experiments.
Relative Motion Perpendicular to Length. In the first of these symmetrical ex-
perimcnts, 0 stations himself at the midpoint of the ruler R and takes off on a space
journey. Similarly, 0' stations himself at the midpoint of the ruler R' and takes off on
a space journey. The flight paths and positions of the t\VO rulers are mirror images
with respect to a vertical plane through the position occupied by 0°, who will be
assumed to remain at rest in the original inertial coordinate system X" Y" Z". These
flight paths will be such that 0 and 0' arrange to encounter each other in outer space
in a region remote from all other bodies. They coast past each other on immediately
adjacent and parallel paths at a constant relative speed u , during a period of time when
neither ruler is accelerating and each ruler is perpendicular to the path.
Figure 2.10 indicates a sequence of positions of the t\VO rulers as "seen" by 0". By
symmetry, 0" must observe that the t\VO rulers are still of equal length and thus that
A and B' will coincide, as will A' and B, these coincidences occurring simultaneously.
When A and B' are precisely opposite each other, the positively charged probe at A and
the negatively charged probe at B' interact to generate a light pulse. Similarly, an arc as
the ends A' and B pass each other gives rise to another light pulse. The generation of
these light pulses is evidence that the rulers are still of equal length. Yet A" can not con-
clude from this that either ruler is now the same length that it was when at rest relative
to him because now both rulers are in motion relative to him.
However, either 0 or A' is in a position to judge this question, since each is stationary

t It may take many trials of the experiment to determine this position, with R always making the
same pass.
SECTION 9 The Interdependence of Space and Time 65

B' A
I (a)

--,--
I
R' R

0' 0
I

I
A' B

I
I

A
I B' (b)
I

R R'
I

~ 0 0' ~

I
B A'
I
I

1
0"
FIGURE 2.10 J[ oving ruler experiment- transverse motion.

with respect to one of the rulers. For example, during the period of encounter, observer
o senses no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Under Einstein's first postulate, all of the laws of
nature should be equally applicable in XYZ and the coordinate system X"Y"Z" which
o formerly shared with 0". In particular, 0 has no reason to believe that the length of R
is now any different from what it was when he and R were both at rest in XI/Y"Z".
Therefore since 0 detects the two pulses caused by the passage of R', he concludes that
not only is R' still the same length as R, but also that it is still the same length as it had
been when at rest relative to himself. Similar remarks can be made about the observa-
tions of 0'.
This result may be obtained in another way. Assume that when the rulers pass each
other, no light pulses occur, indicating that the rulers are now of dissimilar length. Then
let 0 move each of his probes the same small amount closer to the midpoint of R; let 0'
66 T he Special Theory of Relativity CHAPTER 2

extend each of his probes the same small amount further from the midpoint of R'; and
let the experiment be re-run. If this procedure is repeated until positions of the probes
are found which cause light pulses, then both 0 and 0' can say, for example, that R is
the longer of the two rulers. But this result is impossible, The experiment is completely
symmetrical. If 0 thinks R' is longer, 0' must think R is longer. The only symmetrical
answer possible is that no adjustment of the probe positions was needed, and that they
both think both rulers are the sarne length, unchanged from the value when each was
at rest relative to 0".
Therefore the conclusion is reached that when a body is in motion relative to an
observer, the measurement of length transverse to that motion is unaffected by the
motion, being the same as when the body is at rest relative to the observer.
Relative Motion Parallel to Length. Now let the space journeys of 0 and 0' on
the rulers Rand R', respectively, be repeated in all original details except that during
the period of encounter the t\VO rulers are oriented parallel to the paths. In addition,
o and 0' each now takes along a clock, the two clocks having been determined to be
identical when at rest relative to 0".
Figure 2.11 indicates a sequence of positions of the t\VO rulers as "seen" by 0". When
the ends A and A' pass each other, nothing happens. But as A and B' are precisely oppo-
site each other, the positively charged probe at A and the negatively charged probe at
B' interact to generate a light pulse. Similarly an arc as the ends A' and B pass each
other gives rise to another light pulse.
As 0" views this sequence, the t\VO rulers are moving in opposite directions at equal
speeds, and by symmetry the AB' coincidence occurs at the same time as the A'B
coincidence. The t\VO pulses of light originate simultaneously and spread uniformly
thereafter as spherical wavefrouts traveling at the velocity c. The centers of these
spherical wavefronts are the fixed points P, and J)2, indicated in position (c) of Fig-
ure 2.11. Because the velocity of light is finite, 0" "sees" the t\VO rulers separating as
the wavefronts grow and thus finds that the AB' pulse passes 0 before it passes 0',
whereas the A'B pulse passes 0' before it passes o.
Although 0" can conclude that the t\VO rulers are still the same length as each other,
he can not conclude that either of them is still the length it was when at rest relative to
him because once again they are both in motion relative to him.
However, either 0 or 0' is in a position to judge this question, since each is stationary
relative to one of the rulers. For example, during the period of encounter, observer 0
senses no acceleration and can consider himself and the ruler R to be at rest in an
inertial coordinate system XYZ. Using Einstein's second postulate, A knows that the
velocity of light is independent of the motion of the source and equal to a constant c in
all directions in XYZ. Thus since he knows himself to be at the midpoint, that one pulse
of light originated at one fixed end of his ruler and that the other pulse of light origi-
nated at the other fixed end of his ruler, if these t\VO pulses reach him simultaneously,
he will conclude that the arcs occurred simultaneously. This would mean to 0 that the
coincidences of AB' and A'B were simultaneous and thus that the ruler R' was still the
same length as his O\VTI.
Does 0 receive the t\VO light pulses at the same time'? Although this would be con-
trary to the observation made by 0", let it nevertheless be assumed that he does. This
would mean that 0 concludes that the t\VO rulers are still the same length and therefore
SECTION 9 The Interdependence of Space and Time 67

that 0' was directly opposite 0 at the instant of origination of the two light pulses. Due
to the finite velocity of light, at the later instant when 0 receives these t\VO pulses, 0' is
already beyond 0 and the AB' pulse has not yet reached 0' whereas the A'B pulse has
already passed him.
But this result is patently impossible. The experiment is completely symmetrical
with respect to 0 and 0' and observer 0 cannot receive the pulses simultaneously unless

(a)

I
I
B' A' A'

-=-+-:-----'
I (b)

"I
A B

A'

(c)

Spherical wavefront
of A' B pulse

1
0"
FIGURE 2.11 J! oving ruler experiment-longitudinal motion.

0' does also. Thus the two light pulses do not arrive at 0 at the same time and 0 no
longer thinks the two rulers are the same length.
In what order do the two light pulses reach O'? If one takes the observations of 0"
as a hint, one can assume that the AB' pulse reaches 0 an interval of time ol ahead of
the A'B pulse, as measured by the clock 0 brought along with him. This implies that
the A'B pulse reaches 0' sooner than the AB' pulse, say by an interval of time ot', as
68 T he Special Theory of Relativity CHAPTER 2

measured by the clock of 0'. When this assumption is adjusted so that at = at', the
result is completely symmetrical, as required.
An important conclusion has been reached. Observer A now feels that the ruler R' is
shorter than the ruler R by an arnount u at. Since he is still at rest relative to R, he has
no reason to believe that the length of his own ruler is any different from what it had
been originally when at rest in the coordinate system X"Y"Z". Thus he concludes that
the length R' depends on its motion relative to him, when that Illation is in a direction
parallel to its length.
In like manner observer 0' now feels that the ruler R is shorter than the ruler R' by
an equal amount u bt', and therefore that the length of R depends on its motion relative
to him, when that motion is in a direction parallel to its length.
I t is clear that the t\VO observers 0 and 0' are no longer in agreement about measure-
ments of distance, and that this is occasioned by their relative motion. Furthermore, the
t\VO observers are not in agreement about the measurement of time intervals, for 0
thinks that the AB' coincidence occurred first, whereas 0' thinks that the A'B coinci-
dence occurred first. K either observer has any reason to believe that his own clock is
behaving in a different manner from when they were together at rest in X"Y"Z". Thus
each observer concludes that the other's clock has been affected by its motion relative
to him.
The temptation exists to raise the protest that the true picture of what is happening
is the sequence shown in Figure 2.11, as "seen" by 0". If 0 and 0' would only take their
motions into account, they could readily explain the time differentials in the arrival of
the two pulses and deduce that the t\VO rulers are really still the same length. But this
point of view puts 0" and his coordinate system in a privileged status. Why shouldn't
o and 0' each have the right to consider himself at rest in a coordinate system in which
all the laws of physics are valid in the same form they have in the coordinate system
of O"? If this first postulate by Einstein is accepted as reasonable, then 0' can properly
consider himself and his ruler to be at rest in a reference frame X'Y'Z' and to be meas-
uring the length of R as it drifts by; he legitimately concludes that the measurement
of the length of R reveals a smaller value due to its relative motion.
These remarks can be made with equal validity when discussing 0 and his right to
consider himself at rest in a reference frame X YZ. Both observers conclude that the
other ruler is shorter, and both are correct, this surprising result being a consequence of
the operational definition of length enunciated earlier.
I t has been noted previously that 0" concludes that the t\VO rulers are still the same
length as each other, and he is also correct; his conclusion is due to the fact that both
rulers have the same speed relative to him. A" cannot say that either ruler is the same
length it was when back on the ground at rest in front of him; to decide this, he would
have to perform an experiment of the type just concluded by 0 and 0'. Were he to
perform such an experiment, he would find that the length of each ruler was now less,
and by the same amount, thus accounting for the fact that the t\VO lengths are still
equal.
Thus the conclusion is reached that when a body is in motion relative to an observer,
the measurement of length parallel to that motion is affected by the motion, being
shorter than when the body is at rest relative to the observer.
General Remarks. The previous set of hypothetical experiments, or Gedanken-
experimenie, reveal that space and time, upon being considered on an operational basis
SECTION 9 T he Interdependence of Space and Time 69

involving measurements, are not invariant concepts. If a material body has a constant
velocity relative to an observer 0, and he measures its longitudinal and transverse
dimensions, only his transverse results will agree with those obtained by an observer 0'
who is stationary with respect to the body.
What makes this conclusion seem suspect is that in everyday experience one does
not perceive objects apparently shrinking as they take on a relative motion. Airplanes
are not noticeably shorter as they race down a runway, nor does a train seem to extend
its length as it draws to a stop at a station. The reason for this apparent invariance of
length can be traced to the fact that the velocity of light is so great compared to the
velocities of all material objects in one's common experience. In the ruler experiment
just discussed, if c »> U, the time intervals ol and ot' are so small as to escape detection
when ruler sizes consistent with one's "common sense" are assumed. Thus the change
in length U ot is normally so small as to be unobservable; however in considering astro-
nornical distances or great velocities this is no longer true and the effect is detectable
and significant.
Likewise, when one motors past a tower clock, the movement of its hands does not
appear altered by the relative motion. Here again this is due to the great disparity
between the velocity of light and the normal velocities of motor vehicles. Thus in the
ruler experiment just discussed, the size of ot and ot' is an index of the difference between
the readings of the clocks of 0 and 0' as they record t\VO events, the coincidences of
ends of the rulers. But for normal velocities U, and normal ruler lengths, ot and ot' are
negligible and the t\VO clocks appear to agree. I t is only when great velocities or dis-
tances cornc into play that the differences in the readings of these clocks becorne
important.
For this reason one can appreciate that the discovery that measurements of time and
distance depend on relative motion in no way upsets the large body of common experi-
ence built up during one's lifetime. K evertheless, it is of the utmost scientific impor-
tance to recognize that these effects exist. This can only be done by considering situa-
tions beyond common experience in which such effects are significant. The purpose of
much of the remainder of this chapter will be to explore such situations.
The reader may wonder why light signals were chosen in these ruler experiments
rather than some other means of communication. The reason for this is that only light
signals can propagate in a VaCUU111 and if there were any other medium surrounding the
t\VO rulers it would have a relative motion with respect to each ruler. In general these
relative motions would not be the same, thus destroying the complete symmetry of the
experiment, a crucial point in the argument.
From consideration of the ruler experiment involving longitudinal motion, it is evi-
dent that the change in length of a moving ruler, given by U A[, is dependent on clock
readings. The interval of time ot between the AB' and A'B coincidences is in turn
dependent on the length of the rulers, so that measurements of time intervals and
space increments are interdependent.
Although two observers in relative motion will not, in general, agree about the dis-
tance between two points nor about the time interval between two events, this does not
mean that each cannot predict the values of the other's measurements from a knowl-
edge of his own. Such predictions are accomplished via the coordinate transformation
equations which link the two frames of reference in which the t\VO observers are sta-
tionary. It is now apparent that the Galilean Equations (2.4), which assume a time
70 The Special Theory of Relativity CHAPTER 2

invariance and predict a length invariance, are only approximate, and will need to be
modified in order to find the proper way to link the observations of 0 and 0' so as to
be consistent with situations such as the two foregoing ruler experiments.

2.1A THE LORENTZ TRANSFORMATION

A satisfactory modification of the Galilean transformation can be accomplished by


returning to the ruler experiment involving longitudinal motion. Upon referring again
to the situation of Figure 2.11, one can let the origin of an XYZ coordinate system be
affixed to the tip of the probe at B and can let the origin of an X' Y'Z' coordinate sys-
tem be placed at the tip of the probe at A', as shown in Figure 2.12. These two coor-

0'

.....
o

FIGURE 2.12 Coordinate systems fixed on each ruler.

dinate systems then have a relative speed u; the axes can be aligned so that X' and X
slide along each other, with the Y ' and Y axes and the Z' and Z axes respectively par-
allel, thus duplicating the situation of Figure 2.1. Let it further be assumed that
observer 0, who is stationary in XYZ, selects his time origin so that t = A corre-
sponds to the A'B coincidence; in like manner 0', who is stationary in X'Y'Z', will
be assumed to have chosen his time origin so that t' = A corresponds to the A'B coinci-
dence, thus causing the t\VO origins to coincide when t = t' = o.
It will be imagined that conceptually 0 determines a unique triplet of numbers
(x,Y,z) for every point in XYZ space by laying out identical scales (e.g., in meters)
along his three axes, and that similarly 0' determines a unique triplet of numbers
(x',y',z') for every point in X'Y'Z' space by laying out identical scales along his
three axes. It is further assumed that 0 and 0' layout these scales using the same
standard of length (e.g., a meter stick). By this it is meant that if 0 measures lengths
in terms of a ruler R marked in meters and at rest in XYZ, and if 0' measures lengths
in terms of a ruler R' marked in meters and at rest in X'Y'Z', then if these two rulers
were brought to rest side by side, markings one meter apart on R would coincide with
markings one meter apart on R'.
Additionally, it will be necessary for each observer, 0 and 0', to measure time
unambiguously at every point in his coordinate system. To this end it will be con-
ceived that 0 has an inexhaustible supply of identical clocks, such that he has been
able to station one clock permanently at each point in XYZ. To ascertain that all
of these clocks are set properly and running at the same rate, 0 can then select one of
SECTION 10 The Lorentz Transformation 71

them as the reference and perform the following experiment: 0 places himself at the
reference clock and stations an auxiliary observer 0 1 at the clock to be synchronized.
o sends out a pulse of light at time fa on the reference clock, directing it toward 0 1, who
reflects it back by means of a mirror. The returned pulse of light reaches 0 at time lb.
The clock where 0 1 is stationed was set properly if it read u. +
tb) /2 at the instant the
light pulse reached the mirror, I t is running at the proper rate if it proves to be set
properly every time 0 and 0 1 choose to perform this experiment.
In this manner every clock in XYZ can be synchronized to the reference clock, and
thus to every other clock in XYZ. It will be assumed that this has been done, and this
will be the conception of time in the frame of reference XYZ.
Likewise, it can be conceived that 0' has an inexhaustible supply of identical clocks
which he has arrayed at fixed points in X' Y' Z' and which he has synchronized by the
same procedure. It will be further assumed that if these two sets of clocks were brought
to rest relative to each other, they would be found to be identical and running at the
same rate.
With these concepts of spatial position and time, let an event be defined for
observer 0 as something which happens at a point P(x,Y,z) at time t, or more briefly
at the "point"P(x,y,z,t). The same event will occur for observer 0' at the "point"
p' (x' ,y' .z' ,t').
Returning now to a consideration of the pulse of light caused by the coincidence of
A' and B, imagine that 0 has stationed an auxiliary observer 0 1 at the fixed point
(x,y,z) and that 0 1 records the event that this light pulse passes him as having occurred
at time t. Then it follows that 0 can characterize this event by the equation
x2 + y2 + Z2 = (ct)2 (2.33)

Imagine further that 0' has stationed an auxiliary observer O~ at the fixed point
(x',y',z') and that 0 1 and O~ just happen to coincide at the instant the light pulse
passes. O~ records the event as having occurred at a time t', and 0' can write
(X')2 + (y')2 + (Z')2 = (ct')2 (2.34)
The transformation equations which link the observations in XYZ to those in X'Y'Z'
must be such that 0 can derive (2.34) from (2.33) and such that 0' can derive (2.33)
from (2.34), since they are describing the same event.
The discussions of the previous section have already provided much information
about this transformation. For example, observers 0 and 0' agree about distances in
the transverse directions and can write

y' = y z' = z (2.35 )

Further, time intervals and spatial increments were found to be interdependent when
considering measurements in the longitudinal direction. Thus since every motion that
is uniform and rectilinear in XYZ 111Ust also appear uniform and rectilinear in X'Y'Z',
so that the transformation from (x,t) to (x',t') takes straight lines into straight lines,
and is therefore linear, it follows that

x' = a1X + a 2t (2.36)


t' = aaX + a 4t (2.37)

The absence of constant terms in these two equations is due to the fact that
72 The Special Theory of Relativity CHAPTER 2

(x' = 0, [' = 0) corresponds to (x = 0, t = 0). The problem now remains to evalu-


ate the coefficients ai.
First of all, note that if a point ]J'(:r',Y',z') is fixed with respect to observer a', this
point appears to be moving in the positive X direction at speed u when observed by O.
For such a point, taking differentials of (2.36) gives
dx' = A = 0'1 dx + 0'2 dt
dx
0'2 = -0'1-
dt
But in this case dx/dl = u so that 0'2 = -UO'I and (2.36) can be rewritten
x' = alex - ut) (2.38)
The remaining three constants, aI, 0'3, and A'4, can be determined by requiring that
(2.33) and (2.34) transform into each other. Thus if Equations (2.35), (2.37), and (2.38)
are substituted into (2.34), one obtains
aix 2 - 2aiuxt + aiu 2t 2 + y2 + Z2 = a;c 2x 2 + 2a30'4c 2xl + a;c 2l 2
Since this must agree with (2.3;3) for all values of x, y, Z, and t, it follows that
ai - a;c 2 = 1
2aiu + 2aaa4c2 = A
a;c 2
- aiu 2 = c2

Solution of these three equations gives


ai = a; = (1 - U 2/C 2 ) - 1

alU
0'3 = --
c2
which yields the result
x' = K(X - ut)
y' = y
z' = z (2.39)
t' = K(t - ux/c2)
with
K = (1 - u 2j C2)-~~
These important equations were derived by Einstein in his 1905 paper using an argu-
ment which has been reproduced in its essentials. They are commonly called the
Lorentz transformation equations, so named by Poincare in honor of H. Lorentz,
who had derived them earlier (1903) under a different set of hypotheses. t
If u and the range of the variable x are both small compared to the velocity of light c,
Equations (2.39) reduce to
x' ~ x - ut
y' = Y
z' = z
t' ~ t
t These equations actually had been used even earlier by Voigt (1887) in connection with vibrating
motion. Lorentz in his development assumed the existence of an ether, the physical contraction of
bodies due to their motion through the ether, and required that Maxwell's equations transform
properly.
SECTION 11 Length and Time Under the Lorentz Transformation 73

which is essentially the Galilean transformation (2.17). Thus for velocities and dis-
tances encountered in C01111110n experience the Lorentz transformation can be approxi-
mated with negligible error by the Galilean transformation, a conclusion which is con-
sistent with the discussion of the previous section.
Equations (2.39) can be inverted to give the Lorentz transformation proceeding the
other way, namely.
x = K(X' + ut')
y = y'
(2.4A)
Z = z'
t = K (t' + ux'/ c 2
)

K ote that the only difference between (2.4A) and (2.39) is the sign of u. But this is to
be expected; if X' Y' Z' is advancing along the X axis at velocity u, then X}T Z is receding
along the X' axis at velocity - 'U.
A 1110re general form of the Lorentz equations could be obtained by introducing
a third Cartesian coordinate system X*Y*Z* at rest with respect to X1TZ but with
its axes tilted in an arbitrary way with respect to those of XYZ. This has the effect
of letting X'Y'Z' move through X* }T*Z* in an arbitrary direction. Even greater
generality could then be obtained by selecting a fourth frame X~ Y~Z~ arbitrarily
tilted and displaced (statically) with respect to X'Y'Z'. The result would be that the
equations connecting X* Y*Z* and x~ Y~Z~ form the most general Lorentz transforma-
tion, corresponding to the most general Galilean transformation (2.4). However, no loss
in generality will occur from confining one's attention to the simpler Lorentz transfor-
mation (2.39), since the transformations from XYZ to X* Y*Z* and from X'Y'Z' to
X~ Y~Z~ are static, and therefore Galilean. This discussion parallels the remarks of
Section 2.2.
The Lorentz transformation equations call be looked upon as the means whereby one
links the t\VO quartets of numbers !)(x,y,z,t) and f)'(x',y',z',t') which identify the
same event. This process has wide applicability since many physical phenomena can be
expressed in terms of events. For exarnple, the progression of a mass particle along a
path can be thought of as a continuous sequence of events. ]>(x,Y,z,l) traces this pro-
gression as seen by A, with the spatial variables continuous functions of the temporal
variable. The progression of this same particle as seen by 0' can be deduced through use
of the Lorentz equations.
I t should be noted that the transformations (2.39) and (2.4A) are nonphysical for
u ~ c.

2.11 LENGTH AN­ TIME UN­ER THE LORENTZ TRANSFORMATION

It is now possible to give a quantitative interpretation of the second ruler experiment


of Section 2.9 in terms of the Lorentz transformation. Let the two ends of the ruler R'
be at x~ and x~. These spatial coordinates are independent of t' and observer 0' can say
that the length of R' is

If observer 0 wishes to measure the length of R', since it is in motion with respect to
him, he should measure its end coordinates X2 and Xl at a common time t. Using the
74 The Special Theory of Relativity CHAPTER 2

first of Equations (2.39), one can then write


x~ = K(XI - ut) x; = K(X2 - ut) x~ - x~ = K(X2 - Xl)
from which
(2.41)
in which lR' is the length of the ruler R', as determined by O. One could similarly
investigate the length of the ruler R using the first of Equations (2.40) and conclude
that R appears contracted to 0' by the same factor.
Equation (2.41) is seen to be exactly the Lorentz-FitzGerald contraction formula.
However, it is to be remembered that the Lorentz contraction hypothesis included an
ether-filled space which did not contract, there being rather a physical contraction of
material bodies as they moved through the ether. Experiment proved this hypothesis
to be untenable. The interpretation to be placed on (2.41) is that the distance between
t\VO points in one coordinate system appears to be contracted to an observer in relative
motion parallel to the line connecting these t\VO points, whethera material body is present
or not. This is not an apparent contraction of material bodies alone, but of all of space;
as mentioned earlier, it is an effect caused by the operational definition of the measure-
ment of length. If u « c, this contraction is insignificant unless the length itself is very
great. Two widely different examples serve to point up this effect.
EXAMPLE 2.1
The vehicular tunnel under Mont Blanc connecting France and Italy is 11.2 km long.
How much shorter does this tunnel appear to a motorist driving through it at 100 kph?
Equation (2.41) is applicable to this situation, and the first t\VO terms of a power series
expansion give
2
lR' = l~' ( 1 - -1 -u
2 c2
+ ... )

l~. ~ l~. u2 ~ (10


2 5/3600)2
61 = - IR' = = 4.8 X 10-14 knl
2 c 2 3 X 10 8
= 0.000000048 mm
EXAl\:IPLE 2.2
Sirius, the brightest star in the heavens, is estimated to be 8.5 light years from earth. If a
group of space travelers were to journey from Earth to Sirius, having achieved a velocity
of 0.90c relative to the solar system before cutting out their rocket motors, how far away
from Earth would Sirius seem to them?
To these observers the distance would appear contracted, being given by

d' = 8.5[1 - (0.90)2P2 = 3.7 light years


which is far from being an insignificant contraction.
Since this segment of length is going past them at 0.90c, the space travelers compute
that the journey will take them a period of time

3.7
T' = - = 4.1 years
0.90
An observer back on Earth will estimate that the journey will consume an amount of time
8.5
T = - = 9.45 years
0.90
SECTION 11 Lenqtli and Time Under the Lorentz Transformation 75

This disparity is due to the fact that the t\VO sets of observers also disagree about time inter-
vals because of their relative motion. The disparity is large because of the high velocity
and the great distance involved.

The preceding example indicated a situation in which t\VO observers in relative


motion would disagree about the time interval between two events. 'I'his phenomenon
can be treated more generally by considering a particular clock in X'Y'Z' which
remains at the fixed coordinates (x',y',z') and is thus being passed by a sequence of
XYZ clocks. One can define a first event when the hands of this single X'Y'Z' clock
indicate the time t~ and a second event when its hands indicate the time t~.
In XYZ, the first event will occur at the spatial position

y == y' z == z'

these equations resulting from the application of (2.40). The X }TZ clock at this position
registers the time of the first event as

t 1 == K ( t,1 + U :r ' )
~

Similarly, in X}TZ, the second event will occur at the spatial position

X2 == K(X ' + ut~) Y == y' z == z'


and the XY·Z clock at this position registers the time of the second event as

t2 = K(t~ + ~ XI)
Frorn this it follows that
t~ - t~ "
t2 - i, == (1 _ U2/C2)~~ > t2 - t1 (2.42)

Consider this result first from the viewpoint of 0', who is stationary beside the single
X'Y'Z' clock. He watches a succession of X1TZ clocks go by and can take only a single
reading of each of them. However, he notices that they seem to be set progressively
further and further ahead, thus accounting for the inequality in (2.42).
On the other hand, observer 0 can take a sequence of readings of the X/:V'Z' clock
as it passes a succession of XYZ clocks. Since he knows these clocks are all synchro-
nized, he concludes that the rate of the X'Y'Z' clock is slowed by its relative motion,
These results are symmetrical and the same conclusions could be reached if a single
XYZ clock were considered to be passing a succession of X'Y'Z' clocks. Thus it can be
concluded that when the readings of a succession of rnoving clocks are compared with
those of a single stationary clock, successive moving clocks appear to be set further and
further ahead; when the readings of a single moving clock are compared with those of
a family of stationary clocks, the moving clock appears to be running slow. This effect
is known as time dilatation and is given quantitatively by (2.42).
EXAl\1PLE 2.3
Direct experimental evidence of the time dilatation effect exists. For example, the life-
times of 1r mesons have been studied both for the case of mesons at rest in the laboratory,
and for the case when they are in motion relative to the laboratory. 1r mesons are unstable
76 l he Special Theory of Relativity
l
CHAPTER 2

and they decay into a J..L meson and a neutrino, obeying the exponential law

N = Noe- tlT (2.43)

when at rest in the laboratory. In Equation (2.43), No is the number of 1r mesons existing
at time t = 0 and N is the number surviving at a later time t; e is the base for naturalloga-
rithms and T is the characteristic lifetime of the decay process. Several experimentersw "
have established the average value T = 2.56 X 10- 8 sec for 1r+ mesons at rest.
The decay in a beam of 1r+ mesons traveling at 0.755c relative to the laboratory has
also been studied.'! By passing the beam through a series of counters, and noting the relative
numbers of counts in successive counters as a function of the separation distance between
counters, it was established that the separation distance needed to be 8.43 m in order to
have the fourth counter register the passage of only lie as many 1r+ mesons as did the
third coun tel'.
In the laboratory frame of reference, it takes the meson beam

8.43
t = - - = 3.72 X 1A- 8 sec
0.755c
to travel this distance. I f this value for t is inserted in (2.43), it yields the prediction that
the fourth counter should be down from the third by 1/1.57e, a value which is 36 percent
lower than the experimental results.
The difficulty lies in the fact that (2.43) is valid only in a frame of reference in which the
mesons are at rest. For an observer traveling along with the meson beam, the time interval
between passage of the third and fourth counters is only
t' = 3.72 X 10- 8[1 - (0.755)2]~2 = 2.44 X 10- 8 sec

If this value is inserted for t in (2.43), excellent agreement between prediction and experi-
ment results.

EXAMPLE 2.4
The Mossbauer effect.' 9 has also provided a graphic illustration of time dilatation. A source
consisting of the radioactive isotope cobalt 57, which has a convenient half-life of 280 days,
was plated on to the surface of a 0.8-cnl diameter iron cylinder as shown in the figure. 2o
This cylinder was rigidly mounted between t\VO aluminum plates which also held a cylin-
drical shell of lucite. The latter was 13.28 ern in diam, 0.31 em thick, and concentric with
the iron cylinder. An iron foil enriched in Fe 57 was glued to the inside surface of the lucite
shell. This assembly was mounted on a shaft and rotated at angular velocities as great as
3,000 rad/sec. A xenon-filled proportional counter was placed near the assembly, just
beyond an intervening lead shield, as shown in the diagram.
As the cobalt 57 nuclei decay, they change into excited nuclei of iron 57. These iron
nuclei emit gamma rays at a frequency Vo = 3 X 10 18, and these rays can be directed into

16 1\1. Jakobson, A. Schulz, and J. Steinberger, "Detection of Positive 7r Mesons by 7r+ Decay," Phys

Rev, 81, 894-895; l\1arch 1, 1951.


17 C. E. Wiegand, "Measurement of the Positive 7r Meson Lifetime," Phys Rev, 83, 1085-1090; Sep-

tember 15, 1951.


18 R. P. Durbin, H. II. Loar, and \V. W. Havens, Jr., "The Lifetimes of the 1r+ and 7r- Mesons," Phys

Rev, 88, 179-183; October 15, 1952.


19·1l. L. Mossbauer, "Fluorescent Nuclear Resonance of Gamma Radiation in Iridium 191," Z. Phys.
151, 124-143; 1958. (For an excellent explanation of the Mossbauer effect, see the article by Sergio de
Benedetti, Sci Amer, 202, 72-80; April 1960.)
20 H. J. Hay, J. P. Schiffer, T. E. Cranshaw, and P. A. Egelstaff, "Measurement of the Red Shift in an

Accelerated System Using the Mossbauer Effect in Fe S7 , " Phys Rev L, 4, 165; February 15, 1960.
SECTION 11 Length and Time Under the Lorentz Transformation 77

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]

a beam aimed at the counter. However, the iron foil glued to the lucite shell, being enriched
with Fe 5 7 , can absorb these gamma rays and then reradiate them isotropically. This absorp-
tion effect is greatest when the source-absorber assembly is at rest, for then the quantum
energy levels have the same separation hlJo in source and absorber.
However. if the assembly is rotated, the source and absorber travel at different speeds
relative to the laboratory and thus the counting of time occurs at different rates in two
coordinate systems, in one of which the source is at rest, and in the other of which the ab-
sorber is at rest. An oscillation of period i in the source frame will appear to take a greater
time r' as measured by a clock in the absorber frame, the connection being

T = i
'
(1 - U
c
2

2
) }2~ r' (1 - ~ ~)
2 c2
Since T = 1/ Vo it follows that
2
1-
(1 - -
v' ~ lJo U )
2 c2

in which Vi is the frequency of the photons from the cobalt source, as determined in a frame
at rest with respect to the Fe 57 absorber.
Since u is the relative speed of source and absorber, it follows that

U w(R 2 - R 1) w(6.64 - A.4)


-~-----

c c 3 X 1A 1 0

in which w is the angular velocity of the assembly. Thus the change in frequency of the

<l)
-+-J 104
~
r...
eo 103
~
.~

~
:::1 102
A
A

> 101
Q)

'.A
~
Q) 100
~

a 100 200 300 400 500


Angular velocity (rps)

[After Hay, Schiffer, Cranshaw, and


Egelstaff, Phys Rev L, 4, 165; 1960.]
78 The Special Theory of Relativity CHAPTER 2

gamma rays is approximately

1 u2
Llv = Vo - v' = - -2 Vo = A.A65 w 2
2c
But the absorption spectrum of Fe 57 is so sharp that the width of the resonance, or the line
wid th, is only one part in 1A12 • I n other words, if the incoming photons differ in frequency
from Vo by as Ii ttle as one part in 1A12, the absorption falls off markedly, and a lowered
absorption is evident at even smaller changes in frequency.
This lowered absorption plus isotropic scattering by the Fe 57 foil manifests itself by an
increased reading in the counter, since 1110re of the original directed beam of gamma rays
gets through to the counter if less is absorbed and scattered. A plot of counter reading
versus angular velocity of the assembly is shown in the graph, and a theoretical curve
based on the absorption spectrum is included for comparison. The agreement between
theory and experiment is seen to be excellent. It is to be noted that this effect would not
be predicted by a theory which assumed time to he an invariant.

2.12 PROPER TIME AN­ PROPER ­ISTANCE

One of the cardinal N ewtonian beliefs is the invariance of distance, and it has already
been seen in Section 2.2 that a Galilean transformation preserves this invariance.
Another ingrained belief is the invariance of time, and this assumption was necessary
to preserve the form of 1'\ewton's force law under a Galilean transformation. However,
it has just been noted that under a Lorentz transformation neither time nor distance is
an invariant.
However, time intervals and space intervals may be combined to form a quantity
which is invariant with respect to a Lorentz transformation, Let Ti2 and (T~2)2 be
defined by the relations

(2.44)

(2.45)

By using either of the transformations (2.39) or (2.40) one can show that
Ti2 = (T~2) 2 (2.46)
and thus this quantity is an invariant.
To appreciate the physical significance of T12 (the positive square root of Ti2), it can
be recognized that if there exists a Lorentzian frame of reference XYZ in which two
events take place at the same spatial point, then T12 is the time interval between these
two events as recorded by a single clock at rest at this spatial point in XYZ. For this
reason T12 is called the proper time interval. In another frame of reference X'Y'Z',
l~ - t~ is measured by t\VO different clocks because the events are not at the samo
spatial point, and thus t~ - t~ is sometimes called the nonproper time interval. The
interdependence of space and time is clearly illustrated by (2.44) and (2.45).
I t is not always possible to find a Lorentzian frame of reference in which two events
take place at the same spatial point. For imagine that in XYZ they take place at
(Xl,Y1,ZI) and at (X2,Y2,Z2) at times t 1 and t2 respectively, and that upon inserting
SECTION 12 Proper Time and Proper Distance 79

these values in (2.44) one finds that 7i2 is negative. Then 712 is imaginary for all
Lorcntzian frames since it is an invariant. The trouble is that the t\VO points (X1,Y1,Zl)
and (X2,Y2,Z2) are widely enough separated in XYZ that even an X'Y'Z' frame going
at a relative speed u ---+ c cannot cover the distance between (X1,Y1,Zl) and (X2,Y2,Z2)
in so small a time interval as f 2 - fl. Since Lorentzian frames of reference are physically
restricted to relative velocities u < c (for otherwise x' and t' would be imaginary) it
follows that only when 712 is real is it possible to find a Lorentzian frame in which the
two events occur at the same spatial point. Whenever the value of 712 is real, the inter-
val between the two events will be called timelike.
To accommodate situations in which 712 is imaginary, a new quantity C12 can be
defined by the relation

Ci2 == -C 27 i 2 == (X2 - X1)2 + (Y2 - Yl)2 + (Z2 - .Zl)2 - C2 (t 2 - t 1) 2 (2.47)

from which it follows immediately that ciz is an invariant, for

Ei2 == (E~2)2 (2.48)

by virtue of (2.46). E12 (the positive square root of Ci2) is called the proper space interval,
because in a Lorentzian frame in which t\VO events take place at the same time, C12 is
the Cartesian distance between the t\VO events. In another reference frame X'}T'Z',
t~ - t~ ~ 0, and the distance

[(x~ - X~) 2 + (Y~ - Y~) 2 + (z; - z~) 2P~


is sometimes called the nonproper space interval.
Whenever the value of C12 is real, the interval between the t\VO events will be called
spacelike. Except when 712 == C12 == 0, it is always possible to carry out a Lorentz trans-

°
formation to a new frame of reference in which either the two events occur at the same
spatial point (712 real) or at the same time (C12 real) but not both. 712 == C12 = is the
boundary between these possibilities, and corresponds to the situation in which the t\VO
events can just be connected by a light ray which leaves the site of one event as it
occurs and arrives at the site of the other event as it occurs.
These ideas can be given a simple pictorial representation if attention is confined to
events which happen along the X axis. Let an observer () be at Xl at time iI, and let
light signals traveling along the X axis in each direction pass through Xl at i.. The
tracks of these light signals in the XT plane are shown in Figure 2.13 as the lines AB
and CD. The equations for th ese lines arise from the condition 7i2 == Ci2 == 0, and are

(2.49)

These two lines are thus the boundaries between spacelike intervals and timelike inter-
vals. For events which occur in the areas marked Future and Past, the interval between
such an event and the event (X1,l.1) is tirnelike. For events which occur in the areas
marked Present, the interval between such an event and the event (x1,ll) is spacelike.
An event 1)3 anywhere in the Future region is such that the observer () at Xl still has
the opportunity to influence it causally, since he can send a signal over the distance
IX 3 - xII at a velocity less than c and have it arrive there in less tirne than t 3 - t-: An
event P 4 anywhere in the region labeled Past happened long enough ago that the
80 The Special Theory of Relativity CHAPTER 2

FIGURE 2.13 The divisions of space-time.

observer 0 could have learned about it via a signal traveling the distance IX4 - xli at
a velocity less than c and requiring a time interval less than i, - t4.
However an event 1)5 anywhere in the region marked Present could be occurring
without observer A being aware of it, for a signal could not be sent in either direction
over the distance Ixs - xII, traveling at a velocity no greater than c, and cover this
distance in so small a time interval as Its - t 1 1.
K ewtonian mechanics can be viewed as a theory in which the velocity of light is
infinite, for then the Lorentz transformation is seen to reduce to the Galilean transfor-
mation. This would have the effect on Figure 2.13 of making the lines AB and CDhori-
zontal and coincident. The Present would then be reduced to a single line of events
occurring at all positions x, but at the single time Now (i 1 ) . There would be no space-
like intervals, just timelike intervals. Special relativity has one consequence of enlarging
the domain of the Present at the expense of the Past and the Future.
EXAMPLE 2.5
Imagine that the X axis is selected to be pointing at the star Betelgeuse, and that this
star is at the position Xfj. At the present time i 1 observer 0, stationed at Xl, sees Betelgeuse
as it was at the earlier time t6 ; that is he sees the event P 6• Imagine that at a later time t s"
Betelgeuse undergoes a supernova explosion, this being the event P s. At time t 1 observer 0
is not yet aware of this occurrence. However, as time goes on the crossed lines AB and C­
move vertically upward in the diagram of Figure 2.13. When they have shifted an amount
t s - le, the line CD will cross the event P 5 and observer 0 will become aware of the supernova.

Another enlightening geometrical construction results when one conceptually plots


events in the four-dimensional space (x,Y,z,l). For example, the projection on the
XT plane of the history of a moving point might be the sequence of events shown as
the line PQ in Figure 2.14.
In this same plane one can show the axes X' and T'; the equations for these axes can
be obtained by setting t' and x' equal to zero in (2.40). There is no reason why X and T
should be shown orthogonal; if they are so shown, X' and T' most decidedly are not
orthogonal. The line PQ is known as the world line of the moving point and it has the
property of being the same for all Lorentzian frames, the latter differing in the direc-
SECTION 1:3 TT elocity 81

X'
--- ---
--------- --- - ----

/
/
/
/
/
/
FIGURE 2.14 n;orld lines.

tions of their space and time axes on Figure 2.14. The locus of all the time axes is the
Past-Future area of Figure 2.13, whereas the locus of all space axes is the Present area.
A world line can follow one of the four axes in Figure 2.14 in which case length con trac-
tion and the slowing of clocks can be deduced geometrically.

2.13 VELOCITY
The general motion of a point, in which the spatial variables are continuous functions
of the temporal variable, can be traced in terms of differentials. Fr0l11 (2.39) and (2.40)
these are
dx' == «td» - u dt) dx == K(dx' + u dt')
dy' = dy dy = dy'
dz' == dz dz == dz'

dt' = K (dt - ~ dX) dt == K (dt' + .!!.c2 dX')


Ratios of these differentials may be formed to yield velocity components. For example
, dx' dx - u dt dxi d! - u L'x - U
v =:-=:-----

X dt' dt - Cui c2 ) dx 1 - (u/c 2 ) dx/dt


Proceeding in this manner, one can derive the Lorentz velocity transformation equa-
tions, namely,
, vx - u v~ + u.
vx = vx ==
1 - uV x / c 2
1 + UV~/C2
,
v
, vy vy
vy == (2.50)
y =:
«(I - uV x / c2 ) «(I + uv~1
, c
2
)

, vz u,
v == o, ==
ZK(l - uV x / c2) K(l + uv:1c2)
82 The Special Theory of Relativity CHAPTER 2

It may be noticed that the transformation one way differs from the transformation the
other way only in the sign of u. If -u and », are small compared to c, Equations (2.50)
are approximated quite well by the Galilean velocity law (2.18).
EXAl\1PLE 2.6
Let there be t\VO particles moving along the ~Y axis. As seen from the X YZ frame of refer-
ence, let one particle have a velocity V x = v and let the other particle have a velocity V x =
-v. What is their relative velocity'?
To answer this question, let ~Y' Y' Z' ride along with one particle by setting u = v. Then
from (2.50L the velocity of the other particle in ~Y'}""' Z' is
, vx - u -v - v 2v
v = - - - - -2 = = - ---
z 1- UV I / C 1 +V 2/C 2
1 +V 2/C 2

For v small this yields the classic result v~ = - 2v. However, as v ~ c, v~ --4 - c. Thus even
though in .o\ YZ the t\VO particles migh t be going in opposite directions with speeds each
of which approaches c relative to X YZ, their recessional velocities relative to each other
are still less than c.
For v ~ c the entire analysis is improper because one cannot then put u = v in (2.50),
since the Lorentz transformation is nonphysical for u 2:: c.
I t can be concluded from the foregoing that if a particle is traveling at a velocity less
than that of light in one Lorentzian frame, it travels at a velocity less than that of light in
all Lorentzian frames.

2.14 RELATIVISTIC INTERPRETATION OF THE FIZEAU EXPERIMENT


It will be recalled from Section 2.4 that Fizeau found the velocity of light in water to
be dependent on the motion of the water, this dependency being expressed by Equa-
tion (2.21). An explanation of this result based on an ether hypothesis had been made
earlier by Fresnel, who assumed part of the ether to be dragged along by the water.
A simpler explanation of Fizeau's data is possible in terms of the Lorentz velocity
transformation. Let XYZ and X'lT'Z' be two frames of reference such that X and X'
are aligned with the flow and X' is sliding along X at speed u. Then X'Y'Z' can be
chosen to be at rest relative to the water, resulting in XYZ being at rest relative to
the laboratory. In X'Y'Z' the velocity of the light waves as they pass through the
water is
, c
Vx = Vo = Æ (2.51)
n
in which n is the index of refraction.
If the appropriate equation from (2.50) is used, this velocity, as viewed from XYZ, is

V = cln +u (2.52)
x
1 + uf cn
Expansion of the denominator of (2.52) in a power series (cf. Mathematical Supple-
mont) gives

v; == (~n + u) (1 - ~ + »: + . .. )
en e2n 2
u -u
+ -cun +
2 2 3
Vx = -
C

n
+U - -
n2
1i
- -
en
+ -en 3 2 2
(2.53)
SEC'frON I;") 71he
Cedurholm-Toumee ill aser Experiment 83

Retaining only terms containing c in powers above c- 1 gives

», = Vo + it (1 - ~z)
which is in agreement with (2.21). Neither Fizeau nor his followers had sufficient experi-
mental sensitivity to detect the effect of the higher order terms in (2.53). Thus the
Fizeau experiment is completely consistent with the Lorentz velocity transformation.

2.15 THE CE­ARHOLM-TOWNES MASER EXPERIMENT

With the advent of very precise clocks based on the maser principle;" it has recently
become possible to perform an even more sensitive test of the presence of an ether than
that afforded by the Michelson interferometer. This has been accomplished by pointing
the beams of ammonia molecules comprising two masers in opposite directions and
measuring the difference in their oscillating frequencies.
'I'he operation of one of these masers is suggested in Figure 2.15. Ammonia gas is
emitted through an opening in a source S and sprays out into a region containing a
cylinder of electrostatically charged rods. The ammonia molecules norrnally exist as a

Output

FIGURE 2.15 The ammonia beam maser. [After Gordon, Sci Amer, 199, 42; 1958.]

gas in a balance between two energy states, there being a greater population in the
lower state. However, the charged rods repel the ammonia molecules in the higher
state, whereas they attract those in the lower state. As the ammonia gas drifts through
the cylinder of charged rods, the two states start to separate. Those molecules in the
lower state (represented by black dots) diverge whereas those in the upper state (repre-
sented by grey dots) converge. The latter then enter a cavity where, due to the unbal-
anced population, some of them spontaneously revert to the lower energy state, emitting
photons of characteristic frequency in the process. As these photons bounce around
21J. P. Gordon, H. J. Zeiger, and C. H. Townes, "The l\1aser-Ne\v Type of Microwave Amplifier,
Frequency Standard, and Spectrorneter," Phys Rev, 99, 1264-1274; August 15, 1955.
84 The Special Theory of Relat'ivity CHAPTER

in the cavity, a field builds up; if the dimensions of the cavity are properly chose]
to resonate this effect, self-sustaining oscillations can occur, and an electromagneti
signal of great purity at a stable, precise frequency can be extracted from the cavity b:
means of a probe.
If two such ammonia beam masers are placed back to back, and the signals coupler
out of their respective cavities are compared, a detectable beat will occur if the signa
frequencies are not the same, That an ether theory would predict the presence of t
beat can be seen from the following argument:
Assume that the t\VO masers are back to back and at rest in a coordinate systerr
X'Y'Z'. Their ammonia beams are presumed to have velocities v and -v with respect
to X'Y'Z' and this entire system is presumed to be traveling with respect to the ethel
at a velocity u. Under these conditions IVI~l1er has shown." that photons emitted from
the first maser beam in the direction characterized by the unit vector e' have a fre-
quency v~ in the laboratory frame of reference given by

, [ v · e' (v · e')2 u · v]
u
+
= Vo 1 + --C
+ c2
+ --
c 2
(2.54)

in which vo is the photon frequency as determined by an observer at rest relative to


the ammonia molecules, A derivation of Equation (2.54) can be found in Appendix B.
In the cavity of the maser oscillator the ammonia molecules emit photons in all
directions, and as a result the signal coupled out of the cavity will have a mean fre-
quency given by
(2.55)

in which dn is an element of solid angle and fee') is a weighting function dependent on


the geometrical arrangement of the cavity. Upon introducing (2.54) into (2.55), one
gets for the mean frequency

Vo
u· v]
ii: = [1 + g(v) + ~ (2.56)

v · e' (v · e')2
in which g(v) is the mean value of - -
c
+ ----
c'2
and is thus a function only of the

magnitude of v.
If this argument is repeated for the second maser beam, for which v is replaced by - v,
the mean value v~ can similarly be found. 'The difference in these two mean frequencies
is therefore
J J 2vo
ii -
+
ii = -
- c2
(0· v) (2.57)

It has proved possible to achieve a precision of one part in 10 12 in this frequency


comparison, Thus since v = 0.6 knt/see. for each ammonia beam, with vo = 23,870 l\1c/
sec., an ether drift u as small as T1looth of the orbital velocity of the earth should be
detectable with this apparatus. This is 50 times Blare sensitive than the apparatus of
Joos which incorporated a Michelson interferometer and was used in 1930.

22C. Meller, "On the Possibility of Terrestrial Tests of the General Theory of Relativity," Nuovo
Cirniento, 6, Suppl, 381-398; 1957.
SECTION 1G The Variation of 111ass 85

Back to back maser oscillators have been constructed by Cedarholm and Townes"
and used in the manner just described. The outputs from the cavities of the t\VO rnasers
were compared ill frequency as the entire apparatus was rotated through 180 deg, thus
ensuring in (2.57) a maximum value of u · v for some position of the apparatus. I n the
words of the investigators
'The experiment . . . was carefully done for the first time on September 20, 1958. No
proper effect (in the frequency difference) so large as 510 cps was found. Hence, since the
orbital velocity of the Earth of 30 km/s, would have given an effect of 20 cps, the ether
drift could not have been larger than nloo of this value, or 30 lll/~. I t is, of course, possible
for the Illation of the earth to be just cancelled by the motion of the solar system through
the ether at some particular time of the year. The experiment has now been repeated at the
Watson Laboratory during 24-hr. runs at approximately three-month intervals through-
out the year. In none of these runs was any effect so large as -r/o cps found.
This null result makes the case against an ether theory even 1110re compelling,
Einstein's formulation, which treats the ether as superfluous, predicts a null result in
the Cedarholm-Townes experiment.

2.16 THE VARIATION OF MASS

Since it has been established that the Lorentz transformation affords a satisfactory
explanation of phenomena involving the velocity of light, it now becomes necessary to
reexamine the laws of mechanics. If the principle of relativity is to hold for all physical
laws, and if the Lorentz transformation is the proper link between inertial coordinate
systems, then the laws of mechanics, if properly expressed, should transform satis-
factorily via the Lorentz equations. In a sense this reexamination has already been
started in that the concepts involving the measurement of distance intervals and time
intervals form an integral part of all mechanical laws. A critical study of the
operational definitions of these measurements for moving systems has revealed
that both distance intervals and time intervals are dependent on relative
motion. This reexamination will now be COIl tinued by the introduction of another
hypothetical experiment." whose symmetry raises a question about the invariance of
mass.
Imagine that two exactly similar elastic balls suffer a .collision which in the X'Y' Z'
frame appears as shown in Figure 2.16a. They are seen to approach each other along
parallel lines, collide, and then recede from each other along parallel lines. Their
approach speeds are equal and by symmetry so too are their recessional speeds. (A
perfectly elastic collision is assumed with no loss of energy, thus causing the recessional
speed to equal the speed of approach.) This experiment can be assumed to take place
either in a region free from gravitational attraction, or on a level frictionless table
over which the balls are sliding. t
Now imagine this same collision as viewed from an XY'Z frame which is moving in
t A rolling motion would complicate the discussion unnecessarily.
23 J. P. Cedarholm and C. H. Townes, liA New Experimental 'rest of Special Relativity," Nature,
184, 1350-1351; October 31, 1959.
24 This hypothetical experiment and the ensuing analysis were first offered by G. N. Lewis and It C.

Tolman in the paper "The Principle of Relat.ivity and Non-Newtonian Mechanics," Phil Mag, 18,
510-523; 1909.
86 The Special Theoru of Relativity CHAPTEH 2
y'

v~
.....
.....

\
\ I \ )
/ \ I \
l \ J
./

-, A.
", A

(a) (b)
FIGURE 2.16 The collision of two balls.

the direction of the -X' axis at a speed u = v~. To an observer 0 stationary in XYZ,
ball A is moving parallel to the Y axis, and ball B makes a more grazing incidence to
the X axis.
As seen in X'Y'Z', each ball has its y' component of velocity reversed by the colli-
sion but its x' component of velocity is unchanged. As seen in XYZ, ball B has its y
C0111pOnent of velocity reversed by the collision but its x component is unaffected. In
XYZ ball A does not have an x component of velocity either before or after the colli-
sion; however, it does have a y C0111pOnent which suffers a reversal.
Classical mechanics would yield the result for this experiment that V y = for ball B v:
and that in the XYZ frame the velocity of ball A is iVy. In terms of a Lorentz trans-
formation, one would be ill-advised to assume this without checking. Therefore, let
± W y represent the velocity of ball A in XYZ before and after the collision. Using (2.50)
one finds that for ball B

, Wy
whereas for ball A v=-
Y K

The ratio gives (2.58)

and thus V y < ui; Viewed from XYZ, ball A has a greater y component of velocity
than does ball B. (For ordinary velocities the difference is exceedingly small.)
SECTION IG The Variation of 111 ass 87

Equation (2.G8) requires the abandonment of one or the other of t\VO principles of
classical mechanics. If mass is an invariant, then the principle of conservation of linear
momentum is violated in the ?J direction in X}TZ. If the momentum principle is valid,
then 111aSS cannot be an invariant. T'he latter assumption has proved to be the one which
is consistent wit.h experiment. and will be the basis for what follows.
Let nl~ = 1n~ be the t\VO 11laSSeS in the X'JT'Z frame (they are equal by symmetry)
and let 1nA ~ rnB be the t\VO masses in the XYZ frame. Then

so that

This result can be rephrased entirely in terms of XYZ quantities by using (2.50) to
substitute for v~. This gives

V
I
== U == - - - -
z 1 - uvx/c2
«»,
from which U - - 2 - == V x - U
C

UV x -
u
-l)
2v;
2
== vx - UV x
c~

1- 2UVc 2
X
+ U?~V; ==
c4
(1 _UVx)2
c
(1 _V:Vc )2
2
==
2
x

and thus ln~ ==


m;
(1 _~)-Hc2
(2.59)

This relation is seen not to depend on Vy and should hold even when V y == O. But then
ui; ==0 also, and as seen from X'Y'Z' the two balls approach each other along the X'
axis and barely touch as they pass. As seen from XYZ, ball A is at rest and ball B
passes by, barely touching .A. as it travels parallel to the X axis. With m« the 111aSS
of ball A \vhen it is at rest, Equation (2.59) can be rewritten

ma
m
B -
- ------
(1 - v;/
c 2) }~
(2.60)

One can now argue that it no longer matters whether ball A is present or not. Fur-
ther, the rest mass of ball B should also be rna, since in X' V'Z' one started with a sy m-
metrical experiment using identical balls. Wi th only ball.B left, in constant rectilinear
motion, the subscripts can be dropped on m n and Vx, giving

(2.61)

In Equation (2.61), ni« is the rest mass of ball B in the Lorentz frame XYZ, and m is
its dynamic mass when going at a speed v relative to XYZ.
It is inferred from this result that the mass of any material body depends on its rela-
tive motion, increasing with speed according to the relation (2.61).
88 The Special Theory of Relativity CHAPTER 2

EXAMPLE 2.7
A clear confirmation of the variability of mass has been given by Zahn and Spees." Employ-
ing a radioactive source S to generate high-speed electrons, they selected a small velocity
range of these electrons through the use of a velocity filter; with a Geiger counter as de-
tector, they were able to determine the dynamic mass.
As indicated by the figure of the apparatus, C is a parallel plate condenser with extremely
small spacing between the plates (d = 0.4663 mm) and a length of 12 em. Any electron

C~~~8 ~
~~~S2~-~S
J
:I 5 I"
I /
I /
I I
1

I /
I /
I I
I If
I I
I /
1/
P.'
[After Zahn and Spees, Phys Rev, 53, 365; 1938.)

which reaches the Geiger counter must pass between these plates. Helmholtz coils (not
shown) create a uniform, constant magnetic field B = 120.85 X 10-4 webera/m" in a
direction perpendicular to the plane of the paper. The electrons which leave S have trajec-
tories in the plane of the paper which are circles. Only those electrons whose center of
curvature is at P will enter the condenser traveling parallel to the plates. The radius of
curvature r can be found from the geometry, which yields the formula

r 2 = (r - a) 2 + b2
a 2 + b2
r=---
2a
in which a is the distance that the source S is below the mouth of the condenser, and b is
the distance that S is to the right of the mouth. In one run of the apparatus, these dis-
tances had the values a = 0.02992 m and (a 2 +
b 2)H = 0.0977 m, yielding r = 0.1595 m.
The force law can be invoked to determine the velocity of the electrons traveling this
particular trajectory. One obtains
mv 2
- = evB
r
and using (2.61) for the dynamic mass, this becomes

v = !!.- rB
(1 - V2/C2)~~ mo
26 C. T. Zahn and A. H. Spees, "The Specific Charge of Disintegration Electrons from Radium E,"

Phys Rev, 53, 365-373; March 1, 1938.


SECTION 17 The Momentum and Energy 89

in which e is the electronic charge (1.6 X 10- 19 coul) and m« is the rest mass (9.1 X 10- 31 kg).
Solving for v, one obtains
v = 0.749c = 2.25 X 10 8 ru/sec
Under the action of the magnetic field, these electrons would continue in their circular
path, thus striking the bottom plate of the condenser, if it were not for the electrostatic
voltage between the plates. When this voltage V is properly adjusted, a balancing elec-
trostatic force results and the electrons are able to travel between the plates, emerging
from the other end to be detected by the Geiger counter. It is clear that regardless of the
value of V, no other electrons, traveling in any other orbit as they leave S, can pass through
the condenser, and those electrons traveling at the speed 2.25 X 10 8 m/sec will get through
only if V has a value such that

eV
- = evB
d
V = dvB
V = (0.4663 X 10- 3) (2.25 X 10 8 ) (120.85 X 10- 4)
V = 1,270 volts
The experimental data, showing counts per minute versus condenser voltage, are given
in the graph. The agreement between theory and experiment is seen to be very good.

70

60

~ 50
:::s
e
's. 40
Q)
0.
ell
30
C
5
u 20

10

0 500 1000 1,500 2000


Volts

[After Zahn and Spees, Phys Rev, 53, 365; 1938.]

Had the rest mass rather than the dynamic mass been used above in the force equation,
the velocity of the electrons getting through the filter would have been computed to be 3.4 X
10 8 m/sec (greater than c), and the predicted condenser voltage to permit passage would
have been 50 percent higher.

2.17 MOMENTUM AND ENERGY


1\ ow let attention be turned to the more general case of a body whose velocity is not
necessarily a constant with time, in either magnitude or direction. Let it be assumed
that Equation (2.61) is applicable to this general case and define momentum by the
relation
mov
p = mv (2.62)
90 The Special Theory of Relativ?'ty CHAPTER 2

It is apparent that for modest velocities this reduces to the classical definition of
momentum.
Further let N ewtori's force law be defined by the relation

F = dp (2.63)
dt
Since v is now permitted to be time-dependent, it follows that the dynamic 111aSS m is
a function of time and thus that

dp dv dm
-==m-+v- (2.64)
dt dt dt

Expressions (2.63) and (2.64) also are seen to reduce to their classical forms for normal
velocities.
The kinetic energy T of a moving body still can be defined as the work supplied to
bring it frorn rest to its state of motion, Thus

(2.65)

From (2.62)

so that

However since d(v · v) = 2v · dv, this can be rewritten

Therefore (2".65) becomes

dw
(1 - w)%
= moc2
[(1 -c-
2
V2)-~~
- -
]
1 (2.66)

Equation (2.66) can be expanded in a power series (cf. Mathematical Supplement) to


give
1 3 v4
T = - mov2
2 8
+-
ma - 2 +
c
(2.67)

If v«c, (2.67) is approximated quite well by the conventional expression for kinetic
energy.
Equation (2.66) can be written in the interesting form

T = [(1 _r::/ C2 )!h - m o] c2 = (m - mo)c


2 (2.68)

This suggests that the kinetic energy can be interpreted as the square of the velocity
of light times the change in mass, If the increase in energy is thought of as the cause
of the increase in mass, it becomes an attractive hypothesis to imagine that even the
SECTIO~ 18 The Transformation Law for Mos« 91

rest 111aSS 1110 is due to an internal arnount of energy moc2 • If 1noc 2 is called the rest
energy of the body, then the total energy 1~1, being the sum of the rest energy and
the kinetic energy, is given by
(2.69)
This celebrated equation is one of the most important results of the special theory and
has been amply substantiated by a wide variety of atomic and nuclear experiments. It
lies at the heart of the explanation of fission and fusion bornbs and has led to a satis-
factory explanation of stellar energy processes. Verification of (2.69) provides convinc-
ing support of the soundness of the generalized definitions of momentum and the force
law given earlier, on which the derivation of (2.69) "vas based.
EXAwlPLE 2.8
The dynamic balance within a stable star can be explained by arguing that the great
mass causes a high gravitational pressure at the core. 'This intense pressure serves to ele-
vate the temperature of the core to millions of degrees and thus permit fusion processes to
occur. The most likely of these processes is the conversion of hydrogen to helium. Four
hydrogen atoms, each consisting of a proton and an electron, can be transformed into a
single helium atom consisting of two protons, t\VO neutrons, and t\VO electrons, as suggested
by the diagram. A charge balance is achieved because t\VO positively charged protons plus
t\VO negatively charged electrons are replaced by t\VO uncharged neutrons. However a
mass balance is not achieved. Since the atomic weights of hydrogen and helium are 1.008
and 4.003, respectively, it follows that 4.032 units of mass are replaced by only 4.003
units. The loss in mass, multiplied by c 2 • represents the energy radiated during the trans-
formation. As this energy streams outward from the core it causes a radiation pressure

which balances the gravitational pressure, causing the star to maintain a stable size. This
stability ensures that the rate of the fusion process will remain essentially constant over a
long period of time (billions of years). This in turn makes the solar power available to a
planetary system a constant-a desirable factor in evolutionary processes.
When radiation pressure is computed on the basis 6.E = c2 Sm, theoretical calculations
yield values for stellar diameters and surface temperatures which are in satisfactory agree-
ment with observations. Spectrographic studies of our own sun indicate hydrogen is its most
abundant element, with helium next. 'The relative abundance of these t\VO elements suggests
that the process has been going on for about five billion years, a figure which is in good agree-
ment with geological data. I t also suggests that the process can continue in stable fashion
for another five billion years.

2.18 THE TRANSFORMATION LAW FOR MASS


Equation (2.61) is not, of course, the transformation law for mass because it relates
the rest mass in one coordinate system to the dynamic mass in the same coordinate
92 The Special Theory of Relativity CHAPTER 2

system. However it can be used to relate the dynamic mass in two different coordinate
systems as follows:
Let a body of rest mass mo have a velocity v(x,y,z,t) in X YZ and a velocity Vi (x' ,y' ,z' ,t')
in X' Y' Z'. Then
m = -----
(1 - v2/ C2)~2

, mo
and m =-----
[1 - (v') 2/ C2P2

are the expressions for the dynamic 111aSS in the t\VO coordinate systems. Thus

From the velocity transformation Equations (2.50),

(V/)2 = (1 - ~:%r2 [(V 2 - v;) (1 - ~:) + (v% - U)2]

so that m' = K (1 - ~~%) m (2.70)

Equation (2.70) is the transformation law for mass. In using it one should remember
that in general both m and m' are functions of time.

2.19 THE TRANSFORMATION LAW FOR FORCE

On the presumption that the Lorentz equations properly transform all the laws of
physics (as required by the relativity principle) one can write

d
F = dt (mv) (2.71)

d
F' = - (m'v') (2.72)
dt'

and inquire what the force transformation law must be in order to derive either of these
equations from the other via the Lorentz equations.
With the help of (2.50) and (2.70), Equation (2.72) can be expanded to give
, dt d
F = - - [K(V ~ u)m]
Z dt' dt Z

, dt d
F = - - [mv] (2.73)
Y dt' dt Y

, dt d
F = --[mv]
dt' dt
Z Z

Formation of the differentials of the last of (2.39) yields


dt 1
dt'
SECTION 19 The Transformation Law fOT Force 93

so that Equations (2.73) become

F ' == 1 -
UV
_x
)-1 d
~ (mv x - mu)
r. - u(dm/dt)
x
( c2 dt 1 - uVx / c 2
d1 F
F~ = ( - (mv y ) = y (2.74)
K 1 - uv x / dt c2)
K(l - uv x / c2)
, 1 d F'7
F == - (mv z ) == ~
Z K(l - uVx / c ) dt
2 K(l - uVx / c 2 )
From (2.Gl)
dm mo/c
2
£Iv 1 (v) ( dV)
dt (1 - V2/C2)~~ v • dt == 1 - V2/C2 ~ • m dt
With the help of (2.64) this can be rewritten

(c2 - v2) dm = v •
dt
(F _ v dm)
dt
dm v· F
so that -- - ---- (2.75)
dt c2
Finally
2 2
F' == F - uv y/c F _ _ 1_lV_
z/_C_- F,
x z 1 _ uvx/c2 1/ 1 - uv x/ c2
F
F ' == y (2.76)
y K(l - uv x / c2 )

F ' == Fz
Z K(l - uVx / c2)
Equations (2.7f)) are known as the force transformation law. It is evident that if u and v
are small, F ' ~ :F', indicating that in such cases the classical expression, which equates
these forces, is a valid approximation.
It is significant that Equations (2.76) are linear in the force components, Recalling
that F or F' is the total force acting on the body of rest mass me, if F is composed of
partial forces such that
F = FI + F2 + ... + FN
then each of these partial forces has a counterpart such that
F' == F~ + F~ + ...+ F~
Equations (2.76) can then be written in the expanded form
(F~x + r; + ... + r.: == (FIX + F + ... + FNx)
2x

uVy / c 2
1 - uVx / c
2 (F ly +F + 2y

c2
+F +
UV Z /
2 (F lz 2z
1 - uVx / c
o: + F~y + + F~y) = (Fly + F 2y + ... + F Ny)
K(l - uv x / c2)
(F;z+F~z+ +F~) = (F 1z + F 2z + ' " +F Nz)
Z e(I - uVx / c2)
94 The Special Theory of Reloiioin, CHAPTEH 2

Since the partial forces arc in general independent, it follows that

t» uVy / c2 F uu.] c2 F
F'nx
f -
-
If
lIX
- ----
1 _ uVxlc2 ll y
-
1 _ uV x /c2
----
llZ

F ny
r.;
I

= ( (2.77)
K 1 - uV x/c 2 )
F' = Fn z
nz K(l - uv x / c2)

in which F, and F: represent the nth partial force as determined in XYZ or X'Y'Z',
with 1 S n S N. Thus the partial forces transforrn according to the same law as the
total forces. However, it should be recognized that, whereas (2.77) contains partial
forces, it does not contain partial velocities. The terms Vx, Vy, and v, occurring in (2.77)
refer to the total instantaneous motion of the mass m, resulting from the action of all
the forces.
This important transformation law will be central to the development of the field
of a moving charge, a topic to be considered in Chapter 4. The results there obtained
will serve as additional evidence for the validity of this reconstitution of the laws of
rnechanics in keeping with the Lorentz transformation.

REFERENCES
1. Bergmann, P. G., Introduction to the Theory of Relativity, Prentice-Hall, Inc., New York,
1947.
2. Dingle, H., 'The Special 'Theory of Relativity, 3rd ed., a Methuen Monograph, John Wiley
and Sons, Inc., Ne\v York, 1950.
3. Einstein, A., H. A. Lorentz, H. Minkowski, and H. Weyl, The Principle of Relativity, a
collection of original papers, Dover Publications, Inc., New York.
4. Leighton, R. B., Principles of l1fodern Physics, McGra\v-Hill Book Company, XC\V York,
1959.
5. lVI~ller, C., The Theory of Relativity, Oxford at the Clarendon Pre~s, London, 1952.
6. Panofsky, VV. 1(. H., and NI. Phillips, Classical Electricity and M aqneiism, Addison-
Wesley, Inc., Boston, 1955.
7. Richtmyer, F. R., E. H. Kennard, and r. Lauritsen, Introduction to Modern Physics,
5th ed., Mcflraw-Hill Book Company, N C\V York, 1955.
8. Sherwin, C. 'V., Basic Concepts of Physics, Holt, Rinehart and Winston, Nc\v York, 1961.
9. Whittaker, E., A History of the Theories of Aether and Electricity, Vols, 1 and 2, 'rhos.
Nelson and Sons, Ltd., London, 1953.
10. Whittaker, E., From Euclid to Eddington, Dover Publications, Inc., N"e\v York, 1958.

PROBLEMS
2.1 Assume that t\VO plane waves of light are propagating almost parallel to the Y axis, such
that they are given by
fl = K cos (27rvt - k,» - kyY)
"'2 = K cos (21t"vt + kxx - kyY + a)
Problems 95

in which K is the constant amplitude of both waves and a is their relative phase at the
origin. Show that in any transverse plane y = constant these waves interfere so as to
give alternate regions of light and dark. What is the spacing of these interference fringes?
How does the position of these fringes depend on a'? (Note tha; this effect is used in the
Michelson interferorneter.)
2.2 In the Michelson interferometer how does fringe shift depend on the rotational position
of the apparatus if the two arms are not equal? (Cf. .Appendix A.)
2.3 In Section 2.10 of the text, the Lorentz transformation equations were derived using the
pulse of light which occurred at the coincidence of ends of the two rulers. Show that the
Lorentz equations can also be derived by requiring that 0 and 0' obtain symmetrical
results, thus giving an analytic parallel to the literal arguments of Section 2.9.
2.4 Show that t\VO Lorentz transformations carried out one after the other are equivalent to
one Lorentz transformation for which the relative velocity is

U1 + U2
U = -- ---
1 + (U1 U 2/ e2)

with U1 and 1[2 the relative velocities of the t\VO transformations. Thus show that it is
impossible to combine a sequence of Lorentz transformations into one yielding a relative
velocity greater than e.
2.5 In Section 2.9 of the text, a literal argument was used to show that observers 0 and 0'
each concluded that the other ruler had shrunk when relati ve motion occurred. Use the
Lorentz transformation equations to demonstrate that the events A opposite B' and A'
opposite B occur in reverse time sequence for the two observers.
2.6 The result (2.41) was obtained when observer 0 found the positions of the t\VO ends of
the ruler R' at a common time t. Show that the same result is obtained if 0 determines
how long it takes for R' to pass a fixed point in XYZ and then multiplies tnis time interval
by the speed u.
2.7 Show that the time dilatation effect may also be obtained by determining the distance
in XYZ between two events which occur at the same point in X'Y'Z' and dividing this
distance by u to get the time interval in XYZ.
2.8 A jet passenger airplane 150 ft long is cruising at a ground speed of 600 mph. By how much
does the plane appear shortened to a ground observer '? How long would the pilot need to
fly at this speed before his clock appeared to a ground observer to have lost 1 sec?
2.9 Use the time dilatation formula to check the results of Example 2.2.
2.10 A space vehicle whose rest length is 100 In is traveling away from the earth at a constant
velocity v = 0.8e. A pulse of light is sent from the earth toward the spacecraft. As the light
pulse passes the rear of the vehicle it triggers a clock. It then continues to the front of the
vehicle where it is reflected by a mirror and returns to the clock. What time interval does
the clock record between the two passages of the light pulse? What time interval would
earth clocks record between the same two events?
2.11 An electronic clock is shown in the figure, and consists of a flashtube F and a photoelectric
cell P shielded from each other by a baffle, plus a mirror M rigidly mounted a fixed dis-
tance !J above the assembly. A. circuit in the box B is arranged so that when P receives
a light pulse from F via AI, it causes the flashtube to emit another pulse of light with
negligible delay. This clock thus U ticks" once every 2D/ e sec when at rest. N O\V suppose
96 The Special Theory of Relativity CHAPTER 2

that this clock is moving at a constant velocity v relative to the laboratory frame and
determine its period. Does your answer depend on the direction of v?
2.12 A cosmic ray p, meson enters the earth's atmosphere vertically at a speed v = 0.98c. In
its own rest system the p, meson decays into an electron and 2 neutrinos with a mean
lifetime of 2.2 X 10- 6 sec. What is its mean life expectancy as determined by an earth
observer? How far will a shower of these J..L mesons penetrate the earth's atmosphere before
half of them have decayed?
2.13 Suppose that the frequency of a ray of light is v, as determined by 0 who is stationary
in XYZ, and that this light ray is traveling at an angle (J with respect to the X axis. Show
that an observer 0', stationary in X' Y' Z', will find that the frequency of the light ray is

v[l - (u/c) cos (J]


v' - - - - - - - -
- (1 - u 2/ c2) ~~
This result is known as the relativistic Doppler formula. Note that the numerator is the
classical expression.
2.14 A distant galaxy is receding from Earth with a radial velocity component of 1,000 km/sec.
By how many angstrom units will the If{J line (4,861 A) be shifted? Is the shift toward the
red or toward the violet end of the spectrum?
2.15 A straight line fixed in the XZ plane makes an angle (J with the X axis. What angle does
this line appear to make with the X' axis to an observer 0' stationary in X/Y'Z'?
2.16 A small particle of mass m is moving at a constant speed v in a straight-line path in the
XZ plane. This path makes an angle () with the X axis. Find the velocity components of
this particle in the X'f'Z' frame. What angle does the particle's path make with the X'
axis? Is this answer consistent with the result of the previous problem?
2.17 A particle of rest mass j1f 0, moving through X YZ at the constant velocity V, collides
inelastically with a second particle of rest mass mo. If the second particle were initially at
rest in XYZ, find the speed of the composite particle.
2.18 Explain the aberration of light from a distant star in terms of the Lorentz transformation.
2.19 To what speed must a particle of rest mass m« be accelerated in order to quadruple its
mass? What is its kinetic energy at this speed? How does this answer compare with
tmov 2 ?
2.20 Find a formula connecting the momentum p and the kinetic energy T of a particle of rest
mass mo.
Problems 97

2.21 A Compton collision occurs when a photon strikes an electron and is thereby scattered.
Find the change in frequency of the photon as a function of the angle ()through which it is
deflected. What is the change in energy of the electron?
2.22 An excited atom, at rest in XYZ, drops to a quantum state whose energy level is lower
by ~E. A photon is emitted and the atom recoils. Therefore the frequency of the photon
will not be precisely v = ~E/h, but rather will be

v = ~E
h
(1 _~ M~E)
2 oc
2

in which M 0 is the rest mass of the atom and h is Planck's constant. Show this result.
2.23 Consider the collision of a particle of initial energy E and rest energy Eo with a like
particle which is at rest. Show that the maximum energy available in the zero momentum
+
frame is (2EE o 2E o2) ~~.
2.24 Find the Lorentz transformation law for acceleration and express your answer in terms of
acceleration components which are perpendicular to and parallel to the velocity.
2.25 In an Xl"Z frame a particle is moving in the Xl" plane and has instantaneous velocity
components V x = V y = tc.
At this same instant the two force cornponents are equal. In
what Lorentzian frame will the force appear to be entirely Y directed at this instant?
What will be the magnitude of this force?
2.26 Show that the force defined by (2.71) is parallel to the acceleration only if the acceleration
is either parallel to or perpendicular to the velocity.
2.27 An electron and a positron can combine at rest, annihilating each other with the result
that t\VO l' rays are emitted. Assuming that energy and momentum are conserved, calcu-
late the wavelength of the l' rays.
2.28 Consider a rocket ship in which mass can be converted to energy which provides a thrust.
Find the terminal velocity of this rocket ship relative to a frame in which it was initially
at rest, as a function of the percent of original mass converted.
CHAPTER 3
Electrostatics in Free Space
THIS CHAPTER and the t\VO which follow will be concerned with the establishment of
an electrical theory in the absence of dielectric and permeable materials. Cond uctors
will be considered in a limited way, but only as supporting structures for the distribu-
tion or transport of charge; attention will be focused on the fields set up by these
charges and not on any interaction with their conducting environment. The conduc-
tors themselves will be treated as an electrically neutral background, consisting locally
of equal amounts of positive and negative charge in a VaCUUI11. In this way a sim-
plified theory of electromagnetic fields caused by charges in free space can be devel-
oped. Subsequent chapters will then be concerned with the extension of this theory to
situations which include the effects of materials,
The present chapter begins with a formulation of the electric field due to a static
assemblage of charges and then proceeds to the introduction of the electrostatic poten-
tial. Electric flux density is defined, following which Gauss' law and its applications are
discussed, including the use of flux maps. The relationship between field and charge at
a conductor-vacuum interface is established and the method of images is then devel-
oped and applied to several cases. Poisson's and Laplace's equations are derived and
a variety of boundary-value problems considered. The concept of capacitance is defined
and generalized to a system of conductors, and the chapter closes with a discussion of
the energy stored in an electrostatic field.
At this point a dipole theory of the behavior of dielectric materials could have been
introduced, but it would perforce be limited to static stresses. For this reason it was
felt desirable to defer the discussion on dielectrics until the general time-varying case
could be considered. Similarly, the next chapter, which deals with magnetic fields due
to time-independent currents, could logically contain sections OIl d.c. conductivity and
static effects in magnetic materials: these topics too have been postponed so that time-
varying effects could be included for completeness.

3.1 * HISTORICAL SURVEY

Electrostatic theory is based OIl the single experimental postulate that electric charges
exert forces on each other which vary directly as the product of the strengths of the
charges and inversely as the square of their distance of separation. Thus if q and q' are
chosen to represent the strengths of t\VO point charges, and r is the distance between

* This section may be omitted without loss in continuity of the technical presentation.
SECTION 1 Historical Sllrvey 99

them, the force which one charge exerts on the other may be expressed in the form

f cc qq' i, (3.1)
r2

in which 1r is a unit vector along their connecting line. The two electric charges can be
alike or opposite, causing the force to be repulsive or attractive; this feature is accom-
modated mathematically by permitting the symbols q and q' to have an intrinsic alge-
braic sign which is positive or negative.
This inverse square law, as it is usually called, has a curious history of discovery and
rediscovery. As is true with respect to most major scientific principles, its establishmen t
cannot be wholly credited to the efforts of one man. Perhaps the first significant contri-
bution to the realization of this law was made by Benjamin Franklin (1706-1790).
Writing to Dr. John Lining of Charlestown, South Carolina, on March 18, 1755,
Franklin described an experiment he had performed in the following words;'

. . . I electrified a silver pint cann, on an electric stand, and then lowered into it a cork-ball,
of about an inch diameter, hanging by a silk string, till the cork touched the bottom of the
cann. The cork was not attracted to the inside of the cann as it would have been to the out-
side, and though it touched the bottom, yet, when drawn out, it was not found to be elec-
trified by that touch, as it would have been touching the outside. The fact is singular. You
require the reason; I do not know it. Perhaps you may discover it, and then you will be so
good as to communicate it to me. I find a frank acknowledgment of one's ignorance is not
only the easiest way to get rid of a difficulty, but the likeliest way to obtain information,
and therefore I practice it: I think it an honest policy. Those who affect to be thought to
know every thing, and so undertake to explain every thing, often remain long ignorant of
many things that others could and would instruct them in, if they appeared less conceited.

Later, upon editing a collection of his letters for publication, Franklin added the
footnote

. . . Mr. F. has since thought, that, possibly the mutual repulsion of the inner opposite
sides of the electrised cann, may prevent the accumulating an electric atmosphere UpOl1
them and occasion it to stand chiefly on the outside. But recommends it to the farther
examination of the curious.

Very little progress was made with this idea until Franklin described the above-
mentioned experiment to his good friend Joseph Priestley and asked Priestley to repeat
the investigation and verify his results. Priestley (1733-1804), better known as the dis-
coverer of oxygen, undertook experiments beginning in December 1766. He suspended
two pith balls from threads which were entirely inside an electrically charged cup. Like
Franklin, Priestley found" that the balls

. . . remained just where they were placed, without being in the least affected by the
electricity; but that, if a finger, or any conducting substance communicating with the earth,
touched them, or was even presented towards them, near the mouth of the cup, they
immediately separated, being attracted to the sides; as they also were in raising them up,
the moment that the threads appeared above the mouth of the cup.
1 Bernard Cohen, ed., Benjamin Franklin's Experiments, pp. 331-338, Harvard University Press, 1941.
2 J. Priestley, The History and Present State of Electricity with Original Experiments, p. 732, printed for
J. Dodsley, London, 1767.
100 Electrostaiics in Free Space CHAPTER 3

Based on the results of this experiment, Priestley then made the observation
May we not infer from this experiment, that the attraction of electricity is subject to
the same laws with that of gravitation, and is therefore according to the square of the
distances; since it is easily dernonstrated that were the earth in the form of a shell, a body
in the inside of it would not be attracted to one side more than another.

Despite the fact that Priestley was prompt to publish these experirnental findings
and his inference of the inverse square relation, the scientific community of his day
failed to appreciate the significance. Indeed, Priestley himself apparently did not
regard this accomplishment as a sufficiently rigorous proof and did not champion his
deductions.
Two years later in 1769, Dr. John Robison (1739-1805), of Edinburgh, undertook
the task of determining the law of force between electric charges by direct experiment.
Little attention has been given to the historical priority of his discovery, since Robison
made scant attempt at the time to publicize his findings. This is unfortunate, because
he was an accomplished investigator of wide interests, whose discoveries could have
benefi ted the progress of science. His lectures and scientific researches were published
posthumously in Edinburgh in 1822 and are clearly and engagingly presented in an
extensive four-volume treatise entitled 111 echanical Philosophy. Commencing on page 73
of the fourth volume of this treatise, Robison describes in detail an electrometer which
he constructed for the purpose of determining the force law between electrified particles.
Figure 3.1 is a reproduction of Robison's sketch of the electrometer, a device which
balances gravitational and electrical forces, thus giving a mechanical equivalent
of electrical attraction or repulsion. Robison's lengthy description of the apparatus
and its method of operation can be paraphrased by noting that A and Bare metallic
balls which, in the course of the experiment, will be electrified. B is attached to an
insulating stalk and counterpoised by the ball D, with the stalk freely hinged at C.
A is insulated by the glass arm FEL to which is attached the hinge C. With the two
balls A and B uncharged, the apparatus is adjusted so that, when BD hangs vertically,
A and B barely touch. The shaft FI is then rotated until
. . . the line LA is horizon tal, and so is CB; and the movable ball B is resting on A and is
carried by it. N O\V electrify the balls, and gently turn the handle I backwards . . . noticing
carefully the t\VO balls. It will happen that, in some particular position of the index, they
will be observed to separate. Bring them together again, and again cause them to separate,
till the exact position at separation is ascertained. This will shew their repulsive force in
contact, or at the distance of their centres, equal to the sum of their radii. Having deter-
mined this point, turn the instrument still more toward the vertical position. The balls will
now separate more and more . . . this electrometer . . . win give absolute measures:
for . . . by laying some grains weight on the cork-ball D, till it becomes horizontal and
perfectly balanced, and compu ting for the proportional lengths of BC and DC, we know
exactly the number of grains with which the balls ITIUSt repel each other (when the stalk is
in a horizontal position) in order merely to separate. Then a very simple computation will
tell us the grains of repulsion when they separate in any oblique position of the stalk; and
another computation, by the resolution of forces, will shew us the repulsion exerted between
them when AL is oblique, and Be makes any given angle with it.

After revealing his talent for careful instrumentation by instructing the reader in the
proper construction and care of the critical components of the electrometer, Robison
SECTION 1 Historical Survey 101

FIGURE 3.1 Robison's apparatus.

moved on to a discussion of the results he had obtained. Noting that he had made
many hundreds of measurements with different instruments, he concluded that
the mutual repulsion of two spheres, electrified positively or negatively, was very nearly in
the inverse proportion of the squares of the distance of their centres, or rather in a propor-
tion somewhat greater, approaching to 1/r 2 . o6 • • •

By rotating the apparatus so that B was under A, Robison was able to make mea-
surements of the attractive force between unlike charges. The results were similar and
he concluded that the force law was probably the inverse square of distance for both
attraction and repulsion. He failed to recognize the importance of this result, perhaps
because of the subordinate position in which he tended to place experimental work
relative to mathematics.
Another definitive demonstration of the inverse square law was achieved by Henry
Cavendish (1731-1810) in 1773. His experiment had the same basic form as the
102 Electrostatics in Free Space CHAPTER 3

approach used earlier by Franklin and Priestley, although it is not clear that Cavendish
was aware of their efforts. He went far beyond their accomplishments, however, and
obtained a quantitative result for the law of force, including an estimate of the preci-
sion of his data.
The laboratory technique displayed by Cavendish in all his researches would earn
the admiration of any modern experimenter. In his earlier 'York with electricity, he had
developed the concept of "degree of electrification" (now called potential), and had
then convinced himself that when t\VO charged conductors are connected by a wire
they redistribute charge in order to attain the same potential. He incorporated this
result into many experiments designed to compare the charge on two bodies which had
been brought to a common potential.
In one of these experiments, Cavendish showed that the charges on similar bodies at
the same potential are in the ratio of their linear dimensions. Using this knowledge, he
expressed the charge on any body in terms of the diameter of a sphere which, when at
the same potential, would have an equal charge. This, in modern language, is the con-
cept of capacitance, and when Cavendish spoke of the charge of a body as "globular
inches" or simply "inches of electricity" he meant that the capacitance of the body
in question was equal to that of a sphere whose diameter in inches was the value quoted.
Cavendish took as his standard a conducting spherical shell whose diameter was 12.1 in.
and he then ascertained, by a well-arranged series of measurements, the relative capaci-
tances of a great number of bodies of many shapes.
His electric force experiment had the intention!
. . . to find au t whether, when a hollow globe is electrified, a smaller globe inclosed within
it and communicating with the outer one by some conducting substance is rendered at all
over or undercharged; and thereby to discover the law of the electric attraction and repulsion.

To this end, Cavendish constructed an apparatus consisting of a 12.1 in. diam inner
globe, mounted on a glass rod, and surrounded by two hemispheres of diameter 13.3 in.
He then
. . . made a communication between them by a piece of wire run through one of the herni-
spheres and touching the inner globe, a piece of silk string being fastened to the end of the
wire, by which I could draw it out at pleasure.

Cavendish next charged the outer globe, withdrew the connecting wire, removed the
t\VO hemispheres, and tested for charge on the inner globe by touching to it an electro-
meter consisting of two pith balls suspended by fine linen threads. However, he was not
satisfied with the first form of his apparatus, and went to an improved design, about
which he says

For the more convenient performing this operation, I made use of the following apparatus.
I t is more complicated, indeed, than was necessary, but as the experiment was of great
importance to my purpose, I was willing to try it in the most accurate manner.
ABCDEF and AbcDef [Figure 3.2] are t\VO frames of wood of the same size and shape,
supported by hinges at A and D in such manner that each frame is moveable on the horizontal
line AD as an axis. H is one of the hemispheres, fastened to the frame ABCD by the four
sticks of glass, AIm, N n, Pp, and Rr, covered with sealing-wax, h is the other hemisphere

3J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, revised by J. Larmo r,
vol. 1, p. 118, et seq., Cambridge University Press, 1921.
SECTION 1 Historical Survey 103

(a) Cavendish's original sketch.

(b) lv! axwell's drawing.

FIGURE 3.2 The Cavendish apparatus.


104 Electrostatics in Free Space CHAPTER 3

fastened in the same manner to the frame Jibe I). G is the inner globe, suspended by the hori-
zontal stick of glass Ss, the frame of wood by which Ss and the hinges at A and D are sup-
ported being not represented in the figure to avoid confusion.
Tt is a stick of glass with a slip of tinfoil bound round it at x, the place where it is intended
to touch the globe, and the pith balls are suspended from the tinfoil.

Cavendish describes how the inner globe and hemispheres were coated with tinfoil to
make them good conductors, and how the frame was adjusted so that the hemispheres
would fit accurately together and concentrically around the inner globe. He then goes
on to explain that

It was also so contrived, by means of different strings, that the same motion of the hand
which drew away the wire by which the hemispheres were electrified, immediately after
that was done, drew out the wire which made the communication between the hemispheres
and the inner globe, and immediately after that was drawn out, separated the hemispheres
from each other and approached the stick of glass Tt to the inner globe. It was also contrived
so that the electricity of the hemispheres and of the wire by which they were electrified was
discharged as soon as they were separated from each other, as otherwise their repulsion
might have made the pith balls to separate, though the inner globe was not at all overcharged.

Upon electrifying the outer shell and following the procedure just described, Caven-
dish brought his pith-ball electrometer into contact with the inner globe, and observed

The result was, that though the experiment was repeated several times, I could never
perceive the pith balls to separate or shew any signs of electricity.

These experiments were performed on December 18-24, 1772 and April 4, 1773. On
the later date Cavendish improved on the detectability of his electrometer by first pre-
charging the pith balls positively or negatively. In this situation, a small like charge on
the inner globe would have slightly altered the separation of the pith balls, whereas
Cavendish observed in both cases that, upon contact with the inner globe, the pith
balls collapsed toward each other, assuming a position in which they were barely
separated. This indicated that the greater capacitance of the inner globe was draining
most of the precharge off the pith balls, and thus that the charge which had been on
the inner globe was much less than the charge with which he had pre-electrified the
electrometer.
Cavendish next turned his attention to the question of the accuracy of his measure-
ments. At issue was the minimum charge on the inner globe which his pith-ball electro-
meter could detect. To make an estimate of this minimum detectable charge, Caven-
dish totally discharged the condenser which had been used to charge the outer sphere
in the electric force experiment. He then recharged this condenser with -loth of its
original charge, being sure of this value through his use of a set of calibrated condensers.
Upon connecting the recharged condenser to the inner globe (with the hemispheres
removed), he was certain that the charge transferred to the inner globe was less than
a10th of the charge which had been transferred to the outer sphere in the original elec-
tric force experimen t. K O\V, upon bringing his electrometer in con tact wi th the inner
SECTION 1 Historical Survey 105

globe, Cavendish found a sensible effect on the separation of the pith balls. Thus he
was led to the conclusion

It appears, therefore, that if a globe 12.1 inches in diameter is inclosed within a hollow globe
13.3 inches in diameter, and communicates with it by some conducting substance, and the
whole is positively electrified, the quantity of redundant fluid lodged in the inner globe is
certainly less than 6)rth of that lodged in the outer globe, and that there is no reason to
think from any circumstance of the experiment that the inner globe is at all overcharged.

Cavendish then proceeded to argue that the law of electric attraction and repulsion
lTIUSt be inversely as the square of the distance. But he was not yet satisfied. He wanted

. . . to form some estimate how much the law of the electric attraction and repulsion may
differ from that of the inverse duplicate ratio of the distances without its having been
perceived in this experiment . . .

Reasoning as had Newton in the case of gravitational attraction, Cavendish assumed


that the electric charge would spread uniformly over a sphere, and that each element
of charge would exert forces on all other elements according to the same law, with the
principle of superposition applying. He then assumed that the force law had an inverse
distance dependency of the 2 ± 10th power and computed the amount of charge which
would have to reside on the inner globe so that the net force on a charge located at the
midpoint of the wire connecting the two globes was zero. This amount of charge turned
out to be ~7th of the charge on the outer sphere. Since 5)-th is larger than the detectable
';oth amount, Cavendish concluded that

. . . the electric attraction and repulsion must be inversely as some power of the distance
between that of the 2 + s1rth and that of the 2 - sloth, and there is no reason to think
that it differs at all from the inverse duplicate ratio.

Cavendish also investigated the manner in which the electric force law is dependent
on the amount of charge. Once again, he ingeniously contrived a quasi-null experiment,
the apparatus for which is depicted in Figure 3.3. In Cavendish's own words:"

CD is a wooden rod 43 inches long, covered with tinfoil and supported horizontally by
non-conductors. At the end C is suspended, as in the figure, the electrometer described 5 in
Article 249, and at the other end D is suspended a similar electrometer, only the straws
reached to the bottom of the cork balls A and B, but not beyond them, and were left open so
as to pu t in pieces of wire, and thereby increase their weigh t and the force with which they
endeavoured to close.

The two Leyden jars E and F were approximately equal in capacitance and each
exceeded the capacitance of the bar and electrometers together by a factor of about

4 Ibid., pp 189-193.

51'his electrometer consisted of two wheaten straws, suspended by pin bearings from a brass block,
and terminated by gilted cork balls. Ibid., 131.
106 Electrostatics in Free Space CHAPTER 3

one hundred. The outer coatings of both jars were grounded and the inner coating of E
was permanently attached to the bar CD. With the wire weights in the straws A and B,
the system consisting of the jar E, the bar CD, and the two electrometers was charged
until A and B were separated by a measurable amount. The jar F was then connected
to the system, essentially halving the charge on the electrometers. The electrometer
at C was then observed to have a separation almost equal to the separation which

C D

FIGURE 3.3 Cavendish methodfor determining relation between force and charge intensity.

A and B had previously experienced with double the charge. Since Cavendish had
determined that the two electrometers were almost identical and since he had chosen
the wire weights so that they would quadruple the force tending to close AB (the
actual factor was 3.9), he was able to conclude that the electric force was directly
proportional to the amount of each charge.
The results of these highly original and definitive experiments were unknown to the
scientific community for almost a century for, like Robison, Cavendish chose not to
publicize his findings. By the time a general awareness had developed that each of
these men had established the inverse square law, the credit and fame had been be-
stowed properly on someone else.
That someone else was Charles Augustin de Coulomb (1736-1806) who, in 1785, also
demonstrated the law of electric force, using a technique totally different from those
employed by any of his predecessors. Coulomb's procedure involved the use of a torsion
balance which he had invented. With it, he measured the repulsive force between two
like charges, balancing this force by the torsion in a wire from which a bar containing
one of the charges was suspended. His celebrated First Memoir on Electricity and Afag-
neiism contains a preliminary section in which the torsion balance is clearly described
as follows:"
On a glass cylinder ABCD [Figure 3.4, sub-Figure 1J ... we place a glass plate . . .
which completely covers the cylinder; this plate is pierced by two holes . . . one at the
center f, upon which is erected a glass tube; this tube is bonded over the hole j': at the upper
end h of this tube, a torsion micrometer is placed which we see in detail in sub-Figure 2.
6 C. A. de Coulomb, "Premiere Mcmoire sur l'Electricite et Magnet.isme," Histoire de I'Academic
Royale des Sciences, 569; 1i85. For an English translation of excerpts, see W. F. Magie, A Source Book
in. Physics, McGraw-Hill Book Company, New York, 1935.
SECTION 1 Historical Survey 107

JZ~ .1.

C 9'

FIGURE 3.4 Coulomb's apparatus.


108 Electrostatics in Free Space CHAPTER 3
The upper part, No.1, carries the milled head b, the index io, and the chuck q; this piece fits
into the hole G of piece No.2; piece No.2 consists of a circle ab divided along its girth into
360 degrees, and a copper tube 4> which fits into the tube H, No.3, which in turn is sealed to
the interior of the upper end of the glass tubefh of sub-Figure 1. The chuck q is shaped much
like the end of a solid pencil holder, and is closed by means of the ring q. In this chuck is
clamped the end of a very fine silver wire: the other end of the silver wire [sub-Figure 3] is
held at P in a clamp made of a cylinder Po of copper or iron . . . whose upper end P is split
so as to form a clamp which is closed by means of the sliding piece 4>. This small cylinder is
enlarged and pierced at C in order to permi t the needle ag to pass through; it is necessary
that the weight of the small cylinder be sufficient to stretch the silver wire without breaking
it. The needle, ag, is seen [sub-Figure 1] to be suspended horizontally, and about half-way
up in the large cylinder which encloses it, and is formed either of a silk thread or straw soaked
in Spanish wax and finished off from q to a for eighteen linest of its length by a cylindrical
rod of shellac; at the extremity a of this needle is found a small pith ball two or three lines
in diameter; at g there is a little vertical piece of paper soaked in turpentine, which serves
as a counterbalance to the ball a and retards oscillations.
We have said that the cover AC was pierced by a second hole at m; into this second hole
is introduced a slender rod mCf.>t, whose lower portion <I>t is made of shellac; at t is another pith
ball; around the perimeter of the glass cylinder ABeD, at the height of the needle, is
described a circle zQ divided in to 360 degrees; for greater simplicity I use a strip of paper
divided into 360 degrees which is pasted around the cylinder at the height of the needle.
Coulomb then goes on to explain how the instrument is prepared for the experiment
by securing the pith ball t in place and adjusting the micrometer head so that the two
pith balls are just touching. His description of the actual experiment follows:
We electrify a small conductor [Figure 3.4, sub-Figure 4] which is simply a large-headed
pin insulated by plunging its point into the end of a rod of Spanish wax; this pin is introduced
through the hole m and permitted to touch the ball t, which is in contact with the ball a;
upon withdrawing the pin, the two balls are left electrified with the same kind of electricity
and they repel each other to a distance which is measured by looking beyond the suspension
wire and the center of the ball a to the corresponding division of the circle zOQ; then by
rotating the index of the micrometer in the direction pno, we twist the suspension wire lP
and exert a force proportional to the angle of torsion, which tends to bring the ball a closer
to the ball t. In this way one can observe the distance through which different angles of
torsion bring the ball a toward the ball t; comparison of the forces of torsion with the
corresponding distances of the t\VO balls determines the law of repulsion. I shall here only
present some trials which are easy to repeat and which will at once make evident this law
of repulsion.

Coulomb then indicated an initial separation of the two balls of 36 deg, causing an
initial torsional twist of 36 deg in the suspending wire. He next turned the micrometer
head through 126 deg, causing the balls to reduce their separation to 18 deg. Finally,
by turning the micrometer head through 567 deg, he observed that the separation had
been reduced to 8t deg. Since the force of torsion is proportional to the angle of twist,
these data can be tabulated as shown in Table 3.1.
The values of wire twist in the second column are composed of the rotation of the
micrometer head and the angular displacement of ball a. The distance of separation of
the two pith balls is proportional to the sine of one-half their angle of separation, but
since all these angles are small, the distance of separation is essentially proportional, in
t Before adoption of the metric system in France, one line equalled T~ in.
SECTION 1 Historical S'urvey 109

TABLE 3.1
COULOlVIB'S EXPERIMENTAL DATA FOR THE LA \V OF REPULSION

A ngular separation of the A ngular measure of the


two pith balls, deg force of torsion, deg

36 36
18 144
8t 575~

this experiment, to the angle of separation. One notes from the first column of the table
that the angles of separation are almost in the ratio 4: 2 : 1. The second column of the
table lists quantities proportional to the restoring force, and these figures are essentially
in the ratio 1: 4: 16. In analyzing the data, Coulomb was led to the conclusion

It results then from these three trials that the repulsive action which the two balls exert
on each other when they are electrified similarly is in the inverse ratio of the square of the
distance.

It is interesting to note the great change which has taken place in the method of
reporting in scientific journals since Coulomb's time. Whereas Coulomb went into
great detail in describing his apparatus, the experimental procedure, and possible
sources of error, when it came to reporting data he listed only three points, one of
which deviated from the inverse square law by 6 percent. His statement in the First
Memoir just prior to introduction of the data clearly suggests that other trials had
been undertaken and one can only surmise that Coulomb felt a small sample of his
data would be sufficiently convincing.
Upon turning his attention to an investigation of the law of electric force for the
attraction between oppositely charged bodies, Coulomb encountered a new difficulty:'

I wished to use the same method to determine the attractive force between t\VO pith balls
charged with opposite natures of electricity, but by using this same balance to measure the
attractive force) I found an experimental difficulty which did not occur during the measure-
ment of repulsive force. The experimental difficulty arises when the two balls are drawn near
to each other. The attractive force . . . frequently increases at a greater rate than the
torsional force, which increases only directly as the angle of twist; as a consequence, if several
readings are desired, the balls must be prevented from touching each other by means of an
insulating stop in the path of the needle. Since the balance is often required to measure forces
of less than one thousandth of a grain, the collision of the needle with the insulating stop
influences the results and causes part of the electric charge to be lost.

Coulomb displayed his ingenuity in circumventing this difficulty by devising an


experiment in which he related the period of a pendulum to the spatial dependence of
the force of electric attraction. He suspended a horizontal needlelike insulator from a
thin silk thread (Figure 3.5) and attached to it a tinsel disc at the end l and a counter-
balance at g. Kearby was placed a globe G, and in Coulomb's words

7 C. A. de Coulomb, "Seconde Mernoire sur l'Electricite et Magnot.isme," Histoire de I' Academic


Royale des Sciences, 579; 1785.
110 Electrostatics in Free Space CHAPTER 3

. we adjust the globe G in such a way that its horizontal diameter Gr is opposite the
center of the tinsel disc l, which is some inches away from it. We give an electric spark to
the globe from a Leyden j ar [condenser]; we then ground the disc I with a conductor, and
the action of the electrified globe on the electric fluid of the unelectrified tinsel disc gives to
ti.e disc a charge of the opposite type from that of the globe; so that when the ground is
removed, the globe and disc act on each other by attraction.

FIGURI~ 3.5 Coulomb's apparatus for unlike charges.

Designating by d the distance from the needle's center c to the center of the globe G,
Coulomb varied d and, after setting the needle into oscillation, recorded the time it
took for the needle to perform a specified number of oscillations. He listed three trials,
the recorded data for which is reproduced in Table 3.2.

TABLE 3.2
COULOMB'S EXPERIMENTAL DATA FOR THE LAW OF ATTRACTION

d, in. No. of oscillations Elapsed time, sec.

9 15 20
18 15 41
24 15 60
SECTION 1 Historical Survey 111

G.---=::::-----~---~~-------__\_-----~- ......

I
FIGURE 3.6 Composition of forces in Coulomb experiment.

To analyze the data, it is necessary to determine the relationship between oscillation


time and disposition of the parts of the apparatus. Figure 3.6 shows the needle in an
arbitrary angular position and indicates the attractive electric force Fe resolved in to
tangential and radial components. If r is the distance from the tinsel disc l to the
cen ter of the needle c, then

in which I is the moment of inertia of the needle about the axis containing its sus-
pending thread. Under the assumption that the law of attraction is the same as the
law of repulsion, F; = K(d')-2, in which K is a constant and d' is the distance from l
to the center of the globe G. Then
d28 Kr cos <p
dt 2 I (d') 2

Since d' was considerably greater than r in Coulomb's experiment, d' can be replaced
by d in the above equation. Further, since <p == 90° '- (j - a, if 90° - (j is very much
greater than a (small oscillations) then <p ~ 90° - (J, and cos <p sin (j
/'-! e. Making /'-!

these substitutions gives

which is the equation for simple harmonic motion. The period T is therefore

T = 27rd /I
\) Kr

Thus for small oscillations, if the law of attraction is the inverse square, the period
should be proportional to the separation distance d. This analysis would predict periods
in the ratio 20: 40: 54 whereas Table 3.2 indicates that the measured periods were in
the ratio 20: 41 : 60.
Coulomb made some measurements of the rate of dissipation of electric charge and
then corrected his data (the entire experiment took 4 minutes and he found that -ioth
of the charge was dissipated per minute). After correction, the lack of agreement was
negligible for the second trial and only 5 percent for the third trial, which led him to
conclude:
We have thus come, by a method absolutely different from the first, to a similar result;
we may therefore conclude that the mutual attraction of the electric fluid which is called
112 Electrostatics in Free Space CHAPTER 3

positive on the electric fluid which is customarily called negative is the inverse ratio of the
distances; just as we have found in the first memoir, that the mutual action of the electric
fluid of the same type is in the inverse ratio of the square of the distance.

Coulomb also investigated the manner in which the amount of electric charge affected
the electric force. To do this, he replaced the stationary pith ball t (Figure 3.4) by a
small iron circle and proceeded in the following manner:"
He electrified these two bodies [the pith ball a and the small iron circle) simultaneously by
means of the head of a pin, and the repulsive force separated the needle from the iron circle;
when it was brought back and placed at a distance of 30 degrees the index pointed to 110
degrees; the repulsive force therefore was [proportional to] 140 degrees. He then touched the
little iron circle with another of the same substance and same diameter; the needle im-
mediately approached the circle, and to bring it back to the distance of 30 degrees, it was
found necessary to untwist the wire till the index stood at 40 degrees; therefore the repulsive
force was reduced to 40 + 30 or 70 degrees, the half of 140 degrees, the measure of its
former intensity.

Arguing that when the charged iron circle was touched by a similar uncharged circle,
its charge was reduced to half, Coulomb then concluded that the electric force is linearly
proportional to the charge on each body.
These discoveries by Coulomb formed the first quantitative basis for a mathematical
statement of the law of electric force. Although his method lacked the degree of accu-
racy of the approach used by Cavendish, it was direct, it was quantitative, and it was
easy to comprehend. The scientific world readily accepted Coulomb's results, the first
of a substantive nature to be published and widely distributed.
This acceptance was greatly furthered by the theoretical contributions of Simeon
Denis Poisson (1781-1840) who, in two brilliant memoirs" presented in the years 1812
and 1813 lifted electrostatic theory to a mature state of development. He accomplished
this by accepting Coulomb's inverse square law as a fundamental postulate and making
rich use of the analogy to gravitational theory, a subject already highly advanced at
that time.
In an article in the M emoires de Berlin in 1777, Lagrange had shown that if a function
if;(x,y,z) be formed by adding together the masses of all the particles of an attracting
system, each divided by its distance from (x,Y,z), then the derivatives of this function
were equal to the components of the attractive force at (x,y,z). Laplace later demon-
strated 10 that this function if; satisfies the equation
a2if; a2~ a2if;
-+-+-=0
ax 2 ay 2 az 2
at all points not occupied by masses.
8 J. Farrar, Elements of Electricity, M aqnetism, and.. Electromagnetism, Hilliard and Metcalf, Boston,

1826. (Notes selected from Biot's Precis Elemeniaire de Physique, compiled for the use of students of
the University at Cambridge, New England.)
9 s. I). Poisson, "On the Distribution of Electricity at the Surface of Conducting Bodies." First

Memoir read to the French Academy on May 19 and August 3, 1812. Printed in M em. de l'Institut,
part 1, 1-92; 1811. Second Memoir read on September 6,1813. Printed in M'em, de l'Institut, part 2,
164-274; 1811.
10 P. S. Laplace, "Theory of Attractions of Spheroids," ~1 em. de l' Academie Royale, 113-196; 1782

(published in 1785).
SECTION 1 Historical Survey 113

In laying the groundwork for a similar formulation involving electric charge, Poisson
opened his First Memoir by remarking

The theory of electricity which is most generally admitted is that which attributes all the
phenomena to two different fluids, distributed within all bodies of nature. It is supposed
that the molecules of one fluid repel each other and that they attract the molecules of the
other; these forces of attraction and repulsion obey the inverse square law of distance; at
the same distance the attractive power is equal to the repulsive power; from which it follows
that when all the parts of a body contain equal amounts of the two fluids, the latter do not
excercise any influence on the fluids contained in neighboring bodies, and as a consequence
no signs of electricity are manifest. This equal and uniform distribution of the two fluids is
called the natural state; when this state is disturbed for any reason, the body becomes elec-
trified, and the various phenomena of electricity begin to take place.
All the bodies of nature do not behave the same way with respect to the electric fluid:
some, such as the metals, do not appear to exert any influence on it, but permit it to move
about freely in their interior: for this reason they are called conductors. Others, on the con-
rary, very dry air for example, oppose the passage of the electric fluid in their interior; in
this way they serve to prevent dissipation throughout space of the fluid accumulated in
conducting bodies. The phenomena associated with electrified conductors, whether these
conductors be considered singly, or whether they be considered in conjunction and exert-
ing a mutual influence, are the objectives of this Memoir, in which I propose to apply
the calculus to this important branch of physics. But before entering into these matters,
I wish to state in some detail the principles which serve as the basis for my analysis, and
to make known the most remarkable results to which they have led me.

Poisson's central principle, of course, was the assumption of Coulomb's inverse square
law, on the basis of which he introduced a function t cI>(x,y,z) , composed of the sum of
the charges of an electrical system, each divided by its distance from (z; y, z). He then
argued, as had Lagrange in the case of gravitational attraction, that the derivatives

a4> a<l>
ax az
would yield the components of electric force] at (x,y,z).
Turning his attention to conducting bodies, Poisson assumed that an excess of one
electric fluid had been placed on a conductor, and reasoned:

By virtue of the repulsive force between these [excess] particles, and because the metal does
not oppose their movement, one can imagine that the added fluid is transported to the
surface of the body, where it will remain because of the air environment. Coulomb has
proved in effect, by direct experiments, that no atoms of electricity reside in the interior of
an electrified conductor except the natural electricity of the body: all the added fluid dis-
tributes itself over the surface . . . it exerts neither attraction nor repulsion at any interior
point of the body; for if this condition were not satisfied, the action of the surface layer of
electricity on interior points would decompose a new quantity of the natural electricity of
the body, and its electric state would be changed.

t Fifteen years later, in generalizing Poisson's work on electric and magnetic phenomena, George
Green (1793-1841) gave to this function the name potential, and this appellation has been universally
adopted ever since.
t Poisson's original notation has been altered to be consistent with the remainder of this chapter.
114 Electrostatics in Free Space CHAPTER 3

As a consequence of this argument, Poisson adopted the principle that he could find
the manner in which the excess charge distributed itself over the outer surface of a
conductor, by imposing the condition that this distribution must lead to no net elec-
tric force at any interior point of the conductor. In terms of the potential function 4>,
this meant that if IJ were a point in the interior of an electrified conductor, then

. . . the value of <I> is independent of the coordinates of the point P; because then the
partial derivatives of this function being null, the force at the interior point P will be also.

Thus was the concept formulated that a conducting body in electrostatic equilibrium
is an equipotential.
Poisson next turned his attention to conditions at the surface of an electrified con-
ductor and argued, following a suggestion by Laplace, that the electric force at a point
immediately outside the conductor is proportional to the local concentration of surface
charge density. He did this by dividing the force into a part f due to the element of
charged surface immediately adjacent to the point, and a part F due to the rest of the
surface. At a neighboring point just inside the conductor, F will be unchanged but f will
have to be reversed to give a null force. Therefore the resultant force at the exterior
point must be 2f. But if the exterior point is extremely close to the surface, the imme-
diately adjacent surface element looks like an infinite plane, uniformly charged, for
which case Poisson showed the force f to be proportional to the charge per unit area of
the surface, thus completing the theorem.
Using the principle that a charged conductor must be an equipotential, Poisson
deduced the surface distribution for several simple shapes, including an ellipsoid, and
then enlarged his analysis to the study of t\VO charged spheres placed at any distance
from each other. This was a classic and difficult problem to which he devoted over
three-quarters of the space occupied by these t\VO lengthy memoirs. The solution
involves single or double gamma functions, depending on whether or not the two
spheres are in contact. Poisson laboriously computed the values of his integrals for a
variety of conditions and exhibited very satisfactory agreement with the earlier
experimental results of Coulomb.
The year 1813 recorded another significant contribution by Poisson when, in a brief
note.!' he extended Laplace's equation to include points occupied by matter, obtaining

in which p is the volume density of mass. The same connection exists, of course,
between electric potential and charge density. Poisson's proof of the validity of this
important differential equation, which bears his name, has a simple elegance which will
fully reward a decision to consult the original paper. An alternative derivation will be
offered in section 3.9.
The admiration invoked by recounting these achievements of Poisson perhaps has
been expressed best by Whittaker ;"
11 S. D. Poisson, "Remarks on an Equation Which Occurs in the Theory of Attractions of Spheroids,"
Bull. de la Soc. Philomathique, 3, 388-392; 1813.
12 E. Whittaker, A History of the Theories of Aether and Electricity, vol. 1, p. 62, Thomas Nelson and

Sons, Ltd., 1951.


SECTION 1 Historical Survey 115

The rapidity with which Poisson passed from the barest elements of the subject to
such recondite problems as those just mentioned may well excite admiration. His success is,
no doubt, partly explained by the high state of development to which analysis had been
advanced by the great mathematicians of the eighteenth century; but even after allowance
has been Blade for what is due to his predecessors, Poisson's investigation must be accounted
a splendid memorial of his genius.

Poisson's differential equation, linking spatial derivatives of the electrostatic poten-


tial to charge distribution, found its integral counterpart through a discovery by Karl
Friedrich Gauss (1777-1855). In 1813 Gauss established 13 the famous divergence
theorem
J D · dS vJ
s
= V • D dV

connecting a volume integral throughout V to a surface integral over S, with S being"


the closed surface bounding the volume 11 , and D being any vector function possessing
continuous first derivatives in a region containing V. If D is a radial field which varies
inversely with distance from some point 0, that is, if D = lr/r 2 then the surface inte-
gral of Gauss' divergence theorem yields the simple result

S
J D· dS = 471"
if 0 is inside V; otherwise the result is zero. This special result is known as Gauss'
integral. When D is properly related to Coulomb's inverse square law, f sD · dS equals
the net charge enclosed by S. This result, coupled with the divergence theorem, yields
an integral form of Poisson's equation. These deductions will be elaborated in the sec-
tions to follow.
Another great advance in electrostatic theory, though it was not so recognized at
the time, was made by Michael Faraday (1791-1867). His keen sense of physical visu-
alization led him to picture all force functions in terms of flux lines. This technique
first suggested itself to Faraday because of the common custom of illustrating mag-
netic power by strewing iron filings on a sheet of paper and noticing the curves along
which they arranged themselves when a magnet was placed underneath the paper.
From this Faraday evolved the idea of lines of magnetic force, whose direction at every
point coincided with the direction of the magnetic intensity.
It was a simple extension to apply this concept of flux lines to gravitational effects
and to electric intensity. About the latter, Faraday said 14

The lines of force of the static condition of electricity are present in all cases of induction.
They terminate at the surfaces of the conductors under induction, or at the particles of non-
conductors, which, being electrified, are in that condition.

This conception permitted Faraday to replace action at a distance with a local inter-
action of charge and a field of force, a viewpoint which had great appeal for Maxwell

13 K. F. Gauss, "Theoria Attractionis Corporum Sphaeroidicorum Ellipticorum Homogeneorum,"

reprinted in his lVetke, vol. 5, pp. 1-22, published by the Royal Society of Science, Gottingcn, 1870.
14 1\1. Faraday, Experimental Researches in Electricity, vol. 3, art. 3249, published by Bernard Quaritch,

London, 1855.
116 Electrostatics in Free Space CHAPTER 3

(1831-1879). In the Preface to the first edition of his celebrated Treatise on Electricity
and Magnetism, Maxwell wrote
. . . before I began the study of electricity I resolved to read no mathematics on the subject
till I had first read through Faraday's Experimental Researches in Electricity. I was aware
that there was supposed to be a difference between Faraday's way of conceiving phenomena
and that of the mathematicians, so that neither he nor they were satisfied with each other's
language. I had also the conviction that this discrepancy did not arise from either party
being wrong.

Maxwell found, as he proceeded with the study of Experimental Researches, that it was
possible to couch Faraday's ideas in mathematical terms and thus compare them with
the formulations preferred by mathematicians. As part of the contrast, he noted
For instance, Faraday, in his mind's eye, saw lines of force traversing all space where the
mathematicians saw centres of force attracting at a distance: Faraday saw a medium where
they saw nothing but distance: Faraday sought the seat of the phenomena in real actions
going on in the medium, they were satisfied that they had found it in a power of action at a
distance impressed on the electric fluids.

Maxwell's skillful mathematical exposition of Faraday's ideas led him to conclude that
the results of the t\VO methods coincided, but that Faraday's viewpoint was much
richer. Thus he adopted it and furthered it with many ideas of his o\vn. It was Maxwell
who introduced the concept of the electric flux density Junction D (he called it the
displacement), a concept which becomes especially meaningful in the study of dielec-
trics. Using Green's Theorem, he obtained an expression for the energy stored in an
electrostatic system in the form

w= Lf v
E2(X,y,Z) dx dy dz

which highlights the interpretation of the electric field E as the seat of the phenomena.
Maxwell also solved a variety of boundary-value problems, obtaining both the poten-
tial and electric field, and displaying these for the first time as precise field maps, to
illustrate Faraday's idea of lines of force. The plates appended to both volumes of his
Treatise include some of the most beautiful flux maps which have ever been prepared.
The field approach of Faraday and Maxwell, with strong emphasis on the local inter-
action of a charge and a field, will be found to have permeated the remainder of this
text.
Maxwell also contributed to the establishment of the inverse square law. His interest
in this problem had been aroused by his reading of the unpublished manuscripts of
Cavendish. These manuscripts had been brought to the attention of Lord Kelvin after
Cavendish's death. Recognizing their importance and desiring that they be published,
Kelvin urged the Duke of Devonshire, to whom the manuscripts belonged, to entrust
them to Maxwell, This he did in 1874.
The Cavendish experiment which particularly caught Maxwell's admiration was the
one concerned with the determination of the law of electric force, and he resolved to
repeat it. Accordingly, with Sir Donald l\1cAlister, he devised an apparatus which
improved in several particulars on Cavendish's original design. The principal innova-
tions were the use as detector of a more sensitive quadrant electrometer and the adop-
SECTION 2 Mathematical Formulation of the Inverse Square Law 117

tion of a technique which did not require the dismantling of the outer spherical shell.
Maxwell provided a thorough analysis of the accuracy of this method which, coupled
with McAlister's data, led him to conclude that the force law was bounded by r-(2+6)
in which I<5 I < 1/21,600. t The McAlister experiment and Maxwell's analysis will be con-
sidered in Section 3.20.
By far the most sensitive investigation of the electric force law which has ever been
undertaken was accomplished by S. J. Plimpton and W. E. Lawton at the Worcester
Polytechnic Institute in 1936. Together they skillfully applied all the advantages of
modern technology and electronic instrumentation to a repetition of the Cavendish
experiment and were able to show that the distance dependency in the electric force
law deviated from the inverse square by less than two parts in one billion. This remark-
able achievement stands as the most compelling reason for basing an electrical theory
on the inverse square law. The Plimpton-Lawton experiment and an analysis of the
accuracy of their results will be considered in Section 3.21.

3.2 MATHEMATICAL FORMULATION OF THE INVERSE SQUARE LAW


The preceding section of this chapter has shown how Coulomb's experiments led to the
formulation of the law of force for electric charges; namely

(3.1)

The increasing accuracy of the experiments performed by Cavendish, by Maxwell and


McAlister, and by Plimpton and Lawton have raised the confidence level in this law
almost to the point of certainty. Yet, it seems appropriate to point out certain limita-
tions in all these experiments and to circumscribe the limits of validity of this force law.
First, none of these experiments was undertaken with the charges extremely close
to each other, nor excessively far apart. Thus, the question can be raised as to the
limits of r within which the law is valid. As yet, there is little direct evidence at very
large distances. However, if one accepts the premise that an entire electromagnetic
theory can be based on Coulomb's inverse square law, then the indirect evidence sup-
ports the belief that this law operates at astronomical distances. Concerning short
distances of separation, Rutherford's experiments, in which he bombarded atomic
nuclei with a particles, have shown that the law holds at distances as small as 10- 14 m.
It may be valid at closer distances, but nuclear forces then come into play and partially
mask the effect.
Second, note that the law as stated presumes point charges and it is clear that this
is an approximation which can be good only when the extent of each charge is small
compared to r, For example, Coulomb's experiments involved, not point charges, but
rather charge distributions on balls of finite size. Induction effects caused these distri-
tions to be nonuniform. (In the case of repulsion, the remote sides of the two pith balls
attained a heavier charge density.) Thus it became somewhat uncertain what to use
for the true spacing r.
t Maxwell was apparently being conservative in using as the bound one part in 21,600, for in the
Introduction to The Scientific Papers of the Honourable Henry Cavendish he states "We can now use
Thomson's Quadrant electrometer, and thereby detect a deviation from the law of the inverse square
not exceeding one in 72,000."
118 Electrostatics in Free Space CHAPTER 3

In the case of the Cavendish method, the nonconcentration of charge at a single


point is even more evident, and another assumption entered heavily into the experi-
merits. It was assumed that the law of superposition of forces holds for electric charges
so that, if q' were replaced by a distribution of charges, one could write for the force on q
N N
F 0: q I
n = 1
q; i; = q
~n
I ~i~"
rn
n = 1
(3.2)

in which ~n is drawn from the charge qn to the charge q, and l rn ~n/~n is a unit vector
in the same direction as ~n. t
Quite obviously, none of the Cavendish-type experiments proved the validity of the
assumption of superposition; nevertheless, superposition is important for many applica-
tions of the theory. The validity of this assumption is accepted by virtue of the fact that
results predicated on it are consistent with experiment, but it should be recognized that
the principle of superposition for the forces among electric charges was not directly
demonstrated.
Third, the force law for electric charges, (3.1), includes the implication that the line
of action of the force coincides with the straight line connecting the two charges.
Coulomb's experiments did not reveal any transverse component of force, but they
were hardly sensitive enough to be definitive on this point. The Cavendish approach
requires the assumption that this is so, in that symmetry is used to cancel out certain
components of force in computing the net force on a charge between the two spheres.
Here again, the assumption that the force acts along the line joining the charges gains
its strength not from the original experiments, but rather from the accuracy of predic-
tions based on making this assumption.
Fourth, the law (3.1) states that the force varies directly as the algebraic product of
the two quantities of charge. Coulomb was able to show that like charges repel and
unlike charges attract according to the same function of distance, and also showed that
halving one charge reduced the force by a factor of two. Cavendish demonstrated that
doubling each charge quadrupled the force. But neither showed the general validity
of using the product qq', and this is accepted by inference.
Fifth, the law (3.1) states nothing explicitly about the medium in which the charges
q and q' are immersed. The approach to be adopted here will be that only a vacuum is
particle-free, and that any other medium can be viewed at the atomic level as an
assemblage of particles, some of which may be electrified. In the cases of such media,
the generalized form (3.2) can then be used to find the force on the charge q, with some
of the charges qn belonging to the particles which constitute the medium.
Finally, the law (3.1) also states nothing explicitly about whether or not the distance
between q and q' is time- dependent. The experiments were all essentially static. How-
t The distance symbol r is actually a German lower-case x, but it has the semantic advantage of
looking like an r. It will be used throughout this text to designate the distance from a source point (the
position of qn) to a field point (the position of q). The symbol r will be reserved for distances mea-
sured from the origin. Other authors have achieved this distinction by using r' for the distance between
source point and field point. Unfortunately that notation is not convenient here, since many of the
developments in Chapters 4 and 5 will involve t\VO coordinate systems XYZ and X'Y'Z'. rand r '
(or j' and !") will then mean the distance between the same two points as measured in the two different
coordinate systems. The reader may find it convenient to call the symbol r by the name r-sub-c or
r-cedilla.
SECTION 2 j.~1 alhematicol Formulation of the Inverse Square Law 119

ever, to develop a useful theory, one must be able to let charges move. I t will be seen
in retrospect that a satisfactory theory can be based on Equation (3.2) if one assumes
that it is valid for the case that ql ... qN are all fixed in position, but that motion of q
is permitted.
There is ample experimental evidenee to support this assumption. Many modern
electronic instruments and particle accelerators employ static distributions of charges
to create steady electric fields and thus accelerate charged particles which move
through these fields. When a theoretical traj ectory for these accelerated particles
is computed based on (3.2), excellent agreement with the experimental trajectory is
obtained. This has been found to be true even when the trajectory speed is high enough
to require inclusion of relativistic effects.
It has already been seen in Chapter 2 that mass is a function of velocity and the ques-
tion can be raised at this point whether charge is not also a function of velocity. It will
be seen in Chapter 4 that it is convenient to define charge to be an invariant. However,
a general answer to this question can be deferred. For the present it will be assumed
merely that the inverse square law is valid whether or not q moves, with the symbol
q which occurs in the formulas always referring to the value of charge when q is at rest.
With all the foregoing limitations and implicit assumptions in mind, the experiments
described earlier in this chapter will be taken as justification for acceptance of the
inverse square law as a fundamental postulate. Since this will be the only purely elec-
trical postulate needed to develop a complete theory of electromagnetics, space will now
be taken to recapitulate and construct a concise mathematical formulation of this law:
As suggested by Figure 3.7, let there be a static assemblage of N charged particles,
containing respectively charges ql, q2, . . . , qN, arbitrarily arranged in otherwise

q(x,Y,Z)

......- - - - - - - - - - - - - - - - - - - y

x
FIGURE 3.7 Notation for Coulomb's law.
120 Electrostatics in Free Space CHAPTER 3

empty space. The quantities qi are real numbers which can be either positive or nega-
tive. The positions of these charged particles can be described in a coordinate system
X YZ so that the nth particle is identified by the coordinates (X ,Yn,Zn) or by the posi-
+
ll

tion vector r., = lxxn lyYn + lzzn. These coordinates are not functions of time.
Additionally, let a particle containing a charge q be instantaneously at the point
(x,y,z) described by the position vector r = lxx + lyY + 1zz. This charge will be per-
mitted to move, so that the coordinates x, Y, and z can be general functions of time. The
total electrical force exerted on q is then
N
F = I
n=l
r, (3.3)

in which
r, = _1_ q~n ~n = qqn r - r n (3.4)
41rfo rn 41rfO /r - r n l 3

The proportionality constant (41rEo)-1 serves to convert (3.1) to an equality. Inclusion


of the factor 471" is known as rationalization and is done so that a factor of 471" will not
appear in the more often used Maxwell's equations to be derived subsequently from
(3.3). The factor EO can be looked upon as a units-adjusting parameter. It is called the
permittivity of free space, and in the IVIKS system of units to be used in this text has
the measured value of 8.854 X 10- 12 farads/me This choice for fO permits charge to be
measured in coulombs when distance is measured in meters and force in newtons.
The dimensions of Eo will become clear later in this chapter when the concept of
capacitance is introduced. The newton is a unit of force equal to lOS dynes. Thus 1 lb
(force) equals 4.4482 newtons, so that 1 newton can be remembered conveniently as
being slightly less than a quarter of a pound. One coulomb, the unit of charge, is defined
as the quantity of electricity passing a cross section when a current of 1 amp is flowing.
(The primary definition of an ampere will be discussed in Chapter 4 after Ampere's law
has been derived.) One coulomb is also the quantity of electricity required to deposit
0.001118 g of silver f'rorn an aqueous solution of silver nitrate.
Combining (3.3) and (3.4), one can write for the total electrical force on q,

(3.5)

in which
(3.6)
is the vector drawn from qn to q. t
Equation (3.5) is a mathematical statement of the inverse square law, and will here-
after be referred to as Coulomb's law. This equation will prove to be the cornerstone
of all the theory which is to follow, and thus can equally well be called the fundamental
postulate of electricity.
It should be noted that Equation (3.5) is being adopted as a postulate only for
discrete charges which are in otherwise empty space. Although on the face of it this
t The force F may be im plicitly a function of time if z, y, and z are functions of time.
SECTION 3 The Electric Field 121

appears impractical, it will be seen later in this chapter that many electrostatic sys-
tems of charges exist in the presence of electrically neutral and unpolarized material
bodies which can be treated as though they were empty space. In Chapter 6 material
bodies which cannot be so treated will be introduced and the theory will be enlarged to
take them into account.

3.3 THE ELECTRIC FIELD

If a static assemblage of charges qn exists at the points (Xn,Yn,Zn) and a small test
charge ~q is placed at the point (x,Y,z), Equation (3.5) gives for the force on ~q
N
~q \' ~n
LiF = - I...t qn 3 (3.7)
47T' fon =1 ~n

in which 6n is drawn from 'I» to ~q.


When it is assumed that the charge ~q is small enough so that its presence or absence
does not affect the spatial distribution of the other charges, the vector

E = LiF = _1
Li q 47r f 0
lN

n= 1
qn ~n
~~
(3.8)

is defined as the force per unit charge at (x ,Y,z). E can be expressed in the units of
newtons per coulomb and is variously called the electric force, the electric intensity, or
the electric field strength.
By implication, if a charge q of arbitrary size is placed at (x,Y,z), it experiences a
force
F = qE (3.9)

However, one must be careful in using (3.9) to ascertain that the presence of q has not
disturbed the positions of the other charges. For example, if the assemblage of charges
qn is distributed over the surface of a conductor and the charge q is placed in the vicin-
ity, the charges q-; being free to move, will redistribute themselves to new positions of
equilibrium.
Equation (3.8) indicates that the electric force depends on the charges qn and their
positions relative to the point (x,Y,z) but that it does not depend on Liq. An intensity
E(x,y,z) can thus be associated with the point (x,Y,z) whether Liq is there or not. If the
vector function E is interpreted in this manner, it can be taken as a fundamental sub-
ject of investigation. This is the field viewpoint of Maxwell and Faraday, which differs
from the action-at-a-distance theories of their predecessors. In this view, the source
charges qn set up an electric field at the point P(x,y,z); the field in turn will exert a
force on any charge which might be introduced at P.
With this interpretation, E as defined by (3.8) is an electrostatic field, since the source
points (Xn,Yn,Zn) are static and the field point (x,Y,z) has coordinates which are not
connected to the possible motion of any particle. This functional dependence is usually
indicated by writing

(3.10)
122 Electrostatics in Free Space CHAPTER 3

The dependence of E on the sources and their positions is not usually explicitly indi-
cated in the left side of (3.10), but is nevertheless understood implicitly.
In many problems it will be appropriate to consider the total charge p dV in a volume
element dV in lieu of the discrete charge qn. In such cases (3.10) can be written

E(x,y,z) = 1
-4 Jp(~,l1,r) 3~ d~ dl1 dr (3.11)
1T'fO v l

in which p(~,'YJ,r) is the volume charge density function, expressed in coul/rn'', and

(3.12)

is the positional vector drawn from the volume element centered at (~,l1,r) to the field
point (x,Y,z). The volume region V is sufficient to encompass all the sources p dV.
Similarly, there will be occasions when it is useful to consider the total charge (1 dS
on a surface element dS, or the total charge x de on a line element de, in place of the
discrete charge q.: For example, in the case of surface distributions, (3.10) becomes

E(x,Y,z) = - 1
41T'fO S
J (1(~,'Y1,r) -~ dS
~3
(3.13)

in which is the surface charge density function, given in coul/rn", and


(J" ~ is drawn
from the surface element dS, centered at (~,'Y1,r), to the field point (x,y,z).

EXAMPLE 3.1
1"0 gain some appreciation for the effects caused by 1 caul of electric charge, recall that the
charge on a single electron is 1.6 X 10- 19 couI. Therefore, it takes 6.25 X 10 18 electrons to
comprise 1 caul of charge. If these electrons were to be arranged in a cubical lattice 3A on
centers, the resulting cube would be approximately 500 microns on a side. An exterior
electron of this assemblage would be an average distance of perhaps 250 microns from the
remainder of the charge and would thus experience a repulsive force in the order of

1.6 X 10- 19
F = eE ~ -----------~
41T'(8.854 X 10- 12 ) (250 X 10- 6 ) 2
~ itr newton
Al though this does not seem to be a great force, when it is remembered that the mass of
an electron is only 9.1 X 10- 31 kg, the initial acceleration of a free electron experiencing this
force would be approximately 102 7 g. In the absence of compensating forces, this assemblage
of electrons is highly unstable.
Suppose that on a macroscopic scale this entire coulomb of charge is essentially concen-
trated at a point. Then the electric field due to it is radial and given by

1 10 10
E(r) = -lr - -
2
~ - lr -
41l"Eor r2

This field is so great that if another charge of 1 caul were placed 3 m distant, it would experi-
ence a force in excess of 100 kilotons. Further, in the presence of normal air, this field
intensity would cause breakdown out to a distance r = 100 m. In every respect 1 coul
represen ts an enormous amount of charge.
SECTION ~3 The Electric Field 123

Alternatively, if one asks what amount of charge exerts normal forces at normal dis-
tances, a feeling for this can be gained by considering t\VO equal charges 1 111 apart which
exert a force of 4 newtons (approximately one pound) on each other. Solving Coulomb's
force equation gives q = 21 micro-caul.

EXAMPLE 3.2
Imagine that all of space is populated with electric charge, but in such a way that the
charge density varies in only one direction. Let X be this direction, so that the situation can
be pictured in terms of plane layers of charge, stacked one next to the other, all transverse
to the X axis. The volume charge density p(~) varies from one layer to the next, but is con-
stant throughout any layer. Let it be desired to find the electric field distribution due to
this charge system.

­A
4_
"
- ~- -- -------=lC======~~----
r-
P(X,Y,z)
l-
I
L_

;---- X

The layer of charge contained between the planes ~ and ~ + d~ is pictured, together with
the field point (x,y,z) at which the electric field is to be determined. By symmetry, the
charges in the four volume elements shown exert a net effect at (x,Y,z) which is X directed.
On the basis of the contributions from all volume elements in this double-paired fashion,
Equation (3.11) gives

E(x,Y,z) = ~ f"" d~ f"" dlJ foe> pWlxCx - n dt


47rEo
- 00 .1­ Z
[(x - ~)2 + (y - rJ)2 + (z - S)2]%
00

ds'
E(x,y,z)
7rEO
f p( ~')~' d~' f drJ' f [( ~ ') 2 + (rJ ') 2 + (s ') 2] ~~
o 0
124 Electrostatics in Free Space CHAPTER 3

in which ~' = x - ~, rJ' = Y - rJ, and S' = Z- S. Integration gives

E(x,Y,z) _.!:. ­' p(f)ede


1T'Eo _ QO
J'"
0 (~')2
d7]'
+ (rJ')2

J
00

E(x,Y,z) +.!:. p( ~') d~'


- 2Eo

the plus or minus sign being taken according to whether or not ~' is greater or less than
zero. Thus the electric field is independent of y and z and is given by

::0 [ f pw d~ J pW d~
x 00

E(x) = - ]
-00 x

This result has a simple physical interpretation. If a column of unit cross-sectional area
extending from ~ ~ - 00 to ~ ~ + 00 is considered, the first integral is all the charge in
that part of the column to the left of ~ = x, whereas the second integral is all the charge in
that part of the column to the right of ~ = x. Thus the electric field at any cross section is
uniform over that cross section, normal to it, and equal to 1/2Eo times the difference in the
total charge per unit area found to the left and righ t of the cross section.

EXAl\JIPLE 3.3
Consider a spherical conducting] shell of outer radius a which contains a net charge Q.
What is the electric field distribution for this system?
Because of repulsion, the charge Q will distribute itself uniformly over the outer surface
of the shell with a density a = Q/41T'a 2• By symmetry, the field E at any distance r from the
center of the shell will be independent of the direction of that distance. With the use of
spherical coordinates, (3.13) can be written for this example in the form

in which, for convenience, the electric field is being evaluated at the point (r,O,O).
Referring to the figure, one sees that the charge contained within the band of area
211"a 2 sin 0 dO exerts a net effect at (r,O,O) which is entirely radial, and thus the above integral
can be writ ten
E(r) = IrQ f~ cos a sin 0 dO
87rEo ~2
o
t For the purposes of electrostatics, materials may be divided into two categories: conductors of
electricity and insulators (dielectrics). A conductor may be viewed as an aggregation of charged par-
ticles occupying a region which, on the atomic level, is mostly vacuous; conductors are thus brought
within the purview of the present analysis. A large number of these charged particles (usually elec-
trons) are free to wander throughout the confines of the conductor. These mobile charges respond
readily to an electric field and will continue to move as long as they experience a field. Thus whenever
the mobile carriers are arranged in a spatial distribution whose statistical time-average value is zero,
there is no net static electric field anywhere within the conductor. Only situations in which this is the
case will be considered in this chapter, the treatment of dynamic situations being reserved to Chap-
ter 8. The discussion of dielectric materials will be deferred until Chapter 6.
SECTION 3 The Electric Field 125

P(r,O,O)

Using the law of cosines, one can convert this to the form
r+a
+r
f
2
lrQ ~2 - a2
E(r) -----dr
167r foar 2 lr-al ~2

Q
which gives E(r) = l r - - r >a
47rfor2

= ° r

Thus charge which is uniformly distributed over a spherical shell creates an electric field
< a

at all external points as though the charge were concentrated at the center of the shell; at
all internal points it exerts no electric field whatsoever.

EXAMPLE 3.4
An academic problem which later can be extended to a practical situation concerns two
infinitely long parallel line charges. As shown in the figure (see next page), the upper line
contains a uniform charge density of x coul/m and the lower line contains the opposite uni-
form charge density of -)( coul/rn. The electric field at any point in space can be deduced
by first noting that by symmetry the answer will be independent of z. Thus the value of E
will be sought at the point P(x,Y,O).
In analogy with what has been said previously for volume and surface distributions, this
lineal distribution of charge gives rise to an electric field expressible as

in which ~l is drawn from a charge element in the upper line to P and ~2 is drawn from a
charge element in the lower line to P. Thus

E(x,y,O) = ..": f'" lx + 1


l: y(Y - d­2) - l~t dt __ ~ f'" 1 xx + l.(,I} + d­2) - 11 dt
47r fo
- 00
[
x2 + (
y - 2d)2 + r ]/2 2 411"'fo
- 00
[
x2 + (
2 + r ]72
y + d)2 2

E(x,y) = ~ [lxX + lyCy - d­2) _ l.TX + lyCy + d­2)]


211"'fo x2 + (y - d­2)2 x2 + (y + d­2)2
126 Electrostatics in Free Space CHAP'rER 3

P(.t,y.O)
y

d x

-- -J(

The Dirac delta function can be used to advantage in the formulation of many field
problems, Written <S(x - a), the one-dimensional delta function is defined as having the
properties:

o(x - a) = 0 for all x ~ a

! o(x - a) dx = 1
if x = a is included in the region of integration;
otherwise the in tegral is zero

The delta function can be pictured conveniently as the limit of a Gaussian curve, or
some other similarly peaked distribution, when in the limit the curve narrows and
heightens indefinitely, but in such a way that the area under the curve remains unity.
This area is considered to be dimensionless.
From these definitions it follows that, if f(x) is any arbitrary function

f f(x) o(x - a) dx = f(a) (3.14)

!f(x) o'(x - a) dx = -f'(a) (3.15)

in which the prime denotes differentiation. The first result comes readily from the mean
value theorem whereas the second can be derived from the differential of a product. In
both formulas a is assumed to lie within the integration interval.
Delta functions in multidimensional space can be fabricated by forming products of
one-dimensional delta functions. In three dimensions

oCr - ro) = o(x - a) o(y - b) o(z - c)


SECTION 4 Electrostatic Potential 127

in which rand ro are the position vectors drawn respectively to the points (x,Y,z) and
(a,b,c). As a consequence of the earlier definitions, oCr - ro) vanishes except at r == ro
and, if dV == dx dy dz,

J oCr -
v
ro) dV = { 1 if V contains (a,b,c)
0 if V does not contain (a,b,c)

Through the use of the delta function, an assemblage of discrete charges can be rep-
resented by a volume charge density function in the form

L qn oCr' -
N
p(~,'Y],~) = r n) (3.16)
n=l

in which r' is the position vector drawn to the point (~,'Y],~). This representation can be
verified by inserting (3.16) in (3.11), which gives

E(x,y,z) = -
1 J L~ qn oCr' - r.,)
r - r'
-I --'I d~ d'Y] d~
47rEO V n = 1 r - r 3

Use of the result (3.14) reduces this to (3.10).


Similarly, a surface charge distribution can be represented in terms of a volume
charge density through the use of a one-dimensional delta function along the direction
of the normal to the surface; a lineal charge distribution can be represented by a volume
charge density through the use of a two-dimensional delta function in a transverse surface.
By means of these representations a general discussion in terms of p-type distributions
has much wider applicability.

3.4 ELECTROSTATIC POTENTIAL

Use of the del operator V == I, ~


ax
+ 1 ay~ + y I, i8z to form the gradient of inverse dis-

tance (cf. Mathematical Supplement) gives


lx(x - ~) + ly(Y - 7]) + lz(z - ~)
[(x - ~)2 + (y - 'Y])2 + (z - ~)2F2

so that v(D = - ~
from which it follows that (3.11) can be written

E(x,y,z) == - _1_
47rEo V
J p(~,'Y],~)V (~) d~ d'Y] d~~

Since neither p(~,'Y],~) nor the limits of integration are functions of the field point (x,y,z),
the order of integration and differentiation can be interchanged giving

E( x,y,z)
== - 'rf
v
J p(~,'Y],~) d~ d'Y] d~
4 (3.17)
V 1TEO~
128 Electrostatics in Free Space CHAPTER 3

Thus the electric field is expressible as the negative of the gradient of the scalar function

cfJ(X,Y,z) = f p(~,Tj,n d~ dTj d~ (3.18)


V 41rEO~

<P is called the scalar electrostatic potential function and it has several important pro-
perties which will now be developed.
Since E = - V<I> (3.19)
use of the vector identity (V.112) gives
V'xE=O (3.20)
and thus the static electric field is irrotational. Application of Stokes' theorem then
yields

c
¢ E · .u sf = V X E · dS == 0 (3.21)

in which C is any closed contour, forming the sole boundary of an open surface S.
Consider a contour C which is arbitrarily divided into two segments C I and C 2 by the
points PI and 1)2, as shown in Figure 3.8. As a consequence of (3.21)

FIGURE 3.8 A segmented contour.

Pz p~

{t E·d£= {, E·d£ (3.22)


PI PI

In words, the line integral of the longitudinal C0111pOnent of E from PI to P2 is independ-


ent of the path and therefore the static electric field is said to be conservative. But
fro m (3.19)

or (3.23)
SECTION 4 Electrostatic Potential 129

The difference in the value of the scalar electrostatic potential function at any two points
is thus the negative of the line integral of longitudinal E along any path connecting the
two points.
If the total charge in the system is finite and occupies a finite volume, it follows from
(3.18) that the value of cI> at infinity is zero. If in (3.23) the point P2 is allowed to
approach infinity, one obtains

(3.24)

This result has a clear physical interpretation. If there is a distribution of charges


p(t,1J,r) dV which create a static electric field E, and if a charge q is placed at PI, it will
experience a force qE (providing its presence does not alter the charge distribution).
If q is then displaced an amount dl along an arbitrary path, the charge system does an
amount of work on q equal to qE · di. If this process is continued and q is permitted to
recede to infinity, the total amount of energy extracted from the charge system is

f
00

w= qE· ae = q~(Pl) (3.25)


Pi

Therefore W is the amount of energy potentially available when q is at Pl-energy which


can be extracted from the charge system by removing q to a point remote from all of the
charges in the system. For this reason cI>(P 1) = ltV­ q can be interpreted as the potential
energy available per coulomb at the point PI due to the charge distribution p dV. <I> is
expressed in units of joules per coulomb, or volts. For this reason E is customarily given
in volts per meter, a unit which can be understood by virtue of (3.23).
Of course, the value of ifl(P l ) can be negative just as well as positive. If it is positive,
the net action of the charge system on a positive charge q as it moves away is repulsive.
If on the other hand ifl(P l ) is negative, the net action of the charge system on a positive
charge q as it moves away is attractive. In this latter case it takes an external force and
an external supply of energy to pull the positive charge q away. Instead of the charge
system having provided energy to remove q, it has required the addition of energy at the
expense of external sources in order to effect the removal. These arguments are inverted
if q is a negative charge, but then the signs of Wand <I> are opposite, as seen from (3.25),
so the interpretation with respect to the supply or removal of energy is the same.
<I>(P I ) as given by (3.24) is called the absolute potential, whereas <I>(P 2 ) - cI>(P 1 ) as
expressed in (3.23) is customarily given the name of potential difference.
The result (3.21) is consistent with the law of conservation of energy. Since the inte-
grand can be thought of as the work done by the charge system on a positive unit charge
as it undergoes a displacement di, the closed line integral is the work done on a positive
unit charge as it moves from an initial point around any path and ultimately back to the
initial point. Upon its return the original situation is reproduced. If there had been a net
value for the work done, this cycle could be repeated endlessly and would constitute a
perpetual-motion machine.
Because of the independence of path, the result (3.23) can be written
00 00

~(P2) - ~(Pl) = f E·d£- f E·dt


P2 Pi
130 Electrostatics in Free Space CHAPTER 3

which is consonant with (3.24) and indicates that the potential difference is the differ-
ence in absolute potentials at P2 and PI, as well as being a 111eaSUre of the energy which
can be extracted from the system if a positive unit charge is moved from PI to P2. One
may interpret (3.23) by saying that, if E on the average is oriented from PI to P 2 ,
energy is extracted from the charge system as a positive unit charge moves from PI to
P2. This means that the energy potentially available at P2 is less than at PI, thus
explaining the minus sign on the right side of (3.23).
The scalar function <fl(x,Y,z), defined by (3.18), can be set equal to a constant, thereby
prescribing an equipotential surface. The discussion of gradient in the Mathematical
Supplement establishes the facts that V<fl is perpendicular to the equipotential surface
and has a magnitude and direction synonymous with the maximum spatial rate of
increase of <fl. Because of (3.19), E(x,Y,z) is therefore normal to the equipotential surface
which contains (x,Y,z), and has a magnitude and direction which give the maximum
spatial rate of decrease of ¢. The negative feature of this interpretation is in accord with
the discussion of the negative sign in (3.23). More will be said in Section 3.6 about the
orthogonality between E and the equipotential surfaces, in connection with flux maps.
To summarize the results of this section, a scalar electrostatic potential function
~(x,y,z) has been introduced, with (3.18) the defining relation. ~(x,Y,z) has the physical
significance of being the potential energy available if a positive unit charge is placed at
(x,Y,z), this energy being extractable through removal of the unit charge to infinity. cI> is
related to the electric field by Equations (3.19) and (3.24) and is related to the sources by
(3.18). <I> is an exact function because the system is conservative, the line integral in
(3.24) being independent of the path.
Since surface distributions of charge a dS, lineal distributions x de, and discrete
charge distributions qn can be represented by volume distributions p dV through use
of the Dirac delta function (cf. Section 3.3), all the results just obtained apply to these
types of distributions as well, Appropriate forms for the potential function include

~(x,y,z)
= J U(l;,17,r) dS (3.26)
8 47rEo~
N
1 \' qn
cJ!(x,Y,z) = - '-' A (3.27)
47rEo n = 1 r,

The second of these results can be obtained by inserting (3.16) into (3.18) and the first
can be found by a similar procedure. Alternatively, Equations (3.8) and (3.13) can be
rephrased in terms of the gradient of inverse distance and the procedure which led
to (3.18) can be repeated for these two cases.
Because (3.18) is a scalar integral, for many volume distributions it is much easier
to evaluate than the vector integral (3.11). When such is the case, it probably will
prove to be simpler to find E by first finding <fl and then forming - V4>, rather than
attempting to find E directly.

EXAl\1PLE 3.5
.An important special case of an electric charge system is the doublet, or dipole, consisting
of t\VO charges q and - q separated by a small distance d. It is desired to find the potential
»
and electric field of this charge system a distance ~ from its center, with ~ d. This result
SECTION 4 Electrostatic Potential 131

~l

derives some of its importance from a model of dielectric materials, whose behavior can be
explained in terms of atomic and molecular dipoles (see Chapter 6).
On the basis of the figure and Equation (3.27), the potential at a remote point P(~, e,¢)
is given by
<l> _ q
4 7r Eo
(1 1) _
~1 ~2 47r€o
q (~2 - ~l)
~1~2

But

and, since r »
d, these expressions may be expanded by the binomial theorem (cf. Mathe-
matical Supplement) into rapidlyconverging series. Retaining only dominant terms gives
~2 - ~1 ~ d cos 8

<PC () ) ~ qd cos 8
and thus r, ,4> - 4 2 7r€o~

The product qd is called the dipole moment, a phrase borrowed from mechanics. I t is useful
to introduce a vector p whose magnitude is qd and whose direction is from the charge -q to
the charge +q. If ~ is taken to mean the directed distance from the center of the dipole to
the remote point P, the above resuit can be written

4> = ~~ (3.28)
47r€O~3

This form of expression of the potential of a static electric dipole will prove convenient in
later discussions.
132 Electrostatics in Free Space CHAPTER 3

The electric field due to the dipole can be found by employing the gradient operator for
spherical coordinates centered at the dipole, namely,

.,
E(r,O,<p) = -v - -
(qd cos 0) = -qd- (l 2 cos 8 + 1, SIn. 0) f (3.29)
41T'EO~2 41T'EO~3

I t can be observed that the electric field and potential of a doublet diminish with distance
as the inverse cube and square, respectively, whereas Equations (3.8) and (3.27) indicate
that for a single charge the electric field and potential diminish with distance only to the
inverse second and first powers. The explanation lies in the fact that the doublet consists of
two equal and opposite charges close enough together so that they partly neutralize each
other's effect.
Field plots proportional to Equations (3.28) and (3.29) can be found with Example 3.11.

EXAMPLE 3.6
Consider again a spherical conducting shell of outer radius a which contains a net charge Q.
The electric potential distribution for this system can be found with the aid of (3.26) in
the form
4>(r,O,O) = _1_ f !L a 2
sin 0 dO dq,
41T'Eo 41T'a 2 ~
s
in which the geometry of the figure of Example 3.3 applies. Since

(~) 2
r d{ =
= a2 + r2
ar sin ° -

dO
2ar cos °
the integral can be rewritten

f
r+a
4>(r) = _Q- dr
81rEoar
Ir-al

which yields 4>(r) = -!L


41fEor
r~a

= -!L r ~ a

Therefore, charge which is uniformly distribu ted over a spherical surface creates an electric
potential at all external points as though the charge were concentrated at the center; at all
internal points it causes a constant potential.
From these potential expressions the electric field can be found through use of the gradient
operator for spherical coordinates. The result is

Q
E(r) = -V4> = 1r - - r>a
41fEor 2
=0 r < a
which agrees with Example 3.3.

EXAMPLE 3.7
As an extension of Example 3.4, the potential due to two parallel line charges of opposite
sign can be deduced. Referring again to that figure, one sees that the potential is z indepen-
dent and given by
SECTION 4 Electrostatic Potential 133

<I>(x,y) = _1_ foo x df _ _ 1_ foo}( df


41r E o
-00
{I 41rEO
-00
~Z

(3.30)

Equipotential surfaces for this distribution occur when

x
2
+ (Y + ~y ~
x
2
+ (Y - 0
2
=k

-------====~~~===--------x
134 Electrostatics in Free Space CHAPTER 3

in which k is a constant, for then

J<.
<I>(x,Y) = - I n k
21T'fo

Rearrangemen t of terms gives

+( + d1 + k 22) 2 _ d2 k
2
(3.31)
21 -
2
X Y k - (1 - k2)2

which is the equation of a right circular cylinder parallel to the Z axis. The equipotential sur-
faces are a family of nonconcentric nesting cylinders with centers at (0, d(l + k 2 ) /2(1 - k 2 ) )
and radii d[k­ (1 - k 2 ) ]. A. few contours of these equipotentials are sketched in the figure.
Formation of the gradient of (3.30) gives

l zx + ly (Y + ~)
x
2
+ (Y + ~y
which agrees with the result of Example 3.4.

3.5 GAUSS' LAW

Let a new static vector field be introduced by the relation

Do(x,y,z) = EoE =~
41T' v
f p(~,'Y],r) !r d~ d'Y] dr
3
(3.32)

in which, through use of the delta function, p can represent volume, surface, lineal, or
discrete charge distributions. V is a volume large enough to encompass all the charges
of the system, Do is called the electric flux density function and the zero subscript is a
reminder that the present discussion is limited to charges in a vacuum. The units of
Do are coul /rn" as can be seen by inspection of (3.32).
Consider evaluation of the surface integral

f Do · dS f [~411" f
Sa
=
Sa V
p ~ dV]
r
· dS

in which Sc is a closed surface bounding a volume V G. V G and V can bear any general
relation to each other-either may totally include the other, they may have a sub-
volume in C0111n1011, or they 111ay be nonintersecting. Since the extents of these two
volumes are independent, the order of integration can be reversed, giving

f Do ·dS
Sa
= ~
41T'
f p(~,7J,r) [ f ~ ·~S] d~ d7J dt
V Sa!
(3.33)

In (3.33), p(~,1],r) d~ d1] dr is a source element of charge in the volume V, and ~ is


drawn from the source element to the surface element dS in SG. This situation is
depicted in Figure 3.9.
Consider the evaluation of the surface integral on the right side of (3.33) for the
case of a point (~,'Y],r) interior to Sa. Let dS A be a surface element with a central point
SECTION 5 Gauss' Law 135

\ \
\ \
\ \
\ \
\ \
\\
\\
\\
\\
\\
\\
\\ r;
~
\ <t',77',()
FIGURE 3.9 Geometry for establishment of Gauss' law.

P A as shown. If lines are drawn from every point on the perimeter of dS A to (~,1],r),
the cone thus formed includes a solid angle dnA. The projection of dS A onto a sphere
thr ough 1:1 A with (~,1],r) as center therefore has an area equal to ~2 dnA. Thus

in which in is the outward-drawn unit vector normal to dS A and cos 8A is the acute
angle of intersection of the sphere and dS A . By consideration, in this fashion, of all the
elements of solid angle centered at (~,1],r), every surface element in Sa is included
and one can write

f -- = f
471"
~. dS ~. in
--dfJ
Sa ~3 0 ~ cos ()

But 6 · In/~ = - cos 8 or + cos 8 depending on whether or not 6 and in are oppositely
directed. For the solid angle dnA there is one intersection with the surface Sa and the
contribution to the above integral is therefore +dQA. For the solid angle dQ B there are
three intersections. The contributions at P B and P~ are each +dQB whereas the con-
tribution at P~ is -dQB, for a net contribution of +dQB. It is apparent that since
(~,l1,r) is inside Sa, for each element of solid angle dn there must be an odd number of
intersections and hence a net contribution of +dn. Thus

J ~ = J dQ = 41r
dS 471"
(3.34)
Sa ~ 0

When an exterior point (~/,1]/,r/) is considered, each element of solid angle dn makes
an even number of intersections with Sa, half of which give a contribution +dn and
half of which give a contribution - dQ, for a net contribution which is null. Thus the
result (3.34) applies for any point (~,1],r) interior to SG but must be replaced by zero
136 Electrostatics in Free Space CHAPTER 3

for any point (~/,q',r/) exterior to So. Therefore (3.33) becomes

f Do · dS f p(~,1/,n dV
So
=
Va
(3.3fi)

In words, the integral of the normal component of Do over a closed surface Sa is equal
to the net charge in the volume 11 G enclosed by Sa. This is Gauss' law.
Several useful corollaries to Gauss' law follow readilyv First, if the closed surface Sa
is constructed so that every point of SG is occupied by conducting material, the total
charge within SG is zero. This is a consequence of the fact that in electrostatic equi-
librium, E == 0 within a conductor] and thus Do == 0 also. Therefore f So Do · dS = O.
Second, in electrostatic equilibrium, there is no net charge at any interior point of a
conductor. This follows because such a point can be surrounded by a surface SG which
lies wholly within the conductor. Thus the excess charge of a conductor resides on its
outer surface in electrostatic equilibrium, t
Third, if any number of arbitrarily charged bodies be placed inside a hollow closed
conductor, the charge on its inside surface will be equal and opposite to the total charge
on the enclosed bodies. This can be shown by constructing SG to consist only of interior
points of the hollow conductor, at all of which points Do = O.
EXAMPLE 3.8
Consider a static electric system consisting of a single charge of strength q located at the
origin. For this case (3.32) gives
q r
Do = - A
3 41r r

If a gaussian surface So is constructed, consisting of a sphere of radius r centered on the


charge, then over this sphere Do is everywhere normal with a magnitude q/4trr 2• Thus

f Do • dS = -q-2 (4trr2) = q
47rr
So

since 41rr2 is the surface area of the sphere. This result is seen to agree with Gauss' law.
If a charge Q resides on a spherical conducting shell of outer radius a, as in Example 3.3,
by the second corollary to Gauss' law, this charge is found on the outer surface. By sym-
t To say that E == 0 requires some elaboration. In a metallic body, for example, at the atomic level
one can imagine an array of positive ions which can vibrate about their lattice sites, plus a cloud of
electrons which are free to wander throughout the body. The vibrations of the ions and the wanderings
of the free electrons are both random thermal effects, and local fluctuating electric fields exist at any
atom site, varying with the motions of the nearby ions and electrons. However, the time-average value
of this electric field is zero unless there is a drift of the electron cloud (current) through the metallic
body. In electrostatic equilibrium no such macroscopic drifts occur. The statement that E(x,y,z) == 0
implies this, and E(x,y,z) can be interpreted properly as the time-independent component of the electric
field within the conductor. Since the conductor is being viewed as an assemblage of charges in a
vacuum, the defining relation Do = foE is applicable and therefore, within the conductor, Do == 0 also.
t The net result is that a charged conductor can be viewed internally as an electrically neutral body,
but one possessing a charge distribution over its exterior surface. The interior of the body is equivalent
to a vacuum once equilibrium is established, its role having been properly to distribute the surface
charge. This proper distribution is such that if the conducting body were removed, leaving the surface
charges intact in a vacuum, E would still be zero at all points formerly occupied by conductor.
SECTION 5 Gauss' Law 137

metry it has a uniform surface density (J' = Q/41ra2• If a gaussian surface Sa is drawn, which
consists of a concentric sphere of radius b, then if b > a,

f Do' dS =Q
Sa
from which, by symmetry
Q
Do = -
41rr 2
However, if b < a, then
f
Sa
Do' dS = 0

and thus, again invoking symmetry


Do == o.
Upon division of these results by fo to obtain E, agreement is found with Example 3.3.

EXAMPLE 3.9
Another academic problem, from which several practical results can be derived in due
course, involves two concentric conducting right circular cylinders of infinite length. A short
length of this geometry is shown in the figure. If it is assumed that a lineal charge density of

+x coul/rn exists on the inner conductor, the second corollary to Gauss' law indicates that
this charge resides on the surface r = a; by symmetry it is uniformly distributed over this
surface. The third corollary leads to the conclusion that the surface r = b contains a charge
of - x coul/rn ; by symmetry this is also uniformly distributed.
The field that exists in the vacuous region between these coaxial cylinders can be deduced
with the aid of Gauss' law. Let a gaussian surface SG be erected, composed of a concentric
cylinder of radius r, with a < r < b, and two end caps at the positions z = ±L. By sym-
metry, Do is entirely radial so the integrals over the end caps vanish, and

f f
L

2Lx = Do · dS = Do(r)27f'r dz = 47f'rLDo(r)


Sa -L
x
Do(r) = -
138 Electrostatics in Free Space CHAPTER 3

From this it follows that


x
E(r) = l r -
21Tfor
)( r
<I>(r) = <I>(a) - - In -
27T'fo a

If the outer cylinder is grounded and the inner cylinder maintained at a potential Vb, this
last result becomes
<I>(r) = Vb In (rib)
]n (a­b)

As one would expect, the equipotential surfaces are concentric cylinders.


These results can be put to practical use in the treatment of tubular condensers and
coaxial transmission lines.

3.6 ELECTRIC FLUX

A graphic method for displaying any vector field is described in Example V.21 of the
Mathematical Supplement. This technique can be applied to the field Do with the
advantage that it frequently provides conceptual help in the understanding of problems.
The spatial vector function Do(x,Y,z) has a value at the point P(x,y,z) given by
Equation (3.32). Imagine that lines are constructed at P parallel to Do and marked
with arrows pointing in the direction of Do. If the number of lines per square meter
which pass through a small area erected at P transverse to Do is chosen to be numerically
equal to the magnitude of Do at P, and if this is done for all points P, a field map of
Do results.
The lines which represent Do are known as electric flux lines, and since their density
gives the value of Do, one can see why Do is known as the electric flux density function.
For any closed surface So

J/; = f Do' dS
Sa
(3.36)

is the net number of electric flux lines emerging from So. Hence an alternative statement
of Gauss' law is that the net number of electric flux lines emerging from a closed surface
So is numerically equal to the total charge enclosed by So. Since So may be chosen
small enough so as to enclose only one discrete charge, it follows that the number of
electric flux lines originating on a positive charge ql is numerically equal to ql, and that
the number of electric flux lines terminating on a negative charge q2 is numerically
equal to q2. If Sa encloses no charge at all, t­; = 0, which means that as many flux lines
enter SG as leave it. Thus electric flux lines are continuous except at points where there
is charge. All these deductions may be summed up by the two statements:
1. All lines of electric flux originate on positive charge and terminate on negative
charge. t
2. The net efflux t­; at a point P is numerically and algebraically equal to the electric
charge at P.
t Some charge may have to be considered to he placed at infinity in order to "complete" the charge
system.
SECTION 6 Electric Flux 139

EXAMPLE 3.10
Consider again the case of a single charge q placed at the origin. If q is positive, then if; = q
lines of electric flux emerge from the origin; by symmetry, they are uniformly distributed
in three dimensions, as suggested by the figure. At any radius r, the density of these lines is

l/; q r
Do = - so that Do = - -3
41rr 2 41T'r

which agrees with Example 3.8.


In the absence of all other charge, these flux lines would extend radially to infinity, there
to be terminated by a total charge -q, uniformly distributed over an infinite sphere. If the
charge q at the origin is negative, the directions of the arrows on the flux lines are reversed.

EXAMPLE 3.11
Next, reconsider the doublet of Example 3.5, consisting of charges q and -q a distance d
apart. All q of the lines of electric flux leaving the positive charge terminate on the negative
charge. Since for this system

Do = .!L
41T'
(6~
~1
_~)
~2

it follows that for points very close to the charge +q,

so that the flux lines start out radially and uniformly from + q and then bend around toward
the charge -qJ where they enter radially and uniformly. This is enough information to
140 Electrostatics in Free Space CHAPTER 3

allow a rapid and informative sketching of the field, with the result indicated by the heavy
solid lines in the figure.
Through use of the field expression developed in Example 3.5, if ~ » d,

Do = ~ (1~ 2 cos () + 1 8 sin ())


41rr3

which is seen to be consistent with the flux plot.

Flux lines

I
Equipotentials

Equipotential surfaces can also be added to a flux map and they are everywhere
perpendicular to the electric flux lines. In the preceding simple example of a single
charge, the equipotential surfaces are concentric spheres. In the case of the doublet,
they are figures of rotation about the line connecting the charges; the profiles of several
equipotentials are shown in the figure.

3.7 A CONDUCTOR-VACUUM INTERFACE


The relation between flux lines and the charge which resides on conductors is of special
interest. Consider the case of the electrified conductor of exterior surface S shown in
Figure 3.10. I t already has been established, through the second corollary to Gauss'
law, that all the excess charge resides on the exterior surface S. Further, it has been seen
that in electrostatic equilibrium, E == 0 throughout the conductor, for if this-were not
SECTION 7 A Conductor-Vacuum Interface 141

so, charges would be flowing, in denial of the assumption of equilibrium. Thus between
any two points in the conductor IE · di == 0, because these points can be connected
by a path which lies wholly in the conductor. It follows that the conductor is an
equipotential.
If the conducting body is viewed as an assemblage of charges (some mobile) in a
vacuum, it follows that Do = EoE == 0 within the conductor, and therefore all the
electric flux lines associated with the excess charge are external to the exterior surface 8.
Further, these flux lines must be normal to 8. If they were not, this would imply a
tangential component of Do, and thus of E, in 8; this would give rise to surface charge
flow, violating the premise of static equilibrium. This conclusion that the flux lines
must be normal to 8 is also consistent with the fact that S is an equipotential surface.

S~c:J\
\ Sf
\ /
I
'-.; "....--;;....;'
-..;

(a) (b) (c)


FIGURE 3.10 ~4n electrified conductor.

If the total excess charge in a surface element d.S of 8 is (J dS, Gauss' law requires
that dl/; = a d.S be the total number of flux lines just outside of dS associated with the
charge in dS. But dl/; = Do dS, in which Do is the electric flux density immediately
outside of dS. Therefore, at any point on the exterior surface of an electrified conductor
(J = Do (3.37)
In general (J and Do are functions of position on the conductor surface. A possible
distribution is indicated by the flux lines in Figure 3.10a.
All of the foregoing conclusions apply whether the conductor whose exterior surface
is 8 is solid or hollow. Let it be assumed that it is a hollow body, with an interior sur-
face 8', as shown in Figure 3.10b. A gaussian surface ~SG can be constructed which lies
as close to S' as one pleases, with all points of Sa lying in conductor. Gauss' law then
yields the result that S', considered as a whole, must be electrically neutral. This does
not prove that 8' must everywhere be locally neutral. One could imagine a charge
distribution over 8', somewhat as suggested by Figure 3.10c, with the flux lines running
through the hollow interior from the positive cluster of charge to the negative cluster
of charge.
But such a charge distribution on S' is not in static equilibrium. 1"0 appreciate this,
one need only form the integral IE · d.f along t\VO paths from PI to ]J 2 , one path being
entirely in the conductor, the other through the hollow interior along a flux line. Since
E == 0 along the first path, it must also be identically zero along the second because
potential difference is independent of path. Therefore, the interior surface S' must be
142 Electrostatics in Free Space CHAPTER 3

everywhere locally neutral and the hollow interior region is field-free. This is a feature
which can be used to advantage when it is desired to shield equipment from external
electric fields.
EXAMPLE 3.12
Previous examples have established that the electric flux density external to a conducting
spherical shell of outer radius a, possessing an excess charge Q, is Do = l r (Q/ 47T-)(r /r 3 ) .
Right at the surface

Q
(J = Do = - -2
47T'a

which is consistent with the fact that the surface area is 41ra 2•

3.8 THE METHOD OF IMAGES


Certain problems in electrostatics may be simplified by the application of an image
technique which can be established through the use of Gauss' law. To this end, con-
sider an arbitrary complete system of discrete fixed charges. t Figure 3.11a is intended
to suggest this general situation, with flux lines drawn from positive charges to nega-
tive charges, and equipotentials shown lighter and transverse to the Do field.
Let cI>o be a closed equipotential surface with in an outward-drawn unit vector
normal to cI>o at the point I). The surface <Po divides the system of charges into an
internal part and an external part such that
N
L qn =
n=l
Qint + Qext = 0

In words, the algebraic sum of the external charges equals minus the algebraic sum
of the internal charges.
If a surface charge density a = - Do · L, COUl/lU 2 is placed at each point P on the
surface <Po, at the same time that all charges exterior to <Po are removed, the result
will be as shown in Fig. 3.11b. The Do field inside <Po will be unaltered, whereas the field
outside will be completely erased. The surface <Po will still be an equipotential.
Suppose next that an extremely thin, electrically neutral conductor, shaped in the
form of the surface <Po, is slipped into the position of <Po in such a way as not to disturb
any of the interior charge. Suppose further that the charges which make up the dis-
tribution (J' become attached to the conducting surface. Since these charges were in
transverse equilibrium before the insertion of the conducting surface (E t a n == 0 over an
equipotential surface), they will not move even though they are now on a conductor and
free to do so. The important conclusion is reached that the field inside <Po is the same
in the presence of the conductor, charged with the distribution (J', as it was originally
when the external charges were present.
For any gaussian surface Sa erected outside the conducting shell, f SaDO • dS = 0 by
virtue of the fact that now Do == 0 outside <Po. This means that the total charge inside
t By a complete system one means N charges of values qn, placed at arbitrary positions, but such that
1;qn = O. If charges are imagined to exist at infinity, every system can be considered a complete
system.
SECTIO N 8 The Method of Images 143

<1>0
Exterior

(a)

+
+~.-....-~"""",

(b) +
+

(c)

FIGURE 3.11 The method of images.


144 Electrostatics in Free Space CHAPTER 3

SG is zero, or that
o.: + f dS = 0 U
<1>0

so that f dS = o.:a
<1>0

A vivid way to picture what has been done is to imagine that all flux lines which
have extremities external to <Po have unit charges of appropriate sign attached to those
extremities, these charges making up Qext. These flux lines are permitted to contract,
pulling the unit charges with them until all of Qext has collapsed onto a conductor placed
at <1>0, thus forming the distribution a. This erases the external field but leaves the
internal field intact.
Alternatively, if a surface charge density (J' = Do · in = - ( J is placed at all points
on the surface <1>0 at the same time that all interior charges are removed, the result
will be as shown in Figure 3.11c. The field outside <1>0 will be unaltered whereas the
interior field will be completely erased. If the shaped conducting shell is put in place
and the charges comprising (J' are allowed to become attached to it, no change in their
distribution will occur. Thus the field outside cI>0 is the same in the presence of the
conductor containing the surface charge distribution a' as it was when the original
discrete internal charges were present. One can conclude that

f u' dS = Qint
<1>0

and say that if Qint is allowed to collapse along its flux lines onto a conductor placed
at <1>0, thus forming the distribution a', then the internal field will be erased whereas
the external field will remain intact.
Once the situation of either Figure 3.11b or Figure 3.11c is achieved, it is no longer
necessary that the conductor be an extremely thin shell. I t can assume any thickness
which encroaches only on the field-free region and can even be a solid conductor which
completely fills this region.
The above procedure turned around is the method of imaqes. If one wishes to find the
field due to fixed charges and/or charged conducting bodies, a simple solution is avail-
able if the conductors can be replaced by properly positioned equivalent charges. For
simple conductor shapes the proper equivalence is often easily recognizable.
EXAMPLE 3.13
Previous examples have been concerned with the field and potential due to an electric dipole
(doublet) and a flux map for this system can be found with Example 3.11. The equipotential
contours which have been added to that flux map show that the plane which forms the per-
pendicular bisector of the line connecting the t\VO charges is an equipotential surface of
value <I> = o. Therefore, the system consisting of a discrete charge q a distance d­2 above
a grounded conducting plane is equivalent (in a half-space) to a doublet. The imaqe charge
-q, a distance d/2 behind the plane can replace the plane and all the surface charge it
contains, for the purpose of computing the field anywhere above the plane. With reference to
the figure, the Do field at any point P(r,c­>,z) above the plane is therefore

D -.!l {I r
T + 1z (z - d/2). _ 1TT + 1z (z + d/2) }
o - 41r [r 2 + (z - d/2) 2P2 [r 2 + (z + d/2) 2F2
SECTION 8 The Method of I mages 145

in which cylindrical coordinates are being employed, with the origin selected as that point
in the conductor closest to q.
At the plane z = 0

CT(r) = Do = - 1r
d[
r2 + (d)2J
2 _3,-2
'
a distribution which is shown as the dotted area in the figure. The image technique is thus

P(r,cP,z)

d
2

seen to be additionally useful in yielding the charge distribution in the conductor. As a


check,

f CT(r)2rr dr = qd foo r dr
-q
o - 2 o [ r 2 + (d)2J%
- =
2

EXAMPLE 3.14
The case of t\VO parallel line charges of opposite polarity has been treated in Example 3.7 in
which the equipotential surfaces were found to be nesting right circular cylinders. Com
bining the image technique with this result facilitates the solution of a problem of con-
siderable practical importance.
Consider no\v the case of two straight parallel circular conductors, each of diameter 2a,
with a center-to-center spacing D, as shown in the figure. Let the upper and lower conductors
contain lineal charge densities of +)( and -)( COUI/l11, respectively. Since it is already known
that the equipotential surfaces for parallel line charges are cylindrical, it follows that the
system shown in the figure is equivalent to t\VO line charges at an appropriate spacing d.
This spacing d must be selected so as to give equipotentials for the t\VO surfaces occupied
146 Electrostatics in Free Space CHAPTER 3

by the outer skins of the two conductors. By use of Equation (3.31) this means that

D d 1 + k2
2=2 1 - k2
k
a=d--
1 - k2
which, solving for d, gives
d = (D 2 - 4a 2) ~2

Upon inserting this value for d in (3.30) one obtains the expression for the potential any-
where in the space surrounding the two parallel conductors, namely

<I>(x y) =-
x x2 + [y
In - - - - - - - -
+ !(D2 - 4a2))~J2
, 41rEo x 2 + [y - t(D2 - 4a2)~,~p

In particular, if (x,y) be chosen to correspond to any point on the surface of the upper con-
ductor, such as (0, D/2 - a), the potential of the upper cylinder is found to be

<1>+ = ~ In - - - 1 ]H}
{D+ [(D)2
21rEo 2a 2a
Since the median plane is at zero potential, it follows that the lower conductor is at a

I 1
d D

potential <1>_ = -<1>+. This can also be deduced by inserting an appropriate point, such as
(0, - D/2 + a), into (3.30). The potential difference between the t\VO conductors is therefore

V = 2<1>+ = ~ In { -D + - - 1 ]H}
[(D)2
~Eo 2a 2a
)( D
= -cosh- 1 -
~Eo 2a
If D » a, as is often the case in practice, then to first order
)( D
V = -In-
1rEo a

These results are useful in a discussion of two-wire transmission lines.


SECTION 8 The Method of Images 147

The problem of a uniform line charge parallel to a conducting cylinder is quite evidently
within the scope of this analysis and is left as an exercise.

EXAMPLE 3.15
The method of images can be applied to the problem of calculating the field due to a discrete
charge in the presence of a spherically shaped conductor, upon recognition of the fact that
any two discrete particles, statically separated and bearing charges of opposite sign but
arbitrary magnitude, give rise to one equipotential surface which is a sphere. To see this,
refer to the figure, in which PI and P 2 are the positions of the two charges ql and Q2, and 0

/---,p
// a \ II
I l2 \
I 0 PI

\ I r, P, J -I
\ ..... ­ rl ..
'
\
" ......... _ - - / /

is a point on the line which connects them. The potential at the point P is

ip(P) = _1_ (~ + ~)
41l"Eo ~l ~2

A zero potential surface is defined by the condition

£:=-~
~l ql

and this surface will be a sphere if the constant ratio ~2/~1 permits the distance a from 0 to
P to be a constant also. This will be true if the triangles OPP1 and OP2P are similar, for then
~2 r2 a
- - --
~l a rl

in which r i and rs are the fixed distances from 0 to the two charges. Thus an equipotential
surface exists which is spherical, with center at 0 and with a radius a which is the geometric
mean of the distances from 0 to each charge.
As can be seen by turning this problem around, if a grounded spherical conductor of
radius a is in the presence of an external discrete charge ql, placed a distance rl from the
center of the sphere, the field for this system can be computed by replacing the sphere with
a discrete charge

this equivalent charge being positioned a distance

from 0, in the direction toward ql.


This solution can be generalized to permit any value ip for the potential of the spherical
conductor by placing an additional equivalent charge

q3 = 41T' Eoa <I>


148 Electrostatics in Free Space CHAPTER 3

at the point O. Letting q3 = 0 gives the case just discussed, that of a grounded sphere. Let-
ting q3 = - q2 gives the case of an electrically neu tral sphere.

3.9 POISSON'S EQUATION

If a satisfactory model of the electrostatic systern in question results from assuming


that the volume charge density function p is well-behaved (i.e., has continuous first
derivatives), then it follows that Do is well-behaved also. In this event, the divergence
theorem can be applied to (3.35) giving

~
f V·DodV f Do·dS f pdV
=
So
=
~
(3.38)

in which V G is the volume bounded by the closed surface Se. Since 11 G is completely
arbitrary, the integrands of the t\VO volume integrals in (3.38) must equate point by
point; thus
v · Do = p (3.39)
But one can also write
Do = EoE = EO(-VcI»
so that (3.39) can be rewritten

V'2ep = - .!!- (3.40)


Eo
This important differential equation is due to Poisson. It relates the spatial derivatives
of electrostatic potential at a point to the volume charge density at the point and can
be viewed as the differential form of Gauss' law. The voltage distributions inside
vacuum tubes are solutions to (3.40) and it is the basis for analysis of electron beam
compositions, for design problems in electron beam shaping, and for the determination
of electron densities in plasmas.
The formal solution to this differential equation has already been found and is given
by (3.18). However, Equation (3.18) is principally useful in problems for which the
charge distribution is known beforehand and one desires to find the potential function.
There are also problems in which one begins by knowing neither the potential function
nor the charge distribution and wishes to determine both; in such cases it is often
advantageous to begin with (3.40) and seek a particular solution.
EXAMPLE 3.16
Consider two parallel plates composed of conducting material. As shown in the figure, one
plate has its interior surface situated at x = 0 and it is to be supposed that this plate is
heated so that it emits electrons into the interspace. The other plate has its interior surface
at x = l. A constant potential difference is applied between the plates by a battery so that
the unheated plate is Vb volts above the heated plate, thus attracting the emitted electrons.
A steady time-independent current results and this device can be recognized as a rudimen-
tary model of a diode. It is desired to find the distribution of electrons and potential in the
interspace and also the connection between current and plate voltage.
One may wonder why this problem, which involves moving charges, i~ treated in a
chapter on electrostatics. However, since the current is time-independent, at any point
between the plates there always must be as much charge arriving per unit time as leaving, so
the amount of charge at the point remains constant. Thus, even though the identity of the
SECTION 9 Poisson's Equation 149

charges in a volume element dV keeps changing, the amount of charge p dV does not.
Therefore, p may be a function of space but not of time and Poisson's Equation (3.40) Inay
be used to deduce <1>, which also will be a function of space but not of time,

cI>=o

1- -1 1

For simplicity the plate dimensions will be taken large compared to l so that variations
of voltage and charge density in the transverse directions may be ignored. Then (3.40)
becomes
d 2<1> 1
- = - - p(x)
dx? Eo

I t will be assumed that the electrons, under the repulsive action of the electron cloud already
in the interspace, are barely able to get out of the cathode, emerging with negligible
initial energy. Then if v(x) is the electron velocity at a distance x from the heated plate
(cathode),
~mv2 = ecJl(x)

relates the kinetic energy of the electron to the work that has been done on it by the field,
as it moves from the cathode through a distance x. In the above expression - e and m are the
electronic charge and mass and the cathode has been assumed grounded.
Further, since p is the volume density of charge, pv dA dt is the charge in a tube of cross-
section dA and ~Y directed length v dt. All this charge will pass through the tube in time dt.
Defining current as charge/sec passing a cross section, one can write

L dA = pv dA dt
dt
in which L is the current density, expressed in amp/rn". Thus
L = pv

is the relation between current and the flow of charge. In this problem, L is a constant, being
independent of both time and space. p and v are both functions of x but their product is not.
With the aid of these two auxiliary relations, Poisson's equation can be rewritten
150 Electrostatics in Free Space CHAPTER 3

which is readily integrated to give

a solution which satisfies <1>(0) = 0 and <I>(l) = Vb as well as the condition d<l>­dx = 0 at
x = 0, imposed by the assumption that the electrons barely get out of the cathode.
From this it follows that

vex) = (~: VbY2(TY'


p(x) = -~foVb(xl2)-,~

so that L = -KVb~~ (3.41 )

in which K = t (2e)}2 ~ = 2.33 X 10- 6


ni l2 l2
with l expressed in meters.
Equation (3.41) is known as the Child-Langmuir law and is obtained for any geometry
of cathode and plate, the only factor being affected by a change in geometry is the parameter
K, and its variability is not great. This nonlinear relation between current and voltage in a

~------------+---x

diode is a most vital characteristic, being responsible, as an example, for a useful technique
in signal detection. Equation (3.41) has been verified extensively by experiments employing
a variety of geometries.
The presence of a minus sign in (3.41) may seem surprising but it can be traced to the
equation L = pv. The electrons have velocities in the positive X direction but their charge
SECTION 10 Laplace's Equation 151

density p is negative so that L is negative; that is, it constitutes a current in the negative X
direction.
Plots of <1>, o, and v versus x can be found in the graph.

3.10 LAPLACE'S EQUATION

For those electrostatic problems in which the charge distribution is known completely,
Equation (3.18) can be used to determine the potential function and the relation
E == - V<I> can then be used to deduce the field intensity. It has been noted earlier,
however, that problems in which the charge distribution is not known beforehand arise
frequently. When this is the case the potential function in regions containing charge
can be obtained by solving Poisson's equation. Unfortunately, solutions to this equa-
tion are very difficult to obtain except for a limited class of relatively simple situations.
However, an extensive number of practical problems exists in which the charges are
confined to the surfaces of conductors, or otherwise constrained to occupy a limited
region. Under such conditions it is often advantageous first to determine the potential
distribution in the adjacent charge-free regions. For such regions, Poisson's equation
reduces to
\72<1> = 0 (3.42)
which is known as the Laplace equation after its discoverer.
Solutions to Laplace's equation must be chosen to match the conditions at the
boundaries of the charge-free regions, which is the link whereby the charges of the
system affect the potential distributions. 1"0 1' example, if all the boundaries of the
1

charge-free regions are conducting surfaces, then the constant potential over each
of these surfaces might be specified. This is called a Dirichlet problem. Solving Laplace's
equation for the potential subject to these boundary conditions, and forming the
gradient, permits one to determine Do at all boundary points; by this means the charge
distributions over the conducting surfaces are deduced.
Alternatively, the charge distributions over all the boundary surfaces may be speci-
fied, which is equivalent to stating the normal derivative of potential as a boundary
condition. This is called a Neumann problem. Solving Laplace's equation for the
potential, subject to these boundary conditions, yields not only the potential distribu-
tion throughout the charge-free region but also the potential of each conducting surface
forming a boundary.
It is of value to know that the solutions to Laplace's equation so obtained are unique.
To see that this is the case, imagine that t\VO functions <PI and <P2 have been found,
each of which satisfies Laplace's equation in the charge-free regions, and each of which
satisfies the boundary conditions. Then, since Laplace's equation is linear, their differ-
ence <1>1 - <P2 is also a solution and one can use the divergence theorem to write

s
f (ch - ch)V(<I>! - <1>2) • dS = f V • [(<1>1 -
v
<l>2)V(<I>1 - <1>2)) dV (3.43)

in which S is the totality of boundary surfaces of the charge-free regions and V is the
entire charge-free VOlU1TIC.
But the integrand in the surface integral of (3.43) is the product of <1>1 - <P2 and the
normal derivative of <1>1 - <P 2. One or the other of these factors is zero on any surface
element dS because both <PI and <1>2 are assumed to satisfy the boundary conditions.
152 Electrostatics in Free Space CHAPTER 3

o= f [(<1>1 -
V
<1>2)\7 2(<1>1 - <1>2) + IV(<I>1 - <1>2)1 2) dV

in which use has been made of the vector identity (Vvl O?'). Because <1>1 - <1>2 satisfies
Laplace's equation everywhere ill 11 this reduces to

f IV(<I>1 - <1>2)1 2 dV = 0 (3.44)


v
Since the integrand of (3.44) can nowhere be negative, it follows that

Thus 4>1 and 4>2 can differ from each other at most by an additive constant. This
constant will have no influence on potential differences between points in V and
disappears in taking the gradient. Thus, 4>1 and <P2 yield the same electric field distri-
bution and the solution is unique.

3.11 SOLUTIONS TO LAPLACE'S EQUATION IN RECl ANGULAR COORDINATES

When the electrostatic potential 4> is expressed as a function of the three Cartesian
coordinates x, y, z, Laplace's equation takes the form]

(3.45)

Using the method of separation of variables, one can assume a product solution of the
form
(3.46)

Since <P is a real function, it is convenient to assume that the functions fi are real also.
Upon substituting (3.46) into (3.45) one obtains

~df
2
3 = _.!- d 2f t _ ~ d~r2 (3.47)
f3 dz 2 II dx 2 12 d y 2
Since the right side of (3.47) is at most a function of x and y, whereas the left side
can be a function only of z, it follows that both sides 111USt be equal to the same constant.
For convenience this constant, which 111ay have any real value, will be designated by
k;.Then
2
df3 _ k2 f = 0 (3.48)
dz? z. 3

_ ~ d It
2
= ~ d~f2 + k2 (3.49)
11 dx 2 i« d y 2 z

The left side of (3.49) is at most a function of x whereas the right side can be a
t Cf. Mathematical Supplement, Sec. \T.16.
SECTION 11 Solutions to Laplace's Equation in Rectangular Coordinates 153

function only of y. Consequently, both sides must equal a constant, say k;, so that

(3.50)

(3.51)

in which (3.52)

If no one of the three constants k-, k y , k, is zero, appropriate solutions of (3.48),


(3.50), and (3.51) give
(3.53)

in which the brace notation is intended to signify

with a and b constants; etc.


If anyone of the separation constants is zero, for example, if k x = 0, then

fl(x) = ex +d (3.54)

and the appropriate factor in (3.53) is replaced by this linear solution.


Since Laplace's equation is linear, any sum of solutions of the type (3.53) is also a
valid representation for cI>(x,y,z). A particularly useful combination, applicable when
the potential is repetitive in intervals L, and L, in the X and Y directions is

(3.55)

The complex constant coefficients a m n and bm n may be determined from the boundary
conditions by the usual Fourier techniques. This formulation can be extended to non-
repetitive geometries by replacing (3.55) with a Fourier integral.
EXAMPLE 3.17
Consider again the case of t\VO parallel conducting plates, as first treated in Example 3.16,
but now assume that neither plate is heated, although they still differ in potential by Vb
volts, the plate at x = l being at the higher potential. This is now a parallel plate capacitance
problem, and with no electron emission occurring, Laplace's equation applies for the region
between the plates. Assuming transverse dimensions large compared to l, <P can be taken
independent of y and z and Equation (3.54) is an appropriate solution. Inserting the bound-
ary conditions that <1>(0) = 0 and <l>Cl) = Vb gives

x
cI>(x) = Vb- (3.56)
l

and thus the potential increases linearly from one plate to the other. This should be con-
trasted to the heated cathode case of Example 3.16 in which the space charge caused the
154 Electrostatics in Free Space CHAPTER 3

potential to increase as the four-thirds power of distance. The two potential distributions
are shown in the figure.

<t>(x)

/
/
­
"bq; /
SO ­
~'!11
l'1
<i~ /
~I
~/
­
/
I

From (3.56) one can deduce that the electric field is

E = - V <I> = -1 del> = -1 Vb (3.57)


x dx x l

Therefore the electric field is uniform between the plates and

EoVb
Do = EoE = -1 x - - (3.58)
l
The plate at z = l has a uniform surface charge density given by
EoVb
uo = Do = - - (3.59)
l
there being an equal and opposite distribution on the plate at x = O. The subscript on a is
a reminder that these plates are in a vacuum: later this configuration will be reconsidered in
the presence of a dielectric.

EXAMPLE 3.18
As an illustration of the applicability of harmonic solutions, consider an array of thin con-
ducting strips lying in the z = 0 plane. Each strip is a/2 units wide in the X direction and
infinitely long in the Y direction. Alternate strips are at potentials of V and - V volts, +
and insulated from each other by negligibly thin spacings b. This geometry is suggested by
the figure. It is desired to find the potential distribution in the upper half space z > 0.
First of all it is evident by applying (3.55) that one should select n = to be compatible
with no variations of potential in the }" direction ..Also, amo should be set equal to zero for
°
all m since the associated exponential term in z diverges as z ~ co. With these simplifications,
(3.55) reduces to

I
co

<I>(x,z) = b; exp (j271" l1:X) exp ( -271" ':IZ)


and this solution is subject to the boundary condition that the potential is a square wave in
x when z = O. Thus
<p(x,O)
SECTION 12 Solutions to Laplace's Equation in Cylindrical Coordinates 155

z
<I>(x,O)

,..------:1-----....----- y

-v

x
must agree with the potential plot indicated by the graph. This is a well-known problem
in Fourier series" and the coefficients are given by

bm =~ f
a/2

a -a/2
cI>(x,O) exp (-j 27rmx) dx
a
= .V
J1r111,
(1 - cos m1r) = -r.:
for m ~ O. (bo = 0). Therefore

c}>(x,z) ~ 2Jb. m SIn. -


= L 21rmx
- exp
(21rmz)
---
m=l a a

= -
2V 100

1 - cos m1r S. I n
21rmx
- - exp
(21rmz)
---
1r m= 1 m a a

4V [.
= --;- 27rX (21rZ)
SIn -;; exp - -;; +3 l' 67rX
SIn -;; exp
(61rZ)
--;;

1 107rX exp (I07rZ)


+ 5" SIn -a-
.
- -a- + ...J
This is a series in which the higher harmonic terms decay very rapidly in the + Z direction.
One does not need to be very far above the plane Z = 0 in order to find a potential distribu-
tion which is almost a pure sinusoid in the X direction.

3.12 SOLUTIONS TO LAPLACE'S EQUATION IN CYLINDRICAL COORDINATES


For problems involving boundaries which are coordinate surfaces in a cylindrical geom-
etry, it is advantageous to express Laplace's equation in terms of the cylindrical vari-
ables (r,et>,z). For this case Equation (V.86) of the Mathematical Supplement becomes
a2c}> 1 aep 1 a2ep a2ep
-+--+--+-=0
ar2 r ar r 2 aet>2 az2
(3.60)

15 See, e. g., 1. S. Sokolnikoff and R. 1\1. Redheffer, it!athernatics of Physics and Modern Engineering,

pp. 180-181, McGraw-Hill Book Company, New York, 1958.


1.56 Electrostatics in Free Space CHAPTER 3

Once again if the method of separation of variables is used, a product of three real func-
tions can be assumed in the form

(3.61)

which leads to the three ordinary differential equations

df2
k2fa = 0
_._3 _ (3.62)
dz 2

d2f
~
d¢2
+ 11 2­
J2
= 0 (3.63)

2fI 2
-ddr? + -1"1 -djI
dr
+ ( k2 - -
v )
1'2 ~
(1 = 0 (3.64)

in which the separation constants k 2 and 11 2 are arbitrary real numbers. Some of this
arbitrariness is removed upon writing the solution for (3.63) as
!2( 4» ::: {e±jJlq,}

and thus recognizing that II must be an integer n if the range of ¢ is unrestricted and the
potential is to be single-valued. Imposing this condition, one may write
(3.65)

Ordinarily, for the special case n = 0, the linear solution


(3.66)

would be indicated. However, the requirement that <I> be single-valued il11pOSeS the con-
straint that the constant c in (4.61) be zero. The remaining constant d can be aCeOnl1110-
dated in (3.65) by permitting that solution to apply also for the case n = O. Thus (3.6tj)
will be used for n = 0, ± 1, ± 2, . . . .
The solution of (3.62) proceeds in similar fashion, giving
!3(Z) = {e±kz} k~O (3.67)
f3(Z) = c'z + d' k =0 (3.68)

However, no integer restriction exists on the allowable values of k; indeed, since k 2 can
be any real constant, k can be any pure real number or any pure imaginary number.
If both k and n are zero, (3.64) has the simple solution
fI(r) = alnr +b n=k=O (3.69)

whereas if only k equals zero


k = 0 (3.70)

which can be verified by substitution.


For k '¢ 0 it is best to proceed by introducing the substitution variable v = kr,
which converts (3.64) to
2 2
-ddvf2I + -V1 -dfl
dv
+(1- n- )
v2
il = 0 (3.71)

This can be recognized as Bessel's differential equation.


SECTION 12 Solutions to Laplace's Equation in Cylindrical Coordinates 157

It will be assumed that the reader is familiar with the details of the method used for
solving (3.71) and only the principal results will be stated here. 16 The assumption of a
power series representation for fl(V) leads to the conclusion that (3.71) has two inde-
pendent solutions given by

__ ~ (-1)m(kr/2)n+2m
J n(lcr) ~ (3.72)
m=O mien + m) I
Yn(kr) =; ('Y + In ~) In(kr)
_ ~ \' en - m - I)! (~)n-2m
n-l

'Tr m!.: 0 m! kr
1 ~ (-l)m(kr /2)n+2m ( 1 1 1
-- L
7r m=O mIen + m)! 1+-+-+"
2 3
.+-m.
1 1
+1+-+-+··
2 3
.+--1 )
n + m
(3.73)

where 'Y = 0.5772


These solutions are known as Bessel functions of the first and second kind. They
appear formidable but are rarely needed in these forms in practice, since both functions
are tabulated. 17 For k real, the two functions are oscillatory, this feature being shown in
Figure 3.12 for the first few values of n.
For large arguments (kr » 1, n),

~1T'kr
-

1T'
In(kr) ~ 2
- cos (kr - - - -)
'nat
(3.74)
2 4

kr - -n7r - -7r)
Yn(kr) ~
~'Trkr
- 2 .
SIn
2
(
4
(3.75)

For this reason it is convenient to introduce particular linear combinations of J nand


Y n through the definitions

H~l)(kr) = In(kr) + jYn(kr) (3.76)


H~2)(k1') = In(kr) - jYn(kr) (3.77)

'I'hese combinations also form a fundamental set of solutions to Bessel's equation and
are known as Bessel functions of the third kind, or more commonly as Hankel functions.
For large arguments they have the asymptotic forms (2/1T'kr) ~1 exp [±j (kr - n1T' /2 -
7r/4)] and thus, when combined with the harmonic time function ei wt , represent incoming
and outgoing cylindrical waves, Their principal utility will arise later in the discussion
of time-varying fields.

16 Many excellent discussions of Bessel's equation and the properties of its solutions exist in the litera-

ture. For example, the interested reader is referred to J. Irving and N. Mullineux, 1\1aihematics in
Physics and Engineering, pp. 75-82, 128-174, Academic Press, New York, 1959.
17 See, e. g., the tables appended to G. N. Watson, A Treatise on the Theory of Bessel Functions, 2d ed.,

pp. 666-752, Cambridge Press, London, 1952.


L58 Electrostatics in Free Space rHAPTER 3

Yo(v)

FIGURE 3.12 Lower order Bessel functions of the first and second kind.

For small arguments (kr « 1),

(3.78)

;2 ( 0.5772 + In 2kr) n=O


(3.79)
Yn(kr) ~ _ (n _ I)! (~)n
l 7r kr
n¢O

Regardless of the value of the index n, the Bessel functions of the second kind are seen
to possess a singularity at r = 0 and thus they must be excluded from the solutions to
physical problems in regions containing the Z axis, unless a line source exists at r = O.
SECTION 12 Solutions to Laplace's Equation in Cylindrical Coordinates 1,59

If k is imaginary, the series solutions (3.72) and (3.73) are still valid, but for con-
venience a pair of modified Bessel functions is employed. Letting k = jf, with f real,
since the series (3.72) is even or odd according to the nature of n, it is possible to define a
real function by the relation
(3.80)

If one attempts similarly to modify Y n , a study of the series (3.73) reveals that a corn-
parably simple definition will yield a com-plex function, a result which is unwieldy
when one recalls that an effort is being made to represent a real potential function <P.
However, this difficulty can be avoided by introducing the function

(3.81 )

Inspection of the complex sum of the t\VO series (3.72) and (3.73) for imaginary argu-
ment reveals that Kn(fr) is a real function; further, In and K; comprise an independent
set of solutions to Bessel's equation and their asymptotic forms for large argument are
the simple expressions

(3.82)

Kn(lr) -+ ~ e- fr (3.83)

where fr » 1, n.
The functions In and K; do not oscillate and only K; is well-behaved at infinity. The
graphical forms for the first few modified Bessel functions are shown in Figure 3.13.
For low values of n they are widely tabulated;"
In summary of these results, <I>(r,et>,z) may be composed of a suitable product of three
factors from among:

fl(r) = alnr + b n=k=O


= {r±n} k = 0
= anJn(kr) + b« Yn(kr) k real, nonzero
= anIn(fT) + bnKn(fr) k = jf imaginary, nonzero (3.84)
f 2 ( cP) = {e±jncP } n=O,±1,±2,.
!3(2) = C'z + d' k = 0
{e±kz} k real, nonzero
= {e±jfz} k = jf imaginary, nonzero

One can observe from (3.84) that oscillatory functions in r combine with nonoscillatory
functions in z and vice versa.
Since Laplace's equation is linear, products formed from the factors in (3.84) can be
summed to give more general solutions. Of particular value are the summations of
products which comprise complete orthogonal sets. For example, if the potential is repet-
itive in the Z direction with a characteristic length L z , then an appropriate formula-
18 See, e. g., G. N. Watson, loco cit.
160 Electrosiaiics in Free Space CHAPTEH 3

~-----_-------Y---------r-W
o 1 2 3
FIGURE 3.13 Lower order modified Bessel functions.

tion is

!P(r, cP,z)
m=-ClOn=-ClO

This can be recognized as a double Fourier series in c­> and z. The complex coefficients
a-;«and bm n can be determined in the usual way through knowledge of the boundary
conditions. The terms for n = 0, k = 0, must be treated individually in that the linear
forms in (3.84) should be substituted where needed.
An orthogonal set of functions can also be generated in the radial direction. The pro-
cedure can be illustrated with the function J 1 (kr) which is plotted in Figure 3.14a out to
tn
t.x.j
o
8
H
o
Z
~

tv

Y12
\ Y 12 /
V
<,
Yll Y11 \.
~V Y:\ Y13 V:;.
I ~ I 0
[
~ ..
0
~
CJ:l
(a) (b) I (c)
0-
c-
~
"'t3
B"
C":l
~
CJ:l'"
VV J, (Y lI -'!-)
"0 ~
~
~
vvJ. (Y1.f.) I .-
~
~..
VVJ,(Y13 0
~
~.
f.) 1\
~

~
~
~.

Vo ~
vo I \ I I \ I ~
~
~.
C":l
~
~
0
0
~
~
~.
~
(d) (e) (f) ~
~
Cr.J

~
C'J
~
FIGURE 3.14 Construction of orthogonal Bessel functions.
162 Electrostatics in Free Space CHAPTER 3

its first root 1'11, in Figure 3.14b out to its second root 1'12, and in Figure 3.14c out
to its third root 1'13. If each of these curves is stretched (or compressed) to a common
length kr« and multiplied by ~, the result is the family of curves shown in Figures
3. 14d-f. These curves are plots of the functions vfk;. .J1('Ylmr/rO) in the interval
o~ r ~ roo The functions form part of an orthogonal. set, and this procedure can be
followed for any value of n since

(3.86)

for all integral values of n, and for any integral values of m ~ 1, p ~ 1. In (3.86) the
symbol Dm p is the Kronecker delta and has the value unity if m = p, but is otherwise
zero. Equation (3.86) is known as the orthogonality relation for the Bessel functions of
the first kind. Its derivation can be found in Appendix C together with a collection of
the more useful recursion relations and differential and integral formulas which connect
different Bessel functions.
When this type of expansion is appropriate, the potential function can be expressed in
the form

e:J>(r,cP,z) (3.87)

The complex coefficients a m n and b-;« can be evaluated for the given boundary conditions
with the aid of (3.86) and the usual Fourier formula associated with the series in cPo

EXAMPLE 3.19
Consider the case of t\VO concentric conducting cylinders with the figure of Exalnple 3.9
once again applicable. With the inner cylinder maintained at a potential Vb volts above the
outer cylinder, let it be required to find the potential distribution in the space between
cylinders.
By symmetry, the answer should be independent of !­> and z, so the proper selection from
(3.84) is
cI>(r) = i1 In r + B

Use of the boundary conditions e:J>(b) ::= 0, e:J>(a) = Vb gives


In (r jb)
<I>(r) = Vb In (a/b)
From this Do = -foV<I> = -1, LnfO~~J ~
so that a = Do(a) = f.OVb/ a
In (b/a)

and the charge per unit length on the inner cylinder is

all of which is harmonious with the results of Example 3.9.


SECTION 12 Solutions to Laplace's Equation in Cylindrical Coordinates 163

EXAMPLE 3.20
As indicated by the figure, a grounded conducting cylinder is immersed in what had been a
uniform electric field Eo = l xE o. The cylinder axis coincides with the Z axis and its radius
is roo Find the potential distribution external to the cylinder,

P(r,cp)

~-~-t-----X

.. Eft

Since the problem has no z dependence, k = 0 and the proper selection from (3.84) is

if>(r,cf» =
n=
I'
-00
(anrn + bnr-n)(Cneln4> + dnc 1n4» + a In r + b

in which ~' signifies that the term n = 0 is excluded from the summation. Symmetry condi-
tions indicate that the solution should be even in cI> and therefore the above expression
reduces to

I
00

if>(r,cf» (anrn + bnr- n) cos ncf> + a In r + b


n=l

The boundary condition at large r is such that there should still be a uniform field Eo, since
the effect of the cylinder is local. Therefore the potential at large r should be

<P = -Eox = -Ear cos cI>

For this reason a = b = 0 and an = 0, n ~ 1, with al = <E«. Thus

I
00

if>(r,cf» = - Eor cos cf> + b.r:» cos ncf>


n=l

The constants b; are determined by the boundary condition that ep(ro,cI» = O. This gives

o= - Eoro cos cf> + I


n=l
bnr- n cos ncf>

and therefore b; = 0, n :;e 1, whereas b, = Eor~. The final form of the solution is

<P(r,cI» Eor~) cos


+ -r-
=
( -Eor cI>
164 Electrostatics in Free Space CHAPTER 3

EXAMPLE 3.21
Consider a grounded hollow cylindrical can of radius r« and height h, as shown in the figure.
Its lid has been removed just sufficiently to be insulated from the can, and raised to a poten-
tial Yo. Find the potential distribution within the cylinder.

-1 h

,,--- -- - .......... , ,
I " I Y

x
This problem can be simplified from the outset by recogmzmg that the geometry is
= ro and
c­> symmetric and therefore that n = O..Also, since the potential must vanish at r
be finite at r = 0, only the J 0 functions will suffice. Thus the representation (3.87) is
appropriate in the form

~(r,z) = m~l Am sinh ('Yom~) J o ('Yom~)


in which the exponentials in z have been combined to give the hyperbolic sine in accordance
with the requirement that <P == 0 over the bottom of the can.
The coefficients Am can be evaluated with the aid of (3.86) and the boundary conditions
at z = h. Since

~(r,h) v, = = mtl AmSinh('Yom~)Jo('Yom~)


it follows, upon multiplying both sides by rJ 0 [ I'Op (~) ] and integrating, that

]'-v », (T'OP!:.-) dr =
o r«
I Am
m = 1
sinh (T'om~)
ro
]' rJo('Yom!..) J o('Yo P; ) dr
0 ro 0

= A P sinh . ( roh)'Y Op - r~ J 21 ('Y op)


• -
2
SECTION 13 Solutions to Laplace's Equation in Spherical Coordinates 165

Use of Equation (C.14) from Appendix C gives

d(
'Y Opd r­ ro) f'Y Op !-ro J 1("'lOP ro !-.)] = ('YoP

!-.)
J 0("'1 Op !..)
ro
so that the above integral yields

Ap = 2Vo
"'IOpJ ("'lop) sinh ("'lop h/ro)
1

and the potential is given by

J 0 ('Yom !..) sinh ("'IOm ~)


2:
00

4>(r,z) = 2Vo ro ~o
m=l 'YomJ1('YO m) sinh ('Yom;:;;)
For specific values of ro and h the table of roots in Appendix C may be used to determine the
relative richness of harmonics in the above series.

EXAMPLE 3.22
In a variation of the preceding problem, the cylinder is insulated from the bottom lid as
well as the top lid. If the two lids are grounded and the cylinder is kept at a potential V o,
the internal field can be determined by utilizing the representation (3.85). This choice is
dictated by the need to have the potential vanish at Z = 0, h, a condition which cannot be
satisfied by hyperbolic functions of z, Once again there is cJ> symmetry, making n = 0, and
the functions K.« cannot be used because the potential is finite at r = O. Therefore

4>(r,z)
~
= m'= 1 .i, sm h
. mirz
10
(m1rr)
h
Since cI>(ro,z) = V o,

2
fV
h
. prz
o SIn - dz =
00

\'
'-' s :), (m1rr o)
-h-
f . h1n1rZ .
h
SIn sin hp1rZ dz
o h m=l -h

in which the mirror potential - Vo has been assumed for convenience in the range -h <
Z < O. Thus

A = 2Vo(1 - cos p1r)


p
pt:I o(P1r ro/ h)

and the potential is given by the expression

~(r Z)

_ -
-
4Vo
-
, 2:
00

I o(m1r r/h) SIn


. (m1rz)
-
, 7r m=lmlo(m1rro/h) h

in which 2;' denotes odd values of m only.

3.13 SOLUTIONS TO LAPLACE'S EQUATION IN SPHERICAL COORDINATES

When the potential problem of interest involves boundaries which are coordinate
surfaces in a spherical geometry, Laplace's equation is best written in terms of the
spherical variables (r,(J,cJ»; Equation (V.86) of the Mathematical Supplement then
166 Electrostatics in Free Space CHAPTER 3

becomes

1 a (ael» 1 a ( ael» 1 a2ep


- - r2 -
1'2 ar ar
+ r-sin
2
-0 -ao sin 0 -
ao
+ -
1'2 sin! 0 a~2
= 0 (3.88)

A product solution in the form


(3.89)

can be assumed in which the functions fi are real. This leads to the separation
2f3
sin e-d
2
sin 0 d (rd f 1) df 2)
-- - 2 - + -- (.
SIn () - + -1 d- = 0 (3.90)
i, dr dr f2 de de f3 d~2
Since the last term is a function only of ~ whereas the first two terms are not, it follows
that
-.! 2
d f3 = (3.91)
f3 d~2
in which m must be an integer if the potential is to be single-valued. Thus

m = 0, ± 1, ±2, . . . (3.92)

Replacement of (1/13) d2f3/d~2 by -m 2 in (3.90) and division by sin? () gives

-! ~ (r 2
f1
d ) + _1_ !!- (sin 8 df 2) - ~ = 0 (3.93)
11 dr dr 12 sin 8 dO d8 sin 2 8
Because only the first term of (3.93) is a function of 1', this term must equal a real
constant which shall be designated n(n + 1), for reasons which will emerge shortly.
Thus
fl
-d ( r 2 -d ) - n(n + 1)f1 = 0 (3.94)
dr dr

~
d8
f2
(sin () d ) +
d8
[n(n + 1) sin (J - ~2 0] /2
SIn
= 0 (3.95)

Equation (3.94) is readily solved and gives

fl(r) = or: + br-(n+O (3.96)

Solution of (3.95) is facilitated by making the substitution u = cos () which leads to


2 2
(1 - u 2 ) -d f 22 -
du
2u -df2
du
+ [ n(n + 1) - -m-2]
1- u
12 = 0 (3.97)

The functions which satisfy this equation are the associated Legendre functions and
the t\VO independent solutions are normally designated pr;:(u) and Q:':(u). The latter
have singularities at the poles 8 = 0, 1r and must be excluded if the polar axis is part
of the region of interest. In all that follows this will be assumed to be the case. Appendix
D includes a discussion of the manner in which (3.97) is solved, together with a devel-
opment of the major properties of the functions l)~(u), and only the principal results
will be stated here.
SECTION 13 Solutions to Laplace's Equotum in Spherical Coordinates 167

The assumption of a power series solution of (3.97) for the case m == 0 leads to the
conclusion that, if n is an integer, f2(U) can be expressed as a polynomial which is
well-behaved in the entire region -1 ~ u ~ 1 and given by

1 dn
Pn(u) = - n - (u 2 - l)n (3.98)
2 n ! dun

The first few of these polynomials are

Po(u) == 1

and these functions are plotted in Figure 3.15. If all positive integral values of n are

1.0 ....- ..... --~--..------.--


.....- - . . , . . . - - ..... ----,r--...,......-~

0.8 ~~---l---~--4-----l~---+---_+_--+---t-----I---+H

0.6 1-----'~---+---4-----l~-__+_--_+_--_+_-~~-_+__t__t__1

0.4 l---~....--~~-+--:~~~--+---_+_--~--t__-_tt____t_____f

0.2 l----4--Jr--+----+--~~-_+_--_I_--_+_--t_+___+__t___;

P n(u) 0 1---~I----A.--.+---I---_w:~-_4_--_f_-__++__-_It_-____1

- 0.2 ~-~---+--~4-------lIiI'----+--~_+_--~~-t___+___t_-____1

- 0.4 1-----l-4----+----lJ£---=~~-__+_--_+_7I'_~d_-~t_-_+-__;

- 0.6 ~-I-~---J,£.--+---~~---+---_+_--+_--t__-_+-_____1

- 0.8 1-l--~!-.---+----4---~-__+_--_+_--_+_--t_-_+-____1

-1.0 L-_--.L _ _ .......L. _ _ -'--_ _ L - _ - L . _ _ --L._ _..J- _ _ .L--.._--..L.._-----l

-1.0 -0.8 -0.6 -0.4 -0.2 o 0.2 0.4 0.6 0.8 1.0
'U

FIGURE 3.15 Legendre polynomials.

included, the Legendre polynomials generated by (3.98) constitute a complete orthog-


onal set in the interval [- 1,1] and for this reason noni ntegral values of n normally
are not considered.
168 Electrostatics in Free Space CHAP'rER 3

For m ~ 0, the associated Legendre function P;:(u) satisfies (3.97) and is given by

(3.99)

Since ]J n (u) is an nth-order polynomial, m cannot exceed n in value. A variety of


recurrence formulas connecting associated Legendre functions and/or their derivatives
for different values of the indices «an be found in Appendix D together with a list of the
specific functions generated from (:~.99) for low values of 'In and n.
The associated Legendre functions are also orthogonal in [-1,1]' the normalization
integral being
1
+ m)!
JP
m m 2(n
(u)P l (u) du = ( )( )' (3.100)
-1
n
2n + 1 n - m .
DIm

When (3.89) is expanded in terms of the solutions which have been found for the con-
stituent functions, one obtains

IL
n ClO

4.>(r,8,rb) = [anr n + bnr-(n+l)]p;:,(cos 8)[cm cos mrb + dm sin mrb] (3.101)


m=On=O

The combination P=(cos O)[c m cos met> + d; sin met>] is called a spherical harmonic,
Being orthogonal in both cos (j and et> it is suitable for the expansion of arbitrary func-
tions of 0 and et> in spherical coordinates in exactly the same way that a double Fourier
series is used in two dimensions in rectangular coordinates.
EXAMPLE 3.23
Imagine a uniform electric field of strength Eo into which an insulated conducting sphere of
radius To is placed. For convenience take the polar axis parallel to the original field and
accept the resulting potential of the sphere as the zero reference. What is the field distribu-
tion in the region exterior to the sphere?
Equation (3.101) can be used with the simplification that m = 0 because of et> symmetry,
Then

L
00

4.>(r, 8) [anrn + bnr- (n+ 1l ]p n(COS 8)


n=O

I [anr~ +
ClO

4.>(ro,8) = 0 = bnrO"(n+l l jP n(COS 8)


n=O

Multiplying both sides of the second of these equations by Pl(cos 0) and integrating gives

L [anr~ + J
ClO 1

o= bnrO"(n+l l j P n(COS 8)PI(cOS 8)d(cos 8)


n=O -1
2
= [ alTol + blro -(l+l)
] --
2l + 1

this second result arising from the normaliza.tion integral (D.24) in .Appendix D. Thus
SECTION 13 Solutions to Laplace's Equation in Spherical Coordinates 169

Since the electric field must be IzE a at points remote from the sphere,

lim 4>(r,8) = lim


r-4OO r-4OO

r-+ 00
l
n=O
= lim (- Eor cos 8)
oo

anr n 1 -
[ cr
-
r
Pn(cos 8)

Therefore only al is nonzero and al = - Eo. The potential distribution is then

<I>(r,O) = -Eo [1 - (~YJ r cos 0

The field intensity is given by

The surface density of charge induced on the sphere is

(J(8) = EoEr(ro) = 3EoE o cos 8

Flux lines and equipotentials are shown in the figure. I t can be observed that the sphere
exerts little influence at distances larger than one radius from its surface.

I 1

I \ \

#f -
~-lt \ I I

\ \ I
170 Electrostatics in Free Space CHAPTER 3

EXAMPLE 3.24
The potential of a point charge at a distance rl from the origin of coordinates has c­> sym-
metry if the polar axis is chosen to pass through the charge. vVi th reference to the figure, the

P(r,O,t­J)

potential at the point P(r,(},c­» is

r
in which, in accordance with the cosine law,

~ = ~ [(~y + 1 - 2 ~ cos 8
H
= ; [ (; y+ 1 - 2; cos 8 r H

Li-r
00

But since (l - 2ut + t2)-~~ = n(U)


n=O

is a generating function for Legendre polynomials if t <1 (cf'. Appendix D), it follows that

r < rl

r > rl

The potential of the point charge can therefore be expressed as

ip(r,8) = ~ ~
47T"€orl n':o (!-)n
rl
Pn(cos 0) r < T1

= .si:
47T"€or
I (~)n
n=O r
Pn(cos ()
SECTION 13 Solutions to Laplace's Equation in Spherical Coordinates 171

This formulation can be used to find the potential distribution due to a point charge out-
side a grounded conducting sphere. If the polar axis is taken through the point charge ql,
which is a distance rl from the center of the sphere, it follows that the charge induced on the
sphere has a et>-symmetric distribution. L sing (3.101) wi th m = 0 to represen t the part of
the potential due to the charges on the sphere, and using the above series to represent the
potential due to ql, one can write

<I>(r,O) ~ bnr-(n+'lPn(cos 0) + _ql_ \' (~)n Pn(cos ())


a~T<rl n":o 41J"€Orl n':o rl

<I>(r,O) = ~ bnr-<n+l)Pn(cos 0) + .si: ~ (~)n Pn(cos 8)


r>rl n':.0 41J"€or n': 0 r
in which a is the radius of the sphere. The radial functions have been chosen appropriately
to satisfy the finiteness condition at infinity.
Since 4>(a,8) = 0, the orthogonality of the Legendre polynomials leads to

bna-(n+l) = ql (a)n
41J"€Orl rl
from which it follows that

n=O
I bnr-Cn+I)Pn(cos 0) - .»,
41J"€or n=O
I (~)n r
Pn(cos 8)

with

Thus the grounded sphere is equivalent to a second point charge at an interior point on the
polar axis. This result is in agreement with Example 3.15.

EXAMPLE 3.25
A conducting spherical shell of radius a is divided into four equal sectors as shown in the
figure. These sectors are insulated from each other by small gaps and are alternately at the
+
potentials Vo and - Vo. Find the potential distribution in the region external to the
sphere.

The general expansion (3.101) may be used once again with an == 0 to satisfy conditions
at infinity. Then since the potential must be an odd function of ~,

I I
n

<I>(r,O,(p) = d mn sin me/> r-Cn+1lp;:'(cos 0)


m=On=O
172 Electrostatics in Free Space CHAPTER 3

At r = a, the potential is independent of (J and is a square wave function in ¢ and thus

ff
1 211"

<fJ(a,rp)Pf(u) sin p¢ du d¢
-1 0

ff LL
1 21r n 00

= Pf(u) sin p¢ dmna-(n+l) sin m¢ P':(u) du d¢


-1 0 m=On=O

= 7T'a-<l+l) 2(l + p)! d


(2l + 1) (l _ p)! pi

by virtue of (D.30) and the Fourier normalization formula,


The left side of the above equation reduces to
1I"n
f [4 f f
1 1

v, Pf(u) du sin p¢ d¢ ] = 8V o
Pf(u) du
- l O P -1

in which p is restricted to the values 2(28 + 1) with 8 = 0, 1, 2, . . . . Therefore


d - 4al + 1V o (2l + l)(l - p)! F P
pi - -7r-- (l + p)! 1

in which F] = f:tPf(u) duo The dominant term in this series is Fi = 457T' /8. Thus

<fJ(r,8,¢) ~HVo (~r4 sin 2¢ P~(cos 8) ~ 5Vo(~r4 sin 2¢ (cos 8 - cos 38)

for r » a.

3.14 GREEN'S FUNCTIONS


A technique for solving boundary-value problems which has the virtue of systematiza-
tion arises from the use of Green's second integral theorem. With reference to Section
V.21 of the Mathematical Supplement, if <I>(~,l1,r) and 'l!(~,17,r), are well-behaved func-
tions in a volume V which is enclosed by a surface S, this theorem leads to the
identity
f (<fJV~'lJ'
v
- 'lJ'V~<fJ) dV = f
s
(<fJVs'lJ' - 'lJ'Vs<fJ)· dS (3.102)

1 -
a+ 1 -
a+ 1 _.
a
in which Vs =
z a~ y a17 Z ar
Let <P be the potential function being sought and let 'l! = G be the Green's function
defined by
1
G = - +X (3.103)
47T'fo~

with \7~x = 0 through Vand ~ = [(x - ~)2 + (y - 17)2 (z - r)2P'~. G can be interpreted +
as the potential due to a unit charge placed at (x,Y,z) plus the potential due to a source
system exterior to V. Because of the singularity in G, if (x,Y,z) lies within V, a small
sphere of radius E can be erected around (x,Y,z) and its volume excluded from V, or
alternatively the singularity can be represented by the Dirac delta function. The
latter approach gives
2 y o(r - rs)
'lsG = - - - - (3.104)
Eo
SECTION 14 Green's Functions 173

in which rand rs are drawn from the origin to the point (x,Y,z) and to the point (~,7J,r)
respectively.
Since \7~<I> == - pleo in V, (3.102) becomes

-f v
[<I> D(r - rs) - G ~J dV =
eo eo S
f (4) aGan - G a<l»
an
dS

wherein n is a spatial variable in the direction of the outward-drawn normal. This


yields the general result

<I>(x,Y,z) = f Gp dV + sf (G a<I>an - anac) dS


v
EO <I> (3.105)

if (x,y,z) is within V, otherwise the left side of (3.105) is zero. If the Green's function G
is known, Equation (3.105) gives the solution for the wanted potential function <I> in
terms of its sources within V and the values of <I> and its normal derivative on S.
Several classes of problems which can be solved by this formulation are worthy of
mention:
1. If G is chosen so that X == 0, if S consists of a single surface which goes to infinity,
and if <I> decreases with ~ as ~ ~ 00 (as it will for any finite source distribution), then
the surface integrals in (3.105) vanish and the familiar result (3.18) is obtained.
2. If <I> has no sources in V, so that \7~<I> == 0 in V, then (3.105) reduces to

<I>(x,Y,z) = EO f (G a<I>
s an
- <I> aG) dS
an
(3.106)

However, it already has been noted in Section 3.10, in connection with the uniqueness
proof for solutions to Laplace's equation, that if \7~<I> == 0 in V, knowledge of <I> or
aif>lan on S is sufficient to determine <I> everywhere in V. Thus (3.106) as it stands is
over-determined, and additional conditions may be imposed. The most commonly
imposed conditions are (a) G == 0 on S (the Dirichlet problem) and (b) aGlan == 0 on S
(the Neumann problemj.!"
In the Dirichlet problem the Green's function may be considered to arise from a unit
charge placed at (x,Y,z) plus a system of "image" charges so positioned outside S as to
cause G == 0 on S. Another interpretation is to imagine (for the purpose of determining
G) that S is a grounded conducting surface which contains an induced surface charge
distribution due to the presence of a unit charge at (x,Y,z). For any Dirichlet problem

<I>(x,y,z) == - eo fs <I> -aG


an
dS (3.107)

in which eo aG I an is the induced surface charge distribution.


Similarly, in the Xeumann problem, the Green's function may be considered to be
due to a unit charge placed at (x,Y,z), plus a distribution of charges exterior to V such
that aG I an == 0 on S. For any 1\'"eumann problem
aep
.)f an
ep(x,y,z) == eo G - d.S (3.108)

19 See J. I). Jackson, Classical Electroduruimics, p. 19, John Wiley and Sons, Inc., New York, 1962, for

a less restrictive Neumann condition.


174 Electrostatics in Free Space CHAPTER 3

The advantage of both formulas (3.107) and (3.108) is that if G is once found for a
particular geometry (i.e., a particular shape of surface S) an entire class of problems is
formally solved.
3. Since Poisson's equation is lineal', solutions 111ay be superirnposed so that, if clJ
has sources in V, a generalized Dirichlet problem gives

eJ>(x,y,z) = J Gp dV - sJ
v
EO
a ('1
eJ> -.!. dS
an
(3.109)

in which G == 0 on S. Similarly, a generalized K eumann problem gives


.
J Gp dV + J G -an dS
a~
~(x,y,z) = Eo (3.110)
v s

with aGjan == 0 on S.
4. Finally, mixed boundary conditions are possible, with G == 0 over S', a part of S,
and aG jan == 0 over S", the other part of S. The appropriate expressions for cf> are then
natural extensions of the results already given.
EXAl\fPLE 3.26
If an arbitrary potential distribution ~(a,tJ,<p) is established over a spherical surface of
radius a by means of a source system external to the sphere, the potential anywhere within
the sphere can be determined with the aid of (3.107). On the basis of the results of Example

In
­
I
/
­
(~,.",r)1
­
/
­

8
SECTION 15 Solutions to Laplace's Equation in Two Dimensions 175

3.15, G will be due to a unit charge placed at the point (r,f),cP) and an image charge of
strength - air placed at the point (a 2Ir,f),cP). Then

cfJ(r,(J,¢) = - ~ J7I" J
47T' 0 0
2'" <I>(a,lJ,lp) ~ (~
8a ~2
- air)
~l
a 2 sin f} dtJ d'P

If ~1 and ~2 are expressed as functions of a and r through use of the law of cosines, the
integrand may be arranged in a form which is suitable for evaluation.

3.15 SOLUTIONS TO LAPLACE'S EQUATION IN TWO DIMENSIONS


WITH THE USE OF CONFORMAL MAPPING
A large class of two-dimensional potential problems fits the condition that, with proper
orientation of the coordinate axes, ep is independent of z and Laplace's equation reduces
to
8 2ep 8 2ep
-+ - = =0 (3.111)
2 8x 8y 2

When this is the case, a powerful method of attack utilizing the theory of functions
of a complex variable may be brought to bear on the problem. Using the real variables
x and y, one may define the complex variable g == z jy, in which j == V-=1. I t is +
then convenient to associate any given value of g with a point in the xy plane, as shown

~---- X -------tl~

---.A.--------.L...-- x ----------"-------- 'U

(a) 3 plane (b) m plane


FIGURE 3.16 Functions in a complex plane.

in Fig. 3.16a, and thus refer to this as a representation in the complex S plane. The
coordinates may also be expressed in polar form through the transformation equations

R = (x 2 + y2)~' eJ> = tan' ' (;) (3.112)


and then S = Re i ¢ (3.113)
176 Electrostatics in Free Space CHAPTER 3

Imagine that additionally there is a different complex variable defined by the


relations
l11 = U + jv = CRe i IP (3.114)

\11 can similarly be represented by points in a complex plane, either in terms of ree-
tangular coordinates u, v, or polar coordinates ill, ip, (See Figure 3.16b.) If m is some
function of S, such that for each assigned value of S there is some rule which specifies a
corresponding value of lV, this relationship can be symbolized by writing

l11 = f(g) (3.115)

Then if S is permitted to vary continuously, its representative point in the S plane


may trace out a curve C, as shown. In general, the corresponding values of ro will
trace out a curve (£ in the m plane.
A small change ~s in the independent complex variable S will occasion a correspond-
ing change ~m in the dependent complex variable m, The derivative of f(g) may then
be defined in the usual way as the limit of ~ll)1 ~g, namely:

dro = lim f(g + ~g) - f(g)


(3.116)
dg ~s~o ~g

However, unlike the case of the derivative of a function of a real variable, there is no
unique path in the S plane along which g + ~g must approach g, which is to say that
~s may have any direction. If the derivative druidS has a unique value at S, regardless
of the path along which ~S is chosen, the function lUeS) is said to be analytic, or regular,
at S.
If the value of the derivative is to be independent of the direction of ~S, the same
result must be obtained if S is changed solely in the ~r direction, or solely in the y
direction. Since m = u(x,Y) + jV(x,lJ) , in the former rase

dlU = lim u(x + ~x, y) + jv(x + ~x, y) ~ u(x,y) _=-jvC~,Y)


dg ~X--'O 6x
(3.117)
au av
=-+j-
ax ax
whereas in the latter case
?l(x, y + ~y) + jvCr, Y + ~y) - u(x,y) - jv(.l',­})
-ds» = lim
.
-------------- ----------------
dS i~y-+O j ~lJ
(3.118)
av .au
= - -J-
ay ay
If m is to be analytic at S, it is therefore a necessary condition that
au
-- -
av (:3.119)
a:r ay
au
-
au (:3.120)
ax ay
These are known as the Cauchy-Riemann equations. Since any path through S can be
SECTION 1.5 Solutions to Laplace's Equation in Two Dimensions 177

expressed as the linear sum of the displacements ~x and j ~y, it follows that the Cauchy-
Riemann equations are also a sufficient condition that tu(S) be analytic at S.
EXAMPLE 3.27
Consider the function tu = S3 which gives u + jv = (x + jy)3 = (x 3 - 3 xy 2) + j(3x 2y - y3)

so that u = x3 - 3xy 2
au
-ax = 3x 2 - 3u 2
au
- = -6.'Clj
.'J
ay
av
- = 6xlJ
av
-ay = 3x 2 31;2
ax oj
-
,
The Cauchy-Riemann equations are seen to be satisfied for all values of x and y and thus
m = S3 is an analytic function for all points in the complex S plane.
It is left as an exercise to show that gn is analytic (n ~ 0) and thus that the series Lt:=oanS n
is analytic and can therefore be used to represent a general class of regular functions.

When the complex derivative defined by (3.116) exists, it may be found by the same
rules which are used for functions of a real variable. As examples

d . d
dg. (cos g) = - sm g - (In s) = s:'
ds
Returning to the Cauchy-Riemann equations, if (3.119) is differentiated with respect
to x and (3.120) with respect to y and the difference taken, one obtains

(3.121)

Alternatively, if the differentiation is reversed, there results

(3.122)

Thus both the real and imaginary components of an analytic function of a complex
variable satisfy Laplace's equation in two dimensions. It is for this reason that the
theory of functions of a complex variable is so rich in applications to potential problems.
For a problem in which one of the two components of tu is chosen to represent the
potential function 4>, it is interesting to note that the other component is related to the
electric flux. To see this relation, let u(:r,y) be chosen as the potential function for a
particular two-dimensional problem, Then since Do = - fOVU,
au au
D ox = -faA Do!! = -fO-- (3.123)
a:r ay
au au
and the vector i, a--x + lY-ay (3.124)

is perpendicular to the contour u(x,Y) == constant and tangent to the flux line which
passes through the point (x,y). But

du = -
av dx + -av dy
ax ay
178 Electrostatics in Free Space CHAPTER 3

which, with the aid of the Cauchy-Riemann equations, can be written


au
dv = - -
ay
dx + -au
ax
dy (3.12.5)

In moving along a contour v = constant, dv = 0 and the displacements dx and dy


. t h e ratio
are In (au)
. -
ax
: (au)
-
ay
which is a vector parallel to (3.124). Thus the lines v = con-
stant coincide with the electric flux lines.
Further, combination of (3.123) with (3.125) gives
1
dv = - (D oy dx - I)ox dy) (3.126)
to

If d£ = l x dx + l y dy is the general displacement implied in (3.126), then a surface


element

can be composed which is a rectangle, with a side de in the xy plane and a side of unit
length in the z direction. The flux through this surface element is
dtJ; = Do· dS = D ox dy - D oy dx (3.127)
Comparison of (3.126) and (3.127) reveals that
ell/; = - ~o dv (3.128)
and, except for an integration constant (which can be made zero by choosing the
reference for flux at v = 0),
l/; = -toV (3.129)
This equation can be interpreted as saying that the total electric flux between the
contours v = Vi and v = V2, per unit length in the z direction, is -to(V2 - V1)' Xot only
are the contours of v the flux lines themselves, but the value of v can be made a measure
of the total flux.
If the foregoing is represented in the complex ro plane of Figure 3.1Gb, a very simple
picture emerges. The horizontal equispaced grid of lines v = constant trace the electric
flux density, and the vertical equispaced grid of lines u = constant give the equipoten-
tials. This is the electrostatic field picture for the region between parallel conducting
plates which have been oppositely charged (cf. Example 3.17). However, the connection
to real space requires the knowledge of u and v as functions of x and y.
This connection may be viewed in terms of the transformation function lD = ­(g.)
relating the contours C and ~, as described earlier and shown in Figures 3.16a and b.
If ~ is a line u = constant, then C will be an equipotential contour in real space; if
~ is a line v = constant, C will be a flux line in real space. If the correct transformation
equation ­(g.) is found, these cquipotentials and flux lines will be the solution to the
problem under consideration.
It has been seen already that the function f(s) must be analytic if u and v are to
satisfy Laplace's equation. But then the derivative must be independent of path and
dhJ is uniquely related to dg by the expression
dhJ = f' (s) dg (3.130)
SECTION 15 Solutions to Laplace's Equation in Two Dimensions 179

If 1'(8) is considered in polar form, so that 1'(8) = Ae j a then Equation (3.130) states
that the magnitude of dtt) is A times the magnitude of d8 and that the angle of d~u is
the angle of dS augmented by a. Therefore the entire infinitesimal region in the neigh-
borhood of a point m is similar to the infinitesimal region in the vicinity of the cor-
responding point 8, merely being magnified by the scale factor A and rotated through
an angle a. For this reason, if two curves C andC' intersect at a certain angle in the
8 plane, the transformed curves (£ and ~' will intersect at the same angle in the tt) plane,
since both have been rotated by the angle a. Transformations possessing this property
of angle preservation are said to be conformal and every analytic function is therefore
a conformal transformation. As a particular example of this conformal property, the
angle between a flux line and an equipotential contour in the 8 plane is 90 deg; the
angle between a u line and a v line in the mplane is also 90 deg.
The problem of determining a two-dimensional electrostatic field distribution is
thus seen to be equivalent to finding the correct analytic function /(8) which will
transform the sought-for flux-potential map into a simple rectangular grid in the
mplane. As is so often the case in analysis, the inverse problem is simpler, namely, to
study a known function /(8) and see what physical potential problem it represents.

EXAiVIPLE 3.28
If n is a positive real number (not necessarily an integer) the function

is analytic and has real and imaginary components given by

u = Rn cos n¢
v = R» sin n¢

It is evident from the exponential forrn of this function that a semi-infinite straight line,
drawn from the origin in the 8 plane at an angle ¢, will transform into a semi-infinite straight

~ _ _ ..-60 1110.-""'_ <I> = 0 ~....... _..-<I>=O

line, drawn from the origin in the mplane at an angle n¢. Therefore this transformation is
useful in determining the fields near conducting corners. As examples, consider the grounded
interior and exterior corners shown in the figure. For the interior corner, the boundaries are
the semi-infinite lines at ¢ = 0, 1r /2. If v is chosen as the potential and n = 2, these lines
transform into the m plane as the t\VO halves of the v = 0 line, thus satisfying the condition
180 Electrostatics in Free Space CHAPTER 3

of zero potential. Then in the S plane, the potential distribu tion is

v(R,¢) = R2 sin 2¢

and the flux lines can be found by letting u = constant. Both fields are plotted in the figure.
Similarly for the exterior angle, since the boundaries are the semi-infinite lines at cP = 0,
31J" /2, if n = !, once again these boundaries transform into the ro plane as the two halves
of the v = 0 line. The potential distribution is therefore

2
v(R,cP) = R~'J sin - cP
3

and u = constant gives the flux lines. These fields are also shown in the figure.
This solution is applicable to corners of any angle.

EXAMPLE 3.29
Next consider the function
m = cos- 1 S
which gives
x + jy = cos (u + jv) = cos u cosh v - } sin u sinh v
x = cos u cosh v
y = - sin u si nh v

4---+--~I---+-.f-H~~f---+--+---f-+t+--+----t--t---;--+- U = 0

-3
SECTION 15 Solutions to Laplace's Equation in Two Dimensions 181

from which it follows that


x2 y2
- - +sinh!
cosh? v
- -v
x2 y2
cos! u sin 2 u =

The first of these equations, for constant v, gives a family of confocal ellipses with the foci
at x = ± 1. The second equation, for constant u, yields a family of confocal hyperbolas.
The two sets of contours are orthogonal, as shown in the first figure.
Inspection of this figure reveals a variety of problems which may be solved by this trans-
formation. If v is chosen to represent the potential, one can solve for: (1) the field between
two confocal elliptic cylinders, or between an elliptic cylinder and a flat strip stretched
between its foci; (2) the field external to a charged elliptic cylinder, including the limiting
case of a flat strip extending between the foci.
If u is chosen to represent the potential, one can solve for: the field between two confocal
hyperbolic cylinders, including the cases that one or both of them is a plane (u = tr/2 and/or
u = 0 and/or U = tr). The special cases include two perpendicular charged plates separated
by a gap and two coplanar charged plates separated by a gap.
. As an illustration, consider the case of t\VO semi-infinite conducting planes, both lying in
the XZ plane, and separated by a gap of width d, as shown in the second figure. The t\VO

1> = Vo _......I ...... --<I>


. . L - _ . . . '"""""'-.l.-J_..L.----L_--L.._.J........&~--a------ = 0

I~
~---d---~

conductors are assumed to be equally but oppositely charged, with the right plate at poten-
tial zero and the left plate at potential Vo.
Choosing u to represent the potential function, one sees that if d = 2 and V o = tc, the
preceding development applies without modification and u(x,Y) is given implicitly by

x2 y2
--_._- - -- = 1
eos 2 'U sin 2 u

However, it is a simple matter to scale this solution since if

then

with K and k arbitrary constants. In other words, both potential and distance can be scaled
linearly without affecting a solution to Laplace's equation. Therefore if general values of d
182 Electrostatics in Free Space CHAPTER 3

and V o are used in this problem, the solution for u(x,y) is contained in the equation
x2 y2

cos? (~) sin? (~:)


The flux lines, given by v = constant, scale similarly and both fields are sketched in the figure
in the upper half of space. The plots in the lower half would be similar.

3.16 THE SCHWARZ TRANSFORMATION

Example 3.28 dealt with the function ro = Sn, which was seen to map an angular section
of the S plane into the upper half of the ro plane. This angular section was bounded
by the semi-infinite lines 4> = 0 and cP = 1r­n (cf. Equation (3.112) for the meaning of cP)
and was thus controlled by n. Field distributions for grounded corners could be deduced
readily by means of this transformation.
A generalization of this technique is known as the Schwarz transformation and
permits the interior of a polygon in the S plane to be mapped into the upper half of the
to plane. (The polygon need not be closed.) To see how this is accomplished, consider
the inverse transformation

(3.131)

in which K is a constant, possibly complex, and the parameters an, an are real, but as
yet otherwise undetermined. This transformation is analytic everywhere except at the
points ro = an. Therefore if ro is caused to trace out the segment of the real axis in the
ttl plane which lies between an-l and an, Equation (3.131) indicates the phase of dS is
constant and thus the corresponding contour in the S plane is also a straight-line
segment.
The angle cPn which this S segment makes with the real axis is equal to the argument
of dS/dm evaluated at the nth segment. But this is given by

arg (:~) = arg K + al arg (IV - al) + ... + aN arg (IV - aN)

If the points an are graduated so that an-l < an for all n, and if the nth segment is
bounded by an-l and an, then it follows that for any point on the real axis, arg (ro - an)
equals zero or 1r according to whether or not hJ < all' 'rhus

(:3.132)

If (3.132) is subtracted from a similar expression for cPfI+l one obtains

cPll+ 1 - cPn = -1ra ll

By referring to Figure 3.17, one sees that all points011 the real axis segment an - all-l
in the hJ plane map into a line segment of slope cPlI in the S plane; similarly the seg-
ment an+l - an maps into the segment of slope cPlI+l. The interior angle B; is given by
SECTION 16 The Schwarz Tronsjormoiion 183

Hence Equation (3.131) may be written


d . N
~ = K [1 (ro - an)({3 7J7r)-1

dm n=O

This is the Schwarz transforrnation for a polygon with internal angles {3n. The complex
constant K controls the relative scale and orientation of the figure in the g. plane.

L....---"'------.L---------x L - - - - - - .....---"._----..-u

(a) ~ plane (b) m plane


FIGURE 3.17 J1!Iapping of a polygon.

This transformation is useful if (3.134) is easily integrable. The difficulties obviously


increase with the number of sides, and the inverse nature of the transformation causes
some inconvenience, in that it would be more desirable to use as independent variables
the S coordinates which describe the polygon under consideration. Despite these limita-
tions, the approach is a powerful one and will yield the field distributions around a
variety of grounded segmented shapes. If one or more of the vertices of the polygon
are at infinity, different parts of the boundary need not be at the same potential.
EXAl\1PLE 3.30
The field between t\VO semi-infinite conducting planes charged to a potential difference Vo
may be solved by using the Schwarz transformation. With reference to the figure, part (a)
shows the actual geometry being considered, and part (b) shows a polygon t in the S plane
which will tend to the actual geometry as the points b 2 and b, tend to - 00. This limiting
polygon can be mapped into a m plane by recognizing that the "interior" angles tend to the
limits (32 = 0, (31 = (33 = (34 = 27r. Then by use of (3.134)

f} = K f (l1J - al)(lu - a2)-1(lu - a3)(hJ - a4) dl1J + K'

t This polygon is exterior to the eon tour shown so as to include the region in which the field is desired,
and comprises the nondotted area of the plane.
184 Electrostatics in Free Space CHAPTER 3

(a) A plane

-----------------~-----------------x
<1>=0

(b) ~ plane

------------==~~~:a.L x

(c) ttl plane

___________ .... ... .. 11

a:t

(d) ttl plane


SECTION 16 The Schwarz Transformation 185

in which K' is a constant of in tegration. This may be written

I f now a-t ---t 00 so that the entire real axis in the m plane 111ay be u t.ilized, then K can be
permitted to tend to zero in such a way that - K" is the limiting value of Ka4. L'pon making
the further choices al = -1, Q2 = 0, aa = +1, the above expression becomes

g. = K" f ltJ2 ~ 1 dltJ + K'


and the z, axis is divided into four segments as shown in part (c) of the figure. Integration
gives

g. = K" [~2 - In ro J + K'

The constants may be evaluated by using the information that S = 0 + jl when m = -1


and that S = 0 +
jO when m = + 1. This gives

s= l
-; r1 - m
--2-
2
+ In ro J
as the final form for the transformation,
186 Electrostatics in Free Space CHAPTER 3

I t may be observed that the negative half of the u axis is an equipotential of value V o and
that the positive half of the u axis is an equipotential of value zero. Therefore the potential
distribution in the upper half of the m plane is given by

in which <p is measured counterclockwise from the u axis. If lU is written in the form
m = CRei'f, setting <p = constant traces out an equipotential, whereas setting CR = constant
traces out a flux line. This is shown in part (d) of the first figure. The corresponding g. traces
I11ay be found fr0I11 the transformation, and lead to the flux map shown in the second figure.
This solution may be used to deduce the fringing capacitance of a parallel plate condenser.
(Cf. Example 3.32.)

3.17 CAPACITANCE

The concept of electric flux which originates on positive charge and terminates on
negative charge has already been introduced, and has been seen to be a useful pictorial-
ization of Gauss' la \\.. I t also has been noted that when charge resides statically OIl a

FJGURE :3.1 S Capaciianre of two arbitrary conductors.

conductor, it does so the outer surface. The flux lines are then normal to the surface
011
on the vacuum side of the interface and are confined to the vacuum side, there being no
field within the conductor. This fact led to the conclusion that at each point on the
outer surface of the conductor Do = (J, with Do the flux density and (J the surface charge
density.
Let these ideas be applied to the ease of t\VO conductors in free space, as shown
in Fig. 3.18. Each conductor has an arbitrary shape and their relative position and
SECTION 17 Capacitance 187

orientation is also arbitrary. If one conductor contains a net excess charge Q and the
other a net excess charge - Q all the flux lines leaving one conductor terminate on the
other, as suggested by the figure. From a knowledge of Do(x,Y,z) in the intervening
space, one could deduce E(x,Y,z) and then determine the potential difference between
the t\VO conductors by computing the line integral of longitudinal E along an arbitrary
path extending from one conductor to the other.
Suppose that one were to double Q and - Q. This would double Do everywhere and
double E everywhere, thus doubling the voltage difference between the t\VO conductors.
From this it follows that the ratio of the charge on one of the conductors to the voltage
between them is a constant. This constant is a useful index of the charge storage cap-
ability and is called capacitance. t
The conclusion just reached is valid for arbitrary conductors and the general defini-
tion of electrostatic capacitance is therefore
Q
Co =A (3.135 )
V
in which Q is the charge in coulombs and V is the potential difference in volts. The
subscript on Co is a reminder that the intervening space is a vacuum. Capacitance is
measured in units of coul/volt, more commonly called farads.
EXAl\1PLE 3.31
Other illustrative examples have included a variety of situations involving two conductors
containing equal and opposite charges. Making use of the results of Example 3.9, one may
conclude that the capacitance per unit length of t\VO concentric cylinders is

Co == €O~
In b/a
From Example 3.14, the capacitance per unit length of t\VO parallel tubular conductors of
radius a and spacing D is

Co = €o 7r
cosh:' D/2a

From Example 3.17, the capacitance per unit area of t\VO closely spaced parallel plates is

1
Co = Eo-
l
These expressions for capacitance all contain the multiplicative factor Eo and indicate
why the units for Eo are ordinarily taken as farads/m.

EXAMPLE 3.32
The expression given above for capacitance per unit area of two parallel plates leads to the
approxima.te result
A
Co = €o 1:
as the total capacitance, if the area of each plate is ..4. However, this result neglects fringing
and assumes that the Do field is uniform and does not extend beyond the plate edges, as
t The two conductors are said to comprise a capacitor or condenser.
188 Electrostatics in Free Space CHAPTER 3

suggested by the figure. The true field is more like the flux map shown in Example .3.30,
which indicates an extension of the field beyond the plate edges and some additional charge
storage on the back sides of the plates. A more accurate expression for capacitance may be
deduced with the aid of the Schwarz transformation there derived.

I
4> =

I
V, I ~ plane

4>=0
. x

For two semi-infinite parallel plates, separated in distance an amount l, and in potential
an amoun t V o, a transformation to the m plane led to the resul t that

with m = CRe icp • The charge density in hJ space is therefore

= -
a<I>
= - - -
Eo a<l>
= -
Eo
--
V0
an acp
(j €o -
u 7rU

The total charge on the lower plate, per unit length perpendicular to the 8- plane, between
x = 0 and x = ~, and accounting for both the inner and outer surfaces, is

J du
U2

Q(~) = - €oVo
7r U
UI

in which Ul and U2 straddle the point u = a3 in such a way that

~
.
+ JO =:;l [1 - ui +
-2- In u 1
]

=; [1 - u~ +
l --2- In 1[2
]

these expressions arising from the Schwarz transformation.


»
If I~I l, a good approximate solution may be obtained to these transcendental equa-
tions, for then 'lit «
1 and us »
1. Hence (1 - ui) /2 is negligible compared to In 1l} and
In U2+t may be neglected in comparison to -u~/2. Thus

Inul~-
7r~ 1
In u2~21n 27r~)
( - -Z-
l

QW ~ - EO;O [7r~~1 + ~ In e7r}~I) ]


The first term in this expression for charge is the value which would occur if there were no
fringing and the second term is therefore the correction due to fringing.
SECTION 18 AIulticapacitor Systems 189

This result may be applied to the practical problem of a parallel plate capaci tor of area
A = ab and spacing l. If it is assumed that a » l, b » L, the fringing charge is approximately

in which the effects at the four edges have been sUTl1D1ed, with I~l chosen as a/2 or b/2
as appropriate. The total capacitance is then given by

C =
o
f
0
:!
I
r1
L+
In ('­fall)
'­fall +
In ('­fbll)
'­fbll
J
As a specific illustration of this result, if the t\VO plates are 2 in. X 4 in. and spaced 0.1 in.
apart, the capacitance is 10 percent higher due to fringing.

3.18 MULTICAPACITOR SYSTEMS

The results just obtained for a capacitor consisting of t\VO conducting bodies oppositely
charged may be extended to the ease of many conductors. Let otherwise empty space
be populated by N conducting bodies whose outer surfaces are designated as Sn. These
conducting bodies may have arbitrary size, shape, orientation, and position, and their
general distribution is suggested by Figure 3.19. Without loss of generality, one of the

FIGURE 3.19 A systern of conducting bodies.

conductors may he considered so vast as to be an "earth," that is, an infinite reservoir


of both types of charge, and at potential zero.
As consequences of the uniqueness theorem for solutions to Laplace's equation,
t\VO general propositions may be established for this system of conductors:

1. If the electrostatic potential of every conductor is specified, there is only one dis-
tribution of electric charges which will yield these potentials.
2. If the total charge on each conductor is specified, there is only one way in which
the charges can distribute themselves over the surfaces Sn in order to be in
equilibrium.
190 Electrosiatics in Free Space CHAPTER ~3

The first proposition is based on the fact that, if all the boundary potentials are pre-
scribed, a unique electrostatic potential distribution, epCr,Y,z), exists in the intervening
space. But then a unique dip­an is established everywhere, including all points con-
tiguous to S«. Since the charge distribution (J is proportional to acI>­an at such points, it
follows that (J is a unique distribution over ail the conducting boundaries Sn.
The second proposition may be established by a similar argument. Each conductor
is an equipotential surface once its total charge is in an equilibrium distribution. But
this leads to a unique cI>C-C,Y,Z) , a unique dcI>­an, and thus a unique (J, with f(J d.S;
equalling the total charge on the nth body. s,
These t\VO results 111ay be summarized by saying that the distribution of electric
charge over the outer conducting surfaces S; is fully specified if one knows either (1) the
potential of each conductor or (2) the total charge on each conductor.
K ow suppose that there are two equilibrium distributions of charge:

1. A distribution (J giving total charges Ql, Q2, on the different conductors,


with their potentials being V b V 2, . . .
2. A distribution (J' giving total charges Q~, Q~, on the differen t conductors,
with their potentials being V~, V~, ...
Since Laplace's equation is linear, these distributions J11ay be superposed with the
result that (J+ (J' will give a total charge Qn + + V:.
Q:t on 1-.';/1, its potential being V n
Clearly this conclusion 111ay be extended to the superposition of any number of charge
distributions.
As a particular application of the foregoing, if charges (21, Q2, . . . , QN give rise
to potentials V 1, V 2, . . . , V N on the N conducting bodies, then charges kQl, leQ2,
. . . ,kQN will cause potentials kVI, leV'}., . . . kV N, with k any real constant.
Suppose next that a positive unit charge is placed on the first conductor with all
other conductors left uncharged, and that this produces the potentials

Pl1, P21, . . . ,PNI


on the N conductors respectively. Then if a charge Ql is placed on the first conductor,
with all others left uneharged, this will cause potentials

PIlOl, P21Ql, . . . , PNIQl


Similarly, if placing a positive unit charge on the nth conductor and Ieaving the
others uncharged produces potentials

pin, p2n, . . . , pNn


then placing Qn on S; and maintaining the other bodies uncharged will yield potentials

If these distributions are superposed, the effect of charges Ql, Q2, . . . , Qv on the
N bodies is to cause their potentials to become V 1, V 2, . . . , VN where
VI = PIlQl + P12Q2 + + Pl.vQ.v
V: = P21Ql + P22Q2 + ... + P2:vQN
(3.136)

VN = PNIQl + PN2Q2 + ... + PNNQN


SECTIOK 18 M'uliicapaciior J,..';ystems 191

These equations give the potentials in terms of the charges. The factors P» are called
coefficients of potential; they are purely geometrical quantities which depend on the
size, shape, orientation, and position of the various conductors. Except for a few simple
geometries, the calculation of the coefficients is quite involved, but their values may
be deduced experimentally with little difficulty.
Some of the properties of Equations (3.13G) may be brought out with the aid of
Green's reciprocation theorem. 'This theorem is concerned with a set of ill point
charges qm, placed at positions where the potentials due to the other 111 - 1 charges
are given by a set of numbers <Pm. These potentials may be written
Jl1

ep --~_.
1 \', qn
m - 47r EOn = 1 ~ m n

in which ~mn is the distance from 'l» to qm, and the prime on the summation sign indi-
cates that the term for n = m is deleted frorn the SUInt
Alternatively, if a different set of charges q~ is placed at the same points, the poten-
tials will be

<p'
m
= _1
47r EOn
If ~
1\1

=1
,

~m n

Upon multiplying the first of these sumrnations by q~, the second by qm, and then
summing each resulting expression over the index m, one obtains

I <l>mq~ 4:€ I L'


M AI Iv! ,

= qnqm (3.137)
m= 1 0 m = 1 n=1 ~mn
u
I <I>~qm 4~ L If
M J.\1 I

= qnqm (3.138)
m=1 7rEO m = 1 n = 1 ~mn

The right sides of (3.137) and (3.138) are equal to each other, since either can be
converted to the other by an interchange of the summation indices m and n. Thus
111 1\1
2:
m=l
<Pmq~ = 2:
m=l
tI>~qm (3.139)

which is Green's reciprocation theorern. It may be extended to a set of N conducting


bodies whose potentials are V n, and which possess total charges Qn, by combining
all the points of a COn1ITIOn potential in (3.139) into a single term. This gives
N N
L Q~Vn
n=l
= 2:
n=l
QnV~ (3.140)

Consider now the special case that Qi = 1, all other conductors being uncharged, so
that (3.136) gives

If instead, Q; 1, all other conductors being uncharged, then

V~ = PI}, V; = P2j, ... , VN = PNj


192 Electrostatics in Free Space CHAPTER 3

and application of (3.140) yields


N N
2:
n=l
Q:Vn = Pij 2:
n=l
QnV~ = Pij

so that Pij = Pji (3.141)

and the coefficients of potential are symmetrical, with only Nt N + 1)/2 of them being
independent. Other properties of these coefficients may be deduced from their basic
definition. Since the Pij are the potentials at the surfaces Si due to a positive unit charge
on Sj, all the Pij must be positive. Further, the conductor possessing the charge must be
at the most positive potential and thus

(3.142)
EXAMPLE 3.33
Equation (3.141) may be interpreted in words by saying that the potential to which S, is
raised by placing uni t charge on S j, all other bodies being uncharged, is the same as the
potential to which S, is raised by placing unit charge on Si, all other bodies being un-
charged. This is, of course, still true if S, and S j are the only t\VO bodies in the system.
As a special case of this result, let the first conductor be reduced to a point P and suppose
that the system contains additionally only a second conductor. Then the potential to which
the conductor is raised by placing a unit charge at P, with the conductor itself uncharged,
is the same as the potential which would be found at P if unit charge were placed on the
conductor.
Specifically, let the conductor be a sphere and let the point P be a distance r from its
center. Since a unit charge on the sphere causes a potential 1/47rfor at P, if a unit charge is
placed at P, the uncharged sphere will assume a potential 1/47rfor.
Equations (3.136) comprise a linear set of N equations which 111ay be solved to give

Ql = CUV 1 + C12V 2 + + CINV N


Q2 = C 21V 1 + C 22V 2 + ... + C V 2N N

(3.143)

in which the coefficients Cij represent appropriate ratios of two determinants involving
the PijS. Thus the CijS are also purely geometrical quantities, depending on the size,
shape, orientation, and position of each conducting body. c., is called a coefficient of
capacitance, and c., (i ~ J) is called a coefficient of electrostatic induction.
I t follows from Green's reciprocation theorem that

Cij = Cji (3.144)

If the jth conductor is raised to a positive potential while all the other bodies are
grounded, Qn must be positive, but all other charges must be negative. Therefore

Cij ~ 0 (i ~ j) (3.145)

Furthermore, since the total charge of the system cannot be negative in this situation,
SECTION 19 Electrostatic Stored Energy 193

for any value of the index J,


N
'"' c..
'-' tJ_>0 (3.146)
i=l

Equations (3.143) may be rewritten in a more revealing form by making the substi-
tutions
N
C, =
j=1
2: c;

which leads to

Ql = ClIVI +C12(VI - V2) + + C IN (V 1 - VN)


Q2 = C2I(V 2 - VI) + C22V2 + + C2N(V2 - V N)
(3.147)

The quantities Cii and Cij are known as the self-capacitance of the ith body and the
mutual capacitance between the ith and jth bodies. Cij = Cji, by virtue of (3.144),
and all the CiiS and CijS are positive because of the defining relations and the results
(3.145) and (3.146).
An interpretation of (3.147) may be undertaken with reference to the first of these
equations. The total charge Ql has a component CuV 1 which may be attributed to a
capacitance Cn between the first body and ground, since V 1 is the absolute potential.
Additionally, there is a charge C12(V 1 - V 2) residing on the first body, which may be
attributed to a capacitance C12 between the first and second bodies, with this capaci-
tance charged to a voltage difference VI - V 2. Since C21 = C12, there is an equal
and opposite charge C2I(V 2 - V 1) residing on the second body. A number of flux
lines C12(V 1 - V 2) connects these two bodies, originating on C12(V 1 - V 2) and
terminating on C21 (V 2 - V 1)' Similar explanations may be offered for the other terms
C1n(Vl - Vn) occurring in the first equation. Thus the entire set of Equations (3.147)
may be interpreted in terms of a capacitance Cii between the ith body and ground, plus
capacitances Cij between the ith body and each other body in the system. These
capacitances are purely geometrical quantities and often can best be determined by
experimen t.

3.19 ELECTROSTATIC STORED ENERGY

Since electric charges exert forces on each other, work is performed when they move.
In particular, energy normally is required to assemble a system of charges into a given
distribution. This energy may be said to be stored in the system. The technique
employed in Section 3.8 to develop the method of images provides a simple means for
calculating this stored energy.
Let it be desired to find the electrostatic energy stored in the charge system of
Figure 3.11a. If the charge Qext is allowed to collapse onto the surface cPo, forming the
distribution (J, the external field disappears. If, in addition, the charge Qint is allowed
194 Electrostatics in Free Space CHAPTER 3

to collapse onto the surface 4>0, forming the distribution (J", the internal field also dis-
appears. Since (J" = -(J', the net charge everywhere on cf>o will be zero and the system
has become electrically neutral.
This provides an excellent starting point for the creation of the system of Figure
3.11a. Consider the family of surfaces So, SI, S2, ... ,which is to become the family
of equipotentials <1>0, <PI, <P2, . . . ,in the final system, with <Po the innermost of these
surfaces. One begins by placing the charge distributions (J' and (J" on So. The charges
comprising (J' are then moved to S1 and changed to the distribution (J'l = D 1 , in which
D 1 is the flux density distribution the final system is to have over S1. The charges of (J'1
are next moved to S2 and changed to the distribution (J2 = D 2 , in which D 2 is the flux
density distribution the final system is to have over 8 2 • If this process is continued to
completion, all the charges comprising Qext will be in their proper places and the field
external to So will be precisely that of the final system. Since (J" is still on So, the field
internal to So still will be identically zero.
How much energy is expended in moving the charges of (J' from So to S 1 '1 Consider a
surface element d.S« in So, as shown in Figure 3.20. The charge (J' d.S« is to be transferred

81 So
FIGURE 3.20 Transfer of charge.

from d.S« to a surface element dS 1 in S1 such that c dS« = (J'1 dS 1. Let dp d.S« be the
amount of charge which is transferred at a time when p dS o units of charge have already
been transferred. At this time the density of charge on d.S« is (J" + «(J' - p) = -p
and the density of charge on d.S, is (to first order) +
p. If de is the distance from d.S«
to dS l , the work done in transferring the charge dp d8 0 is

d 41V = -P dp d.S« de
Eo

The work required to transfer all of the charge (J' d.S« is therefore

1 (J'2 1
d 3W = - - dS o de = - E2dS o de
Eo (3.148)
2 Eo 2
SECTION 19 Electrostatic Stored Energy 195

Thus the work done in moving all the charges comprising (1 from So to S 1 is

dW = -ho
v
J
1- VO
E2 dV

in which VI - V o is the volume between Sl and So.


It follows that the energy stored in the system as the charges of (1 are 1110ved from So
to their final positions is given by

Wex t = tEo J E2 dV
Vext
(3.149)

with V ext the entire volume external to So.


If So, S~, S~, ... , is the family of surfaces which is to become the family of equi-
potentials <Po, <P~, <P~, . . . , with So the outermost surface, the charges comprising (1'
may now be transferred successively from So, to S~, S~ to S~, etc., until they reach their
proper final positions. The work necessary to do this will be

W i n t = tEo J E2 dV (3.150)
Vint

with Vi n t the entire volume internal to So. The conclusion is reached therefore that the
total electrostatic energy stored in a system of static charges surrounded by free space
is given by
WE = tEo Jv E2 dV = t J E· Do dV
v
(3.151)

in which the volume integration extends throughout all space. This suggests that the
energy is stored in the field with a volume density of f. oE2/ 2 joulc/m", but of course no
experimental verification of this is possible. Only the integrated form (3.151) is suscep-
tible to check. However, this interpretation will prove very attractive when time-vary-
ing fields are considered in Chapter 5.
If the ultimate position of all the charge Qext is such that it is distributed over the
surface of a single conductor, and if similary Qint finally ends by being distributed over
the surface of a second conductor, an electrostatic system such as the one shown in Figure
3.18 will have been created. For such a case it is interesting to return to (3.148) and write
d 3W = ~((1 dS o) (E dt)

as the work required to transfer the charge (1 d.S« to the surface element dS 1. The work
required to transfer this charge to its ultimate proper place on the conductor is therefore
d 2 W = 1;;(1 d.S« fE1
de = ~(1 dSO(<PA - <Po)
in which <PAis the potential of the conductor. The work invested to transfer all of Qext
to the first conductor is then

Wext = ~(<PA - <po)f(1 dS o = !(<PA - <PO)Qext


Similarly, the work required to 1110Ve all the charge Qint from So to the second conductor
IS

W int = t(cPB - cPO)Qint


in which <PB is the potential of the second conductor. Letting Q = Qext = -Qint, and
196 Electrostatics in Free Space CHAPTER 3

letting V = <PA - <PB be the voltage difference of the two conductors, one can conclude
that
(3.152)
is the total energy stored. Since Q = CoY, with Co the capacitance, this result may be
written in the alternative forms
(3.153)
1 Q2
WE =-A (3.154)
2 Co
These equations probably already are familiar from circuit analysis, and it is to be
noted that the derivation just given is valid for any arbitrary pair of conductors.
EXAMPLE 3.34
If a parallel plate capacitor of plate area A and spacing l is charged to a potential difference
Vb' the electric field is uniform (neglecting fringing) of value

By virtue of (3.151), the energy stored in the capacitor is

r
l~ E = 2"
1 (Vb)2
Eo-z- Al = 21( Eo lA_) Vb2
But it already has been noted that the capacitance is Co = Eo All and therefore

vV E = iCoV~
in agreement with (3.153).

The energy stored in a multicapacitor system 111ay be deduced by a generaliza-


tion of the argument leading to (3.152). If <Po is at absolute zero potential, and a d.S« is
an element of charge which will ultimately reside on a conducting surface Sn, then
d 2W = i a dSoV n
is the work needed to move this charge from So to Sn, where the potential is to be V n.
When all elements of charge in a and a' are 1110ved to their final positions, the energy
stored is
N
1VE = t 2:
n=l
Qn V n (3.155 )

Use of (3.143) allows (3.155) to be written


N N
1V E = t LL
m=l n=l
Cnm V mV n (3.156)

The field expression for electrostatic stored energy, (3.151), may be converted to
still another form which leads to a generalized geometric formula for capacitance. Using
the relation E = - Vel> gives
TVE = tEo f E· (- Vcf» dV
v
SECTION 19 Electrostatic Stored Energy 197

which, through application of the vector identity (V.107) becomes

WE = -ho f
v
<l>V· E dV - -ho
v
f V • (<I>E) dV

Substitution of EoV • E = p into the first integral and application of the divergence
theorem to the second yields

WE = t f
v
p<l> dV - t~o f <I> E • dS
s

But S may be taken as a sphere at infinity, and since E decreases as ~-2 and <I> decreases
as ~-1, the surface integral is seen to vanish. Therefore

WE = t
v
f p<l>dV (3.157)

The electrostatic potential function ep rnay be expressed as the integral

<I> = f pi dV' (3.158)


v' 41r€o~

wherein primes are used so as to be able to distinguish between the contributions


to the integrals in (3.157) and (3.158). Thus the electrostatic stored energy may also be
expressed in the form

WE =! f f ~ dV dV' (3.159)
2 17 v' 47r€o~

in which ~ is the distance between the volume elements dV and dlT ' , and the integration
is to be performed twice throughout all of space containing elements of charge.
This result may be applied to situations in which equal and opposite amounts of
charge [Q,-Q] reside on two conducting bodies whose exterior surfaces are 8 1 and 8 2 .
The surface charge densities CT1(~,1J,r) on 8 1 and (12(~,1J,r) on 8 2 are both linearly pro-
portional to Q so that one may write

for any point (~,1J,r) on 8 1, and

for any point (~,1J,r) on 8 2 with 11 and 12 functions which give the normalized charge
distribution. Under these conditions, (3.159) may be written

WE = Q2
2
f f fd~ ss.cs; + Q2 f f fd~
S1 S' 47r€o~ 2 S2 S' 47rEO~
dS2dS~ + Q2 f f fd2 dS
S1 S2 47rfo~
1dS 2
1 2

wherein f~ implies fl(~',rl',r'), etc. It is evident from (3.154) that the capacitance of
this system must be given by
198 Electrostatics in Free Space CHAPTER 3

EXAMPLE 3.35
Two concentric conducting shells of radii T1 and T2 form a capacitor whose capacitance can
be determined with the aid of (3.160). On the basis of the figure, let charges - Q and + Q

-Q

+Q

be placed on the inner and outer shells; then !1(ttJ,r) = -1/41rri and !2(~,7J,r) 1/47rr~
are constants and

1=2
8
JJ 82
fd2
41rfo~
as.ss,
~
= - _1
21rfo
(~) (~)
41rrl 47T' r 2
JJ
81 82
dS\dS
r
2

Let dS 2 be the zenith area element shown in the figure, and let dS l be the ring 27rri sin fJ dfJ.
Then, since
~2 = ri r~ - 2r1r2 cos 0 +
2r dr
= 2rlr2 sin 0 dO
dS I rl
- = 27T'- dr
~ r2
it follows that

I =

2
41rf or2
Following the same procedure, one finds that the other t\VO integrals in (3.160) give + 1/41rforl
and + 1 /47T'for2 and therefore
1
C = 47T'forl - 47rfor2 - - - - - -
SECTION 20 The 111axwell-McAlister Experiment 199

3.20* THE MAXWELL-McALISTER EXPERIMENT

This section and the next are concerned with improvements in accuracy which Max-
well and Me Alister and later Plimpton and Lawton made in the Cavendish electric
force experiment (described in Section 3.1). Though not adding further to the electro-
static theory just presented, a discussion of these experiments enhances confidence
in the postulation of the inverse square law, and affords the opportunity to consider
Maxwell's analysis of the accuracy of such experiments.

Two thin spherical


metal shells

Air

---~ i{:D Insulators

FIGURE 3.21 The Maxwell-McAlister apparatus.

With IVlcAlister's help in the laboratory, Maxwell repeated the Cavendish experi-
ment in 1878, using an apparatus of improved design. The need Cavendish had to
remove the outer hemispheres was avoided by introducing a trap door which provided
access to the inner globe, and through which the testing electrode of an electrometer
could be inserted to detect the presence of charge on the inner globe. As suggested by
Figure 3.21, the outer hemispheres were sealed together and placed on an insulating
stand. The inner globe was spaced and insulated from the outer shell, in a concentric
position, through the use of a piece of ebonite tubing. The trap door in the outer shell
* This section may be omitted without loss in continuity of the technical presentation.
200 Electrostaiics in Free Space CHAPTER 3

was so constructed that, in its closed position, it formed an electrical connection


between the t\VO spheres. The trap door could be lifted by an insulating thread, thus
breaking this electrical connection. The detector inserted through the resulting opening
was a version of Thomson's quadrant electrometer, a much more sensitive instrument
than the pith-ball electrometer available to Cavendish a century earlier. The case of
this electrometer and one of its electrodes were permanently grounded, and the testing
electrode was also kept grounded except when used to test the potential of the inner
globe. To estimate the original charge on the outer shell, a small brass ball was placed
on an insulating stand at a distance of about 60 em from the center of the shell.
The procedure followed was to close the trap door, thus connecting the outer shell
to the inner globe, and then to charge the outer shell positively from a condenser which
was brought in from another room for this purpose and then promptly removed. After
this, in Maxwell's words"

The small brass ball was then connected to earth for an instant, so as to give it a negative
charge by induction, and was then left insulated. The lid was then lifted up by means of the
silk string, so as to take away the communication between the shell and the globe. The shell
was then discharged and kept connected to earth. The testing electrode of the electrometer
was then disconnected from earth, and made to pass through the hole in the shell so as to
touch the globe within withou t touching the shell.
Not the slightest deflexion of the electrometer could be observed.

Because of the relative sizes of the small brass ball and the outer shell, and their
separation distance, Maxwell and l\1cAlister knew that at the time the brass ball
had been momentarily grounded, it had taken on an induced negative charge which
was approximately 1/54th of the positive charge which had been applied to the outer
shell. Later, when the outer shell was grounded, it actually retained a small positive
charge, through induction, and due to the presence of the insulated, negatively charged
brass ball. This small positive charge was computed to be about 1/9th of the negative
charge on the brass ball, or 1/486th of the original charge applied to the shell system.
Thus at this stage of the experiment, the outer and inner shells were insulated from
each other, the outer shell was at ground potential and possessed a positive charge
approximately 1/486th of the original charge, the small brass ball was insulated and
contained a negative charge approximately 1/54th as big as the original charge applied
to the shell system, and the electrometer indicated no charge on the inner globe.
1-'0 test the sensitivity of the instrumentation, the outer shell was disconnected from
ground and connected instead to the electrometer. Being still at ground potential
(due to the presence of the negatively charged brass ball) the outer shell caused no
deflection of the electrometer. However, at this juncture, the small brass ball was
grounded, thus raising the potential of the outer shell and producing a large deflection
of the electrometer.
Calling this observed deflection D, and letting d be the largest deflection which
could escape detection, Maxwell and McAlistcr then knew that the maximum charge
which resided on the inner globe was ± II486(dl D)th of the original charge applied
to the shell system. Thus Maxwell concludes
20 J. Clerk Maxwell, ed., The Scientific Papers of the Honourable Henry Cavendish, vol. 1, p.p. 404-409,

revised by J. Larmor, Cambridge University Press, 1921. (Note 19 of Notes by the Editor.)
SECTION 20 The 111axwell-111cAlister Experiment 201

• • . \VC know that the potential of the globe at the end of the first part of the experiment

cannot differ from zero by n101'e than

where V is the potential of the shell when first charged.


But it appears from the mathematical theory that if the law of repulsion had been as
r-(2+0), the potential of the globe when tested would have been

O.l4780V
Hence 0 cannot differ from zero by more than ±.l2(d/D).
N O\V, even in a rough experiment, ]) was certainly more than 300d. In fact, no sensible
value of d was ever observed. We may therefore conclude that 0, the excess of the true index
above 2, 111USt either be zero, or must differ from zero by less than

The mathematical theory to which Maxwell refers in the above passage is developed
in the remainder of his Kate 19. This development is slightly paraphrased, with modi-
fied notation, as follows:
Assume that the law of force between t\VO electric charges a distance r apart is
qq'F(r) in which q and q' are algebraic quantities representing the amounts of charge
and F(r) is a function to be determined. Since the system is conservative, the potential
energy can be wri t ten

f F(R) dR
00

W(r) = qq' (3.161)

in which the result is independent of the path. This may be expressed in the form
i' (r)
W(r) == qq"- (3.162)
r
00

so that j(r) = f r [f F(R) dR J dr (3.163)

Imagine in the foregoing that the charge q' is successively elements of a charge Qa
uniformly distributed over the outer sphere (Figure 3.22) of radius a and a charge Qb
which is uniformly distributed over the inner sphere of radius b. The potential energies
between the charge q and each of these elements of charge may be added.
Let «; and (J"b be the respective surface charge densities so that
Qa == 47ra 2(J" a Qb == 47rb 2(J" b
Referring to the figure, one can consider first an element of charge at pi, residing in the
element of area a 2 sin 8 d8 d¢. (The usual spherical coordinates are implied.) Placing
the charge q at ]J, a distance c from the center, and letting r == PP', one finds that
}'2 == a 2 - 2ac cos 8 +c 2 (3.164)

and that the potential energy between this element and q is, from (3.162)
f' (r)
q(J"aa 2 sin 8 de df­> ' - -
r
202 Electrostatics in Free Space CHAPTER 3

Air

Two shells

f
FIGURE 3.22 Geometry for 111 axwell's analysis.

Wa = rf
The potential energy between q and all the charge on the outer shell is therefore

o 0
qQa .f'(r) sin fJ dfJ d¢
41r 1"
Since r is independent of </>, this becomes

Wa = qQa
2
f
0
1'(1') sin fJ dfJ
r
(3.165)

However, the differential of (3.164) gives r dr = ac sin f} df} so that (3.165) may be
rewritten
qQ
f
a+c
W a = qQa f'(1") dr = -2a [j(a + c) - J(a - c)]
2ac a-c ac

When this procedure is repeated for the inner sphere, one obtains
qQb
TVb = - [f(c + b) - f(c - b)]
2bc
If q is taken to be a unit charge, the electric potential at P is

V(c) = ~ [f(a + c) - f(a - c)] + 2bc


Qb [f(c + b) - f(c - b)] (3.166)
2ac

From this, it follows that the potential of the outer sphere is

V(a) = ~2 f(2a) + !J!'- [f(a + b) - f(a - b)] (3.167)


2a 2ab

whereas the potential of the inner sphere is

V Qb
(b) = 2b 2f(2b)
o. [f(a + b)
+ 2ab - f(a - b)] (3.168)
SECTION 20 The Maxwell-McAlister Experiment 203

In the Cavendish experiment, the two spheres first were joined by a short wire
and charged to a common potential VI above ground. Putting yea) = V(b) = V I
into the above equations, and solving for Qb, the charge on the inner sphere, one obtains
_ V b bf(2a) - a[f(a + b) - f(a - b)]
Qb 2 (3.169)
- I
f(2a)f(2b) - [f(a + b) - f(a - b)r~

The two spheres were next disconnected from each other and the outer sphere grounded.
At this point, the charge on the outer sphere changed to Q~ which can be determined
from

yea) = 0 = Q~ f(2a) + ~ [f(a + b) - f(a - b)]


2a 2 2ab
At the same time, the potential of the inner sphere became

V(b) = V2 = 2~b2f(2b) + 2~~ [f(a + b) - f(a - b)]

Elimination of Q~ from these t\VO equations yields the relation

V = V
2 I
[1 _~ b
I(a + b) - I(a - b)]
.f(2a)
(:3.170)

I t is this potential V 2 of the inner globe which the electrometer was used to detect. t
N ext assume, with Cavendish, that the law of electric force is SODle inverse power
of the distance which differs but little from the inverse square; that is, let
F(r) = 1'-(2+<5)

r 1- <5
so that .f(r) = 1 _ 82

Since 0 is assumed small, 1'-6 can be expanded in the rapidly converging power series
(8 In 1')2
1'-0 = 1 - 0 In r + + ...
2!
(This is merely a Maclaurin series in powers of o. Cf. the Mathematical Supplement,
Part I.)
Use of the first two terms of this expansion in (3.] 70) gives the first-order result

8
V 2 = -VI [a- I na +- -b -In -2 4a-2-2] (3.171)
2 b a - b a - b

Insertion in (3.171) of the values used by Maxwell and l\IcAlister for the radii a and b
gives V 2 = 0.1478V I , the relation used by Maxwell in the previous quotation. It was
. t hiIS manner that Maxwell was able to determine the bound 0 = ± 21 1600'
In
,
t In the Maxwell-McAlister version of this experiment, the presence of the small charged brass ball
adds another term to the expressions for V(a) and V(b), but this term cancels out in (3.169) and
(3.170), and thus (3.170) is valid both for the original Cavendish experiment and for the later
Maxwell-McAlister experimen t.
204 Electrostatics in Free Space CHAPTER 3

One can carry this analysis further and assume that Qb = 0 and thus that V 2 = o.
Equation (3.170) then gives

bf(2a) - af(a + b) + af(a - b) = 0

Holding a fixed, letting b vary, and differentiating twice with respect to b, one obtains

f"(a + b) = f"(a - b)

Since this Blust be true for any b < a, it follows thatj"(r) = K I andf'(r) = KIf +K 2,
in which K 1 and K 2 are constants. Hence,
00

f F(R) dR = f'(r) = K1 +K 2

r
r r

from which (3.172)

Thus on the basis of the assumption of a null result in the Cavendish experiment, the
electric force law is the inverse square. This proof is due to Maxwell, It borrows from a
procedure first used by Laplace who showed that no function of the distance except
the inverse square satisfies the condition that a uniform spherical mass shell exerts no
gravitational force on a particle within it.

3.21 * THE PLIMPTON-LAWTON EXPERIMENT

The accuracy of the Maxwell-Me Alister result stood until 1936 when Plimpton and
Lawton of Worcester Polytechnic Institute undertook the task of attempting a more
exact measurement."! Using modern equipment, they were able to show that the
electric force must be an inverse power of the distance, r-(2+6), in which 0 is bounded
by ±2 X 10- 9 •
Plimpton and Lawton examined Maxwoll's version of the Cavendish experiment
very carefully. It was at first believed that by using a greater charging potential on the
outer sphere and a D10re sensitive electrometer, one could increase the accuracy of
Maxwell's method. This did not prove to be the case; in fact, it was concluded that
Maxwell had apparently reached the limit attainable by his method, even granting
the sensitivity of modern equipment. Maxwell's method suffered from two limitations:
(1) radioactive contamination of the metal surfaces make spontaneous ionization
possible, and this could affect the charge on the inner globe during measurement, and
(2) contact potentials establish a lower bound on the detectable voltage of the inner
sphere. The second effect is the more severe of the two.
The contact potential difficulty was eliminated completely and the spontaneous
ionization problern reduced by an ingenious modification of the apparatus. Plimpton
and Lawton placed the detector inside the inner globe and thus were able to make
permanent connections between the electrometer and the inner globe. This eliminated
contact potentials entirely. It was also possible to seal the inner globe inside the outer
* This section may he omitted without loss in continuity of the technical presentation.
S. J. Plimpton and \V. E. Lawton, "A Very Accurate Test of Coulomb's Law of Force between
21

Charges," Phys Rev, 50, 1066-1077; 1936.


SECTION 21 The Plim pion-Lauiton Experiment 205

one, thus reducing contamination of its surface. The time duration of the data-taking
was drastically curtailed, thereby decreasing the accumulated effects of spontaneous
ionization.
The apparatus used by Plimpton and Lawton is showed in Figure 3.23. The outer
globe consisted of two hemispheres 5 feet in diameter, soldered together, and mounted
on a porcelain insulator. A slat floor was constructed inside the outer sphere and on it
was placed the detector, housed in copper boxes. These copper boxes formed the lower
half of the inner globe, the upper half being a 4-ft diam hemisphere, mounted on pyrex
glass insulators and connected to the detector boxes. (Plimpton and Lawton showed
that this deformation of the geometry used by Cavendish did not invalidate the appli-
cability of Maxwell's analysis, as given in Section 3.20.)

~----~irror
Telescope l
I
Central rheostat I ...-- 0 Condenser
I
.-­- I generator
I E
WI

~I ~~o~oooo~oooo~
110 a.c.
'-------------------------' --------
FIGURE 3.23 The Plimpton-Lawton apparatus.

The detector used was a five-stage amplifier operating a galvanometer. This assembly
was suspended on rubber to avoid microphonics, The Johnson noise of the input resistor
caused an indication of only ~~ microvolt. The galvanometer was viewed through a
conducting window in the outer sphere, this being simply a glass-bottomed vessel
filled with a salt water solution.
During preliminary investigations, it was found that when switches were opened or
closed, the galvanometer deflected, due to magnetic field surges. A quasistatic procedure
was devised to circumvent this difficulty. The outer globe was charged by a sinusoidal
voltage source whose frequency was adjusted to the resonance of the galvanometer.
This greatly enhanced the sensitivity of the instrumentation as well as the signal-to-
noise ratio. I t was found that a frequency of about 2 cycles per second submerged the
galvanometer fluctuations due to inductive effects below the Johnson noise.
206 Electrostatics in Free Space CHAPTER 3

Plimpton and Lawton were unable to find a commercially available generator at


such a JO\V frequeney ; therefore they designed and const.ructcd their own, 'I'he time-
varying voltage was developed by moving the center plate of a tri-plate condenser
connected to a suitable power supply.
The calibration procedure was simplified by employing a high resistance poten-
tiometer, as shown in Figure 3.24. During calibration, a small known fraction of the

From condenser generator

Oscilloscope

fJ
U
"---_--...._-.-_..." .....
~ --.J

FIGURE 3.24 JlIethod of calibration.

charging voltage was applied to the inner hemisphere. The potentiometer was varied
until the smallest detectable voltage was determined. This voltage was consistently
less than one microvolt.
During the actual experiment, the outer sphere was charged with a voltage wh ich
was always in excess of 3,000 volts. Although many trials were made, no detectable
deflection was ever observed in the galvanometer,
1"'0 adapt Maxwell's analysis to the Plimpton-Lawton experiment, one can return
to the expressions for the potentials of the t\VO spheres, namely, Equations (3.167)
and (3.168). Since Plimpton and Lawton were using such a low frequency (2 cps), these
equations are still valid in their case, only now there is an implied time dependence of
er", Solving these t\VO equations for Qb gives

Qb = 2b2/(2a)V(b) - 2ab[f(a + b) - ­(a - b)]V(a)


(3.173)
f(2a)j(2b) - [f(a + b) - .r(a - b)]2

Sincc the detector is connected between the inner and outer spheres, the current
flowing through the extremely high input resistance R of the detector is Qb = jWQb.
Thus
yea) - V(b) = jwQbR (3.174)
SECTION 21 The Plimpton-Lawton Experiment 207

Elimination of Qb from Equations (3.173) and (3.174) gives

{f(2a)f(2b) - (f(;w; b) - f(a - b))2 + 2ab(f(a + b) - f(a - b)]} Yea)

== {2b~f(2a) + f(2a)f(2b) - Lf(~ + b) - .r(a - b) J2} V (b)


JwR

Since R is so large, this reduces to

ba [f(a + b) - f(a - b)lV(a) == f(2a)V(b)

from which it follows that

yea) - V(b) == yea) [1 _~.f(a +


b
b) - fCa -
fC2a)
b)l (3.175)

But this is the same expression which Maxwell obtained for the static case. N ow, how-
ever, the voltages are sinusoidal, and the potential of the inner sphere is being measured
with respect to the outer sphere rather than ground.
Once again, the assumption can be made that the electric force law is of the form
r-(2+o) with 0 small, yielding the first-order result

Yea) - V(b) = ~ Yea) [~ln a_+_b


2 b a - b
- In _4a_
a2 -
2
_l
b2

When the values of a and b used by Plimpton and Lawton] are inserted in this expres-
sion, one obtains
o
yea) - V(b) ~12 yea) (3.176)

Their measurements indicated that yea) - V(b) was not greater than one-half micro-
volt even for V (a) as great as :-3,000 volts. This yields the result that 0 is bracketed
by the limits ±2 X ]0- 9 •
Because of the great accuracy of this determination, Plimpton and Lawton even
investigated possible effects due to gravity. This influence was shown to cause a poten-
tial difference between the spheres of less than 10- 10 volts, an effect which could be
neglected.
The Plimpton-Lawton determination of the law of electric force stands as a model
of precise experimentation and provides the most confident basis for a development
of an electromagnetic theory which uses the inverse square law as a postulate.

REFERENCES

1. Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, VV. H.
Freeman and Company, San Francisco, 1962.
2. Jackson,.T. D., Classical Electrodynamics, John \Viley and Sons, Inc., Ne\v York, 1962.
t The value used for b needs to be a suitable average, since their inner surface was not entirely
spherical.
208 Electrostatics in Free Space CHAPTER 3

3. Jeans, J., The 111 athematical Theory of Electricity and il1agnetism, 5th ed., Cambridge
University Press, London, 1946.
4. Langmuir, R. V., Electromaqnetic Fields and lVaves, Meuraw-Hill Book Company, New
York, 1961.
5. Lenard, P., Great ill en of Science, The Macmillan Company, Inc., N e\v York, 1933.
6. Magie, \tV. F., . 4 Source Book in Physics, Mcflraw-Hill Book Company, New York, 1935.
7. Panofsky, \V. I{. H., and 1\1. Phillips. Classical Electricity and Jl1 aqneiism, Addison- \Vesley
Publishing Company, Inc., Reading, Massachusetts, 1956.
8. Plonsey, R., and R. E. Collin, Principles and Applications of Electromagnetic Fields,
McGra\v-Hill Book Company, New York, 196!.
9. Ramo, S., and J. R. Whinnery, Fields and lVaves in Afodern Radio, 2nd ed., John Wiley
and Sons, Inc., N e\v York, 1953.
10. Reitz, J. R., and F ..J. Milford, Foundations of Eleciromaqnetic Theory, Addison-Wesley
Publishing Company, Inc., Reading, Massachusetts, 1960.
11. Shamos, 1VI. H., Great Experiments in Physics, Holt-Dryden Company, Inc., New York,
1959.
12. Shire, E., Classical Electricity and ..Magnetism, Cambridge University Press, London, 1960.
13. Smythe, ,V. R., Static and Dynamic Electricity, McGra\v-Hill Book Company, N ew York,
1939.
14. Whittaker, E., A J/istory of the Theories of Aether and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.
15. 'VoH, A., A History of Science, Technology and Philosophy in the Eighteenth Century, The
Macmillan Company, Inc., New York, 1939.

PROBLEMS

3.1 Two particles of equal mass m and equal charge q are suspended from a C0111InOn point by
light strings of equal length. Find the angle of separation of the t\VO strings.
3.2 Two small spheres are placed 1 In apart and a charge of 1 caul is placed on each. Will t\VO
strong men be able to hold the spheres in position ? (Define a strong man as one weighing
200 lbs and able to lift twice his weight.) How many excess electrons could be placed on
each sphere before the men begin to feel a sense of achievement?
3.3 An uncharged conductor of volume V is placed in a uniform electric field of strength E.
What force does it experience?
3.4 A quantity of negative charge - Ze is distribu ted uniformly throughou t a sphere of radius
r; A positive point charge +Ze is situated at an arbitrary point within the negative cloud
of charge. Find the force on the positive charge.
3.5 Use the Dirac delta function to express a surface charge density distribution in the form
of a volume distribution. Insert this expression in (3.11) and (3.18) to verify (3.13) and
(3.26) .
3.6 A. metal shell of radius a contains a charge Qa and a second metal shell of radius b is given
a charge Qb. If these t\VO shells are then connected by a wire, in which direction will curren t
flow?
Problems 209

3.7 Use an energy argument to show that the charge on a conductor resides on its outer
surface.

3.8 .A discrete system of N charges 'l» are at the points (~n, 1]n,S n). Find the energy which can
be extracted from this system when each charge is allowed to move infinitely far away
from all the others.

3.9 Rutherford, in 1911, gave a satisfactory explanation for the deflection of a particles by
adopting as a model of the atom a uniformly distributed cloud of negative electricity of
amount - Ze, contained in a sphere of radius r., with a positive charge Ze at its center.
He obtained the following expressions for electric field and potential at any point within
the atom

E(r) = 4~:o (~ - ~)
'P(r) = ~:o G- 2~a + ;:)
Verify that these expressions are correct.
3.10 Find the equation for the equipotential surfaces due to an electric doublet. Plot a few of
the contours to scale. Are the results valid near the dipole? In what way should they be
modified '?
3.11 A. dipole of moment p is located in a uniform electric field of strength E. Assume that p
is initially perpendicular to E and say that in this position the dipole has zero potential
energy. Then show that if the dipole is rotated into any new position, its potential energy
has changed to - p · E.
3.12 :\ flat circular ring of radius a contains a uniform charge density x couf /m. Find the poten-
tial and field intensity at any point along the axis.
3.13 Use the result of the previous problem to deduce the potential and field intensity at any
point along the axis of a disc of radius a which has a uniform surface charge density (5.
3.14 Two extensi ve metal plates are parallel and opposi tely charged, each being insulated, wi th
the intervening space being a VaCUU111. If the plates are pulled further apart, explain why
their potential difference increases. Suppose that the initial separation was 1 mm and that
initially the potential difference was 1,000 volts. If the ultimate separation is 1 m, what
is the final voltage? This effect accounts for the high voltage of some lightning discharges.
3.15 Three thin concentric conducting spherical shells of radii a = 1 m, b = 2 IU, and c = 4 m
are originally uncharged and insulated from each other by a vacuum. A total charge of +3
coul is then placed on the middle shell B. N ext, A and C are electrically connected by a
thin wire which goes through a negligibly small hole in B without touching B. .A fter the
wire is removed, the total charge on ..4 and C is measured. Predict the results of these
measu rem en ts.
3.16 Use Gauss' la w to show that the average poten tial over any spherical surface in a charge-
free region is equal to the value of the potential at the center of the sphere.
3.17 With the aid of Gauss' law, determine the electric field distribution inside and outside
an electron beam of circular cross section of radius a. ASSUlTIe the beam possesses a uniform
current density L amp zm" and that the electrons are moving at a constant velocity v = lzv.
(This problem may be treated electrostatically even though the charges are moving
because the time-average amount of charge at any position in the beam is a constant.)
210 Electrostatics in Free Space CHAPTEH 3

3.18 Point charges q and -q are placed at the points .A, B. The flux line which leaves A, making
an angle a with AB, meets the plane which bisects .4B at right angles, in a point I). Sho w
that
. a _ r: . ]J AB
SIn - = V 2 SIn --
2 2
Hint: When this flux line is rotated about AB as axis, the surface thus generated encloses
no net charge.
3.19 .~ point charge of +q coul is placed d 111 above an infinite grounded conducting plane in
otherwise empty space. Let P be that point in the plane nearest to q, and with ]J as center
draw a circle in the plane of radius r. If the circular area thus formed contains one-quarter
of the total induced charge in the plane, find the value of r.
3.20 Two infinite grounded conducting planes intersect at right angles and a point charge q is
placed a distance d from each plane. Find the surface charge distribution in each plane.
3.21 Let the electric field intensity at the surface of a thin spherical conducting shell be E.
Show that if an extremely small hole pierces the shell, then the electric field at the lUOU th
of the hole is i E.
3.22 Use the method of irnages to deri ve a field expression for the system consisting of a uniform
line charge parallel to and external to a right circular conducting cylinder. Consider the
general case in which the cylinder is at an arbitrary potential.
3.23 .A grounded conducting tube of infinite length has a circular cross section of inner radius
10 ern. f\ line charge of 2 microcoul/rn is placed parallel to and at a distance of 5 em from
the axis of the tube. Determine the surface charge density distribution induced on the
inner surface of the tube. What is the net force per unit length acting on the tube '?
3.24 Consider an electrified system consisting of a metallic sphere of radius a and excess charge
Q together with a point charge q a distance b > a from the center of the sphere. Find the
force on q and show that under certain circumstances it can be attractive even if q and Q
are of like sign.
3.25 .A hollow conducting sphere has an internal radius of 1 m ..~ point charge of 1 microcoul
is placed within the sphere at a distance of 50 cm from the center. Find the surface charge
density distribution on the sphere and determine the net electrostatic force experienced
by the sphere.
3.26 Two equal charges q are placed at equal distances d from the center of a grounded con-
ducting shell of radius a. If a > d and the charges are on the same diameter, find the net
force on each charge.
3.27 A hollow conductor is formed by a quarter of a sphere and t\VO perpendicular diametral
planes. Find the image of a charge placed at any internal point.
3.28 A spherical conducting shell of radius b is insulated and uncharged and surrounds a
spherical conductor of radius a, the distance c between their centers being small, The inner
conductor contains a charge Q. Find the potential distribution between conductors and
the surface charge density.
3.29 A cylindrical volume of radius b, extending to infinity in both axial directions, contains
a space charge of constant density Po. Find the field intensity E(r) and the electrostatic
potential <I>(r) for any radial distance.
3.30 Two infinite coaxial cylindrical conducting shells of radii a and b bound a uniform space
charge density Po. Determine the potential distribution in this intervening region if the
inner and outer cylinders are held at potentials <I>(a) = 0 and <I>(b) = Vo respectively.
Problems 211

3.31 Determine the trajectory of an outer electron in an electron beam of circular cross section
subject to spreading caused by space charge repulsions within the beam. Assume axial
velocities to be constant and radial velocities such that the beam diverges symmetrically
with no crossing of trajectories. Obtain the radial electric field as a function of r and relate

rm

.l..------ -:=t --
-t-.....--------- z

this to the radial acceleration; then integrate and determine r as a function of axial
distance..Assume that for z ~ 0 the beam is confined to r; with no radial velocity.

3.32 How long does it take an electron to travel from cathode to plate in a planar diode'? Insert
typical values for the parameters and compute the time in microseconds,

3.33 1\ spherical volume of radius b contains a space charge density p(r) = b2 - r 2 • Find the
field intensity E(r) and the electrostatic potential cI>(r) for any radial distance. Check your
results by substitution in Poisson's equation.

3.34 A. thin conducting spherical shell of radius a contains a total excess charge Q, and has its
inner surface coated with a thin insulating film. An equal amount of charge is distributed
throughout its hollow interior such that

T <a

Find the charge distribu tion within the sphere and the surface charge densi ty on its ou tel'
surface. What is the absolute potential of the sphere'? Of the point at its center'?

3.35 Use Laplace's equation to show that the electrostatic potential cannot have a maximum
or a minimum value at any point in space not occupied by an electric charge. Then show
that if cI> is maximum at a point, the point must be occupied by a positive charge, whereas
a negative charge must be at a point where the potential is a minimum.
3.36 Two infinite parallel conducting plates are separated by a distance b as shown in the
figure ..\ very thin conducting septum, infinitely long and of height d, is connected to the
grounded plate, with the other plate kept at a constant potential V o. Solve for the poten-
tial distribution between plates.

t
11
b

~
<f> = 0

3.37 Two L-shaped conducting channels are placed near each other so as to form a narrow
longitudinal slit, as shown in the figure. The t\VO channels are kept at a difference in
212 Electrostatics in Free Space CHAPTER 3

potential Vo. Assume that the structure is infinite in the Z direction and solve Laplace's
equation to obtain a solution for ep in the region between the channels.
y

1 <I> = V,

I Slit
t
b

I <1>=0
~ ~X

3.38 Find the potential distribution and electric field between t\VO half-plane conductors set
at an angle cPo but not quite touching. Ignore edge effects and assume that one plate is
grounded with the other at a potential <1>( cPo) = V o. What is the charge distribution?

<1>=0

<I> = Vo
3.39 Find the potential distribution between a four-segment commutator of radius fl and a
grounded concentric cylinder of radius f2 > fl. Alternate segments of the commutator
are at the patentials ± Vo vol ts, as shown in the figure.
Problems 213

3.40 A spherical conducting shell of radius a is divided into two hemispheres by a narrow
equatorial gap. If the hemispheres are kept at a difference in potential Yo, find the poten-
tial distribution both inside and outside the shell.
3.41 For the quartered spherical shell of Example 3.25, find the potential distribution inside
the shell.
3.42 Find the Green's functions suitable for solving Dirichlet-type boundary-value problems
in vol ving a rectangular box.
3.43 Repeat the preceding problem for a cylindrical box.

3.44 A line charge of density )( coul/rn is located symmetrically inside a 90-deg grounded con-
ducting corner, being a distance d from each face, as shown in the figure. By using a suit-
able conformal transformation, and then employing the image principle, find the paten tial
distribution of this system.

o
II •
~

<1>=0

3.45 Repeat the above problem if the line charge is placed symmetrically inside a grounded
trough, as shown in the figure.

x

<1>=0
214 Electrostatics in Free Space CHAPTER 3

3.46 With the aid of a Schwarz transformation, find the potential and field distributions in the
region between the two right-angle conducting wedges shown in the figure.

4> = Vo

b
+
4>=0

3.47 A capacitor is formed of three concentric cylinders of which the inner and outer are con-
nected together. Neglecting end effects, obtain a formula for the capaci tance per uni t
length.
3.48 Use a Schwarz transformation to determine the change in capacitance, for the geometry
shown, over the value which would be obtained if a uniform field existed in both parallel
plane regions.

4l = Vo

t t
b
a
t
~
<1>=0

3.49 .A sandwich line consists of three parallel plane conductors, as shown in the figure .
.A ssuming these conductors are infinitely long in a direction perpendicular to the paper,
and neglecting fringing, find the coefficients of capacitance per unit length.

L.1
I

~"-b-.~

I .. a
Problems 215

3.50 With the aid of a transformation in the complex plane, find the coefficients of capacitance
per unit length for the geometry shown in the figure.

3.51 Show that the energy stored in the field of a coaxial capacitor is consistent. with the
formula Q2/2C. Repeat t his calculation for a spherical capacitor.
3.52 Consider an electrostatic system consisting of .V conducting bodies possessing charges
Qn and potentials V n. Prove Thomson's theorem, which states that the charge will
distribute itself so that when in equilibrium the electrostatic stored energy in the field is
a minimum.
3.53 j\ system of LV conductors is charged in any manner and then charges are transferred
among the conductors until they are all brought to the same potential V. Show that there
has been a decrease in the stored electrostatic energy equal to what would be the energy
of the system if each of the original potentials had been decreased by an amount V.
3.54 Under the assumption- that the error in measuring the angular position of ball a (see
Figure 3.4) was so much larger than anv other error that it was the determining factor in
t
accuracy, and that Coulomb could measure this position within ± deu. to what accuracv
did he determine the inverse square law?
3.55 A proof of the inverse square law, assuming a null result in the Cavendish experiment, was
provided by Maxwell based on a formulation of potentials. Can you give an alternative
proof based on force':
3.56 What was the size of the auxiliary small brass ball used by Maxwell and Me Alister '!
3.57 From the description given of the Plimpton-Lawton experiment, what is the upper bound
on the total charge residing on the inner globe '?
3.58 What was the average diameter of the inner closed surface In the Plimpton-Lawton
experiment?
CHAPTER 4
Magnetostatics in Free Space
MAGNETO~TATIC;';, as a topic within electromagnetic theory, is usually introduced
by drawing upon cxperimentai evidence to postulate either the Biot-Savart law or
Ampere's circuital law, The theory of maguetic fields due to t.irne-independeut current
distributions ill free space is then developed. Following this, the behavior of magnetic
materials is considered, usually in terms of aggregations of atomic current loops or
equivalent ruagueti« dipoles. A satisfactory description of all gross static magneti«
effects may be achieved in this manner.
The approach to be presented in this chapter differs ill several respects from the
above. First of all, no new experimentally based postulates will be introduced. Instead,
the previously obtained results of special relativity and electrostatics will be used to
derive the Biot-Savart law. The procedure will be to consider the force exerted on a test
charge by a system of charges which are at rest relative to an observer ()'. '1'0 a second
observer O, in constant motion relative to ()', this charge system is drifting at a constant
velocity, and thus can take 011 the appearance of a steady current. The second observer
o detects a slightly different force to be acting on the test charge. This slight difference
is determinable through the force transformatiou law and proves to be the seat of
magnetosta tics.
After the force trausformatiou equations have been used to transform Coulomb's
law and thus establish the Biot-Savart law, the chapter proceeds conventionally with
the introduet.ion of the maguet.ostatic vector potential function and the derivation
of Ampere's circuital law, The magnctostatic vector potential function is found to
satisfy Poisson's equation, leading to the solution of a class of boundary-value prob-
lems, in analogy with what was presented in Chapter :~ in the ease of electrostatics.
As illustrations of the theory, a variety of problems is solved, including the far field
of a small current loop. This resultforms the building block for an explanation of
magnetic effects ill materials. However, the discussion of magnetic materials will be
deferred until Chapter 7. ill order to be able to include time-varying effeets.

4.1 * HISTORICAL SURVEY


Man's awareness of magnetic effects appears to be almost as old as recorded history,
but most of the early knowledge was concerned with the properties of permanent
magnets, The subject of magnetic fields caused by electric currents had a very well-
defined beginning in the winter of 1819-1820. During that period, Professor Hans
* This section may be omitted without loss in continuity of the technical presentation.
SECTION 1 Historical Survey 217

Christian Oersted (1777-1851) of the University of Copenhagen, experimented with the


placement of a closed electric circuit near a compass needle. He had been motivated
in this study by the observation that a compass needle fluctuated erratically during a
thunderstorm. Accordingly, he set up an apparatus consisting of a galvanic battery
and a short-circuiting wire. Apparently during one of his lectures Oersted placed the
wire at right angles to a compass needle, but observed no effect. At the end of this
lecture the thought occurred to him to place the wire parallel to the needle. This action
immediately caused a pronounced deflection in the needle. After putting together a
more powerful galvanic battery, Oersted assembled some of his colleagues as witnesses
and repeated the experiment, Excerpts of his own account 1 of what was observed
follow:
The opposite ends of the galvanic battery were joined by a metallic wire, which, for short-
ness sake, we shall call the uniting conductor . . . . To the effect which takes place in this
conductor and in the surrounding space, we shall give the name of the conflict of electricity.
Let the straight part of this wire be placed horizontally above the magnetic needle,
properly suspended, and parallel to it . . . . Things being in this state, the needle will be
moved . . . .
If the distance of the uniting conductor does not exceed three-quarters of an inch from
the needle, the declination of the needle makes an angle of about 45°. If the distance is
increased, the angle diminishes proportionally. The declination likewise varies with the
power of the battery . . . .
The effect of the uniting conductor passes to the needle through glass, metals, wood, water,
resin, stoneware, stones; . . . . The effects, therefore, which take place in the conflict of
electricity are very different from the effects of either of the electricities.
If the uniting conductor be placed in a horizontal plane under the magnetic needle,
all the effects are the same as when it is above the needle, only they are in the opposi te
direction . . . .
After noting that a rotation of the wire would be tracked by a rotation of the magnetic
needle, and that no effect was observed for needles made of brass, glass, or gum lac,
Oersted offered a few observations in the nature of an explanation of the phenomenon:
The electric conflict acts only on the magnetic particles of matter..All non-magnetic
bodies appear penetrable by the electric conflict, while magnetic bodies, or rather their
magnetic particles, resist the passage of this conflict. Hence they can be moved by the
impetus of the contending powers.
I t is sufficiently evident from the preceding facts that the electric conflict is not confined
to the conductor, but dispersed pretty widely in the circumjacent space.
From the preceding facts \ve may likewise collect that this conflict performs circles; for
without this condition, it seems impossible that the one part of the uniting conductor, when
placed below the magnetic pole, should dri ve it towards the east, and when placed above it
towards the west . . . .
There has been some debate as to the extent of the honor which should be accorded
Oersted for this discovery. A principal factor occasioning this debate is a letter from
I-Iansteen (one of Oersted's students) to Faraday, in which he says:"
1 H. C. Oersted, "Experiments on the Effect of a Current of Electricity on the Magnetic Needle,"

a pamphlet dated July 21, 1820, distributed privately to scientists and scientific societies. English
translation in A nn. of Philosophy, 16, 273-276; 1820.
2 Bence Jones, The Life and Letters of Faraday, vol. 2, pp. 389-392, Longmans, Green and Company,

London, 1870.
218 M agnetostatics in Free Space CHAPTER 4

Professor Oersted was a man of genius, bu t he was a very unhappy experimenter: he


could not manipulate instruments . . . . Oersted tried to place the wire of his galvanic
battery perpendicular over the magnetic needle, but remarked no sensible motion. Once,
after the end of his lecture, as he had used a strong galvanic battery to other experiments, he
said, "Let us now once, as the battery is in activity, try to place the wire parallel with the
needle"; as this was made, he was quite struck with perplexity by seeing the needle make
a great oscillation . . . . Thus the great detection was made; and it has been said, not
without reason, that "he tumbled over it by accident." He had not before any more idea
than any other person that the force should be transversal. But as Lagrange has said of
Newton in a similar occasion, "such accidents only meet persons who deserve them."

Considerable weight has been given to this letter by Hansteen, because he was appar-
ently a witness to the original discovery. However, the letter was written in 1857,
almost thirty years after the fact, and six years after Oersted's passing. True Oersted's
own account, paraphrased above, was completely lacking in quantitative determina-
tion, but all the salient features of the phenomenon had quite clearly been investigated,
including the dependence on current strength, distance of separation, and even shielding
effects. The inference of a circular distribution of magnetic field lines was certainly an
able deduction. As Lenard" has pointed out, the fact that Oersted had a battery and
compass needle on the table indicates he was looking for such an effect and that the
discovery cannot fairly be labeled as a pure accident. But whatever the true circum-
stances were surrounding this discovery, it was one of the 1110st important in the history
of science, linking for the first time the fields of electrici ty and magnetism,
Oersted's discovery was promptly enlarged by others. The academician Arago learned
of it while traveling abroad and, upon his return to Paris, described the effect at a meet-
ing of the French Academy on September 11, 1820. This news excited the interest of
several investigators, and the next discovery was announced by Andre-Marie Ampere
(1775-1836) just one week later. Reasoning that, if magnets exert forces on each other,
and if electric currents exert forces on magnets, then two electric currents should
interact, Ampere devised an experiment in which.'
. . . in parallel directions, t\VO straight parts of t\VO conducting wires joined the terminals
of two voltaic piles; the one being fixed, and the other suspended from points and made very
mobile by a counterpoise, being able to approach or withdraw while still retaining its
parallelism with the first wire, I have then observed that upon passing an electric current
through each of them, they mutually attract if the two currents are in the same direction,
and that they repel each other when, instead, (the currents) are in opposite directions.
Meanwhile, Jean-Baptiste Biot (1774-1862) and Felix Savart (1791-1841) repeated
Oersted's experiments, and announced to the Academy at the October 30th meeting
that they had determined a law of force which governed the effect. The following brief
notice of their announcement was printed in the Journal de Pluisique:"
The beautiful observations of IV1. Oersted, combined with precise measurements of torsion
and oscillation, give the following expression for the action exercised at a distance on an
austral or boreal magnetic pole, by a nearby thin copper wire, of great length, connected to
the t\VO terminals of a voltaic apparatus. From the point of the pole draw a perpendicular
3 P. Lenard, Great Men of Science, p. 214, The Macmillan Company, Inc., New York, 193:3.
4 A. IV1. Ampere, "Memoir on the Mutual Action of Two Electric Currents," Annales de Chimie et
Physique, 15, pp. 59-76; 1820.
5 J de Phys (Paris), 91,151; 1820, See also Ann de Chimie et Physique, 15,222-223; 1820.
SECTION 1 Historical Survey 219

to the axis of the wire. The force acting on the pole is perpendicular to the axis of the wire.
Its intensity is proportional to the reciprocal of the distance. The nature of its action is the
same as if a magnetic needle were to be placed tangentially to the contour of the wire (in
place of the wire), in which case the austral and boreal magnetic poles would be acted upon
in opposite senses, but always along the same straight line determined by the preceding
construction.

The best source for the details of the experiment which established this law is Biot's
Precis Elementaire de Phusique." The method used can be understood with reference to
Figure 4.1a, which is a reproduction of Biot's original drawing. Shown is a compass
6 The third edition of this text was printed in Paris in 1824. An English translation is embodied in
J. Farrar's Elements of Electricity, M agnetisrn, and Electromaqneiiem; printed by Hilliard and Metcalf,
Cambridge, Massachusetts, 1826.

c \'°--1
\ F(r)
I

(b)
\
c' c

,A
!v!' Ail"

Z' z
(a) (c)
FIGURE 4.1 The Biot-Savart experiments.
220 j11agnetostatics in Free Space CHAPTER 4

needle A B, which can freely pivot about its center point, and which is placed a distance
r from a long, straight wire CllfZ. A permanent magnet A'B' (not shown) is positioned
nearby in such a way as to cancel the effect of the earth's magnetic field. The equi-
librium position of the needle is then found to be perpendicular to the wire axis. If the
needle is pictured as having t\VO equal and opposite magnetic poles at its extremities,
the forces exerted by the current OIl these poles are thus equal, opposite and circum-
ferential. If then the needle is displaced from equilibrium by a small angle (), as shown in
Figure 4.1b, a restoring couple is experienced by the needle, and its equation of motion
IS
-F(r)L sin () = Ie
in which L is the length of the needle and I is its moment of inertia. For SInal! dis-
placements, harmonic oscillations will occur of period

Thus, in Biot's words,


. . . if we compare in this way, the squares of the periods, for different distances of the
uniting wire from the needle, supposing always the condition of isochronism to be fulfilled,
we shall obtain the ratios of the component forces exerted in these different cases by the
uniting wire, parallel to the direction of equilibrium about which the needle oscillates.
Upon performing this experiment, Biot and Savart obtained data which is reproduced
in Table 4.1. The last column of data was calculated under the assumption that the
law was inverse with distance. Since the errors were alternatively positive and negative,
irregular, and greater for the larger distances, Biot and Savart concluded that the
law had been fairly established.

TABLE 4.1
DATA FOR THE BIOT-SAVART EXPERIlVIENT

Duration of ten oscillations


Distances of
the wire, 111n1
Observed, sec Calculated

15 30.00 30.99
20 33.50 33.88
40 48.85 48.62
50 54.75 53.74
60 56.75 59.40
120 89.00 84.25

Biot extended this experiment significantly by inquiring what the action must be
on the compass needle due to an infinitesimal length of the wire, Since the influence
of the entire straight wire varied as 1'-1, and since 1'-1 is the integral of 1'-2, he felt that
each element of the wire should make a contribution to the total force which is pro-
portional to the inverse square of its distance from the needle. However, he realized that
SECTION 1 Historical Survey 221

the contribution might also depend on the orientation of the element relative to the
needle, and devised an experiment to deduce this relation. Referring to Figure 4.1c,
Biot introduced an additional V-shaped wire with its apex close to the central point
of the first wire. He then determined the period of the compass needle as a function of
1', with a steady current alternately passing through the straight wire and the bent
wire. The difference in period under the action of the t\VO wires could be explained? by
the assumption that the contribution from a single current element I de was propor-
tional to (sin W)/~2. The discovery of this fact led Biot to proclaim
. . . the elementary action of any lamina whatever (is) proportional to sin W/~2; and
uniting with this expression, which is founded upon experiment, the knowledge of the
absolute direction of the force which is perpendicular to the plane drawn through each
distance and through the direction of each longitudinal element of the wire under con-
sideration, we may assign by calculation the total resultant of the action exerted by a wire,
or by any portion of a wire, whether straight or curved, limited or indefinite.
In present-day notation, this result is equivalent to saying that a system of steady
currents K creates a magnetic field at point (x,Y,z) given by

B( x,Y,Z ) ex: fX dt X ~
~3

and that if a magnetic pole of strength m is placed at (x,y,z) it will experience a force
mB. In the above formula, ~ is the distance Irorn the element df to (x,y,z). This impor-
tant equation is known as the Biot-Savart law and is often taken as the experimental
postulate on which magnetostatics is based.
Ampere, following his announcement of the discovery of the force between t\VO
currents, continued his investigations, and later that year published a memoir" which
succeeded in clarifying much of what was known about electricity at that tirne. He
distinguished phenomena involving electricity at rest from phenomena involving
electricity in motion, introducing for the former the name electrostatics, and for the
latter the name electrodynamics. He also distinguished between electric tension (volt-
age) and electric current. At that time, people were accustomed to speak of the con-
duction and flow of electricity, but since the two-fluid theory was popular, considerable
confusion existed with respect to the nature of the flow process. Ampere decided that
he would call the whole process an electric current, without regard to its inner nature,
and with the direction of the current defined as the direction in which the positive fluid
was presumed to move. This made the electric current something definite in terms of
which phenomena could be described.
The concept of electric potential, or tension, had been privately appreciated by
Cavendish, and had been admirably developed for electrostatics by Poisson. Ampere
noted that electric tension was observable in a voltaic pile before the circuit was closed,
being detectable through use of an electrometer or electroscope, instruments which
Ampere labeled as measurers of tension. As for the current itself, Ampere felt that it
was best measured by means of its magnetic effects, and he introduced for this purpose
an instrument which he called a galvanometer, an instrument which is still in use today.
To Ampere, tension appeared as a cause, and current as an effect. Koting that as
t See Farrar, ibid., pp. 334-339. See also Problem 4.3 at the end of this chapter.
8 Ibid., pp. 59-68.
222 M agnetostatics in Free Space CHAPTER 4

soon as the effect appears through completion of the circuit, the tension "disappears,
or at least becomes very small," Ampere, then made the interesting observation"
The currents of which I speak self-accelerate until the inertia of the electric fluids and the
resistance that they encounter due to the imperfections in even the best conductors cause
equilibrium with the electromotive force, after which they continue indefinitely at a con-
stant speed such that this force remains at a constant intensity; but they cease entirely at
the instant that the circuit is interrupted.

Ohm's law, which was to be enunciated seven years later, is thus seen to be not far off
in Ampere's thinking.
With a clear definition of current, and a means for measuring it, Ampere continued
his researches over the next three years, and in 1825 collected his results in a lengthy
memoir!" which must rank as one of the most distinguished in the history of science.
In this memoir, Ampere concerned himself with the problem of determining the law
of force between two current elements. A wide variety of experiments on an assortment
of wiregeometries had led him to four conclusions about the force interaction between
currents:
1. The action of one current on another is unchanged in magnitude, but reversed
in direction, when the direction of the current is reversed.
2. The effect of a conductor bent or twisted in any small manner is the same as if
the contour were smoothed out.
3. The force exerted by a closed circuit on a current element is always normal to the
element.
4. If all dimensions of a circuit are changed proportionally, with the currents
unchanged, the forces retain their original values.
When Ampere added to these four conditions the natural assumption that the force
d 2F between two current elements I di and I' dl' is along their connecting line-an
assumption consistent with K ewton's gravitational theory and the Coulomb-Poisson
electrostatic theory-he was able, by an astute piece of analysis, to establish the force
l aw
.
dtF ex:
IV'
Jl ~
se.-di'
[2 - - ~) (dl'
(dl-· -
- 3- ~) ]
- -· -
~3 ~5

in which r is the distance separating the t\VO current elements. A clear exposition of the
analysis leading to this formula, as well as the experimental basis for Ampere's four
conditions, can be found in Mason and Weaver."
If Ampere's third condition is formulated in terms of a field concept, one may write

dF = I' se X n
in which dF is the force exerted on the current element I' df' and B is the field caused
by the closed circuit. From this, if I eli is an element of the closed circuit, according
9 Ibid., p. 64.
10 A. lVI. Ampere, "On the Mathemat.ical Theory of Electrodynamic Phenomena Uniquely Deduced
from Experiment," illem Acad, 175-388; 1825.
11 M. Mason and W. Weaver, The Electromaqnetic Field, pp. 176-183, The University of Chicago Press,

1929. Reprinted by Dover Publications, Inc., New York.


SECTION I Historical Survey 223

to the Biot-Savart law, it exerts a force OIl li df/ given by


l

l
d2F a: III di X (di X 6)
~3

which does not agree with Ampere's formula for d 2F.


A lively controversy ensued for some time as to which of these formulas for differ-
ential force is correct. Various investigators have shown 12 that for closed circuits
f f d 2F gives the same answer when starting with either forrnula. In Ampere's time it
was not possible to decide the question by experiment. ~ O\V that the motion of free
charges can be studied under the influence of magnetic fields, the decision is clearly
in favor of the Biot-Savart formula. Ampere's difficulty can be traced to the improper
assumption that the elemental force acts along the connecting line.
Unlike Biot, who regarded magnetic poles as fundamental, Ampere considered
magnetism to be basically an electrical phenomenon. He viewed a magnetized rod as
equivalent to a coil carrying an uninterrupted current. He showed that t\VO solenoids
deflect each other in exactly the same way as do t\VO maguetiz ed rods, and was even
able to show that a single current loop, when free to move, sets itself like a compass
needle with respect to the earth's magnetic field. Ultimately, An1pere came to the view
that every magnetic molecule is really a small permanent circular current. This view-
point was much too advanced for his contemporaries. The meager knowledge of atomic
structure would not permit the conception of permanent currents within materials
without a source of power. However, the impression produced by this memoir was deep
and lasting, and today Ampere's views of these phenomena form the core of magnetic
theory. He is properly credited with authorship of the force law dF = I'dil X B,
even though Biot and Savart deserve citation for the correct formulation of B in terms
of the current elements in a closed circuit. Ampere himself extended the applicability
of this formula by showing that a permanent magnet will exert a force on a current.
His achievements were truly remarkable and Maxwell, writing a half century later,
labeled his memoir "one of the D10st brilliant achievements in science." As a fitting
tribute, the unit for electric current and the circuital law linking magnetic field and
current are named in his honor.
During this same period Faraday made a discovery of the greatest practical impor-
tance. His interest had been aroused in electromagnetism in April 1821 when Wollaston,
a colleague at the Royal Institution, attempted to make a current-carrying wire revolve
around its own axis in the presence of a magnet. Although the experiment was unsuc-
cessful, it piqued Faraday's interest. He began by reading what had been done by
Oersted, Ampere, Biot and Savart, and others, and repeated many of their experiments.
Finally, upon repeating Wollaston's experiment, he noted :13
. . . Magnets of different power brought perpendicularly to this wire did not make it
revolve as Dr. Wollaston expected, but thrust it from side to side . . . . The effort of the
wire is always to pass off at a right angle from the pole. indeed to go in a circle round it; . . .
a single magnet pole in the centre of one of the circles should make the wire continually
turn round. Arranged a magnet needle in a glass tube with mercury about it and by a cork,
water, etc., supported a connecting wire so that the upper end should go into the silver cup
12CL, e.g., 1\Iason and Weaver, .u«, pp. 183-185.
13Faraday's Diary, vol. 1, pp. 49-50, being entries in his laboratory notebook for September 3rd, 1821.
Published by G. Bell and Sons, Ltd., London, 1932.
224 1J1 aqneiosiatics in Free Space CHAPTER 4

and its mercury and the lower move in the channel of mercury round the pole of the nee-
dle . . . . In this way got the revolu tion of the wire round the pole of the magnet . . . .
Very Satisfactory, but make more sensible apparatus.

This was the first electric motor. 'I'he next day Faraday improved on it and shortly
thereafter invented the COD1111utator. nut he left to others the reduction to practice.
Magnetostatic theory was advanced and placed 1110re in analogy with electrostatic
theory by Franz N eumann (1798-1895) of Konigsberg, who introduced the concept
of the magnetic vector potential function A. 1\eumann discovered the utility of this
formulation 14 while devising a theory based on Faraday's emf law, and the A which
he defined is the more general time-varying function which will be encountered in
Chapter 5. However, its time-independent counterpart facilitates the solution of many
magnetostatic problems and will be used extensively in the sections to follow,
An entirely different approach to magnetostatic theory was first perceived by Leigh
Page (1884-1952) of Yale University."! Adopting the principles of special relativity
and Coulomb's law as fundamental postulates, he began with a system of charges at
rest relative to an observer 0'. Upon introducing a second observer 0, in constant motion
relative to 0', he observed that the charge system took on the appearance to 0 of a
steady current. Upon transforming the Coulomb force, Page noted that the force on a
test charge, as observed by 0, was slightly different from the value determined by 0'.
He was able to show that this small difference precisely accounted for the magnetic field.
1\ ot only was this demonstration additional evidence in support of the validity of
special relativity, but it also further illumined the basic unity of all electrical phe-
nomena, whether they are due to charges at rest or in motion. A generalization of Page's
development will form the core of the present chapter.

4.2 THE TRANSFORMATION OF ELECTRIC FORCE

Let an electric charge q~ be at rest at an arbitrary point (x~,y~,z~) in the X' y' Z' coordi-
nate system and let a moving] charge q' be instantaneously at the position
r' = lxx' + lyy' + lzz'
The coordinates x', V', z' may be arbitrary functions of time so that, in general, the
charge q' has a velocity v' (t'). This situation is depicted in Figure 4.2.
An observer 0', stationary in X'}TlZ', will determine the force exerted by q~ on q'
through application of Coulomb's law, obtaining
, q'q~ 6'
F = - -- (4.1)
477'EO (!") 3

in which ~' = lx(x' - x~) + ly(Y' - y~) + lz(z' - z~) and it is assurned that the
charges are in free space.
Let the coordinate axes of an X YZ system be aligned with the corresponding axes
t For the moment this statement means that this charge had a value of q' when at rest in X' Y' Z'.
It shall be seen shortly that it is convenient to consider charge to be an invariant.
14 F. E. Neumann, "The Mathematical Laws for Induced Electric Currents," Berlin Abhandiunqen,

p. 1; 1845. Also p. 1; 1848. Reprinted as nos. 10 and 36 of Ostwald's Klassiker,


15 L. Page, "A Derivation of the Fundamental Relations of Electrodynamics From Those of Electro-

statics," Am J Sci, 34, 57-68; 1D12.


SECTION 2 TYhe T'romsjormotion of Electric F'orce 225

z' v'(t')

r----------------y'

X'
FIGURE 4.2 Notation for a movinq charge acted on by a fixed charge.

of the X'Y'Z' system and let the X axis slide along the X' axis in the negative direction
at a speed u. Upon the coincidence of the two origins let t = i' = O. If VI and V are the
velocities of q~ and q' with respect to an observer 0 who is stationary in XYZ, then
(4.2)
From Equations (2.77), the force exerted on q' by q~, as determined by 0, is given by
F = Fe + F; (4.3)
in which Fe = lxF~ + lyKF~ + lzKF~ (4.4)

2"" [ l x(vyFy + VzF z )


KU , , , ,
Fm = - lyvxF y - lzvxF z ] (4.5)
c
with K = (1 - U2/C2)-~~

The division of F into t\VO parts is arbitrary but useful in that Fe contains all the
terms not dependent on the motion of q' relative to XYZ, whereas F m contains all the
terms which do depend on v, The subscripts on the forces Fe and Fm refer to the fact
that they shall be designated the coulomb force and the magnetostatic force for reasons
which shall become evident.
Two cases of this force transformation will now be considered.
Case 1 : V == o.
In this case, q~ is static in X'Y'Z' and q' is static in XYZ. The force exerted by q~ on
q', as determined by 0, is simply Fe given by (4.4).
226 M agnetostatics in Free Space CHAPTER 4

Suppose that one wishes to apply the concept of electric flux density to this situation.
The reader will recall that in Chapter 3 the idea of electric flux was introduced in
connection with a system of fixed charges. Consistent with that discussion, observer 0'
can say, since q~ is at rest in his coordinate system, that the electric flux density associ-
ated with q~ is given by
, ,
D' - ~-~­ (4.6)
o - 41r (~') 3

in which ~' is drawn from q~ to a field point P' where D~ is being determined,
0' can say further that if q' is instantaneously at the point P', it interacts with the
field D~ so as to experience a force
,D~
F' = q - (4.7)
EO

which is consistent with (4.1).


One needs to proceed cautiously however, in picturing force as the interaction of
charge and electric flux, when adopting the viewpoint of observer O. For him, the charge
q~ is not at rest. To discuss the concept of electric flux associated with a moving charge
requires an extension of the original definition of electric flux.
It is useful to explore the consequences of enlarging the original definition by assum-
ing that a quantity of electric charge and the total electric flux associated with it are
invariants. The 0' observer, for whom q~ is at rest, will picture the electric flux as
emanating from q~ with a spherically uniform distribution. Referring to Figure 4.3a, he
can find the components of D~ at any point I)' by erecting small displacements I)' r;
and P'1); as shown, When the segment P' P~ is rotated around the line L' as axis, it
cuts out a band of area S~, as shown in Figure 4.3b. If all the electric flux which pierces
this band is counted and divided by the area S~, 0' obtains the transverse flux density
[(D~)2 + (D~z)2p2. Similarly, when the segment P'P; is rotated about L', it cuts out
a ring of area S; as shown in Figure 4.3c. If all the electric flux which pierces this ring
is counted and divided by the area S~, the longitudinal flux density D~x is obtained.
The 0 observer, for \Vh0I11 the charge ql has the velocity l x u , sees all longitudinal
dimensions of X' Y' Z' contracted by the factor (1 - u 2 / c2 ) ~2 and all transverse dimen-
sions unaltered, thus picturing the instantaneous situation shown in Figure 4.3d.
Under the assumption that charge and total flux are invariants, 0 will count the same
number of flux lines piercing the area SI (generated by rotating PI)1 about L as in
Figure 4.3e), that 0' counts piercing the area S~. However 0 finds the area SI to be
smaller than S~, the relation being

SI = S~/ K
Thus 0 concludes that the transverse flux density is
(D~y + D~z)~~ = K[(D~)2 + (D~z)2P2 (4.8)
By rotating PP2 about L as axis (Figure 4.3f), 0 generates the same area as does 0'
and counts the same number of piercing flux lines and thus concludes that
(4.9)

If 0 considers the force exerted by ql on q to be computable in the usual way, in terms


SECTION 2 The Transformation of Electric Force 227
z' z

------L' ------L
q'I q1

X' X

(a) (d)

P'1 pI PI P
81

+..-...~-L' --+--+-- L

P' P

L' -~-L

q'1

(c)

FIGURE 4.3 Comporison of flux densities.


228 AIaqneiostatics in Free Space CHAPTER 4

of an interaction between q and the flux field of ql, he can write, by virtue of (4.8) and
(4.9),

Doy , KD~y ,
Fy = q - = q - - = KF y
EO EO

But these equations agree with (4.4). Therefore by defining charge and total electric
flux to be invariants, and by making use of the relativistic contraction of length, one
can extend the validity of the relation

Do
Fe = q- (4.10)
EO

to include the case that Do is the flux density due to a charge in constant translatory
motion.

Z'
.
Moving charge

Z
Stationary
charge

q

Stationary field
X
----------------x' L....--

FIGURE 4.4 Interaction of electric field and charge in relative motion.

The flux distr-ibution as visualized by the two observers is illustrated in Figure 4.4.
For observer 0' the field is stationary and spherically symmetrical; the charge q' is
moving through this field at the constant velocity - l x u . For observer 0 the charge q
is stationary and the field is moving past it at the constant velocity l x u ; the field
exhibits some longitudinal compression due to its motion. For both observers the force
is time-varying-for observer 0' this is because q' keeps moving into new regions of
different static field intensity-for observer 0 it is because, at the static position occu-
pied by q, the field intensity keeps changing with time.
SECTION 2 The Transforrnation of Electric Force 229

Since measurements of distance by the t\VO observers can be connected through the
relation
~' == [K2 (X - XI) 2 + (y - Y 1) 2 + (z - z 1) 2P~

it follows that (4.8), (4.9), and (4.10) can be combined to give

(4.11)

in which Fe has been expressed entirely in terms of quantities measured in XYZ.

Case 2: v =1= o.
Under this condition, in X'Y'Z', q~ is at rest and q' has the general velocity v'(t').
In XYZ, ql has the constant velocity VI == lxu and q is no longer at rest, but rather has
the general velocity v(t). The force exerted by qI on q, as determined by 0, is now given
by F = Fe + F m in which Fe has the same value as in Case 1 and, from (4.5)

r, == K
V (VI,)
~ X~ X F (4.12)

If the idea is retained that charge and total electric flux are invariants, then it
is still true that Fe == qD o/ EO, but the total force F on q is no longer equal to Fe. One
could discard the assumption of invariance of charge and flux and require that both
vary in such a way that the transformation linking D~ and Do yields the relation
F = Fe + F m = qDO/EO. But it is apparent from a study of (4.12) that this would
require that the flux field due to an electric system (the rigidly translating charge qI)
would depend on the motion of a test charge q which was not a part of the system.
Such an unwieldy definition has no utility. Therefore, the postulate that charge
and electric flux are invariants will be adopted generally and an additional
vector field will be introduced to account for the force F m. The reader has perhaps
already surmised that this will be the magnetic field.

on the arbitrarily moving charge q can be expressed by °


In summa.tion, if q1. is moving in XYZ at constant velocity, the force exerted by ql
as

F=q-+K-X Do
V (VI
- x qD~)
- (4.13)
EO C C Eo

wherein use has been made of (4.7). Since VI is X directed, this equation may be con-
verted into a form containing only XYZ quantities through utilization of (4.8), with
the result

(4.14)

in which the nature of the field Do associated with the moving charge qI is precisely as
described in Case 1 and pictured in Figure 4.4.
Let a new vector field B(x,y,z,t) be defined by the relation

(4.15)
230 ;'1 agnetostatics in Free Space CHAPTER 4

in which it is noted that B is a function of time (by virtue of the fact that Do is time-
varying) and depends on the source system (the moving charge ql) but not on the charge
q. B is called the magnetic flux density function and has the units of webers per square
meter. A weber is 1 volt-see and these units will take on more meaning in Chapter 5.
Substitution of (4.1(5) in (4.14) gives

F = q[E + v X B] (4.16)

This important equation is known as the Lorentz force law. So far, it has been formu-
lated only for the simplest source system of a single charge ql in constant translational
motion, exerting a force F on an arbitrarily moving charge q. K ow a generalization
of this result will be undertaken.

4.3 THE FIELDS DUE TO A CLOSED CIRCULATING CHARGE SYSTEM

If there is a system of charges q1 . . . qN at rest in X' Y' Z', instead of the single charge
ql, the charges of this system will have rigid translatory motion through XYZ at the
common velocity l x u. By use of the principle of linear superposition, the total field D~
due to this charge system can be found by the methods of Chapter 3, and the total field
Do can then be found with the aid of Equations (4.8) and (4.9). An observer at rest in
XYZ can determine a magnetic field B(x,Y,z,t) due to the moving charge system by
using (4.15) and can then compute the force on a test charge q moving at a velocity v
by employing (4.16). In other words, the results of the previous section are applicable
to a system of rigidly translating charges as well as to the single translating charge q1.
Admittedly, a rigidly translating system of charges is not a physically realizable
source; however, it may be used as a constituent part of any real system of charges
and currents. As an illustration, consider the closed system of circulating charges shown
in Figure 4.5. It is assumed that the motion of these charges is such that the amount of
charge in a given volume element is always the same, albeit the identity of the charge
keeps changing. It is further assumed that the charge velocity associated with a par-
ticular volume element is unchanging, so that the flow can be characterized by a static
VOlU111e charge distribution p(~,l1,r) and by a static velocity distribution Vl(~,l1,r). This
circulating stream of charge constitutes a time-independent current density t(~,77,r) = PVI.
(Cf. Example 3.16.) If a separate test charge q is instantaneously at the point (x,y,z)
with a velocity v(t), the force F which the circulating charge system exerts on q can be
found as follows:
The circulating charge system in X YZ can be shown to be equivalent to a linear
superposition of static charge distributions in all other Lorentzian frames X'Y'Z'
which move at a variety of constant velocities u with respect to XYZ. (See Appen-
dix E.) Therefore, by superposition, the results of the previous section are extendable
to this source system. The force on q can then be thought of as being composed of differ-
ential contributions due to each source element in the circulating charge system.
Specifically, the charge P(t,l1,t) d~ d'YJ dr, which is moving at the velocity Vl(~,'YJ,t), can
be said to exert a force on q given by
dF = q(dE + v X dB) (4.17)

dD o VI X dD o (4.18)
in which dE =- dB =---
2
EO C Eo
SECTION 3 The Fields Due to a Closed Circulating Charge Syste1n 231

The electric flux field dD o at (x,y,z), due to the moving charge p dV, would be time-
varying except that other charges keep moving into dV and assuming the velocity VI,
thus assuring a steady contribution to the field at (x,y,z).

v(t)

~(X,y,Z)
FIGURE 4.5 Force due to a circulating charge system.

All of the circulating charges can be taken into account by integrating (4.17) to
obtain
F = q[E + v X B] (4.19)
which is a generalization of the Lorentz force law (4.16), based on the principle of
superposition. The fields contained in (4.19) are given by

E(x,y,z) f dD
v
o
(4.20)

f
Eo

B(x,y,z) =
VI X zn, (4.21)
2E
v C O

in which V is the volume occupied by the circulating charge system. These fields are
static because each elemental electric flux field dD o is time-independent. This fact
232 Jl1 aqneiosiaiics in Free Space CHAPTER 4

can be appreciated by still another argument. Imagine that the charge q is at a par-
ticular point (x,Y,z) with a particular velocity v. If, at a later time, q is once again at
this same point with the same velocity, it will experience the same force as before
because the state of the circulating charge system is unchanged. Thus the force F in
(4.19) is a function of time only because of the motion of q, and not because of any time
variance of E and B.
Since charge and electric flux have been defined as invariants, it follows that Gauss'
law is applicable to this situation so that

V· Do = p (4.22)

Also, since p(t,l1,f) is static (because of the nature of the circulating charge system under
discussion), it further follows that

(4.23)

with ~ drawn from (~,l1,r) to (x,Y,z). One 111ay conclude that the closed circulating
system of charges gives rise to a static electric field which does not differ from what
would occur if the charges were at rest. This should be contrasted with the case of the
electric field associated with a single charge, which was found to depend on the charge's
motion.
It follows further that, since the volume V is arbitrary,

dD
o
= p~dV (4.24)
41T'~3

Returning to (4.20) and (4.21), one can write the static fields in the forms

f
p~dV
E(x,Y,z) = -­ (4.25)
v 41T'Eo~3

X ~ dV
B(x,Y,z) =
V
f PVI
4 2
1T'C EO~
3 (4.26)

If the source system is specified, these integrals may be evaluated and the results
inserted in (4.19) to obtain the force on q.

4.4 THE BIOT-SAVART LAW

Inspection of Equations (4.19), (4.25), and (4.26) reveals that if vI/c and vic are much
less than unity, which is usually the case, then [v X BI «< E, and a good approximation
to the Lorentz force on q is to ignore the B field altogether. This may be a valid con-
clusion for the system of uncompensated circulating charges discussed in Section 4.3.
However, imagine now that an additional system of charges with distribution - p(~,l1,r)
is superimposed on the first, but that the individual charges of the second system do not
move. Then each moving charge qI of the circulating system finds itself alongside a
charge - qI of the noncirculating system. The charge pair [qI, - qd exerts equal and
opposite coulomb forces on q but only the moving charge qI contributes to the magneto-
SECTION 4 The Biot-Savart Law 233

static force F m acting on q. Under these conditions the net force on q is

F m == qv X B (4.27)

with B given by (4.26). In this case, ignoring B is tantamount to ignoring the entire
force on q.'
With the two systems of sources superimposed in this manner, one circulating and
the other not, every volume element dV is electrically neutral. This situation describes
conditions which prevail inside conductors through which steady currents are flowing.
The drift velocities of the electrons have the distribution VI (~,17,t). The moving elec-
tronic charge p dV is compensated by the stationary ionic charge - p dV. A time-
independent current density \ == PVI ampyrn" flows through the volume element dV
(cf. Example 3.16). Although the individual electrons have drift velocities so low that
VI/ c may be as small as 10- 10 , there are usually so many electrons participating in the
current flow inside conductors that the calculation of B from (4.26) yields a value
which is often not insignificant.
Let a substitution constant )).0, called the permeability of free space, be defined by the
relation

(4.28)

In lVIKS rationalized units, )).0 == 41r X 10- 7 henries/m; this unit will become more
meaningful when the circuit concept of inductance is introduced. With this substitution,
(4.26) can be written

B(
X,Y,Z
) == J l(~,17,t)
4
X ~ dV
-1 3 (4.29)
V 1rJ..Lo ~

This is the Biot-Savart law and permits computation of the B field arising from any
distribution of steady currents. Equation (4.27) can then be used to find the force
which this field exerts on a charge q moving at a velocity v. Because the magnetic field B
due to the system of steady currents is time-independent, this subject is given the name
nuumetostatics.
EXAMPLE 4.1
As an illustration of the use of (4.29), consider the case of a long straight wire in free space,
extending along the Z axis from -Zl to +Zl and carrying a time-independent current I.
Then t dV = tA dt = I d.t = lzl dris a current element, with A the small cross-sectional
area of the wire, The magnetic flux density at a point (r,¢,O) in cylindrical coordinates will
be

in which the current element is situated at the source point (O,O,s), as shown in the figure.
The above expression integrates readily to give
234 M agnetostatics in Free Space CHAPTER 4

-,
5'

j .--+-----~---- y

x
I P(r,q"O)

For points not too near the ends of the wire, and not too far removed from the wire, so that
r« Zl,

B = 1<p---=1
I
27rJ..Lo r

In such regions the magnetic flux density can be mapped as a system of coneen tric circular
lines which thin ou t with distance from the wire as r- 1 •
EXAMPLE 4.2
A freely moving charge q enters a region in which a steady magnetic field exists, being
described by the equation B = 1zB o, with B o independent of spatial coordinates as well as
time. If the entering velocity is the constant Vo = 1xvox + 1yvoy + 1zvoz, find the subse-
quent motion of the charge.
The equation of motion is given by

d
Fm = qv X B =- (mv)
dt

If the velocity of the charge is never so great that its mass m need be considered relativis-
tically, this equation can be broken down into the components

qVyB o = miJ x
-qvxB o = mv y
o= mii,
SECTION 4 The Biot-Savart Law 235

These equations integrate to give


Vx = VOx cos Wbt +
VO y sin Wbt
Vy = VO y cos Wbt - vOx sin Wbt
Vz = VOz

in which Wb = qBo/m is known as the cyclotron frequency. The charge thus follows a helical
path parallel to the Z axis, the radius of the helix being
2
(VOx + VOy2) ~~
fo = -----::---
Wb

One interesting aspect of this solution is that what would otherwise be the lateral drift of the
charge has been converted to a circular motion, which can be very tight if B« is large. Fur-
ther, if (xo,Yo,zo) is a point in the trajectory, then so too is the point (xo, Yo, Zo+ 21T'Voz/Wb).
'rhus, if a group of charges is injected at the point (xo,Yo,zo), with random initial transverse
velocities but a common VOz, they will all come to a "focus" at 21T'VOz/Wb units of distance
further along the Z axis. This principle is used in the design of many electron devices, in-
cluding some cathode ray tubes.

EXAMPLE 4.3
In 1879 Dr. Edwin Hall of Harvard observed that when a conductor carrying a steady
current is placed transverse to a magnetic field, as indicated in the figure, a transverse
charge separation occurs in the conductor. This phenomenon, called the Hall effect, has
proved useful in the determination of charge densities in materials, including semicon-
ductors. I t can be explained by the following argumen t:

;--------- y

.X

Let the magnetic field be locally uniform and given by B = l yB y • Let the current flow in
a conductor of rectangular cross section, and let it have the uniform value t = l x Lx • Then
the conduction electrons have an average drift velocity v = l x v x and

Lx = PV x = -nevx

in which -e is the electronic charge and n is the volume density of conduction electrons.
236 M agnetostatics in Free Space CHAPTER 4

These electrons experience an average force given by


f = -ev X B = - l zevxB y

This force causes a charge separation in the Z direction until oppositely charged layers are
built up on the top and bottom faces of just the proper value to cause a compensating force.
If E H is the electric field caused by this charge separation, then

O = -e E H - evx B y = -eE H + -LxBy


n
so that the free charge per unit volume is

LxB y
ne=-
EH
Since all the quantities on the right side of this equation can be measured, the number of
free electrons per atom can be determined for various metals in this manner. If the technique
is applied to a p-type semiconductor, the Hall field E H is reversed, indicating that the current
is caused predominantly by positive carriers.

4.5 THE MAGNETIC FIELD INTENSITY

In Chapter 3, when using Coulomb's law, it was convenient to introduce the concept
of an electric field by spli tting the force expression in to two factors, namely,

p dV ~
Fe = qE = q
J--
V 41T'EO~3
(4.30)

Similarly, it has proved convenient, when discussing the force on a moving charge, to
introduce the concept of a magnetic field by writing

\ dV X ~
F m = qv X B = qv X
V
J -1 3
41T'JLo r
(4.31)

The forms of these t\VO integrals suggest a certain analogy between Band E, with
p dV being the sources for E and \ dV being the sources for B. The comparison between
magnetostatics and electrostatics is further heightened by the introduction of a new
vector function Hs, analogous to Do, defined by the relation

(4.32)

in which the zero subscript is a reminder that the discussion so far excludes magnetic
materials. H, is called the magnetic intensity, and when (4.32) is combined with the
Biot-Savart la w one finds that

-
11 o(X,Y,Z ) -
J \(~,1],r) X ~ dV
4 (4.33)
V 1r~3

Thus the units of H, are amperes/me


The manner in which E and B enter the Lorentz force law, with E acting on a charge
element q to give a force, and B acting on a current element qv to give a force; the sim-
ilar manner in which E and B are related to the sources p dV and t dV; the similarity
SECTION 6 The Force between Currents 237

o
between the defining relations Do = EoE and H, = ,u 1B- all serve to point up the fact
that B plays a role similar to E and that H, is analogous to Do. It is unfortunate that
this point was not fully appreciated during the early evolvement of electrical theory,
since awareness of the analogy would be enhanced by use of the reciprocal of !J.O rather
than )..Lo itself. In this text )..Lo1 shall be used wherever convenient in order to emphasize
this d uality.
The value of introducing H, will begin to emerge shortly when Ampere's circuital
law (which is analogous to Gauss' law) is established. The principal utility of introduc-
ing D and H arises when dielectric and permeable materials are discussed (cf. Chap-
ters 6 and 7).

4.6 THE FORCE BETWEEN CURRENTS


If the solitary moving charge q is replaced by a volume charge element Pa dV a so that
qv ~ PaY dV a = t a dV a
with \a dV a a current element which is not necessarily time-independent, Equation
(4.27) becomes
(4.34)
Equation (4.34) gives the elemental force on a general current element r, dV a due to a
steady magnetic field B(x,Y,z). This field is caused by a distribution of steady current
elements and is deducible from the Biot-Savart law. Equation (4.34) is a differential
form of what is often called Ampere's force law.
The total force on all the current elements in the circuit of which \a dV a is a part
can be written

Fm = f
v,
ta X [f \b4
Vb
X ~-1dVbJ
7r,uo
3
~
dV a (4.35)

in which tb is the steady current distribution giving rise to the field B and ~ is drawn
from dl1 b to dV a . The two volumes Tla and l'b may overlap. For example, a closed
circuit of steady current can exert a magnetic force on itself.
EXAMPLE 4.4
A simple case of magnetic interaction of some importance involves t\VO long thin straight
parallel wires carrying steady currents Xl and X2 and separated a distance d. This situation
can be approximated by assuming the wires to be vanishingly thin and infinitely long. Then
(4.34) gives

as the force on a length dr 2 of the second wire, due to all the current elements in the first

wire. (No current element in the second wire experiences a force due to any other current
element in the second wire because 12 ee, X ~ == 0.)
ds
No loss in generality arises from taking 2 at the position S2 = 0, as shown in the figure,
and writing for ~ the relation
238 JJ1agnetostatics in Free Space CHAPTER 4

If by en one means the force per unit length on the second wire , then

(4.36)

'rhus the force between the wires is attractive if the currents are in the same direction'
otherwise it is repulsive. This simple formula can be used to define the unit of current i
ampere, in terms of a mechanical measurement of force. '

--+----)(


~-----d-----~

When (4.36) is considered in conjunction with COUIOlUb's law, each is seen to contain
two electrical quantities, either q and fo or I and JJ.o • But q and I are related through a time
1

derivative and JJ.o and fo are related by (4.28). Thus in reality (4.36) and Coulomb's law
1

each contain the same t\VO electrical quantities, and these t\VO force laws taken together
permit the definition of all electrical quantities in tenus of the units of mass, length, and
time, indicating that a fourth fundamental unit is unnecessary.
This is not to deny that a fourth fundamental unit is convenient nor to suggest that mass
is more fundamental than charge. One could equally well start with electricity instead of
SECTION 7 The Time-Independent Magnetic Vector Potential Function 239

gravitation and conclude by being able to define mass in terms of the units of charge,
length, and time.

4.7 THE TIME-INDEPENDENT MAGNETIC VECTOR POTENTIAL FUNCTION

If \(~,17,S) and ~ are other than very simple functions, the evaluation of B from (4.29)
can be extremely difficult. This same situation was encountered with the electric field
in Chapter 3 and was eased by the introduction of the scalar electric potential function,
whose gradient gave -E. By analogy one is led to wonder whether B can be expressed
alternatively as a vector derivative of a potential function. That this is possible can be
demonstrated by the following argument:
Since ~ = lx(x - ~) + ly(Y - 17) + lz(z - t), the relation

(4.37)

can be used to rewrite (4.29) in the form

B = - ~ J 1 X V F (~)~ dV
47r~o V
(4.38)

a a a a a a .
In (4.37) V F = L, - + ly - + L, - and V s = L, - + ly - + L, - are the gradient
ax ay az a~ a17 at
operators with respect to the coordinates of the field point and the source point
respectively.
The vector identity (V.109) can be utilized to obtain

VF X G) = ~ VF X 1 + VF (D X 1

But V F X t == 0 because 1 is a function only of the source variables ~, 17, r. Thus (4.38)
can be written

B = ~ rJ V
47r~o
F X (~)
t
dV

However, since the limits of integration are also independent of the field point P(x,Y,z),
the order of integration and differentiation may be inverted to give

B = VF X
V
J 47r~OI~
~~
Therefore, it is convenient to define a magnetic vector potential function A by the
expression
~, )
A( X,y,z = J \(~,17,r) dV -I (4.39)
V 41T',uo ~
from which B=vxA (4.40)
In almost every case it is simpler to compute A first and take its curl to find B rather
than to compute B directly from (4.29).
240 111aqneiosiatics in Free Space CHAPTER 4

One important consequence of (4.40) is the fact that

V·B == 0 (4.41)

This follows from the vector identity V • V X A == 0 (cf. Mathematical Supplement,


Equation V.III). Because of the defining relation (4.32) it also follows that
V· H, == 0 (4.42)
EXAl\IPLE 4.5
For the long straight wire of Example 4.1, the magnetic vector potential function is simply

which integrates to give

A = ~ In (z + Zl) + [r 2 + (z + Zl)2]~~
47r,u Ol (z - Zl) + [r 2 + (z - Zl)2P~
Taking the curl in cylindrical coordinates, and then inserting the point per, cP;O) one obtains

nCr ~O) - 1 _1_


,0/, - q, 2
7r,uo-1 r (2
Zl
+Zl r2) I~
7

in agreement with Example 4.l.

EXAl\IPLE 4.0
Imagine a small circular loop of radius a carrying a steady uniform current I, as suggested
by the figure. Let localized spherical coordinates (r,(),cP) be set up with origin at the center
of the loop, and such that the loop lies in the () = 7r/2 plane. I t is desired to find the mag-
netic field at a remote point P(r, (),cP) such that r a. »
Consider the current clement Ia d{3 situated {3 deg beyond the cP plane. The contributions
to A of this element and its twin, which is {3 deg in front of the cP plane, sum to only an A cP
z
p

~::::------+----+------y

d{j

x
SECTION 7 The Time-Independent Magnetic Vector Potential Function 241

component. By thus arranging all the current elements in pairs one can conclude that
A(r,(},et» = lct>A<t>(r,(},et» and the task has simplified to one of finding A<t>(r,(},cP). By sym-
metry, Act> is not a function of ¢, so there is no loss in generality resulting from placing
P(r,(),¢) in the YZ plane. Then

lct>Ia d{3 = -lxla cos {3 d{3 - lyla sin {3 d{3


and, from (4.39),

A (r ()) = 2 /11" Ia cos {3 d{3


<t> , 4 -1
U 1rJ.1.o t
in which ~ = [(a sin (3)2 + (r sin 0 - a cos (3)2 + (r cos O)2]H
is the distance from the element Ia d{3 to the point P. Since r » a,

!
t
= ~
r
(1 + ~ r
sin 0 cos (3 - ~
2r
+ · · .)
2

If terms of higher order than r- 2 are neglected,

A<t>(r,O) = I~l
21rJ.1.o r
/11" (1
0
+ ~ sin 0 cos (3) cos {3 d{3
r
Integration gives
1ra 2I sin fj
(4.43)
41r,LLo1 r 2
It is useful to define what is known as the magnetic moment m of this small current
element. m is chosen to have a magnitude equal to the area of the loop times the current
and a direction perpendicular to the plane of the loop. Thus 111 = 1ra 2I; the direction of m
obeys the right-hand rule, which means that if the fingers of the right hand are placed along
the loop in the direction of current flow, then the right thumb points in the direction of m.
With this definition, Equation (4.43) may be written

A=mXr (4.44)
1
41rJ.1.o r 3
in which r is drawn from the center of the loop to the point P.
The use of (4.40) yields

B = ~ (lr2 cos () + 18 sin ()) (4.45)


41r,LLo r 3
Equation (4.45) has special significance since it is found to be in the same form as (3.29);
thus there is a duality between electric dipoles and small current loops.
EXAMPLE 4.7
If the small circular loop of the previous example is immersed in a region of uniform mag-
netic field B o, it experiences a torque tending to align its magnetic moment with B o. This
effect can be appreciated by referring to the figure, in which the loop is seen edge on and the
uniform field is indicated by flux lines. The current in the loop is assumed to be coming out of
the paper on the left side and into the paper on the right, and therefore m is upward, as
shown.
Application of Equation (4.34) leads to the conclusion that the B, field exerts forces on the
left-hand and right-hand current elements which are outward, causing a couple which tends
242 Magnetostatics in Free Space CHAPTER ,1

to rotate the loop so that m will be parallel to B o. A quantitative expression for this couple
can be derived as follows:
With no loss in generality, the uniform field may be assumed not to have an X component,
in which case it can be given by

with (}o a constant polar angle measured from the Z axis.


A current element I d.t situated at the latitudinal angle ep can be represented by

I d.t = Ia d<l>( -1 x sin <I> +1 t1 cos <1»

and, according to (4.34), this current element experiences a force

dF m = laRa dep( -Ix sin ep + 1y cos ep) X (L, sin ()o + 1z cos ()o)
= laR o d<l>(1 x cos ()o cos <I> + ly cos ()o sin <t> - L, sin 90 sin <1»

This force causes a torque around the center point of the loop (cf. Example V.9, Mathe-
matics Supplement) given by
dT = r X dF m

in which r = lxa cos ep + l ya sin ep is drawn from the center of the loop to the current
SECTION i The 'I'ime-Lndependeni Mtujneiic Vector Potential Function 243

elemen t. Therefore,
dT = Ia 2B o dcj>( -Ix sin 00 sin? cj> + l y sin 00 sin cj> cos cj»
The total torque abou t the center point of the loop can be determined by integration:

f fo
211"

T = dT = -lxla B o sin 00 2
sin? <f> d<f>
(4.46)
= -lx(7ra 2I )B o sin 00
= m X Bo

Equation (5.46) indicates that the equilibrium position for the current loop occurs when its
magnetic moment is aligned with the field. If the loop is rotated from this equilibrium
alignment, its potential energy is increased. The energy which must be supplied to rotate
the loop from an initial angle 01 to a final angle O2 is given by

f f
(12 02

U = T se = mB o sin (0 - 00) dO
(h (h
= -mBo[cos (0 2 - (}o) - cos (0 1 - ( 0)]

If the zero reference level for the potential energy U is taken as occurring when the mag-
netic moment is transverse to the B o field (0 1 = 00 + 7r/2), then
U = -mB o cos ((}2 - (}o)
(4.47)
= -m· u,
in which the final direction of m is used in the dot product in (4.47).
The magnetic vector potential function A has the additional important property
that its divergence is zero. This can be seen by returning to (4.39) and writing

VF • A = ~
47r,LLo
f
v
V F • [l(~,l],t)J
r
dV

in which differentiation inside the integral sign is permissible because the volume limits
are not functions of x, y, z. Use of the vector identity (V.I07) gives

VF • (D i
+ \· V = VF • \ F (D
But V F • \ == 0, since \ is a function only of ~, 'fJ, r; therefore

VF • A = ~l vf \·V (~)
47r,LLo ~
dV F = - ---s f \·
47r,LLo v
VS (!)
~
dV

Use of the same vector identity gives

Vs · G) = ~ V s · \ + \ · V s (t)
However, V s · 1 == 0 because the currents are time-independent and the net efflux of
current from a volume element must be zero. Therefore,
244 Magnetostatics in Free Space CHAPTER 4

The divergence theorem is now applicable and permits the conversion to

VF • A = - _1_
41r,u Ol S
J \·dS ~

in which S is the closed surface bounding V. Since V can be maintained finite and yet
made large enough that none of the currents of the system intersects S, it follows that
one can make \ == 0 on S. This 111eanS that
(4.48)
as asserted.
EXAMPLE 4.8
In Example 4.5 the magnetic vector potential function for a long straight wire carrying a
steady current I was found to be

A
= 1 _I-1
n
(z +Zl) + [r 2 + (z +
Zl)2]~~
z
41r}.LO
1
(z - Zl) + [r 2 + 1.(,
(z - Zl) 2]72
and therefore the divergence of A is

V. A = aA z = _1_ {I + (z + zl)[r 2 + (z + zl)21~}" _ 1+ (z - zl)[r 2 + (z - ZI)21~H}


az 41r,u Ol (z + Zl) + [r 2 + (z + Zl)2r':! (z - Zl) +
[r 2 + (z - Zl)2P2

= 471"~OI {rr2 + (/+ ZI)2jH - [r 2 + (/- ZI)2jH}


This expression for V · A is not quite zero and the reason can be traced to conditions at the
two ends of the wire. There the current has been assumed to end abruptly and V • \ t= 0,
which violates a condition imposed in deriving the result V · A = O. If one were to include
the steady currents in the remainder of the circuit, of which this long wire is a part, then a
null value for the divergence of A would be obtained. Alternatively, if Zl and Z2 approach
± 00 it is seen that V · A ~ 0 for finite z.

4.8 AMPERE'S CIRCUITAL LAW


The fact that V · A == 0 opens the way to the proof of an important theorem, the result
of which is known as Ampere's circuital law. Recall that in electrostatics the equations

pdV
\72ep = - - P
Eo
ep -
J
V
--
41rEO~

were encountered, the first being a differential equation for the electric potential, and
the second its solution. But from (4.39), a component of A, for example the y com-
ponent, is given by

Ay =
v
J 'y
41rllo ~
d~

and therefore must satisfy Poisson's equation, namely,

(4.49)
SECTION 8 A mpere' s Circuital Law 245

If both sides of (4.49) are multiplied by l y and the result is added to similar terms
involving the x and z components, one obtains

\
-1
(4.50)
IJ-o

and A(x,Y,z) is seen to satisfy a vector form of Poisson's equation.


Since B == V X A, use of the vector identity (V.113) gives

V X B == V X V X A == v(v · A) - \72A

by virtue of (4.48). Thus

\
vxB==-=i
IJ-o

which means that


v X H, = \ (4.Fjl)

This is the differential form of Ampere's circuital law. Integration gives

f V X u, · dS = f \. dS
s s

in which S is an open surface bounded by the closed contour C. Application of Stokes'


theorem yields

c
¢n, · di = sf r- dS = Ienclosed (4.52)

This is the integral form of Ampere's circuital law and it plays the same role in mag-
netostatics that Gauss' law does in electrostatics.
EXAl\,IPLE 4.9
The results of Example 4.1 can be used to deduce that the magnetic field due to a steady
current in a long straight wire is

at points not too far removed from the wire nor too near its ends. Let a closed contour C be
erected which encloses such a wire. An element of length along C, expressed in cylindrical
coordinates, is (cf. Example V.17)

and therefore, ¢H
c
o' di = ¢(l)
C 21T'r
(r de/» == 1- f2~ dep =
21T' 0
X

which agrees with Ampere's law.


246 M aqneiosiaiics in Free Space CHAPTER 4

EXAlVIPLE 4.10
Consider an infinitely long cylindrical tube, shown in cross section in the figure, which
carries a uniformly distributed axial current I. What magnetic field is caused by this system
of sources?

.......-----+O-:-...:+---y

x
An answer can be given to this question by first noting that symmetry requires that H, be
independen t of cP; since A has only an axial corn ponen t, V X A has only a cP com ponen t.
Therefore the magnetic field is a function H et>(r).
Next imagine that a concentric circular contour C of radius r has been constructed in a
transverse z plane. If r ::; a, § /1 et> dt = 0 and therefore,
c
liq, == 0 r ~ a

If a ~ r S b, some current is enclosed by C. The uniform current density is given by

l.=----
I
1r(b 2 - a 2)

f
r

and thus
r~ Hq,(r)r d¢ =
a

(r 2
1r(b 2
-
I
-

a 2)
a2)
21rr'dr'

21rrHq,(r) = 1- --
2 (b
2 - a )
2
Ifet> (r) = _L (r - a 2) a ~ r :::; b
21rr (b2 - a 2)
Finally, if r ~ b, all the current is enclosed by C and

1/q,(r) = -L
21rr
b :::; r

Interior points of the tube are shielded; at all exterior points the field acts as though the
entire current were concentrated on the axis.
By superposition, if t\VO concentric conducting tubes carry equal and opposite steady
SECTION 8 A nipere' s Circuital Law 247

currents, which are uniformly distributed, the field between them is

Jlcf>(r) = l
27fT

Throughout the hollow interior of the inner tube and outside the outer tube the field is
everywhere zero.
EXAMPLE 4.11
Consider the long thin and tightly wound circular cylindrical solenoid shown in cross
section in the figure. Let a-a' represent the central transverse plane with ]J any point
(external or internal) in a-a'. Let the first task be to find B(IJ). I f I is the steady curren t in
the winding, COIning out of the paper at the upper cross section of each turn, symmetrically
disposed pairs of current elements I d£ and I d£' can be found, such as the two shown in the

b a

I
I I

b' a'

figure at distances ~ and i' from P. These t\VO current elements will make contributions to
the magnetic flux density at P which can be written

dB = I df X 6
47r,uOl~3

and which are shown in the figure. By symmetry these two contributions sum to a longi-
tudinal component of B only. When all the current elements in the solenoid are paired in
this fashion, it is evident that the entire B field is longitudinal at every point in the central
transverse plane a-a'.
X ext consider the field B(lJ 1) at an internal point in a noncentral transverse plane, such
as b-b', One can begin to construct B(IJ 1) by once again considering pairs of current ele-
merits, this time symmetrically disposed about P'; After awhile, all of the current elements
to the left of 1)1 will have been used up, but there will still be 80n1C left over far to the right
of ]J I . However, since the solenoid is long and thin, these leftover current elements can be
considered to advantage in pairs of a different sort, such as I df' and I se«,
(See figure.)
If b-b' is not too near either end, the posi tion vectors drawn from I df' to P 1 and from I d£"
to ]J 1 must be almost parallel as well as almost of equal length. Since the two current
elements are oppositely directed, it follows that their paired contribution to B(]J 1) is
negligible. Thus if the transverse plane b-b' is sufficiently remote from an end, B is essen-
tially longitudinal at all interior points of b-b',
248 it!agnetostatics in Free Space CHAPTER 4

z------------~--::e

x
With this information about the nature of the field inside the solenoid, Ampere's circuital
law can be applied to the contour 1234 shown dotted in the second figure. Since this contour
encloses no current, and since B is essentially perpendicular to the legs 23 anrl14, it follows
that
2 3

JB.di= JBodi
1 4

and thus that B is uniform over a transverse plane inside the solenoid, provided this trans-
verse plane is not too near either end.
Finally, n can be deduced at a point P far removed from the solenoid, with the use of the
coordinate system indicated in the second figure. If a is the radius of the solenoid, L its
length, and r the distance to the remote point 1~, then a « L « r, Let n be the number of
turns per unit length of the solenoid, so that n dr is the number of turns of a flat loop at the
source position t. On the basis of Equation (4.~14), the contribution to A at J:> for this flat
loop is
dA = 7ra 2I n d((lz X ~)
41r,uol~3

in which ~ = lxx + 1 y Y + lz(z - s) is the position vector drawn from the center of this
loop to the distant point P(x,y,z).
Since r» L, ~ varies insignificantly as S ranges from -L/2 to +L/2. Thus
1ra2InL 1
A ~--- zXr
41r,u o lr
3

1ra 2 I nL
and B = V X A ~ - - 1 - 3 (lr2 cos () + 18 SIn. ()
47r,uo r

and it is as though the entire solenoid were concentrated in the Xl" plane.
These conclusions permit an approximate sketching of the B field for a long slender
solenoid, with the result suggested by the third figure.
SECTION 9 Boundary- Value Problems in Magnetostatics 249

4.9 BOUNDARY-VALUE PROBLEMS IN MAGNETOSTATICS


In connection with Equation (4.50), it has been noted that A satisfies a vector form
of Poisson's equation; in regions removed from the current sources this reduces
to a vector forrn of Laplace's equation. Therefore, all the techniques discussed in
Chapter 3 pertinent to solving V'2q, = 0 would appear to be applicable to boundary-
value problems concerning A. Unfortunately, the situation is not that simple, due to the
vector nature of A. As discussed in Section V.16 of the Mathematical Supplement, the
Laplacian of a vector function generally involves D10re than the Laplacian of its scalar
component functions; additional terms may arise through the spatial derivatives of the
unit vectors. Only in rectangular coordinates is this not the case, because the unit
vectors have constant directions. In all other coordinate systems, the change in direc-
tion of these unit vectors with spatial position adds terms which complicate the solution
of the differential equation. For example, in cylindrical coordinates

V2A = lr ( V2A r - -
1"2
2 -aA<t>

-
Ar)
-
1"2
+ 1<1> ( V'2Act> -CJ¢r - -Act»
+ -1'22 aA 1'2
+ lzV'2A z (4.53)

and in spherical coordinates

(4.54)
250 1\1 agnetostatics in Free Space CHAPTER 4

It is apparent that one is generally confronted with the problem of solving more
complicated differen tial equations than the Laplacian of a scalar function. These
equations can be mixed, and will take different forms as the type of symmetry is
changed. For this reason the techniques tend to be more specialized than was found
to be the case when solving for the electrostatic potential function. A few examples
will serve to illustrate possible approaches.
EXAl\1PLE 4.12
A simple configuration in cylindrical coordinates has been treated in Example 4.10, that of
an infinitely long cylindrical tube carrying a uniformly distributed axial current. Such a
current distribution yields an A which is entirely axial and a function only of r. But if
A = 1zA z (r ), inspection of (4.53) reveals that Y72A = 1zV 2A z • Therefore, in this case Poisson's
Equation (4.50) is simply

r < a, r > b
The solutions to these equations may be written

Az(r) = CI r < a
Ir 2
1
41rJlo (b 2 - a 2)
+ C2 ln r +C 3

= C4 ln r + Cs r > b

in which the C, are constants of integration. Determination of the values of C I , C3, and C,
is not important, since they vanish in taking the curl of A to find the magnetic field. The
requirement that aA z/ ar be continuous across the interfaces leads to the evaluations

When these values for the constants are substituted in the above expressions for A z, per-
formance of the curl operation yields expressions for the magnetic field in the three regions
which are in agreement with the results of Example 4.10.

EXAMPLE 4.13
A problem in spherical coordinates which can be extended to several practical situations
involves a ¢-directed sheet of current lying in a thin spherical shell of radius a. If a is the
thickness of the shell, then j = ta amp/rn can be taken as the lineal current density in the
surface of the shell. It will be assumed that j = 1ct>i(8); that is, the current density will not
be taken as a function of ¢.
It follows that A will have only a cP component, which is a function of rand 8 but not cPo
Inspection of (4.54) indicates that for this case

Y72A = let> (V' 2A ct> - r sin"


~cP 0
2
) = 0

for points not in the shell. Expansion of the Laplacian operator gives

-
1 - a ( r 2 -aAct»
r 2 ar ar
+ ---1 -a
r sin () a()
2
( SIn
. 8 -aAct»
ao - -- Act> -
r 2 sin" (J
= 0
SECTION 9 Boundary- Value Problems in M agnetostatics 251

Upon assuming that A<jl = fl(r)f2(O) one obtains

idr (r 2
f1
d ) - n(n
dr
+ 1)f1 = 0

~
dO
f2
(Sin () d ) +
dO
[n(n + 1) sin () - -.1_J 12 = 0
SIn 0

in which n(n + 1) is a separation constant. Both of these equations were encountered in


Chapter 3 in connection with solutions of Laplace's equation in spherical coordinates. The
most general appropriate solution is

A = i, nt GY P~(cos
an 0) r < a

= i, nt Y+! an (; P;(cos 0) r > a

with these series constituting a complete orthogonal set.


Performance of the curl operation yields

1 ~
B = ~ ~ n(n + (r)n-1 Pn(cos 0) -
1)an -
6 ~
-1 ~ (n + ,1)an (r)n-l
- P~(cos 0) r <a
a n=l a a n=l a

=~ ~
1 ~ n(n + l )«, (a)n+2
- ~
Pn(cos 0) + -1 L
6
nan -
(a)n+2 P~(cos ()) r >a
a n=l r a n=l r

If a contour C is drawn in a ~ plane, straddling the shell as shown in the figure, application
of Ampere's circuital law gives

,
Jq,( O)a dO = JJ.o-1 a dO ~ [1 nL:~ 1 nanP n1+ 1 1J
~ (n + l)a nP n
~ n':l

I
-1 00

so that jq,(O) = JJ.; (2n + l)anP;(cos 0)


n=l

is the lineal current density, expressed as a sum of orthogonal terms. If the current distribu-

z
252 M agnetostatics in Free Space CHAPTER 4

tion is specified, the normalization integral for Legendre polynomials can be used to deduce
the constants an.
The case in which all of the an coefficients are zero except for n = 1 is particularly inter-
esting, for then inside the shell B, = B cos () and Be = - B sin () and the field strength is
uniform. The current distribution required to achieve this effect varies as sin (j.
All the foregoing can be extended to problems involving ¢-type currents flowing in
spherical volumes by considering such volumes to be composed of nesting spherical shells;
the results given here then become a prototype solution.

Several other techniques have proved helpful in the solution of magnetostatic


boundary-value problems. The differential form of Ampere's law yields V X B = 0
away from the sources, and thus in such regions B may be expressed as the. gradient
of a scalar potential function in much the same manner as found in electrostatics. This
technique has been widely used when describing magnetic fields in terms of equivalent
magnetic dipoles."
Since V · A = 0 it is possible to introduce a vector function W by the relation
A = V X W. In turn, W has proved to be expressible as a series of orthogonal functions,
and a variety of problems are solvable by this technique," including the spherical
shell of current discussed in Example 4.13.

4.10 COMPOSITE FIELDS

At this stage in the analysis, it is possible to formulate an expression for the force F
on a charge q which is moving through XYZ at a velocity v(t), when that force is
contributed to by a composite of three sets of sources:

1. A static volume charge distribution Pl(~,17,r).


2. A system of uncompensated charges P2(~,17,r) dV moving through space at the
constant] velocities V2(~,17,r).
3. A system of compensated charges P3(~,17,t) dV moving at the constant velocities
V3(~,17,t). There are stationary charges -P3(~,17,r) dV providing the compensation,
and one 111ay talk conveniently of the charge pairs (P3 dV, - P3 dV).

Through the use of the Dirac delta function, these volume charge densities can equally
well represent surface and lineal distributions, or discrete point charges.
"fhe force on q is given by the Lorentz force law

F = q(E + v X B)
t By this it should be recalled one means that the charge P2 dV, which is instantaneously in that
volume element dV which contains the point (~,17,t) has, for the moment, the particular velocity
V2(t17,t). The identity of the charge in dV keeps changing, but on the time-average there is always
charge at this position with this velocity. Alternatively, if the progress of a specific charge is followed,
it will be found to occupy a succession of positions, momentarily taking on a progression of velocities
V2, which need have neither the same magnitudes nor directions.

16 M. Abraham and R. Becker, The Classical Theoru of Electricitu and Mtujnetism, 2d English ed.,

Chap. 7, Hafner Publishing Company, l\e\v York, 1949.


17 W. R. Smythe, Static and Dimamic Electriciiu, pp. 260-271, Mc Graw-Hill Book Company, New

York, 1939.
Problems 253

in which E(x,Y,z) == _1_


41rfo
J p~ dV
V ~3
(4.55)

B(x,y,z)~ ----=i
1 J ~ dV
t X
-a- (4.56)
41T'Jlo v r
with P = Pl + P2 and t = P2V2 + PaVa. The fields E and B, as given by (4.55) and
(4.56) satisfy the differential relations

V· E = ~ VxE==O
fa
(4.57)
t
V·B == 0 vxB -1
f..Lo

These composite fields, which are due to the most general aggregation of time-
independent sources, are therefore the most general electrostatic and magnetostatic
fields obtainable.

REFERENCES

1. Abraham, 1\1., and R. Becker, The Classical Theory of Electricity and 111 agnetism, 2d
English ed., Hafner Publishing Company, New York, 1949.
2. Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, W. H.
Freeman and Company, San Francisco, 1962.
3. Duckworth, H. E., Electricity and ill agnetism, Holt, Rinehart and Winston, Inc., New
York, 1960.
4. Langmuir, R. V., Electromagnetic F'ields and lVaves, McGra\v-Hill Book Company, New
York, 1961.
5. Page, L., and N. I. Adams, Jr., Electrodynamics, D. Van Nostrand and Company, Inc.,
New York, 1940.
6. Panofsky, w. K. H., and M. Phillips, Classical Electricity and 111agnetism, Addison-vVesley
Publishing Company, Inc., Reading, Massachusetts, 1956.
7. Shedd, P. C., Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood
Cliffs, New Jersey, 1954.
8. Smythe, W. R., Static and Dynamic Electricity, McGraw-Hill Book Company, Ne\v York,
1939.
9. Whittaker, E., .4 History of the Theories. of Aeiher and Electricity, Vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

PROBLEMS

4.1 Two straight horizontal insulated aluminum wires carry equal steady currents in opposite
directions. If each wire is 0.5 em. in diameter and one wire lies on top of the other, what
current will barely cause separation?
4.2 A rectangular loop consists of a U-shaped conductor and a sliding bar. Find the force on
the bar as a function of the dimensions of the loop and the strength of the steady cur-
rent I which flows in the loop.
254 J11 aqnetostaiics in Free Space CHAPTER 4

4.3 Referring to Figure 4.lc, Biot bent the second wire into the form of a right angle and
found that the period of oscillation of the needle, with a steady current Jl: passing through
the bent wire, was TI. Upon passing half this much current through the straight vertical
wire, he measured a period T2. The ratio T2/TI was independent of the distance r and had a
mean value of 0.917. Is this result compatible with the Biot-Savart law? How should the
ratio of periods vary with the apex angle of the bent wire?
4.4 A circular loop carrying a current I and a long straight wire carrying a current I' lie in
the same plane. Show that the mutual force is proportional to II' (sec a - 1) with a the
angle subtended by the circle at the nearest point of the straight wire.
4.5 If in the last problem the straight wire is placed perpendicular to the plane of the loop,
show that a torque exists tending to set the two wires in the same plane. Does your
answer depend on whether or not the straight wire is within the loop?
4.6 Show that the net self-force on a plane circular loop carrying a steady current is zero.
4.7 Show that a simple loop of arbitrary shape carrying a steady current tends to assume the
form of a plane circular loop.
4.8 Two circular wires of radii a and b have a COmlTIOn center and are free to turn on an axis
which is a diameter of both. Find the torque existing between these coils if they carry
steady currents Ia and Ib and are (1) at right angles, and (2) in the same plane.
4.9 For a current-carrying circular loop of radius a, show that the rate of change of field
along the axis is constant at a distance a/2 from the center. Thus show that if t\VO identi-
cal coils are placed coaxially a distance apart equal to their radii, an extended region
exists in which the magnetic field strength is essentially constant. Two such coils arranged
in this manner are known as 1/ elmholtz coils.
4.10 A long thin solenoid of length L and radius a carries a steady current I through its N turns.
Find the B field along the axis.
4.11 A high-speed electron has a dynamic mass which is 1.5 times its rest mass. At what
radius of curvature will it travel in a perpendicular magnetic field of strength B o =
0.5 webers/rn"?
4.12 In a parallel plane diode operating at a constant potential Vo, the electrons normally
travel directly across from cathode to plate. If a transverse uniform magnetic field of
strength B« is interposed, what must be the value of B o just to prevent the electrons from
reaching the plate?
4.13 If ions of mass M and charge q are injected transversely into a region of uniform magnetic
field Eo, after having accelerated through a potential Vo, show that they will travel a
circular path of radius

-
R- VO (J[)J~~
[ 2-- -
B~ q

If a variety of ions of differen t mass and the same charge are collected after having
traveled through a semicircle, they will be separated laterally due to their different orbital
radii. This is the principle of the mass spectroscope.
4.14 In a Wien velocity filter, ions of a particular velocity Vo are not deflected in passing
through a region containing steady electric and magnetic fields. How are the fields
arranged to accomplish this '?
4.15 What magnetic moment results if a spherical conductor of radius a, possessing a net
charge Q, rotates with a constant angular velocity w?
Problems 255

4.16 Find the magnetic field everywhere if two infinite coaxial cylindrical shells of radii a <b
carry equal and opposite steady currents I.
4.17 An infinitely long cylindrical shell is segmented into four equal quarter-circles. These four
segments carry axial currents in alternate directions of uniform lineal density j amp/m.
If the cylinder radius is a, find the magnetic field B at all points.
4.18 A fine wire is wound in the form of a flat spiral of N turns, shaped like a disc of radius a.
Find the magnetic dipole moment if a steady current I flows through this winding.
4.19 . toroidal coil consists of a large number N turns of wire and carries a steady current I.
.A
Use Ampere's circuital law to determine the field at any point inside the toroid. If a and b
are the inner and outer radii of the toroid, find the percent variation in B over a cross
section as a function of b/ a.
4.20 A fine wire is wound in a single layer of N closely spaced turns on the surface of an insu-
lating spherical shell, such that the axes of the turns coincide with the polar axis of the
sphere. If a steady current I is passed through the winding, find the magnetic field
everywhere.
CHAPTER 5
Electromagnetics in Free Space
AS THE STRUCTURE of the word implies, electromagnetics is concerned with interrelated
electric and magnetic fields, an effect which occurs when the two fields are time-varying.
This interrelation is normally introduced by accepting Faraday's emf law as an experi-
mental postulate and adding to it the continuity equation for current, from which
ultimately Maxwell's equations may be deduced. However, the approach to be pre-
sented here will not require this additional experimental postulate. Instead, the most
general static and electric fields will be created in one coordinate system and the
resulting force expression transformed to another (moving) coordinate system, The
transformed force expression will be recognized as a generalization of the Lorentz
force law, and permits the definition of time-varying electric and magnetic fields. These
fields are then shown to be related through Maxwell's equations, one consequence of
which is Faraday's emf law. This approach provides the additional satisfaction of
identifying the electromagnetic fields in the Lorentz force law and in Maxwell's
equations as one and the same, an identity which can only be postulated in the cus-
tomary derivation.
After Maxwell's equations have been established, the vector Green's theorem is used
to obtain a general solution for the electromagnetic field. Conditions at infinity are
studied, and convergence is demonstrated for real sources. The wavelike nature of the
general solution is demonstrated and then Poynting's theorem is derived to show the
energy content transported by these waves, The chapter continues with discussions of
solutions to the vector wave equation in rectangular, cylindrical, and spherical coordi-
nate systems and concludes with a Minkowskian formulation of the field equations.

5.1 * HISTORICAL SURVEY


The two major discoveries on which the theory of time-varying electromagnetic fields
is ordinarily based were made b.y Michael Faraday (1791-1867) and James Clerk Max-
well (1831-1879). Faraday's discovery was experimental and consisted of the significant
observation that a changing magnetic field would induce an electric field. Maxwell
was led by an analogy to the theoretical conclusion that the converse was also true,
namely, that a changing electric field would induce a magnetic field. In this respect,
time-varying electric fields play the same role as conduction currents, and Maxwell
combined the two into a total current which he showed to be continuous. The mathe-
* This section may be omitted without loss in continuity of the technical presentation.
SECTION 1 Historical Survey 257

matical formulation of all these effects and their interrelations constitute what is
known as Maxwell's theory.
Faraday was undoubtedly motivated in his discovery by what appears to have been
a basic tenet of his scientific philosophy-that every cause and effect has its converse.
Thus since Oersted's experiment and many developments which followed had clearly
shown that electricity can produce magnetic effects, it was reasonable to expect that
magnetism should be able to produce electricity. Faraday attacked this problem many
times without success. His laboratory notebook contains an entry dated December 28,
1824 describing an experiment in which a magnet was placed inside a helical coil
" . . . but in no case did the magnet seem to affect the current so as to alter its intensity
as shewn upon a magnetic' needle placed under a distant part of it. . ."
Again, on November 28, 1825, his laboratory notes refer to a battery-connected wire
" . . . parallel to which was another similar wire separated from it only by two thick-
nesses of paper. The ends of the latter wire attached to a galvanometer exhibited no
action." Replacing either straight wire by a helix also had no effect.
A third try was recorded on April 22, 1828. Faraday suspended a copper ring by a
thread and placed a bar magnet inside the ring but could detect no induced current.
Faraday's efforts were paralleled by those of many other scientists, but no one was
having any appreciable measure of success. The difficulty lay in the fact that everyone
was looking for the creation of a steady current. Perhaps the most significant discovery
had been made by Arago in 1824. He suspended a magnetic compass needle over a
copper plate and set it into oscillation, noting that the presence of the copper plate
enhanced the damping. Upon eliminating air disturbances and rotating the copper
plate, Arago was able to make the needle revolve also, and even showed that this
dragging effect depended on the conductivity of the rotating plate. Faraday repeated
Arago's experiment in 182.5 but, despite the suggestiveness of the results, the true
explanation of the phenomenon eluded both investigators.
Finally, on August 29, 1831, six years after his first attempt, Faraday discovered the
effect he had been seeking. His notes for that day state:'
Have had an iron ring made (soft iron), iron round and f inches thick and ring 6 inches in
external diameter. Wound many coils of copper wire round one half, the coils being sepa-
rated by twine and calico-there were three lengths of wire
each about 24 feet long and they could be connected as one
length or used as separate lengths. By trial with a trough each
was insulated from the other. Will call this side of the ring A.
On the other side but separated by an interval was wound wire
in two pieces together amounting to about 60 feet in length, the
A
direction being as with the former coils; this side call B.
Charged a battery of 10 pro plates 4 inches square. Made the
coil on B side one coil and connected its extremities by a copper
wire passing to a distance and just over a magnetic needle (3
feet from iron ring). Then connected the ends of one of the
pieces on A side with battery; immediately a sensible effect on
needle. It oscillated and settled at last in original position. On breaking connection of A
side with Battery again a disturbance of the needle.
Made all the wires on A side one coil and sent current from battery through the whole.
Effect on needle much stronger than before.
1 Faraday's Diary, vol. 1, p. 367. Published by G. Bell and Sons, Ltd., London, 1932.
258 Electronuumetiee in Free Space CHAPTER 5

This discovery of transformer action quickly led Faraday to an appreciation of the


entire effect. On September 24th he tried a different experiment. Using a remote helix
and compass needle as indicator, he wrapped a helical coil around a soft iron cylinder
and built up an apparatus which he described as follows:"

The iron cylinder and helix . . . . A.lI the wires made into one helix and
these connected with the indicating helix at distance by copper wire: Then the
iron placed between the poles of bar magnets as . . . in fig. Every time the
magnetic contact at N or S was made or broken there was magnetic motion
at the indicating helix, the effect being as in former cases not permanent, but
a mere momentary push or pull. But if the electric communication (i.e. by the
copper wire) was broken then these disj unctions and contacts produced no effect
whatever, Hence here distinct conversion of Magnetism into Electricity.

On October 1st Faraday repeated the transformer experiment but


with a wooden core, and once again obtained the same effect, though
enough weaker that he had to substitute a galvanometer for the indicat-
ing helix. He concluded: "Hence there is an inducing effect without the
presence of iron . . . ."
Finally, on October 17th, Faraday performed the most significant ex-
periment of all. He prepared a helical wire in the form of a cylinder and then 3

. . . . a cylindrical bar magnet t inch in diameter and 8-i inches in length had one end
just inserted into the end of the helix cylinder-then it was quickly thrust in the whole
length and the galvanometer needle moved-e-then pulled out and again the needle moved but
in the opposite direction. This effect was repeated every time the magnet was put in or out
and therefore a wave of Electricity was so produced from mere approximation oj a magnet
and not from its formation in situ.

As noted in Chapter 3, Faraday preferred to think of all electric and magnetic effects
in terms of lines of force, having been first attracted to this view by observing the
disposition of iron filings in the neighborhood of a permanent magnet. He thus sought
to explain this new phenomenon of induced electricity in terms of an interaction with
magnetic flux lines. His raw thoughts OIl this subject are contained in an entry in the
laboratory notebook dated August 1, 1851, which contains the passages'

The force of a given magnet is definite and may be considered as represented by its
curves . . . . The curves . . . exist within the magnet as well as without; but within they
are in the contrary or return direction . . . . Whatever the condition of the interior of the
magnet: it has . . . the same kind and amoun t of power as the outside, and so is in full
analogy and similitude with an electro helix.
The intensity of the curves of a magnet vary greatly at different distances from the
magnet But the amount of force is definite and the same for every section of all the
curves .
Hence it follows that whether the curves are intersected directly or obliquely makes no
difference provided they are intersected. 'The effect depends upon the number of curves
intersected. 1\. wire moving obliquely may intersect fewer curves and therefore have a

2 Ibid., vol. 1, p. 372.


3 Ibid., vol. 1, pp. 375-376.
4 Ibid., vol. 5, pp. 409-411.
SECTION 1 Historical Survey 259

feebler current evolved in it; but if it intersected only the same curves directly across,
it would have no larger a current.
So with a given moving wire or with a given \vire under which a magnet is moving, the
quantity of electricity generated is directly as the amount of curves passed over or through.
With the same curves therefore it varies directly with the velocity of the motion.

This explanation of induction as being due to the relative motion between magnetic
lines of force and a conductor was refined by Faraday and included in a paper read
to the Royal Society later that year.' I t was given mathematical articulation by Max-
well as the equation

in which e is the emf induced in a contour C and J sBn dS is the total magnetic flux
enclosed by C. If the contour C is occupied by a conductor, e is the source of the result-
ing induced electric current. In the above, S is an open surface erected on C as bound-
ary, and B; is the normal component of flux density, thus representing the number of
magnetic lines of force per unit area. This famous equation is known as Faraday's emf
law.
After his initial discovery of induction, Faraday continued to experiment with the
phenomenon. On October 28, 1831, he invented the first direct-current generator,
consisting of a copper plate rotating between magnetic poles, with an external circuit
attached between the center and rim of the plate. Through the years Faraday designed
and tested a variety of such generators, and his entry for October 11, 1851 describes
a machine consisting of a rotating wire rectangle with a commutator attached, this being
the prototype of the modern electric generator.
Faraday also discovered the phenomenon of self-induction (in 1834), unaware that
Joseph Henry (1797-1878) had made an independent discovery of the effect two years
earlier. t
It is impossible in a survey this brief to do justice to the painstaking, thorough
manner in which Faraday carefully built and enlarged his knowledge of electrical
phenomena. The interested reader is encouraged to read extensive sections of Faraday's
laboratory notebook in order to gain a full flavor of his accomplishments. As for the
discovery of electromagnetic induction itself, this 111USt rank as one of the D10st impor-
tant contributions ever made to scientific knowledge.

t In fairness to Henry, it should be stated that during this period he and Faraday independently dis-
covered many important electromagnetic phenomena, including self- and mutual induction and many
of the principles of electric machines. Henry also developed the electromagnetic relay, perfected an
electromagnet.ic telegraph, and showed that voltage could be stepped up or down by properly pro-
portioning the coils in a transformer. Henry's lack of promptness in announcing the results of his
experimen ts has probably been the primary cause of his neglect, but the remoteness of the N ew World
from the Old, in those days of slow communications, was a contributing factor. Faraday's achieve-
ments were more promptly disseminated to the European centers of learning, and news of Henry's
accomplishments often bore the appearance of mere confirmation of what Faraday had already done.
In thc stimulation of further scientific inquiry by others, Faraday's influence was inestimably greater.
s M. Faraday, "On Lines of Magnetic Force; Their Definite Character; and Their Distribution Within
a Magnet and Through Spacc," Phil Trans ](oy Soc (London), 142, 25-56; 1852.
260 Electromagnetics in Free Space CHAPTER 5

Although Faraday's law of induction was readily accepted, his explanation in terms
of lines of force fell mainly on deaf ears. The scientists of his day had been reared on
theories of action at a distance, theories which had enjoyed wide success in describing
a variety of electric and magnetic phenomena, as well as gravitational effects. The
eminent Astronomer Royal, Sir George Biddell Airy, declared that he could "hardly
imagine anyone who knows the agreement between observation and calculation based
on action at a distance to hesitate an instant between this simple and precise action
on the one hand and anything so vague and varying as lines of force on the other."
Maxwell was only twenty-four when he undertook to OVerC0111e this objection and
place Faraday's ideas on a firm mathematical basis. In the introduction to his first
paper on electricity, he stated that"

. . . the limit of my design is to show how, by a strict application of the ideas and methods
of Faraday, the connection of the very different orders of phenomena which he has dis-
covered may be clearly placed before the mathematical mind.

After defining a single line of force as a curve in space whose direction at each point is
that of the force on a positive charge, or the force on an elementary north magnetic
pole, whichever the case may be, Maxwell continued

. . . \Ve might in the same way draw other lines of force, till we had filled all space with
curves indicating by their direction that of the force at any assigned point.
We should thus obtain a geometrical model of the physical phenomena, which would
tell us the direction of the force, but we should still require some method of indicating the
intensity of the force at any point. If we consider these curves not as mere lines, but as
fine tubes of variable section carrying an incompressible fluid, then, since the velocity of
the fluid is inversely as the section of the tube, we may make the velocity vary according
to any given law, by regulating the section of the tube, and in this way we might represent
the intensity of the force as well as its direction by the motion of the fluid in these tubes.

Maxwell then pointed out that if the force law involves distance to the inverse square,
there would be no interstices between his tubes of force .

. . . The tubes will then be mere surfaces, directing the motion of a fluid filling up the
whole space. It has been usual to commence the investigation of the laws of these forces
by at once assuming that the phenomena are due to attractive or repulsive forces acting
between certain points. vVe may, however, obtain a different view of the subject, and one
more suited to our difficult inquiries, by adopting for the definition of the forces of which
\ve treat, that they may be represented in magnitude and direction by the uniform motion
of an incompressible fluid.

With this conception, Maxwell proceeded to show that all results obtained for static
charges or permanent magnets, using action-at-a-distance formulas, were also obtain-
able in terms of the distribution of tubes of force. Upon pointing out the equivalence
of a steady current element and a magnetic dipole, he was also able to extend this con-
clusion to magnetic phenomena caused by time-independent currents. However, in
discussing induced electric currents, Maxwell admitted
6 J. C. Maxwell, "On Faraday's Lines of Force," read to the Cambridge Philosophical Society on

December 10, 1855 and February 11, 1856. Reprinted in Scientific Papers, vol. 1, pp. 155-229, Cam-
bridge University Press, London, 1890.
SECTION 1 H isiorical Survey 261

. . . The idea of the electro-tonic state, t however, has not yet presented itself to my mind
in such a form that its nature and properties may be clearly explained without reference
to mere symbols, and therefore I propose in the following investigation to use symbols
freely, and to take for granted the ordinary mathematical operations. By a careful study
of the laws of elastic solids and of the motions of viscous fluids, I hope to discover a method
of forming a mechanical conception of this electro-tonic state adapted to general reasoning.

Maxwell then concluded this first paper with an extensive mathematical development
in which the vector potential emerged as being representative of the electrotonic state,
its curl giving the magnetic field, and its time derivative yielding the induction effect.
He also showed that the curl of the magnetic field at any point was equal to the current
density at that point.
This first electrical paper by Maxwell can fairly be described as principally achieving
mathematical expression for all known electric and magnetic phenomena in terms of
Faraday's physical conceptions. It exhibits Maxwell's characteristic fondness for
models, a fondness which had led him to construct a top to illustrate the dynamics of a
rigid body rotating about a fixed point, and to construct a model of Saturn's rings
(now in the Cavendish Laboratory) to illustrate the motion of the satellites in the
rings. This rich physical imagination was now to lead Maxwell to his most important
discovery, through an extension of the tube of force model so as to explain the electro-
tonic state. This extension was accomplished in a second paper which appeared six
years later in the Philosophical M agazine, in which he offered the introductory remark?
I propose now to examine magnetic phenomena from a mechanical point of view, and to
determine what tensions in, or motions of, a medium are capable of producing the mechani-
cal phenomena observed. If, by the same hypothesis, we can connect the phenomena of
magnetic attraction with electromagnetic phenomena and with those of induced currents,
we shall have found a theory which, if not true, can only be proved to be erroneous by
experiments which will greatly enlarge our knowledge of this part of physics.

I t has already been noted that Faraday looked upon electrostatic and magnetic induc-
tion as taking place along curved lines of force. He imagined these lines to be ropes of
molecules starting from a charged conductor or magnet, and acting on other nearby
bodies. These ropes of molecules were in tension, tending to shorten and at the same
time bulge out laterally. Thus the charged conductor or magnet tends to draw bodies
to itself, contracting its lines of force like the fibers of a muscle. Maxwell sought to
represent this longitudinal tension and transverse pressure in terms of equivalent
conditions in a fluid medium.
Let us 110\V suppose that the phenomena of magnetism depend on the existence of a
tension in the direction of the lines of force, combined with a hydrostatic pressure; or in
other words, a pressure greater in the equatorial than in the axial direction: the next question
is, what mechanical explanation can we give of this inequality of pressures in a fluid or
mobile medium? The explanation which most readily occurs to the mind is that the excess
of pressure in the equatorial direction arises from the centrifugal force of vortices or eddies
in the medium having their axes in directions parallel to the lines of force . . . .
t Faraday called the state into 'which any body was thrown, due to the presence of a magnetic field,
the electrotonic state, and explained induction as being due to changes in the electrotonic state.
i J. C. Maxwell, "On Physical Lines of Force," Phil Mag, 21, 161-175, 281-291, 338-348; 1861.

Reprinted in Scientific Papers, vol. 1, pp. 451-513, Cambridge University Press, London, 1890.
262 Electromaqnetics in Free Space CHAPTER 5

'Ve shall suppose at present that all the vortices in anyone part of the field are re-
volving in the same direction about axes nearly parallel, but that in passing from one part
of the field to another, the direction of the axes, the velocity of rotation, and the density
of the substance of the vortices are su bject to change. We shall investigate the resultant
mechanical effect upon an element of the medium, and from the mathematical expression
of this resultant we shall deduce the physical character of its different component parts.

In order to have adjacent vortices rotating in the same direction, Maxwell next sup-
posed that there exist between them a large number of minute spherical bodies which
roll, without sliding, in contact with the surfaces of the vortices. These particles, which
Maxwell assumed to constitute electricity, thus play the role of idler wheels, Under
this construction, for example, the static magnetic field of a permanent magnet can be
envisioned as consisting of vortices which fill the tubes of force, with the rotational
velocity of a vortex proportional to the strength of the field and thus varying with tube
cross section. With adjacent vortices in the magnetic field rotating at the same speed
in the same direction, the particles between them rotate idly but remain in the same
position. However, if a change should occur in the magnetic field, this would mean that
one of the vortices began rotating faster than the other, and thus the particles between
them would change position, indicating an electric current. In this way, Maxwell's
model demonstrated the creation of electric currents due to changes in the magnetic
field; hydrodynamical considerations of the relations between the rotational velocities
of adjacent vortices and the displacement of the idler particles led to a mathematical
statement of Faraday's emf law,
It was precisely at this point that the great value of the model became apparent.
If a change in vortex motion can cause a displacement of the idler particles, then the
converse should be true-a displacement of the idler particles should occasion a change
in vortex motion. Cause and effect are interchangeable. A changing magnetic field can
create an electric field; a changing electric field should produce a magnetic field. Max-
well was reaching the heart of his greatest contribution when, in Part 3 of the paper, he
said"
According to our theory, the particles which form the partitions between the cells (vor-
tices) constitute the matter of electricity. The motion of these particles constitutes an
electric current; the tangential force with which the particles are pressed by the matter of
the cells is electromotive force, and the pressure of the particles on each other corresponds
to the tension or potential of the electricity.
If we can now explain the condition of a body with respect to the surrounding medium
when it is said to be "charged" with electricity, and account for the force acting between
electrified bodies, we shall have established a connexion between all the principal phe-
nornena of electrical science.

After pointing out that electromotive force (voltage due to magnetic effects) is the
same thing as electric tension (voltage due to charge separation), Maxwell distinguished
between conductors and insulators, concluding
Here then we have t\VO independent qualities of bodies, one by which they allow of the
passage of electricity through them, and the other by which they allow of electrical action
being transmitted through them without any electricity being allowed to pass. J.~ con-
ducting body may be compared to a porous membrane which opposes more or less resist-
8 Ibid., p. 490.
SECTION] Ifistorical Suroeu 263

ance to the passage of a fluid, while a dielectric is like an elastic membrane which may be
impervious to the fluid, but transmits the pressure of the fluid on one side to that on the
other.

Maxwell next discussed the relation between conduction current and potential in a
conductor and then went 011 to say
Electromotive force acting on a dielectric produces a state of polarization of its parts
. . . . In a dielectric under induction, we may conceive that the electricity in each molecule
is so displaced that one side is rendered positively, and the other negatively electrical, but
that the electricity remains entirely connected with the molecule, and does not pass from
one molecule to another.
The effect of this action on the whole dielectric mass is to produce a general displace-
ment of the electricity in a certain direction. This displacement does not amount to a
current, because when it has attained a certain value it remains constant, but it is the
commencement of a current, and its variations constitute currents in the positive or nega-
tive direction, according as the displacement is increasing or diminishing. The amount of
the displacement depends on the nature of the body, and on the electromotive force . . . .

Thus Maxwell introduced for the first time the concept that variations in position of
bound charge were equivalent in their effect to a conduction current. By letting motion
of the idler particles of his model represent either or both, and finding the varia-
tion in vortex velocity due to a particle displacement, he arrived at a generalization of
Ampere's circuital law,
The importance of this generalization cannot be overstated. If motion of the idler
particles could only represent conduction current, then an electrical disturbance could
only propagate through a conductive medium. But with the concept of displacement
current, field changes could be transmitted through dielectric media, including air, and
even including free space (which Maxwell considered to be an ether).
Maxwell recognized that a finite velocity would be associated with the propagation
of any disturbance through his model medium. He described the mechanism of propaga-
tion by imagining that a translational motion of one layer of idler particles would
initiate a change in angular velocity of the contiguous vortices. These in turn would set
the next layer of idler particles into translational motion, and in this mariner the
disturbance would be transferred through a sequence of layers. Maxwell computed
the kinetic and potential energy which were transferred in this fashion, thus obtaining
a velocity of transport. By associating kinetic energy and potential energy with the
magnetic and electric fields respectively, he deduced that the velocity of propagation
of an electromagnetic disturbance was governed by the electrostatic permittivity and
magnetostatic permeability of the supporting 111ediu111. Upon using the values for
these constants, determined for air by Kohlrausch and Weber, Maxwell deduced that
the velocity of an electromagnetic disturbance should be 193,088 mi/sec. He then
concluded
. . . the velocity of light in air, as determined by 1\1. Fizeau, is . . . 195,647 miles per sec-
ond. The velocity of transverse undulations in our hypothetical medium . . . agrees so
exactly with the velocity of light calculated from the optical experiments of IV1. Fizeau, that
we can scarcely avoid the inference that light consistsin the transverse undulations of the same
medium which is the cause of electric and maqneiic phenomena.
This discovery may be likened to an earlier occasion when 1\ ewton first tested his
264 Electronuujneiice in Free Space CHAPTER 5

law of universal gravitation by making calculations on the distance of the n100Il. It


was Newton's misfortune to use an inaccurate value for the diameter of the earth, and
this led to such poor agreement that he put the theory aside for nearly two decades.
Maxwell was spared a similar disappointment in that both his value and Fizeau's
were in error in the same direction.
I t should be remembered that at this time no one had ever wittingly generated or
detected electromagnetic waves. The concept was completely new, as was the notion
of a displacement current. To link light to these hypothetical phenomena was a flash
of brilliance seldom equalled in the history of science. It was not to be until eight years
after Maxwell's death that these hypotheses would receive substantiation through the
experiments of Hertz.
Maxwell next discarded the model which had served so well as a scaffolding with
which to erect his theory, and in a third paper entitled "A Dynamical Theory of the
Electromagnetic Field," presented the theory completely in electrical terms." The
properties of the field are described in terms of 20 equations, which include the relation
between displacement current and conduction current, and the continuity equation
linking charge to current, as well as what are now known conventionally as Maxwell's
equations. This paper was so carefully written that it later appears almost intact in
his Treatise.
These accomplishments, added to his contributions in color vision and molecular
theory, have earned Maxwell the place as the greatest theoretical physicist of the nine-
teenth century. At a centenary in 1931 honoring his birth, Max Planck reviewed the
evolution of man's knowledge of electrical phenomena and concluded 10 by saying of
Maxwell

. . . it was his task to build and complete the classical theory, and in so doing he achieved
greatness unequalled. His name stands magnificently over the portal of classical physics,
and we can say this of him: by his birth, James Clerk Maxwell belongs to Edinburgh,
by his personality he belongs to Cambridge, by his work he belongs to the whole world.

Maxwell's equations, as has already been noted in Chapter 2, played a central role
in the development of the theory of special relativity. Lorentz used them as an invariant
to derive the transformation which bears his name, and Einstein devoted much of his
first paper to the same subject. In the sections to follow, this process will in effect be
reversed. The Lorentz transformation has already been established in terms of funda-
mental considerations of length and time measurements, The Lorentz equations will be
used to derive Maxwell's equations from a transformation of Coulomb's law.

5.2 THE TRANSFORMATION EQUATIONS FOR


ELECTRIC AND MAGNETIC FIELDS

Suppose that an observer 0', stationary in X'Y'Z', has created most general composite
electrostatic and magnetostatic fields, E' (x' ,y' ,z') and .B' (x' ,y' ,z'). He can do this
through the use of three types of sources: (1) a static system of charges, (2) a steady
9 First read to the Royal Society in 1864. Published in Phil Trans Roy Soc (London), 155; 1865.
Reprinted in Scientific Papers, vol. 1, pp ..526-597, Cambridge University Press, London, 1890.
10 Janus Clerk At axwell, A collection of commemorative essays, p. 65, The Macmillan Company,

New York, 1931.


SECTION 2 The Transformation Equations for Electric and 1\1 agnetic Fields 265

current consisting of the flow of uncompensated charges, and (3) a steady current
against the background of static compensating charges. This situation is suggested in
Figure 5.1. Formulas for the static fields arising from a composition of these three

z'
Static charges

(x',y',z') ~

v'(t')

Uncompensated
circulating charges

fI'-"'"

,'-',,
~ i"
!i l\
~ \ J.-------------------y'
\
'.--

X'

Compensated
circulating charges
FIGUHE 5.1 Composiie sources causinq most general static fields B' (x' ,y' ,z') and E' (x' ,y' ,z')
which interact with a movinq charge q.

types of sources were given in Section 4.10. If, in the presence of these fields, a charge q
is moving through X'Y'Z' at a velocity v', observer 0' can say that the force on q is

F' = q(E' + v' X B') (5.1)

which is a use of the Lorentz force law.


Imagine that a second observer 0 is stationary in a frame XYZ which is in constant
266 Electromaqneiics in Free Space CHAPTER 5

motion with respect to X' Y' Z' such that the respective axes are aligned and X is sliding
along the - X' axis at speed u. Then the Lorentz transformation equations (2.40) are
applicable and observer 0 will deduce that the force on q is F, where

F= [lxF~ + K(lyF~ + 1.F:)] + K~ X ( 1x ~ X F') (5.2)

in which K = (1 - U2/C2)-~2 and vet) is the velocity of q in XYZ. Equation (5.2) is


merely a restatement of (2.76) and the present development in some ways parallels
the opening development of Chapter 4. Two cases of (5.2) wil] now be considered.

Case 1: v == O.
In this case, q is static in XYZ and the force F is just the bracketed term in (5.2).
Observer 0, who is accustomed to the idea that magnetic fields exert forces only on
moving charges, will ascribe this force to an electric field, since q is not moving relative
to O. Thus he defines an electric field such that
(5.3)
This electric field depends on time as well as spatial position because the sources of 0'
are moving relative to O.
With q stationary in XYZ, v' = -l x u and (5.1) gives

F~ = qE~ F~ = q(E~ + uB:) F~ = q(E; - uB~) (5.4)


Insertion of (5.4) in (5.3) yields the field transformation equations

Ex = E~ E y = K(E~ + uB:) E, = K(E; - uB~) (5.5)


The electric field in XYZ is seen to be contributed to by both the electric and magnetic
fields of X'Y'Z', and in relative amounts controlled by u.
Case 2: v ¢ O.
In this case q has an arbitary motion vet) in XYZ and the force F is the entire
expression (5.2). The charge q also has an arbitrary motion v'(t') in X'Y'Z', and
(5.1) gives
F'x = q (E'x + v'B'z
y - v'B')
z y
F'y = q(E'y + V'B'
z x - Vx'B')
z (5.6)
F'z = q(E'
1z + V'B'
x y - v'B')
y x

If the velocity transformation equations (2.50) are used to replace the components of
v' by those of v in (5.6), and the resulting equations are inserted in (5.2), one obtains

F = q {1x [ E~ + :~ (vyE~ + v.E:) + K(vyB: - v.B~) ]


+ (1 - u;x) E; + «n; - x- u) B:]
i, [ K K(V

+ 1. [K(l - :~x)E: - .»; + K(v x- U)B~J} (5.7)

Observer 0 can introduce a magnetic field B(x,y,z,t) to account for the fact that the
force on q is different because it is now in motion through XYZ. This magnetic field
SECTION 3 The Transformation Equations for the Source Densities 267

will be time-varying because the sources of 0' are moving relative to O. If observer 0
chooses to define B(x,y,z,t) such that the Lorentz force law is still valid, then he can
write
F = q(E + v X B) (f).8)

in which qE is the force on q when it is at rest in X YZ, and thus E is as gi ven in Case 1
by Equations (5.5). The magnetic field B must be such that (5.7) and (5.8) equate.
Upon comparing components of these two equations, one finds that

(5.9)

These transformation equations, together with (5.5), form a set from which 0 can
determine the fields which interact with q to produce the force given by the Lorentz
equation (5.8). The sources which have produced these fields are of a restricted class,
being time-independent in X'Y'Z', but this restriction will be lifted shortly.
Upon properly combining (5.5) and (5.9) one can establish that the inverse trans-
formation is
E~ = K(Ey - uB z ) E; = K(Ez + uB y ) (5.10)

B~ = K (By + ~ Ez) B: = K( Bz - ~E y ) (5.11)

5.3 THE TRANSFORMATION EQUATIONS FOR THE SOURCE DENSITIES

It will be desirable to relate the fields E(x,Y,z,l) and B(x,Y,z,l) to the time-independent
sources p'(x',y',z') and t(x',Y',z') created by 0'. However, it is convenient first to trans-
form these sources into their XYZ equivalents. This can be done by considering an
arbitrary volume element dV o, in which an amount of charge Po dV o is at rest. This
volume element is assumed to be moving through X'Y'Z' at velocity w", and through
XYZ at velocity w. To 0', this volume element has a size dV' = dV o[l - (W')2/ C2]H
and to 0 it has a size dV = dVo(l - W2/C2)~<2. Since charge is an invariant,

p'dV' = Po dV o = p dV
from which
!!- = dV' = [1 - (W')2/C2J~2 (5.12)
p' dV 1 - W 2/C 2
The velocity transformation equations (2.50) give

w~ = (1 + U;~r2 {[(W/)2 - (w~)2l (1 - ~:) + (w~ + U)2}


which may be used to convert (5.12) to the form

p(x,Y,z,l) = K (1 + uw~)",
7 p (x ,y ,z )I (5.13 )

Since l(x,Y,z,i) === p(x,Y,z,t)w(x,Y,z), another use of the velocity transformation (2.50)
yields

(5.14)
268 Elecironuumeiics in Free Space CHAPTER 5

Equations (5.13) and (5.14) are called the source transformation equations and will be
of assistance in the determination of the dependence of E and B on the sources.

5.4 MAXWEll'S EQUATIONS

The relations between the static fields E'(x',y',z'), B'(x',y',z') and the time-independent
sources p' (x' ,y' ,Z'), " (x' ,y' ,Z') are already known, being given by

v' X E ' == 0 v'· E' = p'/EO


(5.15)
v' X B' = \'/~Ol v'·D' == 0
If all the quantities in these four equations are converted to their XYZ equivalents,
with the assistance of the transformations developed in Sections 5.2 and 5.3, the result
will be a set of equations in which the dependence of E and B on, the sources is dis-
played. To see how this is accomplished, consider any function! of the four coordinate
variables. Upon making use of the Lorentz Equations (2.40), one can establish that

aj = aj dx + ~!!!.- = K!1 + KU aj
ax' ax dx' at dx' ax c2 at
af a.r dt af dx af af
(5.16)
at' -== at dt' + ax dt' = K at + KU ax
af af af af
-=- -=-
ay' ay az' az
Application of these formulas to the curl of E' gives terms such as

aE~
---=--K----
aE: aE~ aE; KU aE;
az' ax' az ax c2 at
which, with the use of (5.10), can be written

aE~ _ aE; = aE z _ K2 (aEz + U aB II) _ ~~ (aEz + U aB y


)

az' ax' az ax ax c 2
at at
Upon determining all three components in this manner, one may write

V" X E K ay
== 0 = i, [(aE
z
aEy) + (V· B - aB%)]
- a; a; KU

+1 aEx - -aEz) + (1 - K 2) -
et; K 2U -
aBy -
2
K U
-
es. -
-
2
K U
-- -
2
aBy ]
y
[( - az ax ax
-
ax at c2 c2
at
2 2
+ 1z [( -aEy - -aEx) - (1 - K 2)
aEy K 2U
aB
-
z
+ -K U -aEy
2
- -K cU2 -aBz] (.1.17)
at
- -
ax ay ax ax c2 at
This result can be simplified considerably. If f is any function of x', y', z' but not of t',
it follows from the second of Equations (5.16) that

ar
?f= -u -=­
at ax
SECTION 4 M axwell' s l~ quations 269

When this relation is used in (5.17) one obtains

(5.18)

Further simplification is possible through determination of V • B. Since

en, ets, en,


-aB
ax
x+ -aB +-
ay
aB
az
y z
= K- -
aB x
ax'
KU
--.
2
c
+ - +-
at' ay' az'
use of (5.9) and recognition of the fact that partial derivatives with respect to t' are
all zero leads to
aE l aEI)
(-ay' - -az'
I , KU z y
V · B = KV • B - - 2
(5.19)
c

The right side of (5.19) is zero by virtue of (5.15) and thus

v·B==O (5.20)

which means that (5.18) can be written


aB
vxE== (5.21)
at
When this procedure is repeated for the curl of B' one obtains

V' X B' == ~1 == 1 [(aBz _ aBy) _ ~2 aExJ _ KU2 V . E}


{K
fJ.o x ay az c at c

+ r, [(aBx _ aBz) _ ~ aEy] + i, [(aB y _ aBx) _ ~ aE,] (5.22)


az ax c2 at ax ay c- at
Once again reduction is possible since

V • E = KV • E " + KU
(aB:-
ay'
-
aB~)
-
az'

== K
(, +P
-
fO
("x
U - -1
fJ.o
' ) == - '(1 + «o
fO
,
uw x - -1
fO

Po
)
=-
KP' ( uw~)
1+-
fO
2
c
Use of (5.13) yields

V· E = ~ (5.23)
fO

and then use of (5.14) means that (5.22) can be written

t 1 aE
V x B = - + -2 - (5.24)
POl c at
270 Electronuumetics in Free Space CHAPTER 5

All these results are relativistically exact. When collected together, they are known
as Maxwell's equations and can be written in the form

VxE=--
an
(5.25a)
at
aDo
v x H, = 1 +­at
(5.25b)

V· Do = p (5.25c)
v·B == 0 (5.25d)
in which Do = foE (5.26)
n, = ,uolB (5.27)

The sources p, 1 and the fields due to them, E, B, Do, "0, are all time-varying, However,
they are due to a restricted class of sources, namely those consisting of static charges
and steady currents in X'Y'Z'. But this restriction can be lifted by a simple argument.
If a second set of steady sources exists in another coordinate system X" Y"Z", they
will give rise to additional time-varying fields in XYZ which will also satisfy (5.25).
By superposition, the sum of the sources in X'Y'Z' and in X"Y"Z" will give rise to
fields in XYZ which satisfy (5.25). If this is generalized to include steady sources in all
Lorentzian frames, including those traveling at any speed in any direction relative to
X YZ, the sum of such sources can result in completely general distributions p(x,Y,z,t)
and I(X,Y,Z,t). This fact is demonstrated in Appendix E. Thus Equations (5.25) have
the widest validity and can form the basis for the study of all types of electromagnetic
fields. Of course, observer 0 need not rely on the steady sources of 0', 0", etc., to
establish his time-varying electromagnetic fields, but can do this equally well himself
by direct creation of the time-varying sources p and t.
Integral forms of Maxwell's equations follow readily from (5.25) with the aid of
Stokes' theorem and the divergence theorem. They are:

¢ E · dt = - f B · dS (5.28a)
c s
¢ n, · dt = f (t + Do) · dS (5.28b)
c s
¢ Do • dS vJ
s
= p dV (5.28c)

¢ B· dS == 0 (5.28d)
s
The first of these equations is often called Faraday's emf law and states that the line
integral of longitudinal E around any closed path is equal to minus the time rate of
change of magnetic flux enclosed. The second equation is a generalization of Ampere's
circuital law and casts Do in the same role as 1. This point was first appreciated by
Maxwell, who gave to Do the name displacement, For this reason, Do is called the dis-
placement current. The third equation is a generalization of the integral form of
Poisson's equation, and the fourth integral states that at all times the total magnetic
flux piercing any closed surface is zero.
If the divergence of Equation (,5.25a) is formed, the left side is zero because of the
SECTION 4 l\,faxwell's Equations 271

vector identity (V. Ill) ; this is matched by the right side, which is also zero by virtue
of (5.25d). Sirnilarly, if the divergence of (5.25b) is taken, one obtains

v . V X Ho = V · (1 + Do) == 0
indicating that the total current is continuous. Thus

a
v ·1 = - - (v · Do)
at
Use of (5.25c) converts this to

v· \ = -p (5.29)

which is known as the continuity equation. In words, V · \ dV is the net efflux of cur-
rent from a volume element dV, and - p dV is the time rate of decrease of charge within
dV. It is quite natural that these two quantities should be equal.
The continuity equation, which links charge and current, in no way denies the exist-
ence of charge without current, since it involves only p. A static charge distribution
p(x,Y,z) satisfies (.5.29) with no current flow.
EXAMPLE .5.1
Consider t\VO rectangular conducting blocks, as shown in the figure, separated by a small
distance l so as to form a parallel plate capacitor. Assume a uniform electron flow in the
direction indicated, so that t = I zt. is upward in both blocks. If A is the cross-sectional
area of each block, charge is accumulating on the adjacent faces at a rate t.A coul/sec.

z
t

~
-, I

Electron
flow
272 Electromaqneiics in Free Space CHAPTER 5

Therefore the total flux between faces, neglecting fringing, is increasing at the rate of
L1 lines/sec, or
Do = t

Thus the conduction current in the blocks is exactly replaced by a displacement current
in the interspace and the total current is continuous, in agreement with V· (t + Do) == 0,
as deri ved above.

If this entire development, beginning with Section 5.2, had been undertaken by
starting with steady sources in X'Y'Z', and asking what fields would be determined
by an observer 0*, in a coordinate system X*y*Z* which was moving at a speed u*
relative to X'Y'Z', all the same results would once again be obtained. Time-varying
fields E*, B*, due to time-varying source distributions o", t* would be found to satisfy
l\laxwell's equations. The question could then be raised as to the relations between the
fields E, B observed by 0 and the fields E*, B* observed by 0*. It is easy to show that
these two sets of fields are related by the previously obtained transformations (5.5)
and (5.9). A proof can be found in Appendix F.

5.5 INTEGRAL SOLUTIONS OF MAXWELL'S EQUATIONS


IN TERMS OF THE SOURCES

Since Maxwell's equations are linear in free space, no loss in generality results from
assuming that time variations are harmonic and represented by ei wt . The angular
frequency w may be a component of a Fourier series or a Fourier integral, thus bringing
arbitrary time dependence within the purview of the following analysis. Accord-
ingly, if !(x,Y,z,t) is any field component or source component, it will be assumed that
!(x,Y,z,t) = !(x,Y,z)e i wt . In this case, Maxwell's equations can be written

v X E = -jwB (5.30)
V X n, = t jwD o + (5.31)

V· E = ~ (5.32)
Eo

V·B == 0 (5.33)

and the continuity equation becomes

V· t = -JWP (5.34)

In all the above equations, E = E(x,Y,z), etc., the time-dependence being suppressed.
E, B, etc., are now complex vectors. (See Mathematical Supplement, Section V.23).
Additionally, if the curl of (5.21) and of (5.24) is taken, and if then (5.21) and (5.24)
are used to eliminate either E or B, one obtains the vector wave equations

(5.35a)

(5.35b)
SECTION ;) I ntegral Solutions of i11 axwell' s Equations 273

For an ejwt time-dependence, these becorne

v X V X E - k 2E (5.3Ga)

v X V X B - k 2B = V X C» (5.3Gb)

in which k = w v!:O€o is called the propagation constant, for reasons which will emerge
shortly. These last two equations can be integrated through use of a technique first
introduced by Stratton and Chu, and based on a vector formulation of Green's second
identity. 11

i,

FIGURE .5.2 Geomein] for the vector Green's theorem.

Consider a region V, bounded by the surfaces Sl ... ~SN as shown in Figure 5.2.
Let F and G be t\VO vector functions of position ill this region, each continuous and
having continuous first and second derivatives everywhere within V and on the
boundary surfaces Si. Using the vector identity
V · [A X B] == B · V X A - A · V X B
and letting A F while B == V X G, one obtains
v· [F X V X G] = V X G· V X F - F· V X V X G
whereas, if A == G and B == V X F, there results
V • [G X V X F] == V X F · V X G - G · V X V X F
11 J. A. Stratton and L. J. Chu, "Diffraction Theory of Electromagnetic Waves," Phys Rev, 56,99-107;

July 1939. Also, sec the excellent treatment in S. Silver, Microwave Antenna Theory and Design,
MIT Rad. Lab. Series, vol. 12, pp. 80-89, l\TcGra\v-Hill Book Company, New York, 1939. The present
development differs from Silver's principally in the nonuse of fictitious magnetic currents and charges.
274 Electromagnetics in Free Space CHAPTEH 5

If the difference in these results is integrated over the volume V one obtains

f (F·
v
V X V X G - G · V X V X F) dV = f
v
V · [G X V X F - F X V X G] dV

If one lets In be the inward-drawn normal from any boundary surface S, into the volume
V, use of the divergence theorem gives

f (F.
v
V X V X G - G · V X V X F) dV

SI"
f ,SN
(G X V X F - F X V X G) · in dS (1).37)

This result is the vector Green's theorem,


Suppose that the E and B of (5.3Ga) and (5.3Gb) meet the conditions required of the
function F in V and let G be the vector Green's function defined by
e-jk~
G = - a = y;a (f>.38)
~

in which a is an arbitrary constant vector and ~ is the distance from an arbitrary point,
[J(x,Y,z) within V to any point (~,1],t) within V or on Si.
G as defined by (5.38) satisfies the conditions of the vector Green's theorem every-
where except at P. Therefore, one can surround P by a sphere ~ of radius 0 and con-
sider that portion V' of V which is bounded by the surfaces 8 1 . • . SN, ~. Letting
E = F, one obtains

J (E·
v'
Vs X Vs X 1/-'a - 1/-'a • V s X V s X E) dV

SI'"
f SN,T,
(1/;a X V s X E - E X V s X 1/;a) • in dS (5.39)

in which, since y; is a function of (x,Y,z) as well as (~,1],t), it is necessary to distinguish


between differentiation with respect to these two sets of variables by subscripting the
operators so that

a a a
Vs = 1x -a~ + 1y -a1] + 1 -at
Z

and
a
vp = 1z -ax + 1 -aya + 1 -aza
Y Z

I t is shown in Appendix G that both sides of this equation may be transformed so


that a is brought outside the integral signs, the result being

a· f (jw1/-'
v'
~l
lJ.o
- ~ V s 1/-') dV
Eo
- a' f
SI' •• SN,T,
(In' E)Vs1/; dS

- a •
81'"
f SN,T,
[jw1/-'(l n X B) - (in X E) X V s1/;] dS (5.40)
SECTION 6 Conditions at Infinity 275

Since a is arbitrary, it follows that the integrals on the t\VO sides of (5.40) may be
equated, yielding

j (jWl{! t-t> - ~ V sl{!) dV - 8,. f 8N [(in · E)V sl{! + (in X E) X V sl{! - jwl{!(in X B)) dS

= f (in· E)Vsl{! + (in X E) X Vsl{! - jwl{!(in X B)) dS (5.41)


~

where, for convenience, the surface integral over the sphere ~ is displayed separately.
It is further shown in Appendix G that the right side of (5.41) reaches the limit
-47rE(x,y,z), with (x,Y,z) the coordinates of the point 1\ as ~ shrinks to zero. Therefore
the limiting value of (5.41) is

E(x,y,z) =~
47r
f (~V sl{! -
V EO
jwl{! ~)) dV
JJ.o

+ Lf SI'" SN
[(in · E)V sl{! + (in X E) X V sl{! - jwl{!(in X B)) dS (5.42)

This important formula gives .E at any point in the volume V in terms of the sources
within V plus the field values on the surfaces which bound ~T.
One may proceed in a similar fashion, by letting B = F, and deduce a companion
formula for B(x,Y,z). Alternatively, the curl of ([).42) may be taken and then (5.30)
employed to obtain B. By either procedure, one finds that

B(x,y,z) 1
= -4 f --=t t X V st/; dV
7r V JJ.o

+~ f [jWl{! (in X E)
2
+ (in X B) X V sl{! + (in · B)V sl{!] dS (5.43)
47r 81" . 8N C

Inspection of the volume integrals in (5.42) and (5.43) reveals that B is given in
terms of the current sources only, whereas the expression for E contains terms involving
both the currents and the charges. However, the continuity equation (.1.34) may be
used to give

E(x,y,z) 1 f _.1_ lo Vs)Vst/; + k t/;t ] dV


47r v JWEo
2

+~ f [(in· E)Vst/; + (in X E) X Vs\{; - jwl/;(l n X B)] dS (5.44)


47r 81 " , SN

Equations (.1.43) and (5.44) constitute a solution of Maxwell's field equations in terms
of the current sources within V and the field values over the bounding surfaces S;

5.6 CONDITIONS AT INFINITY

Let it now be assumed that the surface F;N of Figure 5.2 becomes a large sphere of
radius (R centered at the point P. (R initially will be taken great enough to enclose all
the sources t and p of the fields; ultimately CR will be permitted to become infinitely
276 Electro1nagnetics in Free Space CHAPTER 5

large. Under these circumstances, consider the contributions to (5.43) and (5.44) of the
surface integrals over SN.
If 1m is a unit vector directed outward along the radius of the spherical surface SN,
so that 1m = - In , one may write for the appropriate part of (5.43)

L1[j;t (I n X E) + (1" X B) X V Sy; + (I n • B)V Sy;] dS

=L1[- jc~ (1<1\ X E) + (1<1\ X B) X 1<1\ (jk + D


+ (1<1\ • B)l<1\ (jk + D]e-;ki dS

L1{- jc~ (1<1\ X E) - (jk + ~) [(1<1\ X 1<1\ X B) - (1<1\' B)l<1\l} e;<I\ dS

= L1{- j; [(1<1\ X E) - cB] + ~} e;<I\ dS (;J.45)

Similarly, the appropriate part of (5.44) becomes

~
47l"
J [(1,,·
SN
E)·vsY; + (1" X E) X VSy; - jwy;(l" X B)] dS

=~
471"
J {Jw [(1<1\
SN
X B) + ~] +
c
E}
ffi
e-~<I\ dS
\.Tt
(5.46)

If CR ~ 00, since the surface of the sphere increases as CR2, the surface integral (5.45)
will vanish if
lim ffiB is finite (5.47)
m~oo

lim CR[(l m X E) - cB] = 0 (tj.48)


CR~Q()

Similarly, the surface integral (,5.46) will vanish if


lim ffiE is finite (5.49)
CR~oo

lim CR [(1<1\
<R~Q()
X B) +~]
C
= 0 (5.50)

The relations (5.47) through (5.50) are known as the Sommerfeld conditions at infinity.
Expressions (f>.47) and (5.49) are commonly called the finiteness conditions (Endlich-
keit Bedingungen) and expressions (5.48) and (5})0) are customarily given the name
of radiation conditions (Ausstrahlung Bedingungen). The finiteness conditions require
that E and B diminish as (R-l while the radiation conditions require that they bear the
relation to each other found in wave propagation in regions remote from the sources.
(See Section 5.7.)
I t is now possible to demonstrate the extremely important result that real sources,
confined to a finite volume, always give rise to fields which satisfy the Sommerfeld
conditions. To see this, consider Equations (.1.43) and (fj.44) when the only boundary
surface is the large sphere SN whose radius will be permitted to become infinitely
SECTION () Conditions at J nfin"ity 277

large. It shall be assumed that the real sources t and pare finite and confined to a finite
volume 11 0 . With the surface ~')N becoming an infinite sphere, the volume l ' in U>.4:3)
and (:>.44) also becomes infinite, but no convergence difficulties arise with the volume
integrals because the sources are all within V o.
Borrowing from the results of Section ;").9, t he fields over SN will consist of outgoing
waves whose power density is E X H, wat.ta/rn". Since the surface area of SN is increas-
ing as (R2, if there is even the minutest loss in V, the law of conservation of energy
requires that E and H, diminish more rapidly than m-I , and thus conditions (;").47)-
(5.50) are satisfied. One can then conclude that in an unbounded region, B(x,y,z) and
E(x,y,z) are given solely by the volume integrals which appear in U).4:3) and (5.44).
A check on this conclusion for the limiting case of no loss in V may be obtained
through an ordering of the terms which comprise the volume integrals. To see this,
select as origin an arbitrary point in V o and let r be the vector drawn from the origin
to the field point P(x,y,z); the vector drawn from the source element to P will be
labeled ~. Then

(jk+-1) e--
jkl
+ - all
(L8
-+- Let>
- all)
-
1
I

~ ~ ~ ao' ~ sin 0' ac/>'

in which spherical coordinates (~,O',c/>') centered at P have been employed and

~
1r = - -r

Performing the indicated differentiations, one obtains

(5.51 )

The functions l/;, V sf, and (t · V s)V sf are all seen to involve polynomials in the variable
~-l. Retain for the moment only first-order terms; then substitution in (5.43) and
(5.44) gives

B(x , y ' = -
z) 4
1 f \
jk. - -1 X I r - - dV
e-jk~
(.5.52)
rr v JJ.o ~

E(x,Y,z) =-
1 f -.-1 [-k 2 (t · Ir)lr + ~M
k 21] - dV (5..13)
4rr v JWf:o ~

But ~ = [(x - ~)2 + (y - 1])2 + (z - r)2p2


== [(1' sin () cos c/> - ~)2 + (r sin 0 sin </> - 1])2 + (1' cos () - ~)2P~

in which now conventional spherical coordinates (r,f),cP) centered at the origin have
been introduced. As P becomes remote, ~ can be expressed in the rapidly converging
series
~ = r - (~sin 0 cos c/> + 1] sin () sin c/> + r cos ()) + OCr-I) (5.54)
Similarly, ~-l = r- 1 + O(r- 2 ) lim l r = IT
T-+ 00
278 Electronuujnciics in Free Space CHAPTER 5

and thus as r becomes very large, Equations (5.52) and (5.53) may be written

B(x,Y,z) = -
jk e!':
-
f\ --=1 X lr ei k JI dV + 0(1'-2) (5.55)
41T r V' J.lo

E(x,Y,z) = -jw -e-


jk r
f I, X (1 r X -=i
1). eJk JI dV + 0(r- 2 ) (;").56)
41T l' V' J.lo

in which JI = ~ sin () cos ¢ + 17 sin 0 sin ¢ + S cos o. t


If one were to go back and include all the terms in the expressions for V st/; and
(1 · V s) V s1/;, they would alter the results (5.55) and (5.5H) only at the level of 0(r- 2) .
Therefore these two expressions for Band E may be taken as exact.
In considering the expressions (tj.5t» and (5.56) with respect to the Sommerfeld
conditions, one notices that the terms of 0(1'-2) and below satisfy all four conditions
and thus concern may be focused on the explicit first-order terms. But

T-+
lim rB = jk lim e- j k r
ao 41T T-+ ao V
f~Xle -1
J.l 0
r
j k JI dV (.j.57)

and, since the volume integral is a function of the source coordinates and the angular
direction to J>, but not of r, this limit is finite. A similar argument establishes that
lim r E is also finite and thus both finiteness conditions are satisfied.
T-+ ao

Further,

lim r
T-+ ao
[(l r X B) + ~]
C

= lim -
e: jkr f [jk1 r X ---=i X
t
I, + -jwc I, X I,
t ]
X ---=i e i k JI dV
T-+ ao 47r V JJ.o JJ.o

The integrand in (5}j8) is identically zero and therefore condition (5.t50) is satisfied.
In like manner, the condition (5.48) is found to be satisfied also. This supports the
argument that any system of real sources confined to a finite volume Vo gives rise to an
electromagnetic field at infinity which satisfies Sommerfeld's conditions, that the sur-
face integral over an infinite sphere SN gives a null contribution, and that in an un-
bounded region the electromagnetic field at any point P, near or remote, is given
precisely by

E(x,Y,z) =~
41T v
f _.1_ [(t · V s)V st/;
)Wf.o
+ k 21/;t ] dV (5..59)

B(x,Y,z) = -
1 f ----=i
t
V st/; dV (5.60)
41T V J.lo

Suppose now that parts of the volume Vo are excluded from V by the finite, regular
closed surfaces 1..~1 • • • S, . . . . These surfaces may exclude some of the sources from
V or not, but their presence does not alter the results at infinity. However, now the
more general expressions (.1.43) and (fJ.44) apply, and one may conclude by saying that
(5.43) and (5.44) are valid ev~n if the volume V is infinite, so long as real sources in a
t This syrn bol is the Russian lower-case "ell" and may be called the directional position of the source
point.
SECTION 7 TYke Potential F'U,nctions 279

finite volume are assumed. If the volume V is infinite, the surface at infinity need not
be considered.

5.7 THE POTENTIAL FUNCTIONS

If the volume V is totally unbounded, J~quations (:").42) and (:>.43) give

(:>.61)

(.1.62)

Since Vpl/; == -Vsl/;, and since \ and the limits of integration arc functions of (~,r],t),
but not of (x,y,z), these integrals may be written

E = -V p
V
J PYt
41r€o
dV - jw J
V
lYt_1 dV
41r J,lo
(.5.63)

B == V F X
V
J~ -1
41rJ,lo
dl T (5.64)

Therefore it is convenient to introduce two potential functions by the defining relations

A(x,Y,z,t)
-- J ~(~,1J,t)ej(wt-kn
-1 dl
1
(5.6.5)
V 41r~o ~

4.l(x,Y,z,t)
-- J p(~,r],t)ej(wt-kn
dV (5.66)
V 41r€o~

in which the time factor e jwt has been reinserted and e-jkr/-c has been substituted for l/;.
A is known as the magnetic vector potential function and <I> is known as the electric
scalar potential function.
Since k == w/C, one may write

exp [j(wt - k~)] == exp [jw (t - ~/c)]

Therefore each current element in the integrand of (f).65), and each charge element in
the integrand of (f).66), makes a contribution to the potential at (x,y,z) at time t which
is in accord with the value it had at the earlier time t - ~/c. But this is consistent with
the idea that it takes a time ~/c for a disturbance to travel from (~,'Y/,t) to (x,y,z). For
this reason, (5.6t)) and (f).66) are often called the retarded potentials.
From (5.63) and (5.64),
E == -V<I> - A (5.67)
B==vxA (.1.68)

in which the subscripts on the del operators have been dropped, since A and 4.l are
functions only of (x,Y,z) and not also of (~,'YJ,t).
280 Elcctronuumetics in Free Space CHAPTER tj

The differential equations satisfied by A and <I> may be deduced by taking the
divergence of (5.67) and the curl of (;").68), which leads to
1 .. \
V'2A - - A = -1
(;").69)
c2 1J.o
p
(ti.70)

these results being valid whether t and pare harmonic functions of time, or more general
time functions representable by Fourier integrals. A proof 111ay be found in Appendix H.
At points away from the sources, it is unnecessary to solve for both <I> and A (unless
static source distributions are involved). One need find only A, then use (f).G8) to
obtain B, and then use (5.24) to deduce E. The latter may be rewritten

c
E=-vxVxA (:").71)
jk

It is interesting to observe that for time-independent sources (w = k = 0), Equations


(5.66) and (5.67) reduce to the electrostatic relations encountered in Chapter 3, whereas
Equations (5.6,S) and (5.68) reduce to the magnetostatic relations developed in Chap-
ter 4.
For the more general time-harmonic case, if all the sources are confined to a finite
volume Yo, and if ~ from any point in V o to (x,Y,z) is much bigger than the maximum
dimension of Yo, then (5.65) 111ay be approximated by replacing ~ with r in the denorn-
inator of the integrand, and by replacing ~ with (5})4) in the phase factor, where r is
drawn from an origin in V o to (x,Y,z). This gives the far-field approximation

A(x,Y,z,t) =
ei(wt-kr)
---1-
f t(~,l1,t)ejkJI dV (,5.72)
41r,LLo r v

in which, as before, JI = ~ sin () cos cP +


11 sin () sin cP + r cos 8. To this same approxi-
mation, using (5.68) and (5.71), one obtains

B = -jkl r X A (,j.73)
E = -c1 r X B = -jwA T (5.74)

with AT that part of A which is transverse to the radial direction L.


A study of Equations (5.72) through (fj.74) shows that the fields are in the form of an
outgoing spherical wave
ei(wt-kr)
(5.75)
41r1J.o1r

which diminishes as the reciprocal of the distance, and that this wave is modified by
the directional weighting function

a(o,cf» = f ta,'I/,t)e i k JI
dV (5.76)
v
For this reason a(8,cP) may be called the field pattern, and is closely related to the
power radiation pattern of the system of sources, as will become evident shortly.
SECTION 7 The Potential Functions 281

From the form of (5.75), it is apparent that the wave is propagating in the radial
direction at such a speed that a point of constant phase satisfies the relation
wt - kr = constant
dr w
which gives vph = -dt = -k = C

as the phase velocity of the wave.


Further, 0).73) and (5.74) indicate that both Band E are transverse to the direction
of propagation and that in the transverse plane they are perpendicular to each other,
their magnitudes being in the ratio E/ B = c.
These properties are common to all time-varying electromagnetic fields in free space
at points remote from the sources.
EXAl\1PLE 5.2
A. simple source of great practical importance is the half-wave dipole. It may be assumed
to consist of a filamentary current disposed along the Z axis, as shown in the figure, with
z

" I(f) = I; cos kf

an amplitude distribution which is spatially sinusoidal. Thus one may describe the current
distribution by the equation
I(s,t) = 1m cos kseiwt
282 Electronuupieiics in Free Space CHAPTER 5

in which 1m is the amplitude of the current at the central feeding terminals, which are
assumed to be negligibly separated.
Use of (5.76) gives

(j,C 8) J Um
"/4
cos kt . e i kt co, 9 dt
-"/4
= 1 21m cos (1r /2 cos 8)
z k sin? 8
. Imei(wt-kr) cos (1r /2 cos 8)
so that AT = -10 SIn 8..-l z = -1 8 - -1- ------
21rJ.Lo kr sin (J

from which E =
jwlmei(wt-kr)
18 - - - -
[COS (1r /2 cos 8)]
21rfJol kr sin 8
_ jcl m ei(wt-kr) [cos (1r /2 cos (J)]
-1 8 - - l - -
21rJ.Lo r sin (J
1 jIm ei(wt-kr) [cos (1r /2 cos (J)]
B = - iT X E = l cP - - ---
l
C 21rJ.Lo r sin 8

The directional weighting function [cos (1r /2 cos 8)]/sin {} is plotted in polar form in the
second figure, for a half-plane 4> = constant. The three-dimensional field pattern may be
obtained by rotating this plot around the Z axis, and bears some resemblance to a torus.

cos (~cos 8)
sin 8
SECTION 8 lJIaqnetic Stored Energy 283

5.8 MAGNETIC STORED ENERGY

With the aid of Faraday's emf law, it is now possible to derive a relation for the
energy stored in a magnetic field. To this end, consider first a charge q which is part
of a current system giving rise to an electromagnetic field E, B. If vet) is the instan-
taneous velocity of the charge, in time dt it suffers a displacement v dt and experiences
a force q(E + v X B). The work done on the charge during this displacement is
dW == q(E + v X B) · v dt
== qv · E dt
Therefore the power being supplied to the charge by the field is
dW
p == - == qv' E (5.77)
dt
If, in place of q, one considers all the charge PI dV in a volume element dV which
possesses the instantaneous velocity vIet), the power being supplied to this charge is
d 3P I == PIVI · E dV == 1 1 · E dV
Similarly, for all the charge P2 dV in the same volume element dV, which has the differ-
ent instantaneous velocity V2(t), the power being supplied is 12 · E dV. Upon superim-
posing the contributions for all the charges in dV, one obtains
d 3P
== 1 · E dV (5.78)
in which \ == \1 + \2 + . . . .
N ext, consider a distribution of current density 1(~,1],t) which has established a
steady magnetic field B(x,y,z). Let \'(~,1],t,t) be an intermediate value of the current
density as it is slowly raised from zero to its final value t, and let B' (x,Y,z,t) be the
corresponding intermediate value of the magnetic field. As suggested by Figure 5.3, let
B' · dS be a tube of flux through the point P(x,y,z) and let C be the contour of this tube.

r- dS'
-:

FIGURE 5.3 Energy build-up in a magnetic field.


284 Electromaqneiics in Free Space CHAPTER 5

If S' is an open surface with C as its sole boundary, then

f .'. dS'
S'
is the total current linked by C.
Let C' be the contour of one of the tubes of current " · dS' which pierces S'. 'I'hen
the tube of flux H' · dS induces an electric field along C' given by

¢d 2E
· df' = - 13' · rlS
C'

This electric field opposes the growth of the current " · dS' and energy must be supplied
by the current to the field at the rate

d 4P = - (.' · dS') ¢d 2E
· df' = (.'. dS')(B' · dS)
C'

which is a use of (5.78).


When all the tubes of current piercing S' are included

d 2P = -
v'
fd 2E·.'
dV' = 13' · dS f .'·dS'
S'

in which V'is the region of current flow for all the tubes of current which pierce S'.
Since the field is changing so slowly that D~ may be neglected in comparison to 1',

¢ H~ de = f .'·dS'
C S'
and it follows that
d2 P = f H~.B'dV
~v

wherein oV is the volume of the tube of flux whose contour is C. Upon including all
the tubes of flux, one obtains for the power being supplied to the entire field

P = fv H·,.,
o
B dV d
= -dt f -1
v
2
1/-
,....0
1,
B 2 dV (5.79)

with V the volume of all space.


If W m is the energy stored in the magnetic field, so that P = dW mldt, then

Wm = tJ.lo l f B2
v
dV =t f B · n,
v
dV (5.80)

in which B now has its final steady value. Equation (5.80) is a companion formula to
(3.151), which gave the electrostatic stored energy for a steady electric field distribution.
EXAMPLE 5.3
It was shown in Chapter 4 at the end of Example 4.10, that if two long concentric con-
ducting tubes carry equal steady currents in opposite directions, the field between them is
given by

B<I>(r) = ----1
I
27rJ..Lo r
SECTION 9 Poynting's Theorem 285

in which r ranges from a, the outer radius of the inner tube, to b, the inner radius of the
outer tube. I is the total current in either tube.
The energy stored between tubes, per unit length, is

IVm = ~ J.lo1J (_Jh_)2 b

Znr dr
2 a 27rJ.lo r

= -1- J - = 1
b
2 2
dr --I n -b
-1 -1
4~J.lo
a
r 47rJ.lo a

As an example, if a = tin., b = t in., and I = 1 amp, the stored energy in the magnetic
field, per meter length of the t\VO concentric tubes, is 0.07 micro-joules.

5.9 POYNTING'S THEOREM

Consideration can next be given to the power balance in a time-varying electromag-


netic field. Assume that there is a system of impressed sources Ii which causes an elec-
tromagnetic field E', B', and that in response to this impressed field there is an induced
system t of currents I r creating an additional field ET, B T. The total current density and
field at any point is therefore

I == Ii + Ir
E Ei + Er
B == B' + B:

In accordance with (5.78), the total field E reacts on the impressed source density t i in
such a way that, if power is being supplied to the field, it must be at the rate

But from Maxwell's equations,


aDo
ti = V x Ho - - - tr
at
so that dP = [-EoVXHo+ :tG~oE2)+Eo\r]dV
3 (5.81)

Application of the vector identity (V.108) gives

v · (E X Hs) = H, · V X E - E · V X H,

and this result, coupled with (5.25a) yields

- E V X H,
0 = V 0 (E X Hs) + H, (aa~)0 (5.82)

t The decomposition of the total current systern into impressed and response current densities is
arbitrary, but often forms a natural division. As an illustration, the currents which flow in the dipole
of Example 5.2 may be considered as response currents, whereas the currents which flow in the gen-
erator and transmission line leading up to the terminals of the dipole may be taken as the impressed
source system.
286 Electromaqneiics in Free Space CHAPTER tj

Therefore (5.81) may be rewritten

(5.83)

This result gives the power balance in a volume element dV. The left side of (ti.83)
is the instantaneous power being supplied by the sources to dV. The factor tE olt 2 +
t,uOl B2 has been shown to represent the density of energy stored in static electric and
magnetic fields. If it is assumed that this factor bears the same interpretation for
dynamic fields (and since it is a point function, this is a most reasonable assumption),
then the term

-a
at
(12
- Eo
2 0
1
E2 + - ,u-1B2)

may be identified as the time rate of change of the density of stored energy.
The factor E · v represents the power density being absorbed from the field by the
response current lr. If, for example, the response current is flowing in a conductor, this
term accounts for ohmic loss. Alternatively, if IT is due to freely moving charges, E · IT
accounts for their increase in kinetic energy.
When the law of conservation of energy is invoked, it follows that the term
V · (E X 1-1 0 ) may be interpreted as the volume density of power leaving dV.
This conclusion may be seen from another point of view by integrating (5.83).
With the aid of the divergence theorem, one may write

p = ~
dt v
J (~EoE2
2
+ ~ /-lOlB2) dV + J E·
2 v
tT dV + J E X H o ' dS
.')
(5.84)

The left side of (5.84) represents the entire instantaneous power being supplied by all
the sources. The first integral on the right side of this equation accounts for the time
rate of change of the entire stored energy of the field. The second integral stands for the
power being absorbed by the system of response currents. The last integral therefore
represents the entire instantaneous power flow outward across the surface S bounding
the volume V. For this reason, one may define the Poynting vector as

CP = E X n, (5.85)

and place upon it the interpretation that it gives in magnitude and direction the
instantaneous rate of energy flow per unit area at a point. This is Poynting's theorem.
Since the units of E and H, are volts per meter and amperes per meter respectively,
it is seen that the units of CP are watts per square meter.
EXAMPLE 5.4
For the field of the half-wave dipole treated in Example 5.2, at points remote from the
dipole (the far-field), Equations (5.73) and (5.74) are applicable and therefore
1
-1 J.1.o 1
1-1 0 = J..Lo B = -1, X E = -1 X ET

C 1]

in which 1] = ~ = 377 ohms is called the impedance of free space. Therefore the
SECTION 9 Poynting' s Theorem 287

Poynting vector (5.85) may be written for this case in the form

~ = i.s,
\r
:1
X (1
~ 1T X u», )= E~
1T--;

= i, _?1~_ [cos ~(2 cos ~2] 2 sin 2 (wt - kr)


(27T"r) 2 SIn (J

since eRe [jei(Wl-kr)] = - sin (wt - kr). If cP is the time-average val ue of (P, then

(j> = i. 1]I~ [cos (2~7T" cos (J)]2


87T"2 r2 SIn (J

and one sees that there is a steady radial flow of energy away from the dipole. The total
average power being radiated I11ay be determined by integrating cP over the surface of a large
sphere S centered at the dipole. This gives

P r a d -- findS
'-.r" •
-- f7T'1 r ?Jli~
--
[COS (7T"/2 cos ())]2 •
12 2 . (Jd(J
r 7T"r SIn
S 0 87T"2r 2 sin ()

= ?JIm
2
f 1r
[cos (7T" /2 cos (J)]2 de
47T" 0 sin (J

= 1'/I~ (1.2186)
47T"
in which the integral has been evaluated by first expanding the integrand in a power series .
.As a specific illustration, if Ieff = O.707Im = 1 amp, since ?J = 377 ohms, the radiated
power is P r a d = 73 watts.

Cases such as the preceding example, in which the currents and fields are varying
harmonicallu in time, occur so frequently and have such importance as to deserve
special discussion. Expressing all quantities in the form of a complex spatial vector
function multiplied by eiwt, such as
E(x,y,z,l) = CRe E(x,y,z)e iwt
one may write
<P = E X H, = t(8eiwt + 8*e-iwt) X (Jeoeiwt + Je~e-iwt)
= l(8 X :ICci' + t* X :leo) + t(t X JCoei2wt + t* X JC~e-i2wt) (5.86)
= tCRe (E X Hci') + fCRe (E X "0)
The term fCRe (E X Hri) is independent of time and thus represents the time-average
value of CP, giving
cP = teRe (E X Hri) (5.87)
The term tCRe (E X 110) contains the factor e j 2wt
and thus represents the oscillating
portion of Poynting's vector. CP may therefore be interpreted at a point as consisting
of a steady flow of energy density plus a flow which surges back and forth at fre-
quency 2w.
Similarly
t€OE2 = t€oE · E = t€o[t(te iwt + t *e- iwt) · (te iwt t *e- iwt)] +
= -t€oE · E* + -t€oCRe (E · E) (5.88)
and tJJ.o 1B2 = tJJ.o1B · B* + tJJ.o1CRe (B · B) (5.89)
288 Eleciromaqneiics in Free Space CHAPTER 5

The terms tfoE · E* and t,uolB · B* are independent of time and represent the time-
average stored energies; their time derivatives are zero. The terms tfoCRe (E · E) and
t,uolCRe (D · B) oscillate at a frequency 2w and they represent the variable components
of the stored energy.
Finally,
E · IT = t( Bei wt + B*e-i wt ) • (IJe iwt + IJ *e-i wt )
= tCRe E · IT* + JCRe E · IT (:").90)

Here again, the term tCRe E · IT* represents the time-average power density being
absorbed by the response currents; the term -~CRe E · IT oscillates at a frequency 2w
and represents the energy density being cyclically absorbed and released by the
response currents.
With this formulation, Equation (.1.84) may be rewritten in t\VO parts. The time-
average power balance is seen to be

P = tCRe f E · v.• dV + tCRe sf E


v
X H~ · dS (5.91)

whereas the time-variable part, oscillating at a frequency 2w, may be written

P(2w) = -d
dt v
f [1 foCRe (E · E) + -1,un CRe (B • B) dV
-
4 4
1 ]

+ i CRe f E · \' dV + ! CRe f E X n, · dS (5.92)


~ v 2 s

Thus, on the time average, the sources supply power only to that component of the
response currents in phase with the electric field, represented by the first integral
in (5.91), and to the net energy flow out of the volume V across the surface S. In
addition, the sources may have to furnish energy and take it back at the cyclic rate 2w
if the right side of (,5.92) is not zero. However, in many practical circumstances, the
individual integrals in (5.92) may not be in phase, but may be adjusted purposely
so that they cancel each other, thus "matching" the generator.
EXAMPLE 5.5
In a volume V away from all currents, Equations (5.91) and (5.92) give

t CRe f E H~ ·
s
X dS = 0

1- CRe sf
2
E X H, · dS = - -d
dt
f [1- EoCRe
v 4
(E · E) 1
+ - }J.o 1CRe (8 · B) ]
4
dV

The first equation says that the average power flow into iT equals the average power flow
ou t. This is as it should be since V consists of free-space. The second equation says that
the energy which surges back and forth across S accommodates the cyclic variation of
stored energy within V.
As a specific illustration, let l be sufficien tly remote from the currents so that the fields
'
may be described by (5.73) and (5.74). If then all dimensions of V are small compared to r
SECTION 9 Poynting' s Theore11~ 289

(the distance to the currents), A assumes the simple form throughout V of


A = <Xoei(wt-kz)

with <X O a constant, and the local Z axis chosen in the direction of propagation.
I t then follows that
E = lx(80e-ikz)eiwt

B = 1. (~ e- ikz) eiwt

in which 8 0 is taken to be a real constant, and the X and Y axes have been oriented appropri-
ately in a transverse plane. Thus
t€oCRe (E · E) = t€oCRe 8~ei2(wt-kz)
= t€08~ cos 2(wt - kz)
1
1 1 J..Lo
4 J..Lo CRe (B · B)
1 2
= "4 ~ 8 0 cos 2(wt - kz) = 41 €oCRe (E · E)
The time-varying energies stored in the electric and magnetic fields have the same peak
values. The same is true of the time-average values.
Further,
iE X H6' = clz(t€o8~)
iCRe (E X Hs) = clz(t€o8~) cos 2(wt - kz)

The first of these two expressions has only a real part, is independent of spatial position,
and gives the time-average value of the Poynting vector. It is interesting to note that the
average power crossing unit transverse area is equal to the average energy stored in a
volume c units long and unit area in cross section. Since the waves are propagating at a
speed c, this is a most reasonable result.
The two integrals at the beginning of this example may be applied to the case for which
V is a rectangular volume of square and unit cross section, one-quarter wavelength long
in the Z direction. Then the first integral becomes

tCRe fE
s
X H; . dS = c(hoG~) f 1, ·
s
dS

This surface integral has contributions only over the two transverse surfaces, these contri-
butions being equal and opposite, thus giving the required null result.
For the second integral,

c(iEo(;~) f 1, cos 2(wt - kz) · dS = -CEo(;~ cos 2wt

-it ! GEO(;~)
s

cos 2(wt - kz) dV = - it Gf (;~ sin 2wt)

W 2
- Ie €o8 o cos 2wt

and thus the two sides of the second equation of this example are seen to agree.

All the principal features of the .preceding discussion of Poynting's theorem for time-
harmonic fields may be retained by deriving a complex form of Equation (5.84). If
one assumes that all fields and currents are expressible as complex vector functions of
290 Eleciromaqneiics in Free Space CHAPTER 5

position, multiplied by the time factor ei wt , one can let


d 3P = - E . \i* dV (5.93)
be defined as the complex power being supplied by the sources to the field. This concept
of complex power will require and receive subsequent interpretation. Then, since

\1.
.*
= V X H* - -
aDri
- - \r *
- 0 at
= V X H + JW D*
* .
0 0 - \T*
one may write
d 3? = [-E· V X Hri - j2w(iEoE· E*) + E· \r*] dV
Once again, use of (V. 108) and (5.25a) gives

- E · V X H o* = V • *
(E X Do) + Do· at
* (aB)

= V · (E X Hri) + j2w G llolB · B*)


and therefore
d 3P = [j2w(tJLo 1B · B* - tEoE · E*) + E · t r* + V · (E X Hri)] dV (5.94)
If this expression is integrated the result is

P= j4w(Wm - We) + f E · v' dV + sf E


V
X Hri · dS (5.95)

in which, by virtue of (5.88) and (5.89), Wmand We are the time-average values of the
total energies stored in the magnetic and electric fields in the volume V.
One-half the real part of (5.95) is seen to be identical with (5.91) and gives the time-
average power being delivered by the sources, that is,

P = tCRe? (5.96)

No equivalent simple interpretation may be placed upon the imaginary part of (5.95)
in the general case. However, in regions away from the sources

idm
s
fE X Hri · dS = 2w(We - W m) (5.97)

Example 5.5 contained an illustration of (5.97) in that E X Hri did not have an imag-
inary part and the time-average values of electric and magnetic stored energies were
found to be equal.
Because of the utility of the preceding formulation, it is customary when dealing
with time-harmonic fields to define a complex Poynting vector by the relation
cP = E X H6' (5.98)
from whence it follows, by use of (5.87), that the time-average value of energy flow
at a point, per unit transverse area, is given by
(5.99)
SECTION 10 The Wave Equation in Rectangular Coordinates 291

EXAMPLE 5.6
In the application of Equation (5.95) to radiation problems, all points of the surface S
are customarily remote from the currents, so that (5.73) and (5.74) are applicable and

(5.100)

in which ae and act> are components of a(8,cP) as given by (5.76).


It should be noted that cP in (5.100) has only a real component and is therefore twice the
time-average energy flow in watts per square meter. At a fixed large distance r from the sys-
tem of currents, cP is a function of 8 and ¢ and is known as the power radiation pattern.
The directional dependence of cP is controlled by the current distribution through (5.76).
If cP(fJ,¢) is specified, (5.76) becomes an integral equation involving the sought-for current
distribution \(~,l1,r); this defines an antenna synthesis problem. If \(~,l1,r) is specified,
(5.76) is an integral solution for the power pattern cP(8,¢); this defines an antenna analysis
problem. These subjects have been treated extensively in the literature."
For the specific case of the half-wave dipole of Example 5.2, since

21m cos (71'"/2 cos fJ)


a= -l e - k . 8
Sin

use of (5.99) and (5.100) gives

(J> = ~(Re
2
cP = 1, '7I~
871'"2 r2
[cos (1l"~2 ;os (J)J2
Sin

which agrees with the result found in Example 5.4.

5.10 SOLUTIONS TO THE WAVE EQUATION IN RECTANGULAR


COORDINATES-UNGUIDED WAVES

Equations (5.35) may be looked upon as dynamic analogs to Poisson's equation. The
developments in Sections 5.5 and 5.6 have revealed that solutions to these equations
in regions remote from the currents may have a wavelike nature. But at points not
occupied by currents, (5.35a) and (5.35b) reduce to

(5.101a)

(5.101b)

and these homogeneous vector differential equations may be likened to Laplace's


equation. Because the general solutions to (5.101) are wavelike (as will be seen shortly),
12See, e.g., H. Jasik, ed., Antenna Engineering Handbook, McGraw-Hill Book Company, New York,
1961. Also, R. C. Hansen, ed., Microwave Scanning Antennas, Academic Press, New York, 1964.
292 Electronuumetics in Free Space CHAPTER 5

(5.101a) and (5.101b) are customarily referred to as the homogeneous vector wave
equations.
Since V • E == 0, V • B == 0 away from the currents, these equations further reduce to
1 2E
V'2E - - -
a = 0 (5.102a)
c2 at 2
(5.102b)

If 1 is any component of E or B, then in rectangular coordinates


af2
af af af 2 2 2
-+-+-+--=0
ax ay az
2 a (jct) 2 2 2
(5.103)

and this is seen to be a four-dimensional form of Laplace's equation. Using the method
of separation of variables, in exactly the same manner that it was employed in Section
3.11, one obtains as a primitive solution of (5.103)
f = eiwt-ik·r (5.104)
with r drawn from the origin to the point (x,y,z) and
Ii. = lxk x +l yk y + lzk z (5.105)
2
w
k2 = kz 2 + ky 2 + kz 2 = -c
2
(5.106)

The solution (5.104) is recognized as representing a uniform plane wave, in that all
points in a plane transverse to k have the same amplitude and a common phase. The
wave propagates in the direction of k at the velocity of light and has a wavelength A
given by
211"
A =- (5.107)
k
If attention is restricted to uniform plane waves propagating in the positive and
negative X directions, Equation (5.104) indicates the fundamental solution
(5.108)
in which a, and b» are constants.
By linear superposition, and with the aid of the Fourier integral theorem, a general
solution may be constructed from (5.108) of the form
!(x,t) = f1(x - ct) + f2(x + ct) (5.109)

in which 11 and 12 are arbitrary functions. The forms of these functions show clearly
that any spatial waveform existing at a time t 1 is preserved and merely displaced a
distance c(t 2 - t l ) at a later time t2 , indicating undistorted propagation at the velocity
of light.
If waves propagating in all directions are considered, three-dimensional Fourier
integrals may be used to fabricate arbitrary spatial distributions. In particular, a
solution
!(x,y,z,l) = Ol(Y,Z)!l(X - ct) + 02(Y,Z)!2(X + ct) (5.110)
SECTION 10 The Wave Equation in Rectangular Coordinates 293

may be constructed with 91 and 92 arbitrary functions. This is seen to represent non-
uniform plane waves propagating in the X direction at the velocity of light, with
arbitrary amplitude distributions in a transverse plane. By insertion of (5.110) in
(5.103) it is evident that 91 and 92 satisfy the two-dimensional Laplace's equation.
If one returns to the constituent solution (5.104), which applies for any component
of E or B, it follows that on putting the components together, a uniform plane wave
may be represented by
E(x,Y,z,t) = IEEoejwt-ik·r (5.111)
B(x,y,z,t) == IBBoeiwt-ik·r (5.112)

wherein l E and In are unit vectors and Eo and B« are complex constants. Since V · E == °
and V • B == 0, it follows that both IE and IE must be transverse to k, and for this
reason the electromagnetic wave is said to be transverse.
E and B are related through Maxwell's equations, and insertion of ((5.111) and
(5.112) in (5.25a) gives

(l{ X i E )E o - i n wB o = 0
which requires that
in == i k X IE (5.113)
k Eo
B o == - Eo = - (5.114)
w C

Therefore the E and B fields are crossed, both being transverse to the direction of
propagation, and their amplitudes are in the ratio c. These properties are held in
common with spherical waves at great distances from the sources, as has already been
noted in Section 5.7. But this is hardly surprising, since spherical waves at great radii
of curvature are well-approximated by plane waves.
The power density of this uniform plane wave is given by

(5.115)

and many of the remarks put forth in Example 5.5, in which a plane wave approxima-
tion was made, are applicable to this case.
By linear superposition, the above results for a uniform plane wave may be general-
ized to the case of a nonuniform plane wave through use of Fourier integrals. The devel-
opment parallels what has already been said for a single component of the field.
EXAMPLE 5.7
Consider a uniform plane wave traveling in the +X direction and imagine that it en-
counters a flat conducting surface in the plane x == O. If it is a good conductor, practically
all the energy in the incident wave will be reflected. This situation may be idealized by
assuming that the conductor is a perfect reflector, meaning that tCRe E X H6 == 0 within
the conductor.
Assume that the Y axis is oriented parallel to the incident electric field E'. Then from
(5.111)

whereas from (5.12) and (5.13),


294 Electronuumeiics in Free Space CHAPTER 5

Equation (5.115) gives for the incident power density


(Jii = cl x C
t Eo1Et1 2)
Let the reflected wave be represented by

Then, since the power flow for the reflected wave must be in the - ~Y direction, Equations
(5.12) and (5.13) give
Dr(x,l) = -1 z E~
- e1w. t +"ik x
C

Because no field exists inside the idealized conductor, the total electric field at x = 0-
must vanish, so that
Ei(O-,t) + Er(O-,t) = 1 11 e iwt (Et + E~) == 0
which requires that E~ = - E~. This in turn satisfies the condition that the power density
in the reflected wave equals the incident power density.
The total magnetic field just in front of the idealized conductor is therefore

Bi(O-,t) + Br(O-,t) = l.ei"' ! (~~ - ~~)


2Et 1wt
.
= lz- e
c

whereas the total magnetic field just inside the conductor is zero. This discontinuity in the
magnetic field is accommodated by a sheet of current which ftO\VS in the surface of the con-
ductor. This current sheet has been induced by the incident wave and is the source of the
reflected wa vee I ts strength may be deduced by recourse to the dynamic form of Ampere's
circuital law, C5.28b). With reference to the figure, if a rectangular contour is chosen in the
Z

Free space
I Conductor

•Reflected wave
.. x


Incident wave
SECTION 10 The lVave Equation in Rectangular Coordinates 295

XZ plane such that its long legs, of length l, are just inside and just outside the conducting
surface, then
¢ n, · dl = 21 J..L;
-1
E1e;WI

This must equal the total conduction current enclosed by the contour, since Do will make
no contribution if the short legs of the contour are reduced to infinitesimals. Therefore,
the total conduction current enclosed is

I = 2le f: oE1eiwt

and a linear current density i = III flows in the conducting surface such that
i = l y 2e f:oE~eiwt
In causing the flow of energy in the electromagnetic wave to be turned around, the
conductor suffers a reaction which may be computed from the Ampere force law. Since

d 3F = \ X B dV
if the areal current density 1 is "collapsed" into the surface to give a lineal current density j,
one obtains
d 2F = i X B dS
so that the conductor experiences a pressure due to the wave given by

d2F .
p=-=JXB
dS
Since the sheet of current is immersed in a magnetic field whose spatial average value is
l z (E11e)e iwt, the pressure is

and this has an average value


p = l x 2 (t f:oIE~12)
which is twice the energy density in the wave (cf. Example 5.5).
This result is consistent with the viewpoint that the energy possessed by the incident
wave in a column c units long and unit cross section is e(tf:oIE~12). According to the mass-
energy equivalence formula, this may be equated to me", But then this much of the wave
has a momentum equal to
me = t f:oIE~12
and this much momentum is reversed in 1 sec against unit area of the conductor, causing
a radiation pressure of 2mc.

If one returns to (5.111) and (5.112), which are the expressions for a uniform plane
wave, and the Z axis is chosen in the direction of propagation, then
lEE o = l.rEl + lyE 2

in which E 1 and E 2 are complex constants. Therefore


E = lxElei(wt-kz) + lyE 2e i(wt-kz) (5.116)

B = -Ix -
E 2 eJ(wt-kz)
.
+ Iy -E 1 .
eJ(wt-kz) (5.117)
c c
Inspection of these equations reveals that (Ex,B y) and (Ey,B x ) are linearly independent
296 Electromaqneiics in Free Space CHAPTER 5

fields. (Ex,B y) is said to be an X-polarized wave and (I~ly,Bx) is called a V-polarized


wave, the designation referring to the spatial direction of the electric field. The total
field (E,B) is the superposition of these t\VO cross-polarized waves,
If the complex factors Eland b 2 have the same phase, then E at any point in space
1

oscillates along a directional line which makes a constant angle <p with the X axis, this
angle being given by <p = tan' (1~2/I~l). Under this condition, the wave is said to be
linearly polarized.
If E 1 and E 2 have the same magnitude, but their phases differ by 90 deg, then E at
\
any point in space does not oscillate. Its magnitude is constant, but its direction rotates
at the angular velocity w. To see this, let })2 = +jEl, so that (5.116) yields

E(x,t) = CRe(lx + jly) E le;'(wt-kz)


= El1[lx cos (wt - kz) ± Iy sin (wt - kz)] (5.118)

in which, for simplicity, the phase of Ell has been chosen as zero. Therefore,

IE(x,t)1 = EI[cOS 2 (wt - kz) + sin" (wi - kz)]~~ = b\

a result which is independent of time and position. Further, the direction of E makes an
angle cp with the X axis given by

± sin (wt - kz)


cp = tan- 1 = ± (wt - kz) (5.119)
cos (wt - kz)

At a fixed point (x,Y,z), cp changes linearly with time at the angular rate ±w. If the
thumb of the right hand is placed in the direction of propagation, E thus either rotates
in the direction indicated by the other fingers, or counter to this direction. When the
rotation of E agrees with the direction of the fingers (E 2 = -jEl), the wave is said
to be right-handed circularly polarized; if E rotates counter to the finger direction
(E 2 = +jE 1) , the wave is said to be left-handed circularly polarized.
Alternatively, if time is held fixed, and the direction of E is viewed as a function of z,
Equation (5.119) indicates that the locus of the tip of E is a helix whose axis is the
Z axis, the z length of one turn being a wavelength. The helix resembles either a left-
hand thread or a right-hand thread, depending on whether E 2 lags or leads E 1·
The stored energy and the energy flow associated with either a right-handed or left-
handed circularly polarized wave are given by

We = i EoIE(x,i )12 = tEoE~ = Wm (5.120)

(P = E X H, =
Ei
Iz- (5.121)
11

Therefore, the energy density and the energy flow are both independent of time and
space, a characteristic not shared by linearly polarized plane waves.
If one returns again to the general solution (.:).116), and if E 1 and E 2 have arbitrary
relative amplitudes and phases, at any point in space the tip of E describes a locus
which is an ellipse, and for this reason the wave is said to be elliptically polarized. It is
left as an exercise to develop the properties of such plane waves, including the useful
fact that any elliptically polarized wave may be represented by appropriate amounts of
left-handed and right-handed circularly polarized waves.
SECTION 11 Rectilinear Guided TVaves 297

5.11 RECTILINEAR GUIDED WAVES

Many structures, such as two-wire lines, coaxial cables, waveguides, and dielectric
rods, have been found to possess the property of being able to guide electromagnetic
waves from one point to another. When this guiding occurs along a straight-line path,
the problem is amenable to analysis, for then every component of the electromagnetic
wave may be represented in the form

f( u» )ei (wt-kzz) (5.122)

in which z is chosen as the propagation direction and u, v are generalized orthogonal


coordinates in a transverse plane. (See Mathematical Supplement, Section V.11.)
Under this assumption, in a source-free region, Maxwell's equations become

1 aE
h 2 av
- _z + jk E v
Z
= -J'wB u

1 aE z
-jkzEu - - - = -jwB v
hI au
hl~2 [a: (h2E.) - :v (h1E u) ] = -jwB,
(5.123)

wherein tii and h 2 are the scale factors associated with u and v.
These equations can be solved for the transverse field components, yielding

(5.124)

in which k 2 = W2J..LoEo = (W/C)2 = (27r/A)2 is the square of the free-space wave number.
Equations (5.124) indicate that, in general, the entire electromagnetic field can be
determined from knowledge of the longitudinal components.
An important exception to this occurs when the field is propagating in the Z direc-
tion at the velocity of light, for then k; = k and Equations (5.124) have a pole, unless
E, = Hz == O. Once again the conclusion is reached that electromagnetic waves prop-
agating in free space at the velocity of light are transverse.
If this case (k z = k) is pursued further, with the longitudinal field components zero,
Maxwell's equations (5.123) give

E; = eli; E; = -cB u (5.125)


so that E · B = (l ucB v - l vcBu ) • (l uBu + l vB v ) == 0 (5.126)
298 Electronuujneiics in Free Space CHAPTER 5

and thus the transverse electric and magnetic fields are orthogonal. In addition, if one
writes
E = f,(u,v)ej(wt-kz) B = (B(u,v)ej(wt-kz) (5.127)
with the implication being that 8 and (B are transverse two-dimensional static fields,
then
lu 1v 1z
- - -
h2 hI h Ih2
vxf,= a a a =0 (5.128)
au av az
hlS u h 2S v 0
Therefore f, may be expressed as the negative gradient of an electrostatic potential
function. For this reason, if any two-dimensional static electric field, such as those
found in Chapter 3 is put in motion at the velocity of light, the result is a valid dynamic
solution to Maxwell's equations. The rich storehouse of solved two-dimensional electro-
static problems is thus available for consideration in the creation of rectilinear guided
waves.
Furthermore, since (B is orthogonal to f" it follows that the flux lines of (B lie in the
equipotential surfaces of any two-dimensional electrostatic problem. Field maps such
as those in Example 3.29 may be viewed as giving an electrostatic field and its equi-
potentials, or alternatively, as the transverse electric and magnetic fields of a propagat-
ing wave.
EXAMPLE 5.8
In Example 3.14, the image principle was used to determine the electrostatic potential
distribution due to two infinitely long, parallel tubular conductors, each of diameter 2a
and center-to-center spacing D, when the upper cylinder contained a net charge coul/m +}{
and the lower cylinder contained a net charge -}{ coul/m. With the coordinate system
arranged as in the figure (see next page), the potential was given by

= -xI n {x 2 + [y + t(D2 - 4a2)~2]2}


2
<I>(xy)
, 471'" Eo x + [y - t(D2 - 4a2)~2]2

When the coordinates of a point on the upper conductor were inserted in this expression,
the potenttal of the upper conductor was deduced. When the same thing was done for the
lower conductor, the potential difference was found to be

v =..!!:- In {D2a + [(D)2


71'" Eo 2a
- 1]~}
The electrostatic field caused by this static charge distribution may be determined from
f, = - Vcf>, giving
v
8(x,y) = 2 In /D/2a + [(D/2a)2 _ IJ~~1 f(x,y)

f l x x + l y [Y - i(D2 -
4a ) }2] 2
l x x + l y [Y + t(D2 - 4a2)~~]
in which (x,y) = x 2 + [y _ i(D2 _ 4a 2)Hj2 - x 2 + [y + i(D2 - 4a2)~~J2

If this static electric field is put in motion along the cylinders at the velocity of light
(this assumes the cylinders are perfectly conducting and in a free-space environment),
SECTION 11 Rectilinear Guided Waves 299

f
I

D -----x

~2a~

then the electric field distribution becomes

E(x y z t) = t(x y)ei(wt-kz) = f(x,Y) Vei(wt-kz)


, , , , 2 In {D/2a + [(D/2a)2 - 1]~~}

and a voltage wave can be imagined to travel along the twin cylinders in conjunction with
the electric field.
The accompanying magnetic field may be deduced from Maxwell's equations (5.123) or
from (5.125) and is given by

Using the integral form (5.28b) of Maxwell's second equation, and taking a contour in the
X Y plane which coincides with the perimeter of the upper conductor, one is able to deter-
mine the current flow in the upper conductor. Since Do == 0 within a perfect conductor,
the result is that
~ Vei(wt-kz)
Ienclo8ed = Iei(wt-k;.) = 'f H, · dt = - - - - - - - - - - - - - - -
c ~ ~ In {D/2a + [(D/2a)2 - 1]~~}
1r

so that the complex current amplitude, I, is linearly proportional to the complex voltage
amplitude, V. The current in the upper conductor also is seen to be a wave; a counter
300 Electromagnetics in Free Space CHAPTER 5

current flows in the lower conductor. This two-conductor system, which guides the electro-
magnetic wave rectilinearly, is called a two-wire transmission line.
The ratio V /1 is of some interest, and is called the characteristic impedance of the
transmission line. It is given by

Z= T
V == {D + [(D)2
120 In 2a 2a - 1 ]~2} ohms

in which the numerical values of j..Lo and Eo have been inserted. Z is seen to be pure real
and to have a value governed by the geometry of the twin conductors. If a finite length
of this two-wire line is terminated by a lumped resistor of value R = Z, a wave traveling
along the wires, upon reaching the resistive termination, will be totally absorbed, since the
voltage-to-current ratio in the resistor is exactly the value required by the wave. If R ~ Z,
there must be a reflection.
This procedure may be repeated for a variety of transmission line geometries. Several
cases are included among the problems at the end of this chapter.

Returning to Equations (5.124), if k, ~ k, one can use these equations to find the
transverse-field components if the longitudinal components are known. Since these
equations are linear functions of E, and Bs, partial transverse fields due to E, alone
may be determined by setting B, == O. Such fields are called transverse magnetic, or
more briefly Tl\1 waves. Similarly, a second set of partial transverse fields due to B,
alone may be determined by setting E, == O. These fields are called transverse electric,
or 'I'E waves, The most general solution is then an arbitrary sum of the two sets of
partial fields.
Because the spatial derivatives of L, and L, are transverse, the vector wave equation
(5.102) has a separable Z component (cf. Mathematical Supplement, Section V.16)
which may be written

-h 1 (a-au -hh -aua+ -ava-hh -ava) E z +


2 l
(k 2 -
2
kz)E z = 0 (5.129a)
lh 2 l 2

-h 1 (a-au -hhI -au


2 a + -a -h -a) B z +
l

av h av
(k 2 - k 2z)Bl, = 0 (5.129b)
1h 2 2

Solutions to (5.129) which fit the boundary conditions of the problem under considera-
tion may be inserted in (5.124) to determine the transverse-field components, thereby
completing the description of the electromagnetic waves which are traveling along the
guiding structure.
EXAMPLE 5.9
A rectangular waveguide consists of a hollow pipe, usually made of good conductor, with a
rectangular cross section, as shown in the figure. This waveguide will support both TM and
T'E waves, as may be seen by the following argument:
In Cartesian coordinates, (5.129a) reduces to

If the walls are good conductors, E, ~ 0 against each wall. If this were not so, the currents
induced in the walls would be so high as not to match properly the tangential magnetic
field. This condition will be modeled by choosing as boundary conditions E, == 0 in each of
SECTION 11 Rectilinear Guided Waves 301

-t

I .._ X
- - - - - - - - - - - - ' - - - - - - - - - t ..

/I~. a----- ~I
Z
the four walls. Then the suitable primitive solution of the above wave equation is

2 . mat x . n7rY e1'(wt- k )


E z = - - SIn - - SIn - zZ (5.130)
-v;;b a b

in which m and n are independent positive integers (greater than zero). E z , as given by
(5.130), and the four transverse-field components associated with it, determinable from
(5.124), together are called the TMm n mode for rectangular waveguide. In (5.130) the
factor 2/~ is included for normalization purposes, such that

f f t/;mnt/;rs
a b

dx dy = O::n
o 0

.1, 2 . mx x . n7rY
wherein 'Ymn = --SIn - SIn - (5.131)
~ a b

and the Kronecker delta, o;::m equals unity if r = m and s = n, being otherwise zero.
Upon substituting (5.130) in the wave equation, one finds that k, = ±{3mn where

(5.132)

and this mode will propagate only if

If this condition is not met, the mode will attenuate exponentially. For given interior
dimensions (a,b), the higher the values of m and n, the shorter must be the free-space
wavelength A in order to achieve propagation.
The most general solution for E, consists of a linear superposition of terms like (5.130)
302 Eleciromaqnetics in Free Space CHAPTER 5

for all possible values of m and n, that is,

LL
ClO ClO

Ez = KmnY;mn(X,y)ei(wt-timrl%) (5.133)
m=l n=l

in which the K m n are arbitrary complex constants. Equations (5.124) may be used to
obtain expressions for the four transverse-field components associated with (5.133). The
resulting collection of five equations describes the most general combination of 1'M modes
traveling in the positive Z direction in a rectangular waveguide, Reversing the sign before
f3mn will give a similar solution for propagation in the negative Z direction.
In like manner, in Cartesian coordinates (5.129b) reduces to

aB z + aB z +
2 2
(k 2 k;)B z =0
ay
_
ax 2 2

In order to satisfy the boundary conditions that Ex == 0 in the top and bottom walls, and
that E y == 0 in the side walls, the suitable primitive solution of this wave equation is
B, = 'Itmn(X,Y )ei(wt-timrlz) (5.134)
2 ms» n1rY
in which 'It
mn
= - - cos -a- cos - b
v;;b (5.135)

and m and n are independent positive integers. (One or the other can be zero, but not
both.) {3mn once again is given by (5.132). Bs, as given by (5.134), and the four transverse-
field components associated with it, determinable from (5.124), are together called the
TE m n mode for rectangular waveguide.
The most general solution for B, consists of a linear superposition of terms like (5.134)
for all possible values of m and n, that is,
ClO

B Z = ~ ~ Kmn'lt mn (x,y )ei(wt-timnz) (5.136)


m=O n =0

with the Kmn arbitrary complex constants. Use of Equations (5.124) will yield expressions
for the four transverse-field components associated with (5.136). The resulting collection
of five equations describes the most general combination of TE modes traveling in the posi-
tive Z direction in a rectangular waveguide, Reversing the sign before {3mn will give a
similar solution for propagation in the negative Z direction.
A study of (5.132) reveals that, if a > b, the selection m = 1, n = 0 yields the lowest
possible value of k which will permit propagation. Therefore the TE mode for which rn = 1,
n = 0 (i.e., the rrE 10 mode) will propagate at a lower frequency (longer free-space wave-
length) than any other TE mode, and than any 'I'M mode. This means that, at a given
frequency, it is possible to choose a and b such that only the TE 10 mode will propagate.
It is left as an exercise to show that to ensure this condition, X/2 < a < X, b < 'A/2.
Because of this unique feature of the TE 10 mode, it is particularly useful when electro-
magnetic energy must be conveyed with a well-defined field distribution. From (5.136) and
(5.124), the field components of this fundamental mode are given by

E; = - jw K sin 1rX ei(Wt-{jlOZ)


1r/a a
B; = jf310 K sin 7rX ei(wt-{:JIOZ) (5.137)
7r/a a
Bz = K cos 1rX e i (wt- til0Z )
a
SECTION 12 The Wave Equation in Cylindrical Coordinates 303

with K an arbitrary complex constant and

(5.138)

Inspection of Equations (5.137) reveals that the electric flux lines for the TE IO mode
go straight across from one broad wall to another, originating on positive charge and
terminating on negative charge. The instantaneous electric flux pattern at one cross section
and charge distribution along one broad wall are shown in part (a) of the second figure.

00000000

00000000
o 0 0000 0 0

(a) Electric field and (b) Magnetic field (c) Current flow
charge distribution

The magnetic flux lines are closed loops which lie in planes parallel to the broad walls
and which encircle the y-directed displacement current Do. Some of these flux lines are
shown in part (b) of the second figure. 'The magnetic flux density against the waveguide
walls is associated with current flow in the walls which may be deduced from the integral
form of Maxwell's second equation, (5.28b). If perfectly conducting walls are assumed,
Do == 0 within the conductor, and a lineal current density flows in the walls of amount
(5.139)

in which In is a unit vector normal to the wall and pointing into the interior of the wave-
guide. Part (c) of the second figure shows an instantaneous plot of some of the current lines.

5.12 SOLUTIONS TO THE WAVE EQUATION IN CYLINDRICAL COORDINATES

Equations (5.102) were used as the point of departure in deducing wavelike solutions
to the field equations in rectangular coordinates. This was a relatively direct procedure
because of the simple form taken by the Laplacian of a vector in Cartesian frames of
reference. However, reinspection of Equation (4.52) reveals that the task is not so
simple in circular cylindrical coordinates, except when dealing with an axial field corn-
ponent. For this reason, the approach to be adopted in the following study of cylin-
drical waves assumes that
E(r,¢,z,t) = B(r,¢)ejwt-jkzz (5.140a)
B(r,cP,z,t) = CB(r,cP)eiwt-J'kzz (5.140b)

in which k; is the wave number in the Z direction and w is the angular frequency. By
treating k, as a parameter, traveling or standing waves in the Z direction may be
304 Eleciromaqneiice in Free Space CHAPTER 5

represented by appropriate linear combinations of fields of the type given by (5.140a)


and (5.140b). Use of the Fourier integral theorem will embrace a still wider variety of
physically realizable distributions within this formulation.
With E and B assumed in the form (5.140), the analysis of Section 5.11 is pertinent.,
and in this coordinate geometry (5.124) gives

Or
= _
k2
1
_ k~ J
('k z 8Sar z
+
Jw 8CBz)
r ac/J
(5.141)
(,tI> =
1 (
-
jk; 8S z
-- +JW--
, aCBz)
k2 - k; r act> a1'
so that, if the longitudinal field components can be found, the transverse components
are generally determinable.
The wave equations (,1.129) are applicable and in cylindrical coordinates give

(5.142)

By assuming either ('Z or CB z to be expressible in the form fl(r)f2(¢), one achieves the
separation

(5.143)

(5.144)

with k~ a separation constant.


If k,p is limited to the integral values n = 0, ± 1, ± 2, ' . , (which provides a com-
plete Fourier series), then letting v = Vk 2 - k; r converts (;'"),143) to

which is seen to be identical with (:3.71) and is recognized as Bessel's differential equa-
tion. This equation and its solutions were discussed in Chapter 3 and Appendix C. The
solutions may be given in many forms, including Bessel functions of the first and second
kinds, modified Bessel functions, and Hankel functions, Because of the asymptotic
forms (3.74) and (3.75), the Hankel functions are particularly convenient when repre-
senting radial waves.
By virtue of the foregoing, I~z(r,¢,z,t) and Bz(r,¢,z,t) may be composed of suitable
products of the factors
fl(V) = Zn(V)
f 2 ( cf> ) = {e±jntl>}
(5.145)
!3(Z) = e- jkzz
/4(t) = e"i wt

with Zn(V) representing suitable cylinder Bessel functions. The usual Fourier tech-
SECTION 12 The lVave Equation in Cylindrical Coordinates 305

niques may be used to generalize !2, !3, and !4, and orthogonal expansions are also
available for !1 (cf. Chapter 3).
If the axis r = 0 is included, unless it contains sources, fl(V) must be expressed in
terms of J n (v) alone. If only a sector in the cf> direction is being considered, n need not
be an integer; the corresponding Bessel functions have a non-integral index and are
unlikely to be tabulated.
An elementary wave function consists of the product of the above four factors with
n, k z , and k (or w) specified, this triplet of numbers serving as identification. More
general solutions 111ay then be given by summing on these three indices. The specific
solutions for E, and B, will differ in the values attached to the different elem.entary
wave functions through imposition of the boundary conditions.
EXAMPLE t5.10
Assume that a tubular sheet of current Ije iwt amp/m flows in the cylindrical surface r = a
between the limits z = ± 00 , with i a complex constant. What field does this source create
in the region r > a?
Because of the disposition of the currents, B must be transverse to the Z axis, and the
entire solution may be given in terms of E z • By symmetry, the resultant field must be
independent of cf> and z; thus n = k z = 0. Further, at large radial distances

with the plus sign applying for H~l) and the minus sign for H~2); since this source system
causes outgoing waves, only H~2) need be selected. Therefore
Ez(r,t) = boHd2 ) (kr)e jwt

with k =~ and the constant bo yet to be determined.


C
Using (5.141), one finds that E, == 0, E¢ == 0, B, == 0, and

and the fields are given by

The ratio of the field components is


B, . H62 ) (kr)
B¢ -JC Hi2 ) (kr)
and this ratio approaches - C as r ~ co. This san.e ratio also has been observed for the
components of rectangular plane waves.
306 Eleciromaqneiice in Free Space CHAPTER 5
The time-average power densi ty is

cP = -1 CRe E X fI* =
.
t, Jr] 1
21I' II(2)(kr)II(1)(kr)
2 0 2 \Hi2 ) (ka)12 0 1

At large radial distances this formula may be reduced by using the asymptotic expressions
for the Hankel functions. Further simplification is possible if ka is large, in which case

EXA:\IPLE 5.11
.A circular cylindrical waveguide consists of a hollow conducting tube of inner radius a. Find
the general expression for transverse magnetic waves propagating axially inside this tube.
Because the axis r = 0 is included in the region of interest, J n(V) must be chosen as the
radial function and (5.145) gives

E = In(v)
1
Z
{C?S n<f>}
SIn ncj>
ei(wt-kzz)

If the walls are assumed to be composed of a perfect conductor, Ez(a,<f>,z,t) == 0, which


requires that

If the roots of I n are designated by ~nl' ~n2' • • • ~nm' ••• then


(k2 - k;)a 2 = ~~m

and the propagation constant k, may have a sequence of values given by

f.l.
tJnm
= k =
z 'YIk
I
2 _ 'Yaim
2 "[nm.

Therefore there is a doubly infinite set of transverse magnetic modes which can exist in a
circular cylindrical waveguide, and these modes are distinguished by the indices n, m. For
the TMn m mode,

E, = J n
( ar) {cos n<f>}
'Ynm - .
n<f>
SIn
.
eJ(wt-~nwZ)

with the other field components deducible from (5.141). Whether or not the TMn m mode
will propagate is governed by whether f3nm is real or imaginary, which is determined by
whether or not ka is greater than ~nm'

5.13 SOLUTIONS TO THE WAVE EQUATION IN SPHERICAL COORDINATES

If Equations (ij.102) are used as the starting point for the deduction of wavelike solu-
tions to the field equations in spherical coordinates, a study of Equation (-1 ..14) indi-
cates the difficulty of the task ..A.-II three field components are involved in all three
components of the vector wave equation and separation is possible only in a few par-
ticular situations of symmetry. Still another technique must be found for the solution
of (5.102) by indirect means.
SECTION 13 The Wave Equation in Spherical Coordinates 307

The approach to be followed begins with a study of the scalar wave equation

(5.146)

in which <p(T,f),<t>,l) is a scalar function expressed in spherical coordinates. Letting time


variations be accounted for by writing <P = 'I!(r,8,<t»e i wt yields

(\7 2 + k 2 )'l! = 0 (5.147)

which may be separated by assuming 'I! = !1(r)!z(8)!3(<t». This gives


2
fl f2 2
sin 8 d ( d
- - - r2 - ) sin 8 d
+ --. - (.
SIn 8 -d ) + .
k 2r2 sin" 8 + -1 -d f 3 = 0
11 dr dr f2 d8 ae f3 d<t>2
which breaks into the pieces

~
dr
(r z ddr + [k r
i l
)
2 2
- n(n + 1)] 11 = ° (5.148)

_.1_
SIn
~ (Sin 8 de
8 de
f2
d ) + [n(n + 1) - ~J!2
sln 8
2
= 0 (5.149)

d2! 3
drjJ2 + m
2
h = 0 (.5.150)

with m and n separation constants. If the field is to be single-valued and a complete


azimuthal region is being considered, m must be an integer; this choice for m also gives
a complete Fourier series representation to !3.
Equation (5.149) was encountered in Section 3.13 and is recognized as being the
associated Legendre equation. If the axis e = 0, 7r is to be included, it has finite solutions
P':(cos e) whose properties are described in Appendix D. Restricting n to zero and the
positive integers, provides a complete orthogonal set of functions from which to con-
struct 12.
If the substitution j', = (kr)-~~F(r) is made, Equation (5.148) may be transformed to

(5.151)

which may be recognized as Bessel's differential equation, being in the same form as
(3.64). It gives rise to the solutions

(5.152)

in which Zn+~'l(kr) is a cylinder function of half-order, i.e., an appropriate choice


among Bessel functions of the first and second kinds, Hankel functions, etc. These
functions have the same asymptotic behavior, recurrence relations, and orthogonal
properties as the cylinder functions of whole order (cf. Appendix C).
It is customary to define a spherical Bessel function by the notation

(5.153)
308 Eleciromaqnetics in Free Space CHAPTER 5

and with this terminology one may construct solutions for 'It by forming products
of the partial solutions
fl(r) = zn(lcr)
f2(O) = P';(cos 0) (5.154)
13<4» = {e±jmcf>}

N ext consider the vector function


G = i r r X V'It = -v X (r'It) (5.155)

in which 'It is a solution to the scalar wave equation (5.147). It is shown in Appendix I
that G satisfies (5.102) and is therefore an appropriate solution for either E or B.
However, G cannot represent a general field, since it has no component in the r direc-
tion. However, if 'It l and 'It2 are t\VO independent solutions to the scalar wave equation,
then a general solution may be constructed by choosing

1
BI = - ~ V X EI
JW

with the total field given by E = E I + E 2, B = B I + B 2• In this manner the total field
is expressed as the sum of t\VO partial fields, one of which is TE with respect to the
radial direction, the other being Tl\L Since 'l!I and 'l!2 are expressible in terms of corn-
plete sets of orthogonal functions, this is a broadly useful representation.
EXAMPLE 5.12
A spherical cavity consists of a conducting shell of inner radius a. Find the expressions for
those resonant fields in this cavity which are without an E; component.
For such fields, 'lJ 1 must be of the form

'lJ 1 = J.n (k r ) pm(


n cos ()
) {cos
. m<t>} e jwt
SIn m'¥
A,.

in which in has been chosen for the Bessel function to ensure regularity at r = O. Then

Eo = - _1_ a'1'l =+= jm '1'1


sin 0 ac/> sin 0
E¢ = a'It 1
ao
B, = n(n + 1) 'It
1
jwr
s, = _1
jwr ar
~ (r a'Itao 1
)

B =
q,
1
jwr sin ()
~
ar
(r ac/> 1) iJ'lJ

and, if a perfect conductor is assumed, Eo and E¢ must vanish identically at r = a, which


requires that
SECTION 14 Inductance 309

If "Ynm is the mth root of in (kr), such that in ("Ynm) 0, then k has the allowed values
k = "Ynm
a

which determines the resonant frequencies for the TE modes in the cavity.

5.14 INDUCTANCE

The results of Section 5.8, concerned with magnetic stored energy, may be expressed
in an alternative manner. When the vector identity (V. 108) is employed, if A is the
magnetic vector potential function, such that B == V X A, then

V • (A X B) == B · V X A - A · V X B
==B·B-A·vxB
and (5.80) may be written

W m = i,uOl f V • (A X B) dV + t,uOl fA. V XB dV


v v
= -k,uOl f (A X B) · dS + -k,uol fA. V XB dV
s v
But S may be taken as a sphere at infinity, and since A decreases as ~-1 and B decreases
as ~-2, the surface integral is seen to vanish. Therefore

TV m = i,uOl fA. V X B dV (5.156)


v

For static magnetic fields, V X B == ~1 and


fJ.o

A == f .' dV' (5.157)


v: 47rJ.lo-1 r

wherein primes are used so as to be able to distinguish between contributions to the


integrals in (5.1t56) and (5.157). Thus the stored magnetic energy may also be expressed
by

(5.158)

in which ~ is the distance between the volume elements dV and dV', and the integration
is to be performed twice throughout all of space containing current elements. This
development should be compared to the similar analysis presented for electrostatic
energy in Section 3.19; expressions (3.159) and (5.158) are seen to be completely
analogous.
Equation (5.158) may be applied to current systems of which the prototype is
indicated by Figure 5.4. Shown are two distinct circuital volumes V 1 and V 2 which
contain steady current distributions, the rest of space being source-free. If 11 is the
total current flow at some reference cross section in V 1, and if similarly 12 is the total
current flow at some reference cross section in V 2, it will often occur that the current
310 Electromaqneiics in Free Space CHAPTER [)

density at any point in V I is linearly proportional to 11, whereas the current density
at any point in 11 2 is linearly proportional to 12. In such cases,

for any point (~,'YJ,t) in VI, and

for any point (t,1J,t) in V 2, with f l and f 2 functions which give the normalized current

V,

FIGURE 5.4 Self- and mutual inductance.

intensities. Under these conditions, (5.158) may be written

wherein f~ implies fl(~'''Y7',t'), etc. The integrals appearing in (5.159) depend only on
the normalized distributions of the two systems of currents, and for a given conductor
configuration are constants, so that one may write

(5.160)

in which L ll and L 22 are called self-inductances and M12 is called a mutual inductance,
their units being given the name henries. This development can be extended readily to
situations in which the volumes VI and T1 2 overlap and/or in which there is any number
of separately identifiable volumes containing current systems.
SECTION 14 Inductance 311

EXAl\1PLE 5.13
Find the mutual inductance between t\VO coaxial, coplanar filamentary loops of radii a and
b, as shown in the figure. Assume b » a.
From (5.159),

M = JJ
VI V2'
fl· f~
41r,Liol~
dV dV'

In this case, the current densities may be taken as uniform over the cross-sectional areas

~----f-4-----+-+----Y

4>'

x
8 1 and 8 2 of the inner and outer loops. Then

and

in which C1 and C2 are the median contours of the two loops.


Upon first considering that element dfl which is on the X axis, one sees that by sym-
metry, the elements de~ may be taken in pairs symmetrically disposed with respect to the
X axis such that if
1<1>' d(~ = -1 xb sin ¢' d¢' +1 yb cos ¢' d¢'

then the X components may be discarded. This gives

M = ¢ del f 2b cos ~: d¢'


Cl 0 41r).Lo t
312 Eleciromtumeiics in Free Space CHAPTER 5

With b » a, ~-1 ~ b- l + a cos ¢'/b 2 and thus

M~ ~ del f" 2ab cos 2


c/>' dc/>' = 7fa
2

r
c, 0
41rJ..Lo-lb 2 2 J..Lo-lb

EXAlVlPLE .1.14
A wire of circular cross section whose radius is b is bent into the form of a closed circular
ring of mean radius a. Find its self-inductance if a» b.
If one again makes use of (5.159) and (5.160), then

The geometry identifying the t\VO volume elements dV and dV' is shown in the figures,

Cross section at Of

Cross section at Of + 0

from which it follows that


f l • f~ == flf~ cos ()
dV = (ad8)(pd¢dp)
dV' = (a d8')(p' d¢' dp!)
SECTION 14 Inductance 313

Upon letting 6 = 271 + 7f one finds that


!2 = {[2a + p cos (¢ + ¢') + o' cos ¢']2 + [p sin (¢ + ¢') - p' sin ¢'P} (1 - k 2 sin" 71)
1"-..1 2a(1 - k 2 sin 2 71)

in which

k2 = . 4(a + p' cos ¢')[a + peas (¢ + ¢')]


[p sin (¢ + ¢') - o' sin ¢']2 + [2a + peas (¢ + ¢') + p' cos ¢']2

1 _ k2~ p2 + (p')2 - 2pp' cos ¢


so that 2
4a

Let it be assumed that 11 is a function of p but not of e nor ¢. Then

d L
2 a
= --=1 pll (p) dp p'l~ (p') dp'
47r,uo
f f f
2rr

de'
217"

d¢'
217"

d¢ f
rr/2
2
2 sin 71. - 1 }" d71
(1 - k 2 sin? 71)/2
o 0 0 -rr/2

=
27ra
-1
,ua
,
pll(P) dp P'11(P') do' f r-
0 L
-
2
2rr

k 2
E(k) + -2-
( 2
k
1) K(k)] d¢

wherein E(k) and K (k) are elliptic integrals. Since a » b, it follows that k 2 ~ 1 and

f f
rr/2 rr/2

E(k) = VI - k sin-
2
T} dT}'" cos T} dT} = 1
o 0

To find K(k), let (3 = 7f/2 - 71 and then

f f
flo rr/2
K(k) = d{3 + d{3
o V sin" {3 + cos? {3 - k 2 cos" {3 e, VI - k 2 cos? {3

with 1 - k 2 « {3o« 1. This division of the integral into two parts is done so that sin {3
may be approximated by {3 when {3 ::; {3o. If one places sin {3 = {3, cos (3 = 1 in the first
integral, and k = 1 in the second, the result is that

K(k) ~ In _/14-
V k2

r[-
and therefore

~ E(k) + (~ - 1) JdcP~ j"


K(k) (In 4 - 2 - In V 1- k 2 ) d¢

= 27l"(ln 4 - 2) - f '\JI +
217"
In p2 (p')2 - 2pp' cos cP d¢
o 4a 2

which means

d 2L = 47l"~~ p (In 8a - 2) 11 (p) dp p'f~ (p') dp'


,uo p
314 Eleciromaqnetics in Free Space CHAPTER 5

'I'wo cases of interest may now be distinguished. First, if the wire is assumed to have infi-
nite conductivity, so that all the current flows on its surface, then/l(p) = (27rb)-1 o(p - b),
and

L = - a(8a
1
In- - 2 )
J1.o b

Second, if the current is assumed to be uniformly distributed over any cross section of the
wire, then 11 = (7rb 2)- 1 and

The difference in these results is slight under normal circumstances.

The concepts of capacitance and inductance may be extended to situations involving


time-varying currents and charges by the following argument: Let time variations
of the fields and sources be of the form e jwt (w may be only one Fourier component of the
total spectrum). Then the potential functions are

dV
A(x,Y,z,t)
V
f t(~,71,5)ej(wt-kn
4trlJo ! -1

f (
t r)ei(wt-kr)
4>(x,Y,z,l) = P l;,71,~ dV
v 41rEo!

If the currents and charges are disposed throughout a system of conductors, and if no
point (x,Y,z) in the system is an appreciable part of a wavelength removed from any
other point (~,71,t), then retarded time may be ignored between t\VO such points, and

(5.161)

if>(x,Y,z)e iwt = eiwt f p(~,7J,n dV (5.162)


v 4trEo~

Upon deleting the factors eiwt, one sees that A and <I> have the same forms as were
encountered in magnetostatics and electrostatics, except that now t and p are, in gen-
eral, complex quantities.
The condition that the extent of the conductor system be small compared to a
wavelength characterizes a lumped circuit. If within the circuit the regions where
conduction current and displacement current dominate are distinct and separate, the
analyses leading to Equations (3.159) and (5.158) may be repeated, with the potential
functions (5.161) and (5.162) replacing their static counterparts. The net result is
that the same formulas for capacitance and inductance are obtained. One concludes
that the static formulas for capacitance and inductance are valid so long as the fre-
quency is low enough to insure that the circuit dimensions are small compared to
the wavelength.
SECTION L5 T'ransjormaiion of the Integral Solutions 315

5.15 TRANSFORMATION OF THE INTEGRAL SOLUTIONS


TO FORMS SUITABLE FOR WAVEGUIDE PROBLEMS

In the development leading to the integral solutions for E and B, given by Equations
(5.42) and (5.43), the Green's function t/; = e-jkr/~ was employed. By retracing the steps
leading to (5.42) and (5.43), the reader will have no difficulty in convincing himself
that, if the more general Green's function

e- jkr
G(x,y,z,~,l1,r) = - + g(x,y,z,~,l1,r) (5.163)
~

be used in place of t/;, with 9 any function which satisfies

(5.164)

everywhere in the volume V and over the bounding surfaces 8 1 SN, then (5.42)
and (5.43) are once again obtained, with G replacing t/; everywhere, In particular, if V
is a source-free region bounded by the single closed surface S, then at any point (x,Y,z)
within V,

E(x,y,z) = ~ f E)VsG + (In


41r s
[(In· X E) X VsG - jwG(ln X B)] dS (5.165)

B(x,y,z) = ~ f [jwG (In + X E) (In X B) X V sG + (In · B)V sGJ dS (5.166)


41r sc 2

in which In is the inward-drawn normal.


These equations may be applied to waveguide problems in the following manner:
let any cross section of a cylindrical waveguide be represented as in Figure 5.5, with C

'------------- X
FIGURE 5.5 Cross-sectional geometry of cylindrical waveguide.

the contour of the cross section, A the cross-sectional surface, u the outward-drawn
normal direction, and v the peripheral direction.
Let the closed surface S alluded to in (5.165) and (5.166) be taken to consist of the
316 Electromaqnetics in Free Space CHAPTER 5

interior surface of a portion of the waveguide, extending from t = to = S2, plus rl r


end caps of area A at = r rl
and at t = r2.
Let the sources of the electromagnetic
waves within the waveguide be fields externally impressed over holes in the waveguide
walls, Further, let these holes be confined to a finite axial extent of the guide. Then, if a
small loss is assumed in the medium (an air-filled guide, instead of an evacuated guide,
for example), and if tl ~ - 00, t2 ~ +
00, it follows that the contributions to E(x,y,z)
and B(x,Y,z) in (5.165) and (5.lo6) due to the integrals over the t\VO end caps is neg-
ligible. This condition will be assumed, and the surface S in (5.1G5) and (5.166) will
be taken to consist of the interior waveguide surface of infinite axial extent.
For convenience, different Green functions will be used in (5.165) and (5.166). If one
selects
e- j kr
G 1 (x,Y,Z,1l,v,t ) = gl(X,y,Z,u,v,s) - ­ (5.167)
~

for use in (5.165), and further stipulates that G l == 0 on S (the Dirichlet condition),
then the Z component of (5.165) becomes

(5.168)

in which G, satisfies
(5.169)

with j)F the field point (x,Y,z) and r)s the source point (1l,V,t). Only l~z need be found
from (5.165), since all other field components of 1"'~1 waveguide 1110des 111ay be expressed
in terms of E; (cf. Section 5.11).
In like manner, if one selects
e-jk~

G2(x,y,Z,1l,v,t) = g2(X,y,z,u,v,r) + -­ (5.170)


~

for use in (5.166) and further stipulates that aG 2 / au == 0 on S (the Neumann condition),
then (5.166) yields the z component

B.(x,y,z) = :11' f j~ {E.(U,V,n [:;2 + k 2 ] G2 - E.(u,v,n ::~;} dv dr (5.171)

in which G 2 satisfies
(V'~ +k 2)G
2 = -41rO(PF - Ps) (5.172)
In reaching the result (5.171), Maxwell's equations have been used to replace B; by
terms involving E; and E z , and several integrations by parts have been employed to
transfer the differentiations in the kernel to the Green's function.
Knowledge of tangential E everywhere on the waveguide walls permits determination
of all field components at all points, through use of (5.168) and (5.171). When the walls
are made of good conductor, E t a n ~ 0 except over the holes in the walls, and the extent
of the integration is thereby reduced.
If the waveguide has a simple cross section, such as rectangular or circular, the
Green's functions G l and G 2 can be expressed as complete series of orthonormal func-
tions, thereby greatly simplifying the analysis. As an illustration, the Green's functions
SECTION 15 Trausjornuuion of the Integral Solutions 317

for a rectangular waveguide are derived in Appendix J. When the results are sub-
stituted in (i).168) and (:').171), one obtains

E,(x,y,z) = - i f .pm.,.~.c,y) J
m=1 n=1 2) {3mn s
a.pm.,(1;,7]) E,(1;,7],t)e-
au
ifJ m"I,- t l dv dt (;3.173)

~ L~
- ~ (mir / a) 2 + (n 7r / b) 2 '1t ( )
mn x,y
J 'ltmn() 1-
~,'YJ
1 (
~v ~,'YJ,s
)
e-'~
] Iz- tid
mn
1
, v G·s
2w/3mn .s

II
m =0 n =0

±j Wmn(X,y) J aw mn(I;,7]) E,(1;,7],t)e- ifJ m"I,- t l dv dt (5.174)


m=O n=O 2w s av
in which the upper sign is used in the second sum of (5.174) if z > S, the lower sign
being used if z < S. V;mn and 'lt m n have been defined previously by Equations (5.131)
and U>.135).
Equations (5.173) and (5.174) are known as the generating functions for rectangular
waveguide and permit the determination of the fields anywhere within the guide if the
fields are completely known on the walls. These generating functions were first obtained
by Stevenson. 13
EXAlVIPLE 5.15
Two identical rectangular waveguides are joined so as to have a common broad wall, as
suggested by the figure. It is assumed that the dimensions a and b are so chosen that only

Directional coupler

Slot dimensions

/
/ 2
/

the TE 10 mode will propagate at the frequency being considered. (Cf. Example 5.9). A
pair of small crossed slots is cut in the common broad wall at a distance Xl from the nearest
13A. F. Stevenson, "Theory of Slots in Rectangular Waveguides," J App Phys, 19, 24-38; January
1948.
318 Electromagnetics in Free Space CHAPTER 5

side wall, In what is to follow, it will be shown that if Xl is properly chosen, this assembly
is a matched directional coupler. By this one means that if a TE l o mode is fed into port 1,
no reflected waves are detected at ports 1 and 2, but waves are detected at ports 3 and 4,
their relative amplitudes being controlled by the size of the crossed slots.
If the results of Example 5.9 are utilized, a 1'E I O mode incident on the crossed slots from
port 1 may be represented by

B z = K 0 cos -7rX.
e, (w t - IJI OZ) R

in which Ko is the complex amplitude of the incident wave at z = 0. The other magnetic
field component of this incident wave is

BX = j{310
- K 0 SIn
. -7rX e « Q
wt-IJIOZ
)

1r/a a '

At the center of the slots (Xl,O,O), the magnetic field components are

7rXl.
Bx = j{31oK . 7rXl . t
- : 0 SIn - er: B =
Z K 0 cos - a er"
7r/a a
If Xl is chosen so that
1rXI
tan -;;
7T"
= {3loa =
[(2a)2
~ - 1
]-~~

then at the center of the slots, B x and B, will have equal amplitudes and will be in time quad-
rature. With this condition assumed, if the slots were not present, the X and Z components
of current density in the broad wall at (XI,O,O) also would have equal amplitudes and be in
time quadrature. If the slots are narrow and small (l » d but l« A), the conduction cur-
rent which each slot interrupts is replaced mainly by a displacement current across each
slot. This means that electric fields which are in time quadrature are induced across the
two slots by the incident wave. These electric fields may be approximated by the expressions

in which E is the complex induced amplitude at the center of each slot and it is assumed
that the electric field goes to zero at both ends of each slot. The first electric field expression
applies for the Z-directed slot, the second for the X-directed slot.
By virtue of (5.174), the back-scattered TE IO mode which appears as a reflected wave at
port 1 is given by

B(l)
z
1r
iwt
Ee
}wf3loa 2b
cos 1rX
a
{~ f
cos ~ cos
a sal
7r 7rr
e-itllolz-rl d~ dr

-
(.J
fJIO
f. 1r ~
SIn -
Sal
cos 7T"(~ - Xl) e-j{jlolz-t\ d·5t d~r }
SECTION 16 A Minkowskian Formulation of the Field Equations 319

If the substitution is made that ~' = ~ - Xl this becomes

{
-7f cos 7fXI
- f
l/2

cos -7fS cos {310S df


a a 0 l

f
l/2

- . 7f
{310 SIn
X
- l COS -7f ~' cos 7f t'}
- ~' d t;;
a () l a

Since sin 7rXl/a = (7f/{310a) cos 7fXl/a, and since Z«"'I\, so that cos (31of~ 1 and cos 7r~' /a~ 1
throughout the integration interval, it follows that
B~l) = 0
and there is no back-scattering to port 1. Similarly, B~2) = o. However, for ports 3 and 4,
the upper sign must be used in the second integral of (5.174) and one obtains

B?> = B~4) = Kl cos 7rX eiCwt-{310Z)


a

in which
u» E
Kl=--
a 3b c

The total field emerging from port 3 is given by Ko +


Kl; at port 4 it is given by Kl.
The mode amplitude Kl is seen to increase with l so that the fraction of power diverted to
port 4 may be controlled by slot size. Experimentation has revealed that an extensive
dynamic range of power diversion is feasible with crossed slots of this type, making them
suitable for use not only as a waveguide coupler, but also as a circularly polarized radiator
(with the upper guide removed)."

5.16* A MINKOWSKIAN FORMULATION OF THE FIELD EQUATIONS

Shortly after the appearance of Einstein's first paper on relativity, Hermann Min-
kowski (1864-1909) recognized that a considerable clarification in notation was possible
if the variable jet were treated as a fourth dimension and the equations of physics
restated accordingly.t> Thus, for example, the proper distance (2.47) could then be
written
ds 2 = dxi + dx~ + dx~ + dx~ (5.175)
in which, in place of the coordinates x, y, Z, t, the new coordinates
X2 = Y X3 = Z X4 = -jet (5.176)
have been introduced, and the two events occasioning (5.175) have been assumed to be
an infinitesimal distance and time apart. Equation (5.175) is seen to be an extension to
four dimensions of the familiar expression for differential distance in three dimensions.
With this notation, functional derivatives with respect to X4 assume the same form
as those with respect to the spatial variables. As an illustration, the scalar wave equa-
* This section may be omitted without lops in continuity of the technical presentation.
14 A. J. Simmons, "Circularly Polarized Slot Radiators," IRE, Trans Antennas Propagation, AP-5
(1), 31-36; January, 1957.
IS H. Minkowski, Space and 'I'ime, an address delivered at the 80th Assembly of German Natural

Scientists and Physicians, at Cologne, September 21, 1908. A translation may be found in The Prin-
ciple of Relativity, Dover Publications, Inc, New York.
320 Electronuumeiics in Free Space CHAPTER 5

tion beC0111eS a four-dimensional version of Laplace's equation. The laws of dynamics


maybe recast in a static form and the governing equations of electrical phenomena also
assume a simple and elegant structure, as shall be seen by what follows.
In the Preface to his Electrodinuimics, S0111111erfeld indicated his admiration of ~\ Iin-
kowski's formulation by saying
. . . After I had heard Hermann Minkowski's lecture on "Space and Time" in 1909 in
Cologne, I carefully developed the four-dimensional form of electro-dynamics as an apothe-
osis of Maxwell's theory . . . in return, this has always met with an enthusiastic reception
on the part of my audience.

Sommerfeld's presentation of this material is especially appealing because of its


conciseness and clarity, and the ensuing development is patterned after his approach
wherever appropriate. Recently another excellent treatment of the subject has been
offered by L. J. Chu. 16 1;'01' applications of this formulation to other branches of physics,
such as dynamics, the reader is referred to the literature of those fields. 17
Confining attention to a rephrasing of the governing equations of electromagnetics,
let a four-dimensional generalization of the Laplacian operator be defined by

(5.177)

Additionally, let the potentials A and <1>, defined by (t>.G5) and (i).66), be combined to
form the four-potential A whose components are
<I>
A4 = ~ (5.178)
JC
With these definitions, the differential equations (5.69) and (.5.70) become
I
D 2A = --
-1
(5.179)
J..Lo

with I called the four-current density and possessing the components


(;j.180)

Equation (5.179) is the four-dimensional wave equation relating the potential function
to its sources.
Equation (H.5), which connects the potentials A and <1>, may be written

aAl aA 2 aAa 1 . aA4


V · A = - + - + - = --<1>=
a:rl aX2 aX3
2
c aX4
which gives 0 · A == 0 (5.181)

.
wherein D·IS the four-vector (a
- , -a, -a, - a) . Thus the four- d·InlenSlona
. I d·Ivergence
ax! 8X2 8X3 aX4
of the four-potential function is identically zero.
16 H,. :iV1. Fano, L. J. Chu, and R, B .. Adler, Electronuujnetic Fields, Energy and Forces, Appendix 1,

John Wiley and Sons, Inc., New York, 1960.


17 See, e.g., H. Goldstein, Classical J.~1 echanics, Addision- Wesley Publishing Company, Inc., Reading,

Massachusetts, 1953.
SECTION] G A 1\1 inkowskian FOrJHUlation of the Ft'eld Equations 321

Turning now to Equations (;").G7) and (5.G8), which relate the field vectors to the
potentials, one may write

(f>.182)

These equations suggest the utility of introducing the four-dimensional curl

aAn aAm
curl m n A = - - -- (5.183)
aX m aX H

Since CUl'lmm = 0 curl.., = - curls.,

it follows that curl m n A is an anti-symmetric tensor with six distinct components


which differ from zero. Equations (;">.182) 111ay then be written in the tensor form

0
(~~12 - ~~J (iJA3 iJA1) (~~14 - ~~J
aXl aX3

(:~l _ ~~12) 0 (~~23 - ~~32) (~~24 - ~~42)


curl A = =5=
(iJAl _ iJA3) (~~32 - ~~23) 0 (~~34 - ~~43)
a X3 aXl

(~~41 - ~~:) (iJA2 _ iJA4) (iJiJ~: _ ~~4)


aX4 aX2
0

(5.184)

in which g: is an antisymmetric tensor possessing six distinct components different


frorn zero, and properly 111ay be called the field tensor. I t is given by

jEx
0 -B z -By
c
jEy
-B z 0 Bx
~= [B,J:J By -B x 0
c
jJ~z
(5.185)

c
-jJ~x -jJ~y -jI~z
0
c c c
Similarly, the .vlaxwcll equations tan be cast In a four-dimensional form. If the
operation
4

div., ~ ==
\'
L
a~mn
(5.186)
n=l a~rn
322 Electromaqneiics in. Free Space CHAPTER 5

applied to tensors is given the name reduction or divergence, it is seen to reduce a four-
dimensional tensor to a four-vector, and 111ay be contrasted to the operation 0 · intro-
duced earlier, which reduced a four-vector to a scalar. Upon applying the operation
(5.186) to the tensor 5= given in (5.185), one obtains

div g: = ~l (5.187)
J..Lo

The first three components of this equation embody the ~rax\vell equation (:>.24) and
the last component is a rephrasing of (5.23).
Since for any antisymmetric tensor 3' the components are related such that

it follows that
D· div 3' = 0
and thus from (5.187) one is able to deduce that

D· I = 0 (;j.188)

which is a restatement of the continuity equation (5.29).


Finally, let the dual of 5= be defined by the relation

jE z _jJ~lJl
0 Bx
C c

[j:,
- jI~}z jEx
0 B1J
g:* B] c c (;'1.189)
jl~y -jIE x
0 Bz
c c
-Ex -By -B z 0

which is formed by an interchange of the real and imaginary constituents of ~. The


reduction of this tensor is seen to give
div ~* = 0 (5.190)

The first three components of (5.190) comprise the Maxwell equation U>.21) whereas
the fourth component is a restatement of ([).20).
To recapitulate, a complete representation of Maxwell's electromagnetic theory for
free space is therefore embodied in the equations

div ff = ~ -1
div ~* = 0 (;).191)
J..Lo

The field tensor ~ may be found from the four-potential A through the relation
g: = curl A (5.192)

whereas the potential A is deducible from the equations


I
-----=1 D·A = 0 (5.193)
J..Lo
Problems 323

For further development and use of this notation the interested reader is referred
to Sommerfeld's Electrodynamics.

REFERENCES

1. Campbell, L., and W. Garnett, The Life of James Clerk M axwell, Macmillan and Company,
London, 1882.
2. Crowther, .I. G., i11en of Science, w. VV. Norton and Company, New York, 1936.
3. Fano, 1~. M., L. J. Chu, and R. B..Adler, Electromagnetic Fields, Energy and Forces, John
Wiley and Sons, Inc., New York, 1960.
4. Glazebrook, R. T., James Clerk ill axwell and Modern Physics, Cassell and Company, Ltd.,
London, 1896.
5. Harrington, R. F., Time-Harmonic Electromagnetic Fields, McGra\v-Hill Book Company,
Ne\v York, 1961.
6. Jackson, IT. D., Classical Electrodynarnics, John vViley and Sons, Inc., Ne\v York, 1962.
7. Jordan, E. C., Electrornagnetic Waves and Radiating Systems, Prentice-Hall, Inc., Engle-
wood Cliffs, New Jersey, 1950.
8. Jones, Bence, The Life and Letters of Faraday, Longmans, Green and Company, London,
1870.
9. Panofsky, VV. K. H., and M. Phillips, Classical Electricity and 111 agnetism, Addison-
Wesley Publishing Company, Inc., Reading, Massachusetts, 1955.
10. Ramo, S., and J. R. Whinnery, Fields and Waves in .7\!I odern Radio, 2nd ed., John Wiley
and Sons, Inc., New York, 1953.
11. Shedd, P. C.,Fundamentals of Electromagnetic lVaves, Prentice-Hall, Inc., Englewood
Cliffs, New Jersey, 1954.
12. Sommerfeld, A., Electrodynamics, Academic Press, Inc., New York, 1952.
13. Stratton, J. A.., Electromagnetic Theory, Mcflraw-Hill Book Company, New York, 1941.
14. Whittaker, E., A History of the Theories of Aether and Electricity, vol. 1, Thomas Nelson
and Sons, Ltd., London, 1951.

PROBLEMS

5.1 Because of the result of Appendix E that the most general sources t(x,y,z,t) and p(x,Y,z,t)
in XYZ may be built up from static charge distributions in all other Lorentzian frames, it
follows that one may derive Maxwell's equations by starting only with p' (x' ,y' ,z') in
X' y' Z', without including t' (x' ,y' ,z'). To see this, assume only a static charge distribution
in X' Y' Z' and parallel the development of Sections 5.2-5.4 to obtain the X~connected
portions of Maxwell's equations. Then invoke superposition to obtain equations (5.25).
This procedure has the advantage in rigor of basing the derivation of the general field
equations solely on electrostatics, and it then permits all the results of Chapter 4 to be-
come special cases of the more general theory. In particular, Equation (4.29), which
expresses the Biot-Savart law, is seen to be a limiting form of Equation (5.60), with
k = O.
5.2 Let B = F in the vector Green's theorem (5.37) and show that B is expressible in the
form (5.43).
324 Eletiramaqneiics in Free Space CHAPTER 5

5.3 By taking the curl of (5.42), show that B may be written in the form (5.43).
5.4 Use the continuity equation to show that (5.42) 111ay be converted to (5.44).
5.5 Using Fourier integral theory, show that the general form for the retarded potential <P, as
given by (H.7), is a natural extension of the harmonic form (H.2).
5.6 Find the field pattern for a dipole which is one wavelength long if the current distribu-
tion is

5.7 Use Poynting's theorem to determine the total radiated power for the full ..wave dipole of
Problem 5.6.
5.8 Find the stored magnetostatic energy per unit length for the system consisting of two thin
parallel conducting tubes of radius a and center-to-center spacing D, if they carry equal
and opposite steady currents I.
5.9 Static charges and steady currents can set up time-independent electric and magnetic
fields in a C0l11mOn region such that CP = E X II is not identically zero, but still no net
power flow exists. Show that under these conditions f s(P · dS = 0 for any closed surface S
in the region.
5.10 If t\VO uniform plane waves of con11110n polarization but different angular frequencies WI
and W2 propagate simultaneously in the same direction, show that the net time-average
power flow is equal to the sum of the individual time-average power flows.
5.11 Find the radiation pressure if a plane wave is normally incident on a perfectly absorbing
plane screen.
5.12 Show that an elliptically polarized plane wave may be decomposed into appropriate
amounts of right-handed and left-handed circularly polarized waves.
5.13 Determine expressions for the instantaneous stored energy density and Poynting's vector
for an elliptically polarized plane wave, Check that your answers have the proper limits
for linearly and "circularly polarized waves.
5.14 A circularly polarized plane wave is normally incident on a perfectly conducting plane
screen. What can be said about the polarization of the reflected wave?
5.15 Establish the law of reflection for a linearly polarized wave of arbitrary polarization
incident on a perfectly conducting screen at an angle a with respect to the normal.
5.16 Repeat the analysis of Example 5.8 for the case of a coaxial transmission line consisting of
t\VO concentric circular cylindrical shells of radii a and b, with b > a.

5.17 Repeat the analysis of Example 5.8 for the case of t\VO semi-infinite planar conducting
sheets which lie in the same plane but are separated by a constant gap width a.
5.18 A rectangular cavity of dimensions a, b, c is excited in the mode

E = sin 7rX sin 7ry eiwt


Z a b

If Hz = Ex = E y == 0 and the walls are perfectly conducting, find the resonant frequency
wand the force exerted on each face.
5.19 Determine an expression for the resonant frequency for a TM mode in a cylindrical cavity
of radius a and length l.
5.20 Determine the integral of the Poynting vector over any cross section of the circular
cylindrical waveguide of Example 5.11 for any transverse magnetic mode.
Problems 325

5.21 Establish the expressions for the field components and the allowed frequencies for the
resonant TM modes in a spherical cavity.
5.22 Deduce the self-inductance per meter of t\VO parallel wires, each of radius a and center-
to-center spacing D, (D» a). Assume that the wires carry equal and opposite harmonic
currents, and that D « A, conditions often encountered in practice, such as in telephonic
communications.
5.23 Calculate the mutual inductance between the t\VO coils shown in the figure. Assume that
a» b and that the respective numbers of turns are N« and Ni:

---t)--
\-- - - - - - - - d - - - - - e - I
5.24 For any antisymmetric tensor 3', show that 0 · div 3' = O.
5.25 In a Minkowskian formulation of the field equations} define a suitable field tensor whose
terms represent the components of H, and Do.
CHAPTER 6
l)ieleclric lklalerials
WITH RESPECT to some aspects of their electrical behavior, materials 111ay be classified
as conductors, semiconductors, and dielectrics or insulators. (See Section 8.2.) An ideal
dielectric is a material which possesses no free charges and thus completely inhibits
the passage of steady electric current. Since many real materials of practical importance
approach this idealization, it constitutes a useful 1110del on which to base an analysis
of electric behavior, as shall be seen in subsequent sections of this chapter.
Although electrically neutral, any dielectric is composed of molecules, which in turn
are composed of charged particles (nuclei and electrons), and these particles are usually:
affected by the presence of an electric field. Such a field influences the positively and
negatively charged parts of a molecule oppositely, and these parts may suffer oppositely
directed displacements from their equilibrium positions, thus causing the molecule to
become polarized. These displacements are limited by strong restoring forces, caused
by the altered charge distribution within the molecule, such that the charge shift is
seldom more than a small fraction of a molecular diameter. In such cases the molecule
may be viewed as an elementary electric dipole (or several dipoles) whose distant field
can be calculated using the techniques developed in earlier chapters. When the contri-
butions due to all the molecular dipoles are summed, the resultant is often found to
alter the field distribution significantly from the value it had in the absence of the
dielectric, both for points inside and outside the dielectric.
The dipole behavior of a molecule may arise from three distinct causes. First, the
electron cloud of a constituent atom may shift relative to its nucleus due to the presence
of an electric field. 'I'his induced effect is called electronic polarization. Second, the
molecular structure may be due to an arrangement of oppositely charged ions, which
can shift from their equilibrium positions under the action of an electric field, thus
giving rise to ionic polarization. Third, the molecule 111ay consist of an arrangement of
atoms which, in the absence of an electric field, is a randomly oriented permanent
electric dipole. The presence of the field then causes a partial orientation of the perrna-
nent molecular dipole, causing a net polarization, and this phenomenon is termed
orientational polarization. All three effects may be present in a given material.
To account for these three sources of dielectric behavior, the material may be treated
as though it were a collection of dipole moments P» in a vacuum" with consideration
of the detailed composition of P» deferred to a later discussion. Accordingly, a distribu-
tion of static dipole moments will first be assumed and an expression derived for the
total electric field due to static primary charges and the assemblage of dipoles. This
expression will then be used to explain the manner in which dielectric materials affect
SECTION 1 Historical Survey 327

capacitance, to generalize the meaning of the electric flux density Do, and to deduce a
relation for the local field at the site of any molecular dipole. Consideration will then
be given to the problem of connecting, for a linear dielectric material, the strengths
of the local field and the induced 1110111ents P»- This will be done for electronic, ionic, and
orientational polarization and will be seen to permit simplifications of the expression
for the total field; additionally, it will lead to the relation D = fE, with the permittivity
factor f serving to describe dielectric behavior for large classes of materials.
S0111e attention next will be given to nonlinear materials, notably ferroelectric
crystals, in which the polarization is not only not linearly proportional to the applied
field, but also depends on the prior history of excitation.
For linear dielectric materials, the theory will then be extended to the case of time-
varying fields, at which point the necessity to include dielectric losses will arise. The
concept of a complex permittivity will be introduced whose imaginary part accounts
for these losses, and the dependence of permittivity on frequency will be considered
for all three types of polarization.
Finally, the free-space form of Xlaxwell's equations derived in Chapter 5 will be
extended to apply in regions occupied by dielectric materials. Additional extensions
of Xlaxwcll's equations will occur at the ends of Chapter 7 (for magnetic materials) and
Chapter 8 (for conductive materials).

6.1 * HISTORICAL SURVEY

The recognition of a distinction between two classes of materials, conductors and


insulators, dates from 1729 when Stephen Gray discovered the phenomenon of electric
conduction. 1 Most common substances were soon categorized with respect to this
property; the metals were identified as good conductors, and many excellent insulators
were known and widely used by electrical experimenters of the eighteenth century.
It will be remembered from Chapter 3 that during this period Henry Cavendish became
interested in electrostatics, and was the first to observe that the presence of an insulator
between the plates of a condenser increased its capacity to store charge for a given
voltage. He measured the relative dielectric constants of many common substances,
such as shellac, beeswax, ebonite, and paraffin, and performed experiments which
indicated that the dielectric constant was independent of voltage (for glass) and
independent of temperature (for rosin)." However, Cavendish's papers were still unpub-
lished in 1837 when Michael Faraday rediscovered the effect.
Faraday was led to this problem by researches he had conducted on the deeornposi-
tion of chemical compounds placed between electrodes. Contrary to the behavior of
liquid electrolytes, he observed that when a solid substance such as sulphur was used,
it did not conduct electricity and was in no way decomposed; yet its presence between
the electrodes did cause an effect in that the charge stored on each electrode was
increased over the value found when air was the intervening 111ediu111. Faraday pursued
this discovery, selecting shellac and sulphur as the t\VO insulating substances best
* This section may he omitted without loss in continuity of the technical presentation.
r S. Gray, "Several Experiments Concerning Electricity," Phil Trans Roy Soc (London), 37, 18-44;
Feb., 1731.
2H. Cavendish, Electrical Researches, ed. by J. C. Maxwell. See particularly Notes 15 and 27. Cam-
bridge University Press, London, 1879.
328 Dielectric 1\1aterials CI-{;\PTEH 6

suited for experimental study of the phenomenon. Using t\VO identical spherical
capacitors, he left one air-filled and fitted a hemispherical shellac cup between the
conducting shells of the other. Upon comparing the ratio of charge to voltage for the
t\VO condensers, Faraday concluded"
. . . assuming the capacity of the air apparatus as 1, that of the shell-lac apparatus would
be tfi or 1.55 . . . this by no means expresses the relation of lac to air. The lac only
occupies one half of the space . . . if the effect of the two upper halves of the globes be
abstracted, then the comparison of the shell-lac power in the lower half of the one, with
the power of the air in the lower half of the other, will be as 2: 1 . . . . I cannot resist the
conclusion that shell-lac does exhibit a case of specific inductive capacity.

This coefficient of specific inductive capacity, which Faraday introduced as a quanti-


tative measure of capacity enhancement, is today called the relative dielectric con-
stant. For sulphur he determined it to be 2.24, and then extended his experiments
to include a variety of insulators, both liquid and gaseous, as well as solid.
To explain this phenomenon, Faraday formed a physical conception of the action of
insulators, based on an idea originally put forward by Davy some years earlier to
describe the behavior of a voltaic pile. Davy had supposed that prior to chemical
decomposition, the molecules of the liquid electrolyte became electrically polarized.
Faraday supported this hypothesis by noting!
When I discovered the general fact that electrolytes refused to yield their elemen ts to a
current when in the solid state, though they gave them forth freely if in the liquid con-
dition, I thought I saw an opening to the elucidation of inductive action . . . . For let
the electrolyte be water, a plate of ice being coated with platina foil on its t\VO surfaces,
and these coatings connected with any continued source of the t\VO electrical powers, the
ice will charge like a Leyden arrangement, representing a case of common induction, but
no current will pass. If the ice be liquefied, the induction will fall to a certain degree, be-
cause a current can now pass; but its passing is dependent upon a peculiar molecular arrange-
ment of the particles . . . .
Faraday then drew the inference
. . . As, therefore, in the electrolytic action, induction appeared to be the first step, and
decomposition the second . . . as the induction was the same in its nature 'as that through
air, glass, wax, . . . produced by any of the ordinary means: and as the whole effect in
the electrolyte appeared to be an action of the particles thrown into a peculiar or polarized
state, I was led to suspect that common induction itself was in all cases an action of con-
tiguous particles . . . .
In the following year Faraday elaborated on this idea, saying"
The particles of an insulating dielectric whilst under induction may be compared to a
series of small magnetic needles, or more correctly still to a series of small insulated con-
ductors. If the space round a charged globe were filled with a mixture of an insulating
dielectric, as oil of turpentine or air, and small globular conductors, as shot, the latter
being at a little distance from each other so as to be insulated, then these would in their
condition and action exactly resemble what I consider to be the condition and action of
3 M. Faraday, Experimental Iiesearches in Electricity, vol. 1~ Sec. 1252-1270, Bernard Quaritch, Pub-

lisher, London, 1839.


4 Ibid., Sec. 1164.

5 Ibid., Sec. 1679.


SECTION 1 Historical Survey 329

the particles of the insulating dielectric itself. If the globe were charged, these little con-
ductors would all be polar; if the globe were discharged, they would all return to their
normal state, to be polarized again upon the recharging of the globe.

This insight is all the 1110re remarkable when one remembers the primitive state of
atomic theory in 1838. With this model, Faraday was able to deduce that the polariza-
tion of the dielectric would be opposite to the influence causing it, thus requiring 1110re
primary charge to maintain the same voltage. This provided an explanation for the
increase of capacity due to the presence of a dielectric.
In drawing an analogy between dielectric polarization and the behavior of a series
of small magnetic needles, Faraday established a link to a successful theory of magneti-
zation promulgated fourteen years earlier by Poisson." Poisson had adopted Coulomb's
doctrine of two magnetic fluids as the basis for his analysis. These fluids were presumed
to neutralize each other unless magnetically excited, in which case they 1110ved to
opposite ends of the individual elements inside a magnetic body, but were incapable of
passing from one element to the next. This polarization of the t\VO magnetic fluids
then caused a magnetic field distribution which was derivable as the gradient of a
potential function <Pm. Poisson showed this potential function to be given by the
expression
<Pm = P
s
M · dS
~
+ ! (-
V
V • M)
~
dV

with the first integral taken over the surface of the magnetic body, the second taken
throughout its volume, and M the polarization density, or magnetization. This formula
shows that the magnetic field produced by the body is the same as would be caused
by a fictitious distribution of magnetic charges, consisting of a surface layer whose
density is the normal component of M, plus a volume distribution of density - V • M.
With this interpretation, Poisson was able to explain the magnetic phenomena known
at that time, and this theory was one of his most significant achievements.
Accepting Faraday's conception of an analogous electric polarization in insulating
materials, all that was needed to provide a theory for dielectric behavior was to trans-
late Poisson's theory of induced magnetism into electrostatic language. This was
done independently by Lord Kelvin? and F. O. ~Iossotti8 and the essence of their
development will be found in Section 6.3. In his memoir, Mossotti also showed the
manner in which the dielectric constant of a material depends on its mass density.
This relation, known as the Clausius-l\Iossotti equation "vas derived independently by
Clausius some years later," and will be considered in Section 6.12.
With the growth of atomic theory, the nature of polarization mechanisms became
more clearly understood, and both electronic and ionic polarization were recognized
as forms of induction. It was appreciated that an atorn or ion pair would become
polarized under the action of a local electric field, and that this local field was con-
tributed to not only by the external sources, but also by the induced dipoles in all other
atoms or ion pairs of the material. Lorentz derived a particularly useful expression for
6 S. I). Poisson, "Memoir on the Theory of Magnetism," Mem Acad Sci (Paris), ser. 2, 5, 247-338;

February 1824.
7 W. Thomson, Cambridge and Dublin Math J, 1, 75; November 1845.
8 F. O. Mossotti, Arch des Be Phys, (Geneva) 6, 193; 1847. Mem della Soc Ital Medena (2), 14,49; 1850.

9 R. Clausius, Die mechanische Warmelehre, vol. 2, pp. 62-97, Vieweg, 1879.


330 Dielectric i11aterials CHAPTER 6

the local field in ter111S of the externally applied field and the polarization densit.y.!" If
one uses the Lorentz expression, it is possible to obtain an explicit relation between the
externally applied field and the induced polarization, thus accounting for the alteration
in field distribution due to the presence of the material.
Orientational polarization, due to the existence of permanent dipole moments in
certain molecules, was first hypothesized by Debyc!' in 1912. He used this concept to
explain the high static dielectric constants of water, alcohol, and similar liquids.
Borrowing from an earlier analysis of Langevin, concerned with the analogous problem
of orientation of magnetic dipoles in a permeable material, Debye was able to demon-
strate a temperature dependence for the dielectric constant of substances containing
polar molecules. He extended the Clausius-Mossotti equation to include orientational
polarization, and provided a technique whereby experimental data giving dielectric
constant versus temperature could be used to separate the orieutational polarizability
from the -electronic and ionic contributions.
Maxwell had introduced the displacement vector D to represent the polarization of a
medium , as has already been noted in Chapter 5, and this has proved to be a highly
satisfactory means for characterizing dielectric behavior. In general, it is usually pos-
sible to connect D to the electric field by the relation D = frEOE in which Er is the dimen-
sionless relative dielectric constant (f r 111ay be a tensor). This functional relation
between D and E is controlled by the polarization properties of the material and thus
serves as a description of these properties. A central aim in any dielectric theory is to
establish the dependence of Er on the properties of materials and the state variables.
The last half century has seen an accumulation of data giving Er for a vast number
of materials as a function of several parameters, such as frequency and temperature;"
Theory and experiment are found to be in good agreement for 1110St gases. In solids and
liquids, where the molecules are closer together and the local field is 11101'e complex,
agreement is less satisfactory, and subtle refinements of the theory are required.
Considerable progress has been made with this problem in the last few decades and
this is an area of research showing continued interest.

6.2 THE ELECTRIC MOMENT OF A NEUTRAL SYSTEM OF CHARGES

In the sections which follow, the electric polarization of materials will be used as the
basis for an explanation of their dielectric behavior. A fundamental concept associated
with the phenomenon of polarization is the electric moment of a system of charges qi.
If r, is the position vector of the ith charge with respect to some fixed point, qiri is
defined as its electric moment with respect to that point. 1'0 generalize, the electric
moment of the system of charges qi relative to a fixed origin is given by

(6.1)

When the net charge of the system is zero, this definition is independent of the choice

10 H. A. Lorentz, The Theory of Electrons, Sec. 117, Teubner Publishing Company, Leipzig, 1909.

Reprinted by Dover Publications, Inc., New York, 1952.


11 P. Debye, Polar Molecules, Dover Publications, Inc., N ew York, 1945.
12 E.g., see A. H,. von Hippel, ed., Dielectric Moierials and Applications, John Wiley and Sons, Inc.,

New York, 1954.


SECTION :3 The Static ]\1 acroscopic Electric Field 331

of origin, for if Liqi = 0 and the origin is displaced an amount - Llr, the new value for
electric moment is

p + Lip = 1 qi(ri + Lir) = 1 s«. + Lir 1 qi =


i i i
P

Under this condition of charge neutrality, the electric moment may be written in
another form by introducing electric "centers of gravity" for the positive and negative
charges via the equations
L+ qiri r , 2:+ qi == rl(2
=

L c«. = r2 L qi = - r2Q
with rl and r2 the equivalent positions of the total positive and negative charges
Q, - Q. With these results (6.1) may be rewritten as

(6.2)

in which d is the directed distance from the center of negative charge to the center of
positive charge.
A simple example to which this analysis is applicable is the case of a single atom,
with r , the equivalent center of the nucleus charge and r2 the equivalent center of the
electron cloud. If the at0111 is polarized, these positions do not coincide, and d has a
value different from zero. If OIl the time average d is a constant, the at0111 is statically
polarized; alternatively, under the influence of a time-harmonic electric field, d will be
oscillatory. Th ix formulat.ion is also applicable to a pair of ions, with d drawn from the
center of charge of the negative ion to the comparable point in the positive ion.
If several neutral systems of charge are considered jointly, their combined electric
moment may be written

p = ~ Pi (6.3)

in which Pi == qjdj, with qi the total positive charge of thejth system, and d, the directed
distance from the center of negative charge of the jth system to the center of positive
charge of the jth system. This formulation is applicable, for example, to the case of a
molecule in which the jth subsystem is a neutral ato m. I t is also applicable to a molecule
in which the jth subsystem is an ion pair, or to a group of ions which are arranged in
ion pairs. In even greater generality, it is applicable to any neutral system of charges
which can be divided into subsystems of charge which arc also neutral. I t is in this
larger sense that Equation (6.3) will be applied to the polarization of dielectric materials.

6.3 THE STATIC MACROSCOPIC ELECTRIC FIELD


DUE TO A VOLUME OF POLARIZED DIELECTRIC MATERIAL

The definition of E in vacuo, as introduced in 'Chapter 3 and generalized ill Chapters 4


and 5, may be extended to apply to matter in a manner given by Lorentz in his theory
of electrons. If matter is pictured as consisting of electrons and positively charged
nuclei, since these elementary particles are extremely small compared to atomic
dimensions, at the microscopic level the picture is one of a region of space which is
332 Dielectric Materials CHAPTER 6

essentially vacuous, with a very small part occupied by particles. The electric field
associated with these particles is customarily called the microscopic electric field and
denoted by the symbol e; a microscopic magnetic field b is also associated with the
motions of the particles. These microscopic field intensities satisfy Xlaxwell's equations
in vacuo as given in Chapter E). However, the rapid variation in space and time of e and b
is not ordinarily a feature of interest. For this reason, the local average values of e and b
are introduced, with the averages taken over sufficient distance and time to smooth out
the microscopic variations. These averaged field intensities are called the macroscopic
fields and represented by the symbols E and B. It will be shown in due course that these
macroscopic fields also satisfy Xlaxwcll's equations for all types of materials, whether
they be dielectric, magnetic, or conductive. Though established from a different view-
point, these macroscopic fields are identical with those introduced by Maxwell in his
phenomenological theory which treated matter as a continuum.
In the case of dielectric materials, a convenient starting point for the development
of a theory is to assume that the material is ideal, meaning that there is no free charge
and that all regions of the material are electrically neutral. The fundamental "building
block" of a material will be designated by the generic term molecule, this term standing
for a neutral atom, or several atoms joined in homo polar, covalent, or ionic bonds, or
a small group of ion pairs, etc. (whichever is characteristic of the material in question).
Some examples would be

1. Neutral atom: noble gases such as Ar, Ne


2. Homopolar bond: 02' N2
3. Covalent bond: SiC
4. Ion pair: Hel
5. Group of ion pairs: Solid K aCl t

Since a fundamen tal characteristic of any molecule under the above definition is its
electrical neutrality, if the molecule possesses an electric moment, this moment may be
written in the form of Equation (6.:3).
In developing a dielectric theory, it will be advantageous to consider static conditions
first, since they are less complex and lead to some useful conclusions which simplify
the later discussion of time-varying effects. Therefore, the time-average positions of
the centers of positive and negative charge within the nth molecule will be assumed.
Then if within this molecule a subgroup of charge - qnl is centered a distance d n 1 from
a subgroup of charge +qnl, this will constitute a dipole within the molecule, and will
contribute an electric moment, or dipole 1110111ent, P»: = qn1d n 1, with d n 1 the directed
displacement from -qnl to +qnl. If both the magnitude and direction of d n 1 are caused
t In cases such as this, there is no unique way to define the molecule. Solid Na.Cl has a cubical lattice
structure in which Ka + and CI- ions alternate in a three-dimensional array. One could imagine a
rectangular cell containing one pair of these ions as constituting a molecule, but this has some diffi-
culties in that a single :\a + ion is bonded to S1'X CI- ions in the solid state. Alternatively, one could
imagine a larger cubical cell which has Cl- ions at all eight corners and at the centers of all six faces,
and N a + ions at the centers of all twelve edges, as well as at the volumetric center of the cell. The
corner ions belong equally to eight cells, the edge ions equally to four cells, and the face ions equally
to two cells, so that this molecule is equivalent to four ion pairs of (N a +, CI-). This conception has
the advantage of including in the molecule all of the principal bonds involving the central Na + ion,
although it does have the unnatural feature of partial ions. Either conception of the solid N aCI mole-
cule is electrically neutral.
SECTION :3 The Static Macroscopic Electric Field 333

by an external field, Pill denotes either electronic or ionic polarization. If only the direc-
tion of d n l is influenced by an external field, P1l1 is representative of orientational polar-
ization. At this stage in the analysis, it will not be necessary to define the nature and
cause of the dipole moments explicitly, and Pill will be taken to represent any of the
three types of polarization.

(x,y,z)

JC----------------y

x
FIGURE 6.1 Notation for dipole moments.

The nth molecule may consist of many subgroups of displaced charge (qni,- qni), the
displacements being dnt', in which case a total dipole moment is defined for the molecule
by the relation
L
r- == pni = qnidni
"
Li
(6.4)

With reference to Figure 6.1, if ~n is drawn from the site of P» to a distant point Ct,y,z),
then the dipole moment pn causes a potential at (x,Y,z) given by

ep _ pn · ~n
(6.5)
n - 47r€o~~

this result being merely a rephrasing of Example 3.5.


I t is usually the case that the volume V of the dielectric material is large enough so
that it 111ay be subdivided into a great number N of macroscopic volume elements dV,
with N big enough so that the methods of the integral calculus may be employed,
whereas dV is large enough to contain many molecules. t When this is so, a continuum
t This use of the symbol dV is open to criticism in that dV is not in this case a mathematical infini-
tesimal. However, if another symbol were used to emphasize its macroscopic nature, the subsequent
integral formulations would suffer from the awkwardness of unfamiliar notation.
334 Dielectric Materials CHAPTER 6

approximation may be utilized. Let p(~,1],r) dV represent thc vector sum of all thc
dipole 111011lents in dV, so that P is the volurne density of dipole morucn ts. By this, one
means that if there are M molecules within dV, then

L pn
ill
PdV =
n=l

P is often called the polarization density, or simply the polarization.


Since the position vectors ~n drawn from the various molecules within dl1 to the
distant point (x,y,z) are essentially indistinguishable, one may "Tit e for the potential at
(x,Y,z), caused by all the dipoles within dV,
p. ~
d<p == - -3 dV
41l" Eor

with ~ == l x (x - ~) + l y (Y - 1]) + l z (z - r) drawn from dV to (:c,y,z). From this


it follows that the potential at (x,Y,z) due to all the elementary dipoles in the entire
volume V of dielectric material is given by

(6.6)

Implicit in the derivation of (6.6) is the assumption that (x,y,z) is sufficiently remote
from every dipole in 11 to satisfy the condition that ~ » d.; for all n and for every i.
This condition is clearly 111et for points (x,Y,z) outside the material, and the remainder
of the analysis will first be carried through under this restriction.
The Field at an Exterior Point. Distinguishing del operators with respect to the
source point (~,1],r) and the field point (x,y,z) by the definitions
000
Vs = 1 -
s:
+1- +1-
o~ y orJ Z ar
o 0 0
VF = lx
ox +
- ly -
oy + lz
oz
-

one finds that Vs (1) ~ = ~


~

and thus that

<p(x,y,z) == _1_
47rEO
J
v
P · VS (!)
~
dlT (6.7)

Use of the vector identity

Vs· (P) ~
Vs·
==-~-+p.VS ~
P (1)
and then the divergence theorem converts (G.7) to the form

f (-
f
P · dS V s • P) dV
<p(x,Y,z) = -- + (6.8)
S 47rEO~ V 47rEO~

in which S is the bounding surface of the dielectric material. The electric field caused
SECTION ;3 The Static iI/Iacroscopic Electric Field 335

by this polarized volume of material is therefore

E(x,y,z) == -VF
[1 P · dS + 1(-
8
--
41r€o~ V
Vs ·
41r€o~
P) dl1 J (6.9)

A very interesting interpretation may be given to Equations (6.8) and (6.9) in terms
of equivalent charges. The electrical effect of the dielectric material at exterior points
is seen to be the same as though a volume density of bound electric charge pp == - V s · p
were distributed throughout 11 , together with a surface density of bound electric charge
eJp == l\~ distributed over S (with P; the normal component of P). This concept of

equivalent charges will prove useful in describing many aspects of dielectric behavior.
The Field at an Interior Point. The previous analysis may be adapted to points
(x,y,z) within the dielectric by treating separately a small region surrounding the
interior point in question. With reference to Figure G.2, let a spherical surface S~ of

V'

FIGURE 6.2 The field at an interior point.

radius 8 be erected around (x,Y,z) as center, thus separating a volume fT~ from the
remainder of the dielectric. 11 0 should be chosen large enough to satisfy the condition
that 8 be much greater than any d ni , but it should be small enough that P is essentially
uniform throughout Va. 'I'hese conditions will normally be satisfied if 8 is all order of
magnitude larger than the linear dimensions of the macroscopic volume element dV
described in the discussion following Equation (G.I)).
If 11 1 is all the volume of dielectric except Va, then

E 1 (X,Y,z) == -V F rIP ·
8+8 0
--
dS
41r€O~
+ 1(-
V'
V s · P)
41T"€O~'
dl
1
J (6.10)
336 Dielectric 111 aterials CHAPTER 6

is the electric field at (x,Y,z) due to all the dielectric, except for the contribution
of the polarized molecules within V· o. If 1l0\V (:r,Y,z) is permitted to range over the sub-
volume elV, consisting of the macroscopic volume element which is at the center of V' o,
no sensible change will occur in the value of J~/(X,y,Z) computed from (G.lO), since d1/
is so small compared to l,'·o. 'Thus Equation (G.lO) also gives the average electric field
throughout a macroscopic volume clement, due to all the dipoles outside lla.
It is shown in Appendix I( that all the dipoles within {la cause an average electric
field throughout V o of amount
J[

E; = - 47l"~003 n~ 1 p"
(6.11)

in which the sum is over all the dipole 1110111ents contained in Va. But if P is the density
of dipole moments in V o, then
A/

-!7r0 3 p
l
n=l
l>n

p
so that Ei = (6.12)
3€o

Since lT o is a small local volume, if the molecules within V o are alike and uniformly
distributed, it follows that (G.12) also gives the average electric field throughout the
macroscopic volume element dV', due to all the dipoles within V o. t Therefore if E(x,Y,z)
is interpreted to mean the total average field in dl1 , or 1110re briefly, the macroscopic field,
then

E(x,Y,z) =
-V
F
[ j PedS+ jC- v s e P ) _d l 1
]
p
(6.13)
S+8 0 47r€o~ v- 47r€o~

This result can be simplified. Since V o is so small that P is uniform over So, one 111ay
take a polar axis parallel toP, as indicated in Fig. G.3, and conclude that P, == - P cos 8
over So. (Xote that dS is directed into V· o.) 'I'hcu

f P·dS = j (p·dS)/;

f r(-
-V
F
So 47r€o~ SO 47r€O~3

= P cos 0) (02 sin 0 dO det>)[ - (l x o sin 0 cos et> + 1 osin 0 sin et> + 1,0 cos 0) 1
y

o 0 47r€003

= L, -
])

2€o 0
f cos? ()
7r

sin () de = -
P
3€o
(6.14)

Therefore the equivalent bound surface charge over So makes an equal and opposite
contribution to the macroscopic field at (:r,Y,z), when compared to the contribution
made by the polarized molecules within l!a.
t Most materials have this locally homogeneous property. This includes polycrystalline and amor-
phous solids and many crystals, notably those possessing a cubical lattice structure. It also includes
gases anclliquids, since the thermal agitation of their molecules assures a time-average homogeneity.
SECTION ;3 The Static 111 acroscopic Electric Field 337

Additionally, since ~TO is so small, V s • P is constant throughout V o and thus

VF
Va
1 (V s • P) dV =
4 1rfo! Va!
1
(V s · ~)~ dV = (v s · P) ~ dV = 0
Va !
1 (6.15)

Therefore (G.15) may be added to the right side of (6.13) without affecting the value
of E(x,Y,z). When this is done, and the explicit value for the surface integral obtained

p p

FIGURE 6.3 Surface integral over S8'

in (6.14) is utilized, there results

E(x,Y,z) = -V F
[I P · dS
--- + 1(-- - - -drTJ
V s · P)
s 41rfo! V 41r€o~

But this is identical with (6.9) and thus the macroscopic field is given by the same
formula whether Cr,Y,z) is an interior point or an exterior point.

EXAl\lPLE G.!
In Chapter 3 the electrical system consisting of t\VO oppositely charged, parallel, eon-
ducting, closely-spaced plates was considered extensively. In Example 3.17 it was found
that if both plates were unheated, were separated by a distance l, and differed in potential
byV b volts, then
E = -1
z
Vb
l

were the field expressions, with the coordinate system oriented as shown in the figure.
The charge density on the positive plate was 0'0 = Do = €oVb/l, and in Example 3.31 the
capacitance per unit area was found to be Co = foil.
338 Dielectric Materials CHAPTER 6

Imagine now that a homogeneous, isotropic dielectric material completely fills the space
between the plates. With the battery still connected across the plates, the voltage differ-
ence must remain Vb volts. By symmetry considerations, one concludes that the potential
distribution is still <I>(x) = Vbxll and that the electric field is still uniform, being given by
E = -Ix V bll. Therefore, the dielectric becomes unijormui polarized in the presence of this
electric field, as suggested by the figure. The polarization is directed down because the
positive charges are shifted in that direction, the negative charges being displaced upward.

I
, ,, t ;,--,--'--'--'--'--n , ,, ,, , ,
,+++++++++++++++++++++++++++++++++++\
{t 1
<I> = Fh

;
, , , , t:,t , • t t: ,
~
; t t t t , t t t ;

t ,: , , , ,
t:
, , t ,,
t t t ,I; t t ; t

I,
t t
,,, ; I;
;
; ; , , ;
L ____________
,-----------------------------------
t t: ;
t: t , , t
t
t t
<1>=0

If a rectangular volume V, whose cross section is shown dotted in the figure, is selected
for application of Equation (6.8), one finds that V s • P == 0 within V and that P n == 0
over S except for the t\VO faces perpendicular to X. Over the face at x = l, P n = - p
whereas over the face at x = 0, P n = + P, with P the uniform volume density of induced
electric dipole moments, Thus the entire volume of dielectric behaves as though there were
uniform and opposi te surface charge densi ties on its t\VO faces contiguous to the metallic
plates. These equivalent charge densities are opposite in sign to the original primary charge
densities (0-0, -0-0) placed on the plates by the battery.
Since the potential and electric field distributions between the plates must be the same
as before the dielectric was inserted, it follows that the battery must have delivered enough
additional charge to the t\VO plates so as just to cancel the effect of the dielectric. Therefore
the new primary charge densities on the plates are [(0-0 + P), - (0-0 + P)]. The capacitance
per unit area is now

C = -+-P= Co (1 + -P) > Co


(10

Vb (10

This increase in capacitance may be considerable, depending on the strength of the in-
duced P for a given 0-0. Later sections of this chapter will be concerned with a determi-
nation of P as a function of external stimulus.
This conclusion that the capacitance has been increased due to the presence of a dielectric
is consistent with an energy argument. According to Equation (3.152), the energy stored in
the capacitor, per unit plate area, is now TV = i(o-o +P)V b' and this increase in stored
energy can be traced to the fact that it took work to cause a relative displacement of the
positive and negative charges comprising the dielectric.
SECTION 4 A Generalization of Do 339

6.4 A GENERALIZATION OF Do

Consider next a general electrostatic system consisting of primary charges plus electric
dipoles which represent dielectric materials. One can imagine that a primary volume
charge density p(~,'YJ,r) occupies the volume 1/'1 and that an assortment of dielectric
materials occupies the VOlU111e V 2, being bounded by the surface 8 2 • V 1 and V 2 may over-
lap and neither of them need be only one simply connected region. If p(~,?1,r) is the
volume density of electric dipole moments in V 2, then the total macroscopic electric
field E at a point (x,Y,z) will be
E(x,Y,z) = El(x,y,z) + E 2(x,Y,z) (6.16)
in which

(6.17)

and

(6.18)

If the entire electrostatic system, including the dielectric materials, is viewed as a dis-
tribution of primary charges of density p and equivalent bound charges of density
Pp = - V s · P in a vacuum, a total macroscopic electric flux density may be defined by

Do(x,y,z) = EoE(x,y,z)
= EoEl(x,y,z) + EoE 2(x,y,z) (6.19)
= Dol(x,y,z) + D o2(x,y,z)
In (fi.I 9) DOl is the density of electric flux which originates on the positive charges
of the primary distribution p and terminates on its negative charges. Similarly D 0 2 is
the density of electric flux which originates 011 the positive charges of the equivalent
distribution PP and terminates on its negative charges. D 0 2 is a macroscopic field and
does not have the detailed microscopic structure of a flux field associated with the
individual charges which comprise the molecular dipoles. Both the flux densities DOl
and D 0 2 satisfy Gauss' law and are derivable through formation of the gradient of a
scalar potential function. Therefore
V· DOl = P
V X DOl == 0 (6.20)
V· D 0 2 = -V· P
V X D 0 2 == 0
In 1110st problems of practical significance, one is interested in the total electric field,
which determines the force on a charge, and in the electric flux density DOl, which is
linked to the primary sources p dl 1 through Gauss' law. Equation (6.19) can be rewrit-
ten to display these two quantities of interest in the form
DOl = EoE - D 0 2 (6.21)
D 02 is the extraneous quantity in this equation, and it is advantageous to explore the
implications of defining a macroscopic electric flux density field D by the relation
D = EoE +P (6.22)
340 Dielectric Materials CHAPTER 6

The substitution of P for - D 02 is clearly suggested by the third of Equations (6.20).


In making the definition (6.22) it is to be emphasized that E still has the meaning of
total macroscopic electric field and that P still represents the macroscopic volume
density of electric dipole moments. Outside V 2, where P == 0,

D = DOl + D02 (6.23)


whereas inside V 2
D = DOl + D 02 +P (6.24)

Thus in terms of the earlier conception of electric flux associated with charges in free
space (i.e., the primary charges p and the equivalent bound charges pp), D equals the
total macroscopic flux density Do outside the dielectric materials, but not inside.
Furthermore, taking the curl of (6.22) and utilizing (6.20) gives

vxD=VxP (6.25)

whereas taking the divergence yields


V· D = p (6.26)

Therefore D, as defined by (6.22), may not be irrotational, but it does satisfy Gauss'
law in terms of the primary sources only, and one 111ay write

J D· dS vJ pdF
s
= (6.27)

and conclude that D has discontinuities only at points occupied by primary charges.
Equation (6.22) will be adopted as the defining relation for the generalized electric
flux density function D, with the understanding that this equation refers to the total
macroscopic E everywhere, whether inside or outside the materials, but that it only
refers to the total macroscopic Do field outside the materials. I t is an equation of cardinal
importance in the theory of dielectrics, because usually it can be converted to the form
D = frfOE, since for most materials P is expressible as a function of E. The relative
dielectric constant fr then serves the role of representing the macroscopic electric
behavior of the dielectric.
Besides containing the polarization density P explicitly, Equation (G.22) has several
other advantages. Outside the materials, (G.22) gives D = foE which is a natural rela-
tion, whereas (Cl.21) does not similarly connect DOl and E. In the absence of materials,
(6.22) reduces everywhere to (3.32) and thus all the earlier discussion of electric fields
due only to primary charges becomes a special case of the present generalization. But
an even 1110re important feature is that D, as defined by (6.22), shares with DOl the
property of satisfying Gauss' law, thus providing a cause and effect relation with the
system of primary charges.
EXA:\lPLE 6.2
Imagine that a parallel plate capacitor is charged by a battery until its plates have densities
(0", -0") and that then the battery is removed. If a block of dielectric is inserted between the
plates, as shown in the figure, the D field is unaffected, since the charges on the plates
cannot change. However, the E field is affected, because the induced polarization in the
dielectric produces a field in the opposite direction. The equivalent bound surface charges
SECTION 4- .A. Generalization of Do 341

+
+
~
+ +
+ +
~
/':
+
+
~
+ +
~
~+ +
+ +
+ +
%
+ +
+ +
+ +
a -(j (J -(Jp (Jp -(j

D Field lines E Field lines

(UP, -up) on the dielectric faces are also shown in the E plot. If the Do field were shown,
it would be everywhere proportional to E through the factor Eo, and just like E would be
discontinuous at the dielectric faces, due to the flux lines originating on Up and terminating
on -Up.

EXAMPLE 6.3
Conceptually, the macroscopic field vectors E and D, which appear in (6.22), may be
measured inside a dielectric with the aid of small cavities constructed within the dielectric.
With reference to the first figure, let it be desired to determine E at the point A, and
imagine that through the removal of material, a long, thin, needle-shaped cavity of arbi-
trary orientation has been created with A as its central point. For the moment, let it be
assumed that compensating charges of volume density -V· P and surface density P n have
been placed in the cavity and on its walls, so that the macroscopic field everywhere is the
same as it was before removal of part of the dielectric. Since V X E == 0, it follows that
around the rectangular contour 1234,

If the cavity is small, and the legs 12 and 34 are negligible, this means that E 14 = E Z3 •
In other words, the longitudinal component of electric field within the cavity is the same as
the corresponding component of electric field in the adjacent dielectric.
Now consider the compensating charge distributions -V · P and P n , within the cavity
and on its walls, If the cavity is small, -v · P is uniform within the cavity, and by sym-
metry these volume charges contribute nothing to the electric field at point A. If the
cavity is also slender, the surface charges comprising P n can, at most, contribute to the
transverse electric field at A. Thus removal of these surface and volume charges will not
342 Dielectric Materials CHAPTER 6

affect the longitudinal field at 11. Therefore if a small needle-shaped volume of dielectric is
removed in the neighborhood of any point A, and no compeneaiinq charges are introduced,
a measurement of the longitudinal electric field at .·1 will give the same value for that
component of the field as existed in the dielectric before removal of the material, Three
perpendicular orientations of the needle-shaped cavity can yield measure» of all three COln-
ponents of the electric field within the dielectric.

Next consider the creation of a waferlike cavity, with ./1 as its central point, as shown in
the second figure. The circular faces of this cavity are large compared to its cylindrical rim,
and its orientation is arbitrary. If, once again, volume and surface charges of densities
-v · P and P n are introduced to compensate for the removal of the dielectric to form the
cavity, the total field everywhere is as before; a measurement at point A of the electric field
component En along the cylindrical axis of the wafer will yield the same value as existed in
the dielectric with all the material present.
If the cavity is small, -v · P is uniform within the cavity, and by symmetry the volume
charges contribute nothing to the electric field at i1; their removal will not change the
reading of the field strength. On the other hand, the surface charges on the walls of the
cavity may make a substantial contribution to the electric field at A. If the wafer cavity is
extremely thin, the surface charges on the cylindrical rim can be ignored. However, then
the surface charges on the t\VO circular faces, being equal and opposite, cause a field at 1"1
which is like that between the faces of a parallel plate capacitor. This field is along the
cylindrical axis of the cavity and of strength - P n/ Eo. Removal of these surface charges
would cause the axial componen t of electric field to change to the value En + Pn/ Eo. If
Equation (6.22) is resolved into components, one may write Dn / Eo = En + P n/Eo. There-
fore if the axial component of electric field is measured at the center of a waferlike cavity,
with no compensating charges present, multiplication of this field reading by Eo gives the
component of D in the axial direction. Three perpendicular orientations of the cavity will
then yield all three components of D.

6.5 THE LOCAL FIELD

The defining relation (6.22), which gives D in terms of E and P, is a macroscopic equa-
tion whose further interpretation Blust await the linking of P to its causes. But this
linkage occurs at the microscopic level, and thus P forms a bridge between the macro-
scopic and microscopic theories of dielectric behavior.
SECTION ;) The Local Field 343

1'0 develop the connection between P and its causes, it is useful to introduce the
concept of the local field, E 1o c , which will be defined as the average field intensity acting
on a given molecule within the dielectric. E 10 c may be determined by removing the
molecule ill question, maintaining all other molecules in their time-averaged polarized
positions, and calculating the space-averaged electrostatic field in the cavity previously
occupied by the removed molecule. If 11 m is the volume of the cavity, then

E 10c = +! e
T'm Vm
dV - _1_
Vm Vm
! em dV (6.28)

in which e is the total tirne-averaged field at a point in V m and em is the time-averaged


field at the same point just due to the charge distribution of the molecule in question.
If the dielectric material is locally homogeneous, the first integral in (6.28) is approxi-
mately equal to the macroscopic field E. And if 11 m 111ay be chosen as a spherical volume
of radius 1'm, then the results of Appendix I( give

(6.29)

with p the total dipole moment of the removed molecule. If N = 1/ 1T m is the local
volume density of molecules, then assuming parallel and equal polarization for all
local molecules gives
p p
P = Np = 4-:3 = 3Eo - - . 3
"311'"1 m 411'"Eo1 m

so that the space-averaged self-field is - P/3eo. With these substitutions, (6.28) becomes
P
E 10 c = E +­
3Eo
(6.30)

Equation (6.30) was first derived by Lorentz " using the definition that the local field
was the value found at the center of the molecule rather than averaged over a molecular
volume. It requires the assumption that the molecules are alike and locally distributed
in a uniform fashion. I t is approximate in the sense that the field right at one of the
dipole moments Plli of the nth molecule may not be accurately given by the field aver-
aged over the entire volume associated with that molecule. An additional difficulty
is the fact that if all space is to be considered filled with molecular volumes (which is
implicit in the relation It m = 1/ N), then IT m cannot be spherical. Despite these approxi-
mations, when (6.30) is used to predict dielectric behavior, the results are in satis-
factory agreement with experiment for many substances.
Although the integral in (G.29) 111ay not actually reduce to precisely - P /3Eo in all
cases, it is arguable that the average field in 11 m due to the polarized molecule contained
in V m should be proportional to the dipole moment of the molecule, which in turn is
proportional to P by definition. Therefore

_1
V m v.,
! em dV = - 'Y ~
EO

13 H. A. Lorentz, Theory of Electrons, Dover Publications, Inc., New York, 1952.


344 Dielectric M aierials CHAPTER 6

in which "y is called the internal field constant, and does not necessarily have the
value one-third. In this case (G.:10) aSSU111CS the somewhat 1110re general form

E 10 c = E + (~) p (6.31)

.A. best-fit between experimental data and a theory based on (6.31) 111ay then provide a
means for determining the value of "y.
Upon adoption of this formulation of the local field, it is possible to obtain relations
between P and its causes for each of the three types of polarization, and from these
results to determine a functional relation between D and E.

6.6 ELECTRONIC POLARIZATION

The developments contained in this and the next t\VO sections will show that, under
suitable assumptions, a linear relation exists between the strength of the local field
and the induced polarization. This linearity presumes field strengths which are not
excessive and is applicable to most dielectric materials, It includes all three polarization
mcchanisms-v-eloctronic, ionic, and orientational-the first of which will now be
considered.
In the presence of an electric field, the electron cloud and nucleus of an atom tend to
shift in opposite directions, causing electronic polarization. This effect can be related
to the electric field causing it by assuming that the electron cloud has a time-average
distribution which is a function only of radial distance from the cloud's center, and then
making use of Coulomb's law. Although such a classical model is admittedly crude, and
one should properly use quantum mechanical expressions for the cloud distribution, this
classical approach has the virtue of simplicity and yields results of the right order of
magnitude.
With reference to Figure 6.4, let the volume charge density in the electron cloud be

p(l')

Electron cloud charge distribution -, 'l'a><.. ',


I

t"
\
~

......
t
,.." /

d--
\

+
I

Polarized atom

FIGURE 6.4 Polarization of a single atorn.


SECTION 6 Electronic Polarization 345

some general function per) which drops to zero at a radius r: It will be assumed that the
nucleus is displaced a distance d relative to the center of the electron cloud, due to the
presence of a field E 1oc • If Ze is the total charge on the nucleus, with Z the atomic number
and e the proton charge, then ZeE loc is the force on the nucleus due to all other charges
except its O\VI1 electron cloud. This force must be balanced by the restoring force which
the electron cloud impresses on the nucleus.
A spherical shell of charge 47rr 2 p(r) elf will exert a force at points beyond r as though
the charge were concentrated at the center, and will exert no force at points within r.
'rhus the nucleus will experience a restoring force due only to that part of the electron
cloud within the radius d. According to Coulomb's law, this force will be

so that, for an equilibrium displacement,

ZeE loc
:l
-
(Ze)2
--
Jd r 2f(r) dr = 0 (6.32)
€od 2 0

in which f(T) = - per)/Ze is the normalized absolute charge density in the electron
cloud, in the sense that

J 41rr
r.,
2f(r)
dr = 1
o
Equation (6.32) may be rewritten in the form

Jr
d

e.: = Ze 2f(r) dr
€od 2 0

If the displacement d is small compared to the effective atomic radius fa, this can be
approximated by

E 10c
Zef(O)
= --') 1'2
J d
Zed
dr = - f(O) (6.33)
€od~ 0 3€o

in which case the equilibrium displacement is seen to be linearly proportional to the


local field intensity causing the displacement. But Zed = pe is the dipole moment of
the atom, and therefore
(6.34)

with a e = 3€o/f(O) called the electronic polarizability of the atom,


To gain a feeling for the magnitude of f(O), imagine that per) as sketched in Figure 6.4
is constant out to the radius r; and then zero, so that

an expression which is seen to be normalized. Then f(O) also is given by this expression
and
(6.35)
346 Dielectric 111aieriols CHAPTER 6

If the atom whose electronic polarizability is being considered is only one of t\VO
or 1110re atoms comprising a molecule, the radial symmetry assumed above is, of course,
invalid. The displacement for a given local field will depend on the orientation of the
molecule with respect to the field. However, assuming knowledge of the effect of nearby
at0111S and the probability distribution of orientations of the molecule, a suitable
average value for a e could be deduced, whir.h would differ from ((-L~;) only by some
multiplicative factor. Therefore Equatiou (6.:34), which indicates all induced electronic
dipole moment proportional to the local field, is generally valid, even though explicit
expressions for a e 111ay be too difficult to obtain in complicated cases.

6.7 IONIC POLARIZATION

Two different atoms X and Y may join together as a molecule by forming a chemical
bond. This bond 111ay be caused by the transfer of electrons from one atom to the other
or through the sharing of electrons between the atoms. When a transfer of electrons is
involved, the bond is said to be ionic; if the electrons are being shared, the bond is called
covalent. A simple ionic bond between one X at0I11 and one Y atom quite obviously
results in a permanent dipole moment, A covalent bond can also exhibit a permanent
dipole moment if the "center of gravity" of the shared electrons does not coincide with
the "center of gravity" of the remainder of thc charges in the t\VO atoms,

x x x

(a) (b) (c)

FIGURE 6.5 Jonic bonds in molecules.

Depending on the kind of bond and how many at0I11S of each type are in the complete
molecule, a variety of possibilities arises. Three of the I110re simple arrangements in-
volving ionic bonds are suggested by Figure 6.5. In (a) the molecule XY is seen to have
a dipole moment, but in (b) the in-line disposition of atoms yields a net dipole moment
of zero for the molecule XY2. However, if the atoms are arranged as in (c), the n101e-
cule XY 2 does have a net moment. Real molecules can be found to fit all three of these
cases; examples are (a) HC], (b) CO 2, and (e) H 20 .
Molecules are classified as polar or non-polar according to whether or not they possess
a permanent dipole moment. In the presence of an electric field, polar molecules
experience a torque tending to align thorn with the field. This is orientational polariza-
tion, and will be treated more fully in Section 6.8. ~ on-polar molecules are not sub-
SECTION 7 Ionic Polarization 347

jccted to this aligning torque, but the presence of a field can induce in them a type
of polarization which can be explained with the aid of Figure G.G. In (a) the in-line
XY 2 molecule is shown again, but this timc in the presencc of all electrostatic field,
which causes relative displacements of the charged atoms in the directions shown.
The t\VO dipole moments 110 longer cancel, and the net moment is 2t1p. This effect is
known as ionic polarization.

~ p - ap

I
p - Lip'

• E10c

(a)
p + Lip
y

(b)
/ p + Lip'

FIGURE 6.6 Ionic polarization.

However, in many materials the XY 2 molecules are randomly oriented and the more
general situation is shown ill Figure 6.(->b where the displacement is due to the field
component J!J'loc cos O. A quantitative est.imate of this displacement may be obtained
in the following way: Let da be the normal interatomic spacing between the center
of the X atom and the center of either Y atom and let 2q be the net time-average charge
on the X atom. Then in the absence of an external electric field, if the ions were a
distance T apart, the lower Y ion would experience an attractive force

Fa = (2q)(q) + (- q)(q) = ~
41l" E or 2 47r€o(2r) 2 167rEor 2
due to the positive X ion and the other negative Y ion.
To prevent the molecule from collapsing, there 111USt also be repulsive forces between
the ions. These repulsive forces occur when the electron shells of adjacent ions overlap,
and they increase strongly for small decreases in 1'. Following Born, one can make the
simple assumption that the repulsive force experienced hy the lower Y ion, due to the
electron cloud of the X ion, is
K
}? --
T - 1'11-+1

in which K and n are as yet undetermined constants whose values depend on the ions
heing considered. When the molecule j~ in equilibrium, F, = F; and r = d a and thus
7q2d~-1
K =
1()7r€o
348 Dielectric Materials CHAPTER 6

Therefore the total force on the lower Y ion, due to the other ions in the molecule, is

(6.36)

and this is seen to be zero for the equilibrium spacing r = d.;


If, in the presence of E loc , the X ion shifts a distance Ad closer to the upper Y ion, the
force experienced by the 10,,,e1' Y ion, due to the other two ions in the molecule, is
changed to
F = (2q)(q) + (-q)(q) _ 7q 2
1
d:-
41rEo(da + ~d)2 41rEo(2da) 2 IG1rEo(da + ~d)n+l

~ -q-3 [7(n + 1) - 16]q ~d


161rE Oda
This force must balance the external force - ql~lloc cos () exerted on the lower Y ion and
thus, since 2~p' = 2q ~d is the induced dipole 1110111ent for the entire molecule,
I 321rEod~ cos 8 l~l
')Lip - (n.37)
- 7(n + 1) - 10 loc
JJ
...

If all orientations of the molecule are equally likely, and the average value of 2Lip'
is denoted by pi, then since the average value of cos () is 2/1r,
Pi = (XiE loc (6.38)
in which (Xi is called the ionic polarizabiliut of the molecule and is given by
()4E d:
o
(6.39)
(Xi = 7 (n + 1) - 16
The value of n to be used in this formula depends on the number of electrons in the
outer shells of the iOIlS. Using an approximate analysis of the interaction between
closed-shell electronic distributions, Pauling deduced 14 a set of values of n as a function
of ion size which are given in Table 6.1. The numbers in this table should be used by
taking the average of the values of n for the t\VO types of ions occurring in the molecule.
For example, carbon disulphide would call for a choice of n = 7. Some experimental

TABLE 6.1
REPULSIVE EXPONENT n

Type of closed
Representative ions n
shell structure

He Lit Re2+ 5
Ne 0 2- F- Na+ Mo·2+
e 7
A 8 2- Cl- {{+ Ca 2+ 9
I(r Br: 10
Xe 1- 12

14 L. Pauling, The Nature of the Chemical Bond, p. 339, Cornell University Press, Ithaca, New York,

1948.
SECTION 7 I onic Polarization 349

work by Slater " on alkali halides indicates reasonable agreement with this averaging
procedure of the n values in Pauling's table.
When the relative sizes of typical values of fa and d; are taken into consideration, for-
mulas (G.35) and (6.3H) indicate comparable values for electronic and ionic polarizability.
Equation (G.38), which is seen to be in the same form as (6.34), is applicable to more
complex non-polar rnolecules than the one represented by Figure 6.6. I t is also appli-
cable to ionic crystals. (Cf. Example 6.4.) The analysis differs in these 1110re general
cases only in that the coefficient in (6.37) is slightly altered, which, in turn, makes a
small change in the magnitude of ai. Thus ionic polarization manifests itself quanti-
tatively in the same manner as electronic polarization.
EXAMPLE G.4
An ionic crystal such as NaCI is a cubical array of (alternately) positive and negative ions.
Under the influence of a static axial field the positive and negative ions shift oppositely,
approximately as suggested by the figure, until the restoring forces balance the impressed
force and equilibrium is reestablished. The balance of forces on a central positive ion may be
determined as follows:
Let the equilibrium relative displacement be ~d, with d; the normal interionic distance,
and ~d «d., If z is the valence number, +ze is the net charge on a positive ion, -ze the
l~ J. C. Slater, "Compressibility of the Alkali Halides," Phys Rev 23, 488-500; April 1924.
y

- - 1
0: 1- - 1- I- -
I-
- -
,.. ,......, ,.. ,.

...-
"

...-
J-
- )

~
....
- -- -... ~
-- -...
-- -- -....
.-.-da~ --Ad
( -.... )

-e---< -.... --
II

-ze +ze
1+
v
-ze
-
1;-
-... -- • X

(
-- -... J J
" ...- -- )

-... -.... --
u:
~ -- r
--
-- -- .....-
1_... 1_
-- -- - - -
.-
-
• E
350 D~'electr£c Maierials CHAP'fER 6

net charge on a negative ion. The positive ion 1+ at the origin has suffered no asymmetrical
displacement relative to the other positive ions, so by symmetry the other positive ions
exert a null net force on 1+.. Also, by symmetry, the negative ions exert a net force on 1+
which has only an X component. Consider first the negative ion which is normally at the
position (kda,lda,mda) with k, l, and m integers. This ion causes a force on 1+ whose X com-
ponent is
(ze)2 kda + ~d
f = 41r Eo [(kd. + ~d)2 + ([d.)2 + (mda)2]~~
Using a binomial expansion and ignoring terms higher than first order, one obtains

(ze)2 [ k ~d 2k 2 - l2 - rn2 ]
f = 4no (k 2 + [2 + m2)%d~ - d~ (k2 + [2 + m 2)H

The zero-order term is seen to be odd in k and thus, when all negative ions are con-
sidered, there is only a first order force F 1, given by

F
I
= - -
2
(ze)- ~d
4nod3 ( )
III 2k -
2

(k 2 + [2
[2 -
+ m )%
m
2
2

a k l m

in which the sum is over all allowable values of the indices k, l, m, these values correspond-
ing to the sites of negative ions. This series sums to zero and thus

PI = 0

and, to first order, the array of displaced ions does not contribute a restoring field.
However, the balance of interactions of the electron cloud of 1+ with those of the t\VO
neighboring negative ions 11 and 12" has also been disturbed by the displacement, giving
rise to a restoring force F 2. Once again, use of the Born approximation gives

F2 = K __ K = 2K en + 1) Lld
(da - ~d)n+l (da + ~d)n+l d:+ 2
The restoring force in the absence of an external field is I(/d~+l and this force 111Ust balance
the attraction exerted on 1+ by all the ions (both positive and negative) to one side of 1+
when no external field is presen t. Th us

K
d:+ 1 =
(ze)2 [\,1
41rEod~ ~ (k 2 + l2
k
+ 1n2)~~ - L
\'"
(k 2
k
+ l2 + m2)~2
]

in which ~' is over all terms for which k + l + m is odd and ~" is over all terms for which
k + +
l m is even. When these SU111S are evaluated, their difference gives

K = 0.29 (ze)2 d:- 1


47r Eo

and therefore F 2 = 0.58 (ze)2 (n + l)(~d)


47r Eoda3
In the presence of a static electric field, the balance of forces on 1+ is thus

ze E l~·
= 0 58 (ze)2(n +d1) ~d
41r Eo a3
If one lets Pi = ze 6.d stand for the induced dipole moment, it follows that
SECTION 8 Orientational P olarizaiion 351

in which the ionic polarizability ai is given by


47r €od~
ai = -----
O.5S(n + 1)
an expression which is seen to be similar in form to (6.39).

6.8 ORIENTATIONAL POLARIZATION


If a homogeneous polar material is considered, all molecules have the same perma-
nent dipole moment and, in the absence of an exciting field, these moments are often

W(8)

01--------+-------+---

-J.l.E 1oc

-qE 1oc

(a) (b)
FIGURE 6.7 Permanent dipole in a local field.

randomly oriented. When an electrostatic field is present, each molecule experiences a


torque which tends to align its dipole moment with the field. Were it not for thermal
agitation, in such materials all of the molecules would become aligned; but their
collisions with each other keep breaking up this pattern, so that on the time-average
there is only a partial alignment. A quantitative indication of this effect may be
deduced by first considering a permanent dipole J1. = qd, as shown in Figure 6.7 a, which
makes an angle 8 with respect to the field. The energy added to the system consisting
of the dipole and the sources of E 1oc , when an external agency rotates the dipole from
an initial setting 8 1 , to a new position 8, is

f qE10cd sin
8

W = (3 d(3 = - JJ.E1oc(cos () - cos (}l)


81

Therefore the potential energy of the permanent dipole moment J.l is a minimum for
() = 0 and a maximum for 8 = 7r; in other words, alignment with the field is the pre-
352 Dielectric Materials CHAPTER 6

ferred orientation. If the reference for potential energy is taken as 01 = 7r/2, then

W = -ll· E 10c (6.40)

and the energy versus angular position may be plotted as shown in Figure ().7b.
Let N v be the number of molecules in a macroscopic volume clement and assume
that all pointing directions would be equally probable for the permanent dipole
moments of these N; molecules were E Joc = o. In the band of directions ~7r sin 0 dO
shown in Figure 6.8, one would find oriented a fraction 27r sin 0 dO/47r of the N; dipoles,

E\or.

()

~ d8

FIGURE 6.8 Calculation of N (0).

this being the ratio of the band area to that of the entire unit sphere. However, the
presence of E10c causes a weighting by energy of the pointing directions such that

N(O) dO = K27r sin 0 e- W(8)/kT dO (().41)

is the number of dipole 1110111cnts pointing in this band of directions. The factor K
appearing in (6.41) is a normalizing constant and l: = 1.38 X 10- 2 3 joules/deg is
known as Boltzmann's constant. The temperature T is expressed in degrees Kelvin
and the weighting factor e-ll'(8)/k1' is a consequence of the Xlaxwell-Boltzmann statis-
tiCS. 16 Since J~N(O) dO = N v, one finds that K = alv v/ 47r sinh a, wherein a = J.LI~loc/kT.
The net dipole moment for all those molecules whose dipoles point in the band of
directions 27r sin 0 dO is JJ. cos ON(0) dO and therefore the equivalent orientational dipole
16 See, e.g., F. W. Sears, An Introduction to Thermodunomics, the Kinetic-Theory of Gases, and Statis-

tical Mechanics, 2d ed., Chaps. 12 and 14, Addison-Wesley Publishing Company, Reading, Massa-
chusetts, 1953.
SECTION 8 Orientational Polarization 353

moment per molecule is given by the expression

~ 0J J.I. cos ON(O) dO J


~ a

p; =
N;
= 2 ~h
a SIn a_a
ve" dv

in which the substitution variable v = a cos 8 has been introduced. This integral gives

po = J.I. ( coth a - D = J.l.2(a) (6.42)

The function ~(a) defined by (6.42) is known as the Langevin function)' it first arose
in an analysis by Langevin in 1905 of the similar problem of the orientation of magnetic
dipoles in a steady magnetic field. It is plotted in Figure 6.9. For large values of

~~(a)

1.0
/
/
0.8 /
/
/
/
0.6 /
/
/
/
0.4 /.
h
#
0.2

1L-_ _----ll.- L- .1.- ....L- ~ ...L_ __L._a


o 2 3 4 5 6 7
FIGURE 6.9 The Langevin function.

a = IJ.l!}loc/kT, the Langevin function is seen to approach unity. This corresponds to


low temperatures and gives Po ~ IJ., indicating that all the permanent dipole moments
are essentially aligned with the field. At very low temperatures this is a result one would
expect, since it is thermal agitation which is interfering with the tendency to alignment.
However, at normal temperatures ~(a) is quite small. This may be appreciated by
noting that IJ. has a value which, in order of magnitude, is the charge on a proton
multiplied by a distance of one angstrom, i.e., 10- 10 meters. (Because of the convenient
size of this product, dipole moments are often expressed in debye units, with one debye
unit equal to 10- 10 esu angstroms or 3.33 X 10- 30 coulomb meters.) If one assumes a
polar gas in which each molecule has a permanent dipole moment of one debye unit,
at room temperature for a local field even so strong as ten million volts per meter,
a ~ 10- 2 • For values of a this small, ~(a) ~a/3. If this approximation is used, (6.42)
becomes
(6.43)
354 Dielectric Materials CHAPTER 6

Thus at normal temperatures the equivalent orientational dipole 1110n1ent per molecule
is inversely proportional to temperature, proportional to the square of the permanent
dipole moment per molecule, and linearly proportional to the local field. This result
aSSUl11eS a material whose dipole 1110111ents are randomly oriented in the absence of an
external field.
Under the conditions leading to (G.43) one 111ay write
(6.44)
in which a o = J.L 2 / 3k 1' is called the orieniaiional polarizabilily of the molecule.
Thus it is seen that all three types of induced polarization can be linearly proportional
to the local field.

6.9 DIELECTRIC SUSCEPTIBILITY, PERMITTIVITY,


AND RELATIVE DIELECTRIC CONSTANT

Materials whose molecules possess permanent dipole moments which are randomly
oriented in the absence of a field may exhibit all three types of polarizability-elec-
tronic, ionic, and orientational. In such cases one 111ay write for the average polarization
per molecule
p = aE 10 c (G.45)
in which
P = l)e + Pi + po (G.4G)

is the SUln of the electronic, ionic, and orientational dipole moments, and

a = ae + a, + a; (G.47)
is the total polarizability, being the sum of the three contributions. a e and ai include
the effects of all atoms and ions in the molecule and must be suitably averaged over all
orientations of the molecule with respect to the field; a o = J.L '2/3k T under normal
conditions of temperature, with p. the permanent dipole moment per molecule.
If N is the density of molecules per cubic meter, then

P = lVp = NaE loc ((1.48)

is the volume density of dipole moments. If one invokes (6.31), this 111ay be rewritten as

Na
(6.49)

Thus under the restrictions of this analysis the polarization P is linearly proportional
to the macroscopic electric field. For this reason one may write

P = xefuE (6.50)

in which x, is a dimensionless constant, called the dielectric susceptibility, and is given by


Na/EO
Xe = - - - - - (6.51)
1 - "INa/EO
SECTION 9 Dielectric Susceptibiliiu, Permiiiuniu, Constant 355

From (6.22) the flux density is


D == foE + xe€oE == €o(l + x.) E == €E (6.52)

with € == €o(l + x.) called the permittivity of the medium, The relative dielectric
constant €r is defined by the relation

€r == - == 1 + Xe (6.53)
€a

and, like x., is a dimensionless quantity.


If one returns to Example 6.1, and a dielectric material satisfying the conditions
of this analysis is placed between the parallel plates of the capacitor, then the capaci-
tance is changed so that

c 1 + X, == €r (6.54)
Co
and thus t he relative dielectric constant is an easily measured quantity. If one has
determined the capacitance in a VaCUU111 and then in the presence of the material
medium, it is then possible, through the use of (6.51), to obtain an estimate of the total
polarizability a-if the particle density is known and a value (such as one-third) is
assumed for 'Y.
The results of this section are also applicable to materials in which one or more of the
partial polarizabilities of (6.47) is zero.
EXAMPLE 6.5
The measurement of dielectric constant through a capacitance experiment is limited to
frequencies at which the circuit has dimensions small compared to a free-space wave-
length. When this is the case, a circuit such as the one shown in the figure may be ern-
ployed. Neglecting loss, the resonant frequency is
[LeCs +
10 = - - - - - -
C)]-~~
21r
If Cs is a calibrated condenser which may be retuned to maintain resonance after insertion
of the dielectric specimen, the change in Cs gives the change in capacity C due to the
dielectric, from which the susceptibility may be deduced. If the frequency is 10 \V eno ugh,
L

Variable
frequency
generator
356 Dielectric It!aterials CHAPTER 6

this quasistatic procedure yields the static dielectric constant. The power loss in the dielec-
tric as a function of frequency may be determined by measuring the sharpness of tuning
near resonance.
At microwave frequencies a different procedure is used. A. dielectric specimen perhaps a
tenth of a wavelength long is inserted in a waveguide and the change in VSvVR noted.
From this the velocity of propagation v in the specimen may be deduced; since v = (}.Lo E)-}2,
the permittivity is thereby determined.
At light frequencies, still another procedure is followed. The index of refraction is deter-
mined for the specimen, usually in a minimum deviation prism experimen t, though total
reflection and interferometric techniques are sometimes employed, Since Snell's law gives
sin i = n sin r, with i and r the incident and refracted angles, if a light wave is incident
from a vacuum on the specimen, the refractive index n is the ratio of light velocities in free
space and in the specimen. But this is also the square root of the relative dielectric constant.

6.10 THE STATIC DIELECTRIC CONSTANT OF GASES

For most gases, under normal condi tions of pressure and temperature, IV a/ EO « 1.
This conclusion will be substantiated in the illustrative examples to follow, and is due
to the relatively low particle density; it does not extend to liquids and solids. However,
when this condition is met, a significant simplification of the analysis is possible. The
formula for dielectric susceptibility becomes

Xe = Na/Eo« 1 (G.55 )
and the relative dielectric constant is therefore close to unity. 'I'he local field is given by

E10c = E + -Eo'Y P = E(l + 'Yx e) ~ E (6.56)

This result may be explained by saying that the molecules are far enough apart to
cause a low polarization density P; the nearest neighbors of a molecule do not exert a
sufficient effect to cause its local field to differ substantially from the average macro-
scopic field.
These simplifications can be applied to a variety of special cases.
Monoatomic Gases. The rare gases, such as helium and argon, are monoatomic
under normal conditions, and thus the principal polarization mechanism is a relative
shift of nucleus and electron cloud, and a = a e • A measurement of capacitance, first
in a vacuum and then in a rare gas medium, will permit a deduction of ae.
As an illustration, the relative static dielectric constant of helium, measured at one
atmosphere and DoC, is found to be Cleo = Er = 1.0000684. Under these conditions
the particle density is approximately N = 2.7 X 10 2 5 at0111S/111 3 , so from (6.55)

CX e = 0.22 X 10- 4 0 farad 111 2

Through use of (6.35), an estimate of the radius of the helium at0111 is

Ta ~ O.G X 10- 10 meters

which is the right order of magnitude. Thus, even though the classical atom 1110del
(used in Section 6.6 to obtain a quantitative index of electronic polarizability) is rather
crude, the results are quite reasonable. As further illustration, Table 6.2 lists the
SECTION 10 The Static Dielectric Constant of Gases 357

polarizabilities, under comparable conditions, of many of the rare gases. The polariz-
ability is seen to increase as the atoms increase in size, which is in accordance with
Equation (6.35). One also observes from this table that even for xenon, Nae/f.o ~ 10- 3 ,
which is consistent with the assumption leading to (6.35).

TABLE 6.2
ae X 1040 farad n1 2 FOR RARE GASES 17

Gas He Ne Ar Kr Xe

a. e 0.201 0.390 1.62 2.46 3.99

17 From L. Pauling, "Many-Electron Atoms and IORs,"

Proc Roy Soc (London), A114, 181-211; 1927.

The average distance between molecules is N-~'J. Under normal conditions of pressure
and temperature, this is approximately ~3 X 10- 9 meters for a monatomic gas, which is
about twenty times the atomic diameter. This low population density accounts for the
low value of Xe, and Equation (6.5;")) then indicates that the dielectric susceptibility
is linearly dependent on particle density, (so long as lVae/f.o « 1), an effect which has
been verified by experiment.
Using the foregoing analysis, one may also deduce the relative shift d of the nucleus
and electron cloud of a single atom. For an electric field of 106 volts /rneter, Equation
(6.34) gives for a helium atom
p ~ 0.2 X 10- 34 coul-m

Since the charge is Ze = 3.2 X 10- 19 coul, the displacement is approximately


d = 6 X 10- 17 meters
This is only about one millionth of the atomic radius, which serves to justify some of the
earlier assumptions about dipole moments, For reasonable values of electric field, the
atoms of a rare gas are only minutely perturbed.
The classical development in Section 6.6 suggests that the electronic polarizability
a e is dependent only on distribution of charge in the electron cloud, as represented
by the function fer). If this cloud distribution remains the same, a, should be a con-
stant. From this one would predict that a e (and hence f.r) is independent of temperature
for a rare gas, unless the temperature is so high as to excite the atoms above their
ground state. This prediction is also borne out by experiment.
EXAMPLE 6.6
Measurements of the atomic diameter of argon in its condensed state at 50°!\: give 2ra
3.84 X 10- 10 meters, Using these data, what is the relative dielectric constant of argon
under standard conditions of pressure and temperature?
At T = 273°I{ and p = 1 atrn the particle density of argon gas is approximately N
2.7 X 10 2 5 at0I11s/m 3 • Combination of (6.35) and (6.55) gives
x, = 47rNr~ = 2.4 X 10- 3
so that €r = 1 + X, = 1.0024
358 Dieleciric Materials CHAPTER 6

Direct measurement of the relative dielectric constant of argon in a capacitance experi-


ment gives Er = 1.000545, so the above estimation of Xe is high by a factor of 4. The chief
source of error can be traced to the assumption, which led to Equation (6.35), that the
electron cloud has a uniform density. If one were to assume instead that the electron cloud
is relatively 1110re dense near its renter, this would increase f(O) and decrease Xc, thus
bringing the above calculation into closer accord with direct experiment. Further refine-
ment would require use of a quantum mechanical 1110del for the electron cloud.

Non-polar Gases. For a non-polar gas whose molecules have a structure such as
that indicated by Figure 6.G, there may be ionic as well as electronic polarization.
The relative displacement of nucleus and electron cloud will be different for the several
types of atoms, and the total electronic polarizability will be the sum of the partial
effects. Both the electronic and ionic polarizabilities need to be averaged over all
possible orientations of a molecule, with the results represented by mean values of a e
and ai. Then if N is the density of molecules per In 3, the polarization density is given by

Because a, is comparable to a e , ]J / EoE is quite small for norrnal particle densities, and
the approximations (6Ji l ) and (6.~)6) are once again valid.
f

Ionic polarizability depends on the structure of a single molecule and thus shares
the property with electronic polarization of being temperature insensitive unless the
temperature is extremely high. For this reason non-polar gases such as CH 4have
permittivities which are constant over wide temperature ranges, as may be seen from
Figure G.!!.

EXAl\1PLE 6.7
The measured dielectric susceptibility of CO 2 at O°C and one atmosphere is 0.985 X 10- 3 •
Find the total polarizability and estimate what fraction of it is due to ionic displacement.
Use of (6.55) gives
a = a + a. = fOX e = 8.854 X 10- 12 X 0.985 X 10- 3
e t N 2.7 X 1025
= 3.2 X 10- 40 farad In 2•

If one aSSU111eS that the bond consists of C4+ with 202~ then the carbon ion has a shell
structure like helium, and the oxygen ions resemble neon. Referring to Table 6.1, the re-
pulsive coefficient is therefore n = 6. The rotation-vibration spectrum of CO2 yields the
information." that the internuclear distance is da = 1.16 A. Using Equation (6.39) one
obtains

a, =
64 X 8.854 X 10- 12 X (1.16)3 X 10-30 = 0 .27 X 10-40 f ara d 111.2
7(6 + 1) - 16

This indicates that the ionic polarizability is about 9 percent of the total polarizability.
This fractional polarizability has also been determined experimentally. If the dielectric
constant is measured at visible light frequencies (in a refraction experiment), the relatively
heavy ions cannot follow the field variations, and a e alone is contributing. Upon comparing
this result with the measurement of dielectric constant at very low frequencies, where both
a e and (Xi are contributing, one is able to separate the t\VO polarizabilities. The average of
18G. Herzberg, Infrared and Raman. Spectra, p. 398, D. Van Nostrand Company, Inc., New York,
1945.
SECTION 10 The Static Dielectric Constant of Gases 359

the data of three independent investigations!" gives ai to be 11 percent of the total polariza-
bility, so the above rough estimate is the right order of magnitude.

Polar Gases. For a gas composed of polyatomic molecules possessing permanent


dipole moments, all three types of polarization are present and under normal conditions
the polarization density is expressible as

P == N(a e + ai + Jl2/3kT)E == xe€oE


so that the dielectric susceptibility is ternperature-dependent, being given by
N
X, == - (a e
€a
+ ai + J.L2/3kT)

If x, is plotted as a function of T':' for a polar gas, the curve is a straight line, as
indicated by Figure 6.10, the intercept being Nto; +
aJ/eo and the slope NJ.L2/3k€o.
Xe

,,/
./
./
,/
,/

L..-- T':'

FIGURE 6.10 Dielectric susceptibility vs. tentperature for a polar gas.

When experimental data for real gases are plotted in this fashion, it is possible to separate
the contribution due to orientational polarization and make an estimate of the perma-
nent dipole 1110111ent per molecule. Figure G.11 shows such a representation of the experi-
mental results for a variety of gases and supports the straight line prediction of the
theory.

EXAlVIPLE 6.8
With reference to Figure 6.11, measurement of the intercept and slope for the plot represent-
ing CH 2Cl 2 gives

N(a e + ai) == 0.7 X 10-3


€a

N Jl2
-- = 1.5
3k€o

Therefore
a, + ai 0.7 X 10- 3T
J.£2/3kT 1.5
19C. 1'. Zahn, Phys Rev, 27, 459; 1926. H. H. Uhlig, J. G. Kirkwood, and F. G. Keyes, J Chem Phys,
1, 155; 1933. H. A. Stuart, Z Physik, 51, 490; 1928.
360 Dielectric Materials CHAPTER 6

10
CHaCI

_ _ _ _ _ CR,CI,
4

- - - - - - - - - - CCl4
2

1000
o 2.5 3.0 3.5-r
FIGURE 6.11 Experimental data of static dielectric susceptibility us.
ieniperaiure for various cases. [After R. Sanger, Phys. Z., 27, 556; 1926.]

and thus at room temperatures the orientational polarizability is roughly seven times as
great as the electronic and ionic polarizabilities combined.

Further, _(4.5NkEo)H
jJ.- --

so that, using N = 2.7 X 1025 rnoleculea/m" as the particle density under standard con-
ditions, one finds that
J.L = 4.5 X 10- 30 coul-m.
= 1.35 debye units
is the permanent dipole moment of a CH 2Cl 2 molecule.

The procedure illustrated by Example G.8 has been utilized to deduce the permanent
dipole moment jJ. for many gases, and Table 6.3 lists the results for some of the more
common cases.

TABLE 6.3
PERMANENT DIPOLE ~IOMENTS

Alolecule u, Debye units illolecule u, Debye units

HCl 1.03 CS2 0


HBr 0.79 CH 4 0
H 2O 1.81 CC1 4 0
H 2S 0.92 CH 3CI 1.89
CO 0.12 NO 0.1
CO2 0 N0 2 0.4
SECTION 11 The Static Dielectric Constant of Solids and Liquids 361

A study of this table permits some inferences to be made about molecular structure.
The expected symmetry of the CH 4 and CC14 molecules would lead to the prediction
of no net permanent dipole moment, and this is seen to be the case. However, the zero
moment for CO 2 , in face of the fact that CO does have a moment, implies that CO 2
has the in-line molecular structure of Figure 6.5b. A similar conclusion may be drawn
about the structure of CS 2 • Alternatively, molecules such as H 20, H 2S, and N0 2 , which
do have net dipole moments, must have the structure suggested by Figure 6.5c, with
the two bonds making an angle other than 180 deg.

6.11 THE STATIC DIELECTRIC CONSTANT OF SOLIDS AND LIQUIDS

Unlike the situation normally encountered in gases, where a large susceptibility is


precluded because the molecules are many molecular diameters apart, in the case of
solids and liquids the molecules are contiguous. This high particle density often results
in a substantial value of P per unit applied field, thus causing the local electric field to
be significantly different from the macroscopic electric field. Concomitantly, the
dielectric susceptibility is not small, and the approximate expressions (6.55) and (6.56)
for Xe and E 10 c are not applicable, being superseded by the more exact relations (6.31)
and (6.51).
In applying these more exact relations to solids, three classes of materials may be
identified :

1. Solids exhibiting only electronic polarizability


2. Solids which manifest electronic and ionic polarizabilities
3. Solids which possess orientational as well as electronic and ionic polarizabilities

These three types of solids will be considered in turn.


Elemental Solid Dielectrics. These are materials consisting of a single type of
atom, such as diamond, sulphur, etc. It is apparent that such materials contain neither
ions nor permanent dipoles, and thus can exhibit only electronic polarization. If one
makes use of (6.51), the relative dielectric constant is

Na e
1 + (1 - 1 ' ) -
€o
€r = 1 + Xe ==
Na e
(6.58)
1 - 1'-
EO

Measurement of the dielectric constant of such solids, together with an assumed value
for 1', permits an inference of the electronic polarizability.

EXAMPLE 6.9
At 25°C, the relative dielectric constant of sulphur is 3.75. Estimate the electronic polariza-
bility if the density of sulphur at this temperature is 2.05 gms/cc,
Insertion of l' = i in (6.58) gives

Na e = 1.43
Eo
362 Dielectric Materials CHAPTER 6

Since the atomic weight of sulphur is 32, one kg-mole contains N A = 6.02 X 102 6 atoms
and weighs 32 kg. This many atoms occupies a volume

- - -- 15 600 cm 3
V -- 32,000
2.05 '
and therefore the atom density is

6.02 X 102 6
N = = 3.86 X 1028 atoms/rn!
0.0156
The electronic polarizability is thus

1.43 X 8.854 X 10- 12


ae =
3.86 X 1028
= 3.28 X 10- 40 farad m 2

With reference to Table 6.2, since sulphur is close to argon in the periodic table, this result
is seen to be the right order of magnitude, but larger than one would expect for a free
sulphur atom. This increase of electronic polarizability in the solid state is due to the fact
that the bonding of adjacent at0111S affects the valence electrons. Similar calculations for
the diamond-structure solids, such as silicon and germanium, also give a somewhat higher
value for a e than would be predicted for the corresponding free atom.

Since the distance between atoms in a solid is only slightly affected by temperature,
the quantities a e , 1', and N which appear in (6.58) are temperature insensitive, and
one would expect Er for an elemental solid dielectric to be essen tially constan t over a
wide temperature range. This prediction is confirmed by experiment.
Ionic Non-polar Solid Dielectrics. Generally, those solids which contain more than
one type of atom, but no permanent dipoles, evidence ionic as well as electronic polar-
izability. Prominent a1110ng such solids are the ionic crystals, such as the alkali halides.
The structure of these crystals is characterized by a regular three-dimensional alter-
nation of positive and negative ions, and hence the entire crystal has no permanent
dipole moment. However, in the presence of an external field, the positive ion lattice
will suffer a displacement relative to the negative ion lattice, resulting in ionic polariza-
tion. Additionally, both ion types will show electronic polarization, so that the total
polarization density 111ay be written as the SU111 of these contributions, in the form
P = P e + Pi. The static relative dielectric constant €rs is related to P by the expression
(6.59)

However, if the dielectric constant is measured at light frequencies (by a refraction


experiment), the ions are too heavy to follow the field variations, and Pi = O. Denoting
the relative dielectric constant under these conditions by Erl one can write

(6.60)

Combining (6.59) and (6.60) yields for the ratio of polarization densities

€rs - €rl
(6.61)
€rl - 1

Therefore measurements of the dielectric constant under quasistatic and, optical


SECTION 11 The Static Dielectric Constant of Solids and Liquids 363

conditions provides an indication of the relative strengths of ionic and electronic


polarization.
Table 6.4 lists these data for many of the alkali halides. The ionic contribution is seen
typically to be three times as great as the electronic contribution.

TABLE 6.4
STATIC AND OPTICAL DIELECTRIC CONSTANTS FOR ALKALI HALIDES

f rs fTl Pi/P e f rs e-i r.».


LiF ............. 9.27 1.92 8.0 KF ............. 6.05 1.85 4.9
LiCl. ........... 11.05 2.75 4.7 KCl. ........... 4.68 2.13 2.3
LiBr ............ 12.1 3.16 4.1 I{Br............ 4.78 2.33 1.8
LiI ............. 11.03 3.80 2.6 1\1............. 4.94 2.69 1.3
NaF ............ 6.0 1.74 5.8 RbF ............ 5.91 1.93 4.3
NaCI ............ 5.62 2.25 2.7 Rbel. .......... 5.0 2.19 2.4
NaBr ........... 5.99 2.62 2.0 RbBr ........... 5.0 2.33 2.0
NaI ............. 6.60 2.91 1.9 RbI ............ 5.0 2.63 1.5
I
Because Pi, like P, is dependent on atomic structure and particle density, (quantities
which normally are unaffected by temperature), ionic solids also reveal susceptibilities
which are independent of temperature.
Polar Solids. For solids whose molecules possess permanent dipole moments, the
total polarization contains three contributions, and may be written P = P, Pi + +
Po,
with Po representing the orientational contribution. Unfortunately, no adequate
quantitative theory exists which relates Po to its stimulus in the case of solids. The
reason for this is that, unlike the molecules of a liquid or a gas which can rotate freely,
the molecules of a solid are often constrained by the stability of the structure and the
directional character of the bonding. This constraint varies from one material to
another, and impedes the alignment of the permanent dipoles with the electric field
in a manner which, unlike thermal agitation, is not random, and cannot be expressed
generally. Therefore the discussion of polar solids must be limited principally to some
qualitative remarks.
In solids the ease with which a molecule can rotate depends on its shape and on its
interactions with its neighbors. Apparently the more symmetrical the molecule is, the
more freely it will rotate. Non-polar solid methane (CH 4) , which is highly symmetrical,
exhibits this feature and so does solid hydrogen. However, less symmetrical molecules
such as H 2S and H'Cl do not. Indeed, instead of displaying a rotational characteristic,
they appear to have several stable orientations and thus to obey an order-disorder
theory;"
Polar solids whose molecules have a discrete number of allowed orientations show a
dielectric susceptibility with the same temperature dependence that one would expect
if the molecules could rotate freely. To see this, imagine that there are M allowed
orientations which make angles fli with the field E 1oc • If Vi is the energy of a molecule in
N. L. Alpert, "Study of Phase Transitions by Means of Nuclear Magnetic Resonance Phenomena,"
20

Phys Rev, 75, 398-410; February 1, 1949.


364 Dielectric Materials CHAPTER 6

the ith state in the absence of an external field, then the relative population of the
field-free states is

No net orientational polarization in the absence of a field implies that


.lIt'
2:
J.L cos (Ji e- Ud kT = 0 (6.62)
i=l

When the field is present, the relative population becomes

and the net polarization per molecule is therefore

L
.1\1
J.L cos (Ji e-(Ui-jlEloc cos 9 j ) IkT

i= 1
po =
L
111
e-(Ui-jlEl oc cos 9i ) /kT

i=l

When p,E1oclkT « 1, this reduces to

I
Al AI
I J.L cos (Ji e-UdkT[l + J-LE 1oc cos (Ji/kT]
J.L2E 1oc i= 1
cos? (Ji e-UilkT
i=l
po = - -- ---=-M~---

kT
12 e- U d k T
i= 1

80 100 120 140 160 180


T(OI()

FIGURE 6.12 Relative dielectric constant of solid H 2S vs. temperature at


5 kcps. [After Smyth and Hitchcock, J. A.m. Chem. Soc., 56, 1084; (1934)·]
SECTION 11 The Static Dielectric Constant of Solids and Liquids 365

If Ui/kT is also small, as is usually the case, then

L cos" Oi
M

J.l2E1oc i= 1
p; = J:T--M- (6.63)

which, except for a difference in the multiplicative factor, is the same as expression
(6.43). For this reason, one would expect a polar solid to exhibit a T":' temperature
dependence, whether the orientational mechanism is a free rotation or a set of states.

40

30

10

Melting point
Solid V
O""'----~--....-.----'----"-------IL....---"'-
270 280 290 300

T(OK)
FIGURE 6.13 Relative dielectric constant of nitrobenzene vs. temperature.
[.lfter S1nyth and J!itchcock, J .:1n1 Chern Soc, 54,4631,: 1932.]

This temperature dependence is shown clearly in substances such as solid Hel and
H 2S. The relative dielectric constant of solid hydrogen sulfide is plotted as a function
of temperature in Figure 6.12. €r is seen to increase as the temperature is lowered until,
at 127°K, a slight jump occurs, presumably occasioned by a change in structure and a
reconstitution of allowed states. The temperature dependence is once again evident
until103.5°I\: is reached, at which point the permanent dipole moments are apparently
"frozen" and no orientational effect remains,
In considering liquid dielectrics, once again three classes of materials may be identi-
366 Dielectric 1\1 aterials CHAPTER 6

fied. Those liquids which have only electronic, or at most electronic and ionic polariza-
bilities, may be discussed in a manner analogous to what has already been done in the
case of non-polar solids. For the same reasons, such liquids customarily display dielec-
tric constants which are temperature insensitive. Appreciable susceptibilities are found
insome of these liquids because of their favorable molecular structure and high particle
density.
An additional difficulty arises when one attempts to apply the theory to polar liquids.
Although molecular rotation is normally free, so that the Langevin analysis of net
orientation appears pertinent, an actual rotation of a given molecule affects its nearby
neighbors and alters E 1oc , thus further complicating the calculation of ao. This effect
was first appreciated by Onsager" in 1936 and approximate theories have been devel-
oped which take it into account;" The prediction of these theories is that a o is still
proportional to J.L2/kT', the only difference being a change in the value of the propor-
tionality constant.
This T':' temperature dependence of the dielectric constant is evident in the liquid
phases of polar materials such as Hel and 1128. The case of nitrobenzene is shown in
Figure 6.13. A rise in Er is seen to accompany a lowering of the temperature until solidi-
fication apparently sets the permanen t dipoles in fixed orientations.

6.12 THE CLAUSIUS-MOSSOTTI EQUATION

If the Lorentz expression for the local field Equation (6.30) is assumed, so that
'Y = t, then the general polarizability Equation (6.51) may be written in the form
X, Net. Er - 1
(6.64)
x, + 3

If PM is the mass density of the dielectric material, then the molecular density N is
given by

(6.65)

in which N Jt is Avogadro's number and At is the molecular weight, In ~II(S, units,


NA = 6.02 X 10 26 is the number of molecules in one kilogram molecular weight, and
1\1 is the kilogram molecular weight (as an illustration, 111 = 32 kilograms for oxygen);
Pl.! is expressed in kg /m". If (6.65) is substituted in (6.64) one obtains

(6.66)

This is the Debye generalization of the Clausius-Mossotti equation, the earlier version
of which had not included the effects of orientational polarization. I t relates the dielec-
tric constant to the mass density of the material, and is written in such a form that the
right side (called the molar polarizability) is a function only of temperature. Therefore
the left side is independent of density.
21 L. Onsager, "Electric Moments of Xlolcculcs in Liquids," J A ni Cheni Soc, 58, 1486-14D~3; 1036.
22 H. Frohlich, Theory of Dielectrics, 2d ed., pp. 33-50, Oxford Press, London, 1958.
SECTION 12 The Clausius-Mossotti Equation 367

This equation is valid only for dielectrics of low density, for which inaccuracies in
the Lorentz local field expression have little effect. Thus principal agreement between
(6.66) and experiment is found for gases at not too excessive pressures. The form (6.66)
is really not pertinent to liquids and solids, in that their densities are not readily
changed, and then not substantially. Equation (6.64), which is often also called the
Clausius-Mossotti formula, gives fair agreement with experiment for non-polar liquids
in which short range forces are negligible, and also for SOIne simple non-polar solids for
which the Lorentz local field expression is a good approximatiori.P
The ease of gaseous hydrogen is illustrated by Table 6.5, in which the .calculated
values of molar polarizability listed in the last column are seen to be almost constant
over a wide range in density.

TABLE 6.5
DIELECTRIC CONSTANT OF HYDROGEN VERSUS
DENSITY AT 24.9°C 2 4
I
Pressure Density jll €r - 1
---
(atmospheres) (kgjn1 3)
€r
PM €r + 2

7.96 0.324 1.00192 3.946


30.03 1.206 1.00730 4.026
88.13 3.421 1.02083 4.030
255.04 8.984 1.05540 4..038
478.78 14.955 1.09310 4.026
814.62 21.755 1.13766 4.032
1425.36 30.357 1.19500 4.022

24After A. Michels, P. Sanders, and A. Schipper, "The


Dielectric Constant of Hydrogen," Pliusica; 2, 753-756;
1935.

If the dielectric is a mixture, consisting of a homogeneous blend of a number of


different types of molecules, the Clausius-Mossot.ti equation may be extended readily
to the form
€r
--=-
- III N;.« (6.67)
€r
2
+
3 €o n n n

in which N n and an are the molecular density and the total polarizability of the nth
species.

EXAMPLE 6.10
If Equation (6.66) is re-solved for the relative dielectric constant, one obtains

23 For penetrating discussions of the validity of the Clausius-Mossotti formula, see H. Frohlich,

Theory of Dielectrics, 2d ed., Appendix A3, Oxford Press, London, 1958, and C. J. F. Bottcher,
Theory of Electric Polarisation, pp. 199-212, Elsevier Publishing Company, Amsterdam, 1952.
368 Dielectric M aierials CHAPTER 6

and this equation 111ay be plotted, with the result suggested by the figure. As the density
increases, the dielectric constant is seen to climb more and more sharply, and tend to
infinity as the density approaches the critical value 3J! fo/lV A(X. This so-called polarization
ca tastrophe is not approached in the case of most real materials because the approximations
inherent in the derivation of the Clausius-Mossotti equation become invalid long before
such densities are reached. However, it provides an interesting insight to the behavior of
ferroelectric crystals, as shall be seen in Section 6.14. Many gases conform to the lower
portions of the curve, especially those with small polarizabilities.

10

1 10- --.._

o
A typical gas under standard conditions has a relative dielectric constant in the neighbor-
hood of 1.001. For such a gas, if one asks by what factor the density must be increased to
raise f r to the value 2, use of the above equation gives the result 750. This requires that
the molecules be 9 times closer together, which is approaching the liquid state. It is apparent
that an inordinate change in the state variables of a gas is needed before a significant change
in its susceptibility will occur.

EXAMPLE 6.11
If the Clausius-Mossotti equation is written for non-polar materials under quasistatic con-
ditions (a == a e + ai) and at light frequencies (a = a e ) and the difference is taken, one
obtains

Af [frs - 1_ frl - 1] = N A(Xi = TI


+2 +2
i
PM e., Erl 3 fo

this result being known as the rnolar ionic polarizability. If attention is directed to the
solid alkali halides, and the expression found for (Xi in Example 6.4 is used, one finds that

e., - 1 frl - 1 N (Xi 27T'


--- - --- = -- = -----
f rs +2 frl +2 3Eo 1.74(n + 1)

since 2d~ is the volume occupied by one molecule, so that 2N d~ = 1. Use of the experi-
mental values of dielectric constant from 'fable 6.4 provides a check of this equation.
Comparing the last t\VO columns of Table 6.6 reveals that the agreement is reasonably good
for some of the salts listed and fair for the others.
SECTION 13 Isotropic Medium 369

'TABLE 6.6
MOLAR IONIC POLAHIZABILITIES OF VARIOUS ALKALI HALIDES

Repulsive Experimental
coefficient, €rs - 1 €rl - 1 Theoretical
-----
n €rs +2 €rl +2
LiF ........... 6 0.50 0.52
LiCl. ......... 7 0.40 0.45
NaF .......... 7 0.43 0.45
NaC!. ......... 8 0.31 0.40
KCI ........... 9 0.28 0.36
1\] ............ 10.5 0.21 0.31
RbBr ......... 10 0.27 I 0.33
RbI ........... 11 0.22 0.30
I

6.13 PRIMARY STATIC CHARGES IN AN INFINITE,


HOMOGENEOUS, ISOTROPIC MEDIUM

The results achieved so far in this development of a theory of dielectric behavior permit
consideration of a problem of some theoretical interest. Imagine that a system of static
primary charges of density p is contained within a finite volume V 1, this volume being
part of an infinite homogeneous and isotropic medium. In this event the total macro-
scopic field at any point in space is given by

E - -VF [j -PdV- +
47r€o~
VI s, 47r€o~
r.:
- - - jVs.PdV]
47r€o~ V 00
(6.68)

in which it has been assumed that the field caused by the primary charges has induced
in the medium a static polarization density P which is linearly proportional to the field.
The surface integral over Sx; vanishes because Soc; may be taken to be a sphere of
radius R ~ 00, and over Soc; the polarization P is R-dependent to a greater negative
power than -2.t Therefore (6.68) may be written

(6.69)

But P has been assumed to be linearly proportional to the macroscopic E field so that

D = foE + P = foE + Xe€oE


V' • D = p== €oV • E + V' • P == €o(l + Xe)V • E
p
p - V · P = EOV' • E == - -
I + x,
t This conclusion may be reached by permitting the minutest bit of loss in the medium during estab-
lishment of the field.
370 Dielectric Materials CHAPTER 6

Making this substitution in (6.69), one obtains

(6.70)

and it is as though no polarized molcoules were present, but Coulomb's law and all its
consequences were valid with EO replaced by E and only the primary charges considered.
For gases such as air in which x, is very small, the impractical restriction of Chapter 3,
that only primary charges in a vacuum would be considered, 111ay be removed with no
practical alteration in any of the results.

6.14 FERROELECTRIC CRYSTALS

For all the dielectric materials considered in the preceding sections, the displacements
of centers of charge and the accompanying polarization were linearly proportional to
the externally applied electric field, and disappeared upon removal of the field. How-
ever, not all dielectric materials behave in this manner, In recent years several classes
of ionic crystals have been discovered to have the property that, when the centers of
positive and negative charge have been pulled apart sufficiently, they remain locked
in their new positions, causing the molecules to become permanently polarized. These
crystals then display polarization without benefit of an external field, and are said to
be spontaneously polarized.
It is possible to cause a specimen of such material to become uniformly polarized,
but a 1110re common initial situation is one in which subvolumes of the specimen, called
domains, are individually uniforrnly polarized, with the direction of polarization vary-
ing from one domain to another in a random manncr.J" When the specimen is in this
condition it shows no net bulk polarization. If now an external electric field is applied,
those domains whose polarization is aligned with the field grow at the expense of the
other domains, as individual molecules change the direction of their polarization. A
plot of the net polarization, as the intensity of the applied electric field is increased,
will take the form of the curve oab in Figure 6.14. At first the slope of this curve keeps
increasing, as the growth of favorable domains assists the external field in the realigning
process. Then an inflection point is reached, and the slope decreases as saturation sets
in until, at point b, only favorably oriented domains are left and no further realign-
ment is possible. Additional increase of the electric field then causes only a slight linear
change in P (portion bc), this occurring because of the normal electronic and ionic
polarization effects discussed previously. An extrapolation of the linear segment be
intersects the P axis at a value P s , which is the spontaneous polarization density each
domain individually possessed in the random state, and which the organized specimen
now exhibits.
If the electric field is decreased, some nonaligned domains spring up and begin to
grow, and the curve cbe is traced out. However, when the E field is totally removed,
a remanen t polarization oe still exists and the specimen shows a bulk polarization in
the absence of an external field. The electric field must be reversed and reach a value
25 These domains can be observed through a microscope using polarized light. See, e.g., P. W. Fors-

bergh, Jr., "Domain Structures and Phase Transitions in Barium Titanate," Phys Rev, 76, 1187-1201;
1949.
SECTION 14 Ferroelectric Crystals 371

fo before the net polarization is reduced to zero. If the reversed field is further increased,
the curve fg is followed, saturation once again setting in when those domains with
polarization parallel to E have grown to engulf the entire specimen. Another reversal
of the electric field results in the curve ghic, thus completing the cycle.
The similarities between this process and the corresponding one involving ferro-
magnetic materials explain why dielectrics which display this hysteresis effect are called
ferroelectrics. The choice of the word ferroelectric is not completely appropriate because
the microscopic mechanisms are dissimilar, as will become evident in Chapter 7.

-------~----fl~---'Ii~-----E

FIGURE 6.14 Hysteresis curve for ferroelectric crystal.

The nature of the hysteresis curve of Figure 6.14 points up the fact that there is no
linear relation between polarization and field for ferroelectric crystals, and that the
relation is not even single-valued. One can define a differential dielectric "constant"
by the equation

dP
(6.71)
dE

but €r is not a constant, and its value depends on the previous history of the specimen,
as well as on the value of E. In speaking of "the dielectric constant" of a ferroelectric
crystal, what is usually meant is the value of €r deduced from (6.71) when dP IdE is
found for the virgin curve oab at the origin. It is a characteristic of ferroelectric crystals
that €r so determined is quite high, with values in the range 500-5,000 being not
uncommon.
This phenomenon of spontaneous polarization has S0111e features akin to orienta-
tional polarization, but the t\VO mechanisms differ in several fundamental ways,
Orientational polarization disappears when the external field is rcmoved ; spontaneous
polarization does not. Orientational polarization is due to permanent dipole moments
possessed by the individual molecules. These molecules orient themselves in a random
372 Dielectric Materials CHAPTEH 6

succession of directions due to therrnal agitation and therefore orientational polariza-


bility is temperature dependent. However, in ferroelectric crystals, the individual
molecules have permanent dipole moments which, within a given domain, are all
aligned steadily, thermal agitation not affecting this orientation.
Ferroelectric crystals do exhibit a different type of temperature effect, however. The
spontaneous polarization usually disappears above a certain characteristic tempera-
ture OJ, called the ferroelectric Curie temperature. The reason for this can be traced
to a change in crystal structure such that individual molecules no longer possess per-
manent dipole moments. Above the Curie temperature the dielectric constant is found
to vary with temperature in such a way as to obey the Curie-Weiss law

C
€ == - - - (6.72)
T-O
in which C and () are constants characteristic of the crystal in question. () is usually a
few degrees below the Curie temperature Of.

Ps

100 105 110 115 120 125

T(OI()
FIGURE 6.15 Spontaneous polarization of potassium dihydrophosphate,
I(II 2P0 4 • [After .l rz and Bantle, Helu Phys Acta, 16, 211,. 1943.J

Several different groups of ferroelectric crystals may be classified on the basis of


their chemical composition, structure, and electrical behavior:
1. Dihydrogen phosphates and arsenaies of the alkali metals. Typical of this group
is !<:H2P0 4, whose spontaneous polarization density as a function of temperature is
given in Figure 6.15. The Curie temperature is seen to be 123°I\:, which is representative
of the entire group, and so low as to limit seriously their practical applications. The
shape of this curve is similar to one showing the spontaneous magnetization of
iron, as shall be seen in Chapter 7.
2. The tartrates. Characteristic of this group, and the first solid to be recognized
as possessing ferroelectric properties, is Rochelle salt (N aT~C4H406 . 4H 20). This salt
was first prepared in 1672 by the French pharmacist Seignette who lived in La.Rochelle,
and for this reason ferroelectricity is referred to by S0111e authors as Seignctte elec-
SECTION 14 Ferroelectric Crystals 373

tricity. Figure 6.16 gives the spontaneous polarization density of Rochelle salt as a
function of temperature and reveals t\VO transition temperatures, at 2,55°I( and 296°1(.
In the ferroelectric phase the crystal is monoclinic but above 296°I( and below 255°I(
it has an orthorhombic structure. The spontaneous polarization occurs along one
polar axis only and therefore the dipole moment is limited to t\VO opposite directions,
resulting in a relatively simple domain structure. When measured along this axis, the
relative dielectric constant is found to reach values as great as 4,000 in the neighbor-
hood of the t\VO transition temperatures. Transverse to this axis, the relative dielectric
constant is very low.

250 260 270 280 290 300


T(OK)
FIGURE 6.16 Spontaneous polarization of Rochelle salt.
[After llalblutzel, flelv Phys Acta, 12., 489; 1939.]

3. The GASH group. So-named after guanidine aluminum sulphate hexahydrate


(CN 3H6)AI(S04)2 . 6H 20, which was discovered to be ferroelectric in 1955, this group
contains a wide variety of crystals. They have the common property of possessing
hydrogen bonds of the type O-H-O or O-H-N which link together deformable
ions such as (S04)2-, (8e04)2-, etc. The list includes certain alums and many glycine
compounds. GASH-type crystals are characteristically "soft," which means that they
are water soluble, have a low melting or decomposition temperature, and are physically
soft at room temperature. They exhibit relative dielectric constants of the order
of 5 or 6 and do not have a detectable Curie point, presumably because elevation of
the temperature occasions the loss of water of crystallization, thus preventing repro-
ducible results. C;ASH itself is trigonal and has a spontaneous polarization of 0.35 mi-
crocoul/cm ' at room temperature. An attractive feature is its square hysteresis loop.
4. The oxygen octahedron group. Prominent among these crystals is barium titanate,
BaTi0 3 • Above the Curie temperature {)f == 393°1(, it possesses the cubical structure
suggested by Figure 6.17. This is called the perovskite structure, after the prototype
mineral CaTi0 3 • The Ba 2+ ions are seen to occupy the corners of a cube, the centers
of the six faces being occupied by 0 2- iOIlS. The oxygen ions thus form an octahedron,
at the cen tel' of which is found the Ti 4+ ion.
Below the Curie temperature the structure is no longer cubic. The material becomes
spontaneously polarized in a direction parallel to one of the cube edges, and along this
374 Dielectric 1l{aterials CHAPTER 6

direction the crystal expands, whereas transverse to this direction it contracts, result-
ing in a tetragonal structure. The Ti 4 + ion is no longer at the center of gravity of the
0 2- ions, having suffered a sizable relative displacement which, coupled with its 4e
charge, can account for the high resulting polarization density. Since there are six possi-
ble directions of spontaneous polarization, the domain structure of barium titanate is
more complex than that of Rochelle salt.

CD Ba 2+

• Ti 4 +

0 0 2-

FIGURE 6.17 The Perovskite crystal structure of barium titanate.

Figure 6.18 depicts the spontaneous polarization density and relative dielectric con-
stant of BaTi0 3 versus temperature, when measured along a cube edge. Two additional
transition temperatures are observed. These occur at 278°!{, where the spontaneous
polarization changes to a direction parallel to a face diagonal, and at 193°K, where
it changes direction again and becomes parallel to a body diagonal.
The possibility of spontaneous polarization in these ferroelectric crystals is suggested
by Equation (6.49). If Nex~/EO = 1, the indication is that a finite value of P can exist
with ]~' = O. To achieve this condition, one requires a high value of the product of N
(liquid or solid state), of ~ (large local field), and of the polarizability ex. This favorable
combination of factors is peculiar to a limited class of crystals.
A possible cause for the existence of a Curie temperature in these crystals may also
be traced to Equation (6.49), which tall be combined with (6.50) and (6.~)3) to give

Na/Eo
Er - 1 = (6.73)
Na
1 - 'Y-
EO

If it is assumed that ex and ~ are independent of temperature, then Er depends on tern-


perature in a manner controlled by N. But the particle density N, which is inversely
proportional to the volume, is related to the volume coefficient of expansion A by the
expression
dV/V dN/N
--=A=--- (6.74)
dT ar
Therefore as the temperature is decreased, the particle density increases. If at some
SECTION 14 Ferroelectric Crystals 375

temperature 1 the quantity N a:yI EO is only slightly less than unity, cooling the crystal
1

to a lower ternperature ()f may cause sufficient contraction to make N Ci.Y / Eo == 1, with
the result that spontaneous polarization 111ay occur. In a small range of temperature
just above (Jj, this would suggest that changes in the dielectric constant are related
to the volume expansion of the crystal.

~
20 X 10- 2
8
<,
~ 16
0
~ 12
~
c, 8
4

0
T(OK)

10 X 10 3

o 120 180 240 :l00 :360 420


T(OI{)

FIGURE 6.18 Spontaneous polarization and dielectric constant


(along cube edge) of barium titanate as functions of iemperature.

If Equation (6.73) is differentiated with respect to temperature, one obtains


dN IN _ -x _ dE rld1
1

dT - - (E r - l)[Y(Er - 1) + 1]
In the neighborhood of () f the relative dielectric constant tr» 1; since y is of the order
of unity,

dT
Formation of the integral of this expression gives

'jTJ ~:r =
fr(O/) r
_ A'Y i
0/
dT
376 Dielectric Materials CHAPTER 6

Since fr(Of) ~ 00, this reduces to

l/A1'
---- (6.75)
T - Of

which is the form of the Curie-Weiss law (6.72); the agreement is not perfect, in that
experimentally 0 is usually found to be several degrees below the Curie temperature
Of. However, (A1')-1 appears to be the right order of magnitude for the Curie constant
C. For example, barium titanate has an observed Curie constant of approximately 10 5
and an expansion coefficient A = 3 X 10- 5 per degree, which would give l' = t, a
reasonable value.
The high value of the polarizability a exhibited by all ferroelectrics may be explored
further by returning to a discussion of crystal structure. In the case of barium titanate,
the cubical cell of Figure 6.17 has been measured and is found to have an edge dimension
of 4.00 A. Since Figure 6.18 indicates a spontaneous polarization density at room tem-
perature of 0.16 coul/rn", the equivalent dipole moment per unit cell of barium ti-
tanate is
p = P, dV = (0.16)(4 X.10- 10 ) 3 = 10- 29 coul-m

If one assumes that this dipole moment is entirely due to a shift of the titanium ion
relative to the tetragonal array of oxygen ions, then this displacement is

10- 29 0

d = = 0.16 A
4 X 1.6 X 10- 19

which is a reasonable result. The actual situation may be more complicated, since some
relative shift of the barium ions and the oxygen ions is possible. 26
It is interesting to observe that if all the permanent dipole moments in liquid water
were aligned; the result would be a polarization density Po ~ 0.10 coulyrn", a figure
which is comparable to the spontaneous polarization density for barium titanate. The
significant difference is that in water the dipole moments are randomly oriented whereas
within a domain of barium titanate they are all parallel. Another feature of interest
in barium titanate is that its electronic polarizability is not insignificant.
The unusual properties of ferroelectric crystals have led to their use in a variety
of practical applications. The nonlinear relation between P and E permits the design
of devices such as rectifiers; the polarity of the remanent polarization (oe or on in
Figure 6.14) permits the storage of binary information in a computer memory; and
the high relative dielectric constant permits large capacities in small volumes.

6.15 PIEZOELECTRICS

In Example 6.4 the polarizability of N aCI was examined by assuming that a static
electric field was applied parallel to one axis of the cubical crystal. Relative displace-
ments of the negative and positive ionic lattices were then hypothesized and a new
balance of forces was established, from which an equilibrium displacement could be
deduced. If second-order effects had also been considered, it would have been found
26 See, e.g., A. J. Dekker, Solid State Physics, Sec. 8.6, Prentice-Hall, Inc., Englewood Cliffs, New

Jersey, 1957.
SECTION 15 Piezoelectrics 377

that the longitudinal shift of the two sets of ions caused transverse force components
which resulted in a slight lateral compression, This in turn permitted an elongation
of the crystal in a direction parallel to the field. If the electric field had been reversed,
the same mechanical deformation would have occurred. This phenomenon is called
electrosiriction and is common to all dielectric solids.

/
/
/
/~----

CI
-----------/l
/ / I
// / I
f---- ~ Cl // :

- --{ I
Cl /; I I
+ I I
Na I
I
I
CI I
)
/
/
Cl I /
L___ I /
-----__ 1/

FIGURE 6.19
r F
Mechanical deformation of NaCI crystal.

If the crystal possesses a center of symmetry for the constituent charges, this is not
a reciprocal effect, in the sense that a mechanical deformation will not cause a net
polarization of the specimen. The reason for this is that an applied mechanical force
acts on elements of mass but does not distinguish between the two types of charge.
Taking N aCI as an example, Figure 6.19 shows the deformation of a local region of the
crystal when it is placed in compression. The arrows indicate the directions of displace-
ment of the six CI- ions relative to the central N a + ion. By symmetry these displace-
ments are equal and opposite in pairs, so that the electrical center of the six CI- ions
still coincides with the position of the N a + ion and there is no polarization.
378 Dielectric Materials CHAPTER 6

However, not all crystals contain a center of symmetry, Consider for example the
hexagonal array of ions shown in Figure 6.20a. Starting from any ion, if a vector is
drawn to any other ion, the negative of this vector does not reach a similar ion, so this
crystal is not center-symmetric. In the undisturbed state the equivalent centers of
the three positive and three negative ions, shown connected by solid bond lines, coin-
cide at the point P and there is no polarization. However, if the crystal is placed in
compression, as shown in Figure 6.20b, the hexagon flattens. Since the transverse
expansion is smaller than the longitudinal contraction, the equivalent center of nega-
tive charge shifts to the right, the equivalent center of positive charge shifts to the left,

F F

t
+ ---

,
I
+
I
I
II '\
":t
I
-

"+
, t +--
'+_.-! / +-_.
/ \
\

\ +-----/ \ / / \+
/' \
• P +-----
/ \, \p+
-
P_j
• •

=+--
\P-
-
Pi
• • +

-,
, ,
t-" \ I
-:- +--
, , " I
+--_.~ +-----

l
\ I

t
\ I

'+---- !-

F F

(a) (b) (c)

FIGURE 6.20 ~11 echanical deformation of a crystal lacking a center of S?jn1.1netry.

and a net transverse polarization appears. Alternatively, if the crystal is placed in


tension, the hexagon elongates, as shown in Fig. 6.20c, and a transverse polarization
of the opposite sense occurs.
This phenomenon, in which a mechanical deformation causes an electric polarization,
is called piezoelectricity, after the Greek word piezein which means to press. The effect
was discovered in 1883 by Pierre Curie and is reciprocal-an electric field which causes
the polarization of Figure 6.20b will also cause a contraction transverse to itself; an
electric field which results in the polarization of Figure 6.20c will similarly cause an
elongation transverse to itself.
Of the 32 different classes of crystals, 20 lack a center of symmetry and hence are
possible piezoelectrics. (Quartz and Rochelle salt arc t\VO con11110n materials which
exhibit this effect.) Such crystals are important because they permit the conversion
of mechanical energy into electrical energy and vice versa. Practical applications
include the crystal pickup in a phonograph, the stable frequency source, and the trans-
ducer, in which an electrical voltage is used to drive a piezoelectric crystal at its me-
chanical resonant frequency, thus serving as a source of ultrasonic radiation.
It should also be noted that when a piezoelectric crystal is heated or cooled, a charge
separation appears across its faces. This effect is known as pyroelectricity and the charges
SECTION 16 Time-H armonic Fields and C0111plex Permiitiuiiu 379

are evidence of an internal polarization, caused by the strain associated with thermal
expansion or contraction of the crystal.
Finally, mention should be made of a class of materials known as electrets. These
are organic materials which, when solidified in thc presence of a steady electric field,
appear to have a nct electric moment "frozen in." Despite the fact that these moments
may persist for long periods of time, it is believed that this is a metastable state.

6.16 TIME-HARMONIC FIELDS AND COMPLEX PERMITTIVITY

In the preceding sections dielectric materials were studied when under the influence
of static electric fields. Three polarization mechanisms were identified (electronic, ionic,
orientational) in which the induced dipole moments normally were linearly propor-
tional to the exciting field, and disappeared when the field was removed. A fourth
mechanism (spontaneous polarization) was found to occur in a limited class of mate-
rials, the ferroelectric crystals, and in such materials the relation between polarization
density and external field was generally nonlinear. Beginning with this section, these
earlier static results will be extended to the case in which the exciting field is time-
harmonic, but the analysis will be restricted to linear materials. Thus electronic, ionic,
and orientational polarization will be considered, but the reaction of ferroelectric
crystals to a time-harmonic excitation will not be treated.
If an externally applied t ime-harrnonic electric field has persisted within a dielectric
for a sufficient length of time, the induced polarization density P must also be periodic
in time. However, the displacement of charges associated with this polarization usually
shows some inertia, so that P may not be in phase with the local field. It is also possible
that P may not be parallel to the local field. An example of this was noted in the case
of the hexagonal crystal structure of Figure 6.20. However, it will be assumed in all
that follows that the maqnitude of P is linearly proportional to the magnitude of the
local electric field.
It is shown in Appendix L that if p(~,17,r)ejwt is the induced polarization, with P a
complex vector, then the macroscopic potential function due to all the dipole moments
within a dielectric specimen of volume V is

( - i":" +
cI> x,Y,z,t -
S 41T"Eo~
j-Vse{P}dV
V 41T"EO~
(6.76)

in which S is the bounding surface of the specimen and ~ is drawn from dS or dV to


(x,Y,z). {PI is the time-retarded value of P. This result is seen to be similar to the
static formula (6.8), the only difference being that the polarization is now t.ime-har-
monic and retardation must be incorporated. Equation (6.76) normally is valid for
points within the dielectric as well as without, and permits the interpretation that the
dielectric specimen is equivalent in its electrical action to a surface charge distri bution
P n plus a volume charge distribution - V s P. e

Paralleling the development of Section 6.4, one can consider a distribution of primary
charges, of density p(~,1],s)ejwt, occupying a volume V 1, and an assortment of dielectric
materials which fill a volume V 2, with P(~,1],r)ejwt the density of induced dipole momen ts.
Both p and P are in general complex functions and the resulting macroscopic electric
380 Dielectric Materials CHAPTER 6

field E is such that


v · (EoE) = p - V•P
It is once again advantageous to introduce a generalized macroscopic electric flux
density function D by the relation
D = EoE + P (6.77)
In (6.77) all three vectors are in general complex and P need not be parallel to E, in
which case D is not parallel to either of them. Once again, D satisfies the divergence
relation
V· D = p

and is thus a flux field whose discontinuities are associated with the positions of the
primary charges.
Many other results of the static theory may be extended to the time-harmonic case.
Equation (6.31), which connects the local and external fields via the polarization, is
valid if the external field constant 'Y is taken to be a complex constant (or perhaps a
complex tensor). Equations (6.48) and (6.49) also apply if the polarizability a is treated
as complex, so that once again P = xeEoE and (6.77) may be written

in which
(6.78)
is the complex permittivity, E' and e" being its real and imaginary parts; x, is now the
complex dielectric susceptibility. As before, the relative dielectric constant Er can be
defined by Er = e/EO. It will be found in what is to follow that a, Xe, E, and Er are all
complex functions of frequency, and, in general, each may be a tensor, since P is not
necessarily parallel to E.

6.17 TIME-HARMONIC ELECTRONIC POLARIZABILITY


The complex polarizability a can be considered first for simple materials, such as mono-
atomic gases and elemental solids, in which the only polarization mechanism is elec-
tronic. Returning to the atomic model of Section 6.6, let it be assumed that a single
+
atom consists of a nucleus, of charge Ze, and an electron cloud of total charge - Ze
distributed uniformly throughout a sphere of radius ?"a. Since the nucleus is several
thousand times more massive than the electron cloud, the motion forced on the nucleus
by an external time-harmonic field can be ignored in comparison to the periodic oscilla-
tions of the cloud itself. If attention is focused first on the case of monoatomic hydro-
gen, the differential equation which describes the motion of the electron cloud can be
deduced by the following argument: Let the cloud be displaced an amount r relative
to the nucleus, after which all external forces are removed. In accordance with Equation
(6.32), the cloud will experience a restoring force given by
e2
Fr = - --3 r = - ar
47T" Eo1'a
in which a is called the restoration coefficient. If there were no damping, the resulting
motion would be governed by the differential equation of an harmonic oscillator,
SECTION 17 Ti1ne-I-I arinonic Electronic Polarizobiliiu 381

namely,
(6.79)
with m the mass of an electron. Solutions to this equation are periodic at a natural
angular frequency Wo == (a/rn)V2. If one takes fa as 0.5 X 10- 10 m (a good approxima-
tion for the ground state) and uses m == 0.91 X 10- 31 kg for the 111aSS of an electron,
the estimate can be made that 10 1 6 < Wo < 10 17 and therefore that the natural resonant
frequency of electronic oscillation for a single hydrogen at.orn lies in the ultraviolet
portion of the spectrum.
Equation (6.79) is inexact because it does not account for the damping caused by
emission of energy due to the time variation of velocity of the electron cloud. It is
shown in Appendix ~\I that this damping can properly be included by taking as the
differential equation of free motion
mi = -aT - 2br (6.80)

in which the damping constant b is given by

(6.81)

It therefore follows that if a forcing electric field Eloc(x,Y,z)eiwt is present, the equation
of motion of the cloud satisfies
nii == - aT - 2br - eEloceiwt (6.82)
in which E 10c may be treated as a complex constant in the small region occupied by the
atom, This differential equation has the particular solution T(t) == CRe Aeiwt, with the
complex amplitude A given by
__ (elm)E 1oc
A (6.83)
w
2
- w~ - jw(2blm)

If the forcing field has been present long enough, the damped complementary solution
to the homogeneous equation (6.80) may be ignored in comparison to the particular
solution (6.83). It then follows that the induced dipole 1110n1ent (cf. Appendix L) is
given by
(6.84)
so that the complex electronic polarizability is
e2 111~
a == (6.85)
wo[ 1 - (wi wo) 2] + j (2 bwoI m )wi Wo
e 2

If a, is separated into its real and imaginary parts according to the formula

and displayed as a function of frequency, the curves of Figure 6.21 result. The real part
is seen to be positive for wi Wo < 1 and negative t hereafter. For wi Wo small it reduces to
the static polarizability given by (6.35), and it has two resonant peaks, symmetrically
disposed around w/ Wo = 1 and separated by 2bwo/Jn. The imaginary part has a single
resonance centered around wi WQ = 1 and it will be seen in Section 6.20 that this gives
382 Dielectric M aierials CHAPTER 6
rise to absorption of energy by the at0111 from the exciting field. Outside the region

2bwo w 2bwo
1 - -111.,< - <
Wo
1 + -m
the imaginary part of the electronic polarizability is seen to be negligible; the real part
has essentially the static value below this region and is negligible above it. Upon placing
numerical values in (6.81), one finds that 2bwo/11L «1 and therefore this resonance
is confined to a tight frequency region in the ultraviolet part of the spectrum,

Ot--------:::...---t---\- ~ w

I
(-2~:o 1
FIGURE 6.21 Complex electronic polarizability of single hydrogen alOUL

When the higher energy eigenstates of a hydrogen atorn are considered in this fashion,
the classical equivalent is to choose different values for fa, which leads to a sequence of
harmonic oscillators with restoration coefficients ai, damping coefficients bi, and reso-
nant frequencies Wi. The strengths of these harmonic oscillators are weighted, and the
result is a plot of complex electronic polarizability versus frequency which displays a
series of resonances of varied heights, spaced throughout the ultraviolet portion of the
spectrum: each of these resonances has the shape of Figure 6.21. At visible frequencies
and below, the electronic polarizability of a hydrogen atom is, for all practical purposes,
a pure real number given by the static formula (6.35).
If an atom of higher atomic number Z is considered, the spectral series of absorption
lines will be altered because of the 1110re complex structure of the electron cloud.
Individual elements will exhibit electronic polarizabilities with resonances at charac-
teristic frequencies, these frequencies predominant ly occurring in the ultraviolet or
beyond. At frequencies below the ultraviolet, the total electronic polarizability will be
essentially the static real value.

6.18 COMPLEX IONIC POLARIZABILITY; TIME-HARMONIC


PERMITTIVITY OF NON-POLAR MATERIALS

If two ions are set into harmonic motion by a time-harmonic field, the analysis of the
effect parallels what has been presented for electronic oscillations. In the case of ions
the restoring force is also (to first-order) proportional to the displacement from equilib-
SEc'rION 19 Dipolar Relaxation 383

rium, with the restoration coefficient approximately given by

(z~)2(n + 1)
a==
47r€od~

in which z is the ion valence, n the repulsive exponent, and d; the spacing between ions.
If ni is the mass of an ion, the natural resonant frequency is once again given by
Wo == (a/1n)~2. Using typical values, one finds that 10 14 < Wo < 10 15 so that natural
ionic oscillations are in the infrared part of the spectrum, As with the electronic case,
there is a small damping factor so that o: is complex in the neighborhood of Wo, having
the static value below resonance and being essentially zero above resonance. For com-
plex molecules, a series of resonances in the infrared will be noted.
In materials which exhibit only electronic and ionic polarization, the complex relative
dielectric constant is given by

€r == 1 + 1 + (1 - ~)Na/€o
X e == - - - - - - -
1 - ~Na/€o

in which a == (a: - }a:') + (a~ - ja~') is the sum of the complex electronic and ionic
polarizabilities per molecule, this equation arising from (6.51). For millimeter wave-
lengths and longer, €r for such materials is frequency independent, temperature inde-
pendent (if the molecule density N is constant), and equal to the static value discussed
in earlier sections of this chapter. These materials normally will show a complex
permittivity only in the infrared and ultraviolet regions of the spectrum and then only
in a series of isolated narrow bands of frequency. In the visible spectrum the ionic con-
tribution will have dropped out, and the dielectric constant will equal what the static
value would be if there were only electronic polarization. Beyond the ultraviolet region
the permittivity is that of free space.

6.19 DIPOLAR RELAXATION


In the previous two sections the frequency dependence of electronic and ionic polariza-
tion has been discussed. It was found that dielectric materials which have only these
t\VO kinds of polarizability show a permittivity which is frequency independent (and
equal to the static real value) at all wavelengths longer than infrared. However, many
materials do not fit this category. Due to the frequency dependence of their orientational
polarization (or other causes yielding similar effects), some materials exhibit complex
permittivities in the microwave range of frequencies and below, and thus give rise to
dielectric losses at wavelengths of practical interest to those concerned with electro-
magnetics. These materials include many liquids and glassy substances and even some
crystalline materials.
For such materials if a constant electric field E is suddenly applied, the polarization
shows considerable inertia and will not reach its static value immediately but instead
will approach it gradually, as suggested by Figure 6.22a. If after the ultimate value of
polarization P2 has been achieved the field is suddenly removed, the polarization will
not vanish immediately but will decay as shown in Fig. 6.22b.
This behavior can be explained by arguing that the almost instantaneous change in
polarization (either Pi or P2 - P 3) is due to the electronic and ionic contributions and
384 Dielectric Materials CHAPTER 6

that the slow buildup or decay (P 2 - PI or P 3 ) is due to an orientational or equivalent


effect, with which there is associated relatively much more inertia. It has been noted
that the natural angular frequencies of electronic and ionic oscillations exceed 10 1 4 rad/
sec, so the rise to PI or the fall from P2 to P 3 can be expected to occur in a time interval
of the order of 10- 1 4 sec. The orientational relaxation, on the other hand, 111ay vary
from 10- 12 sec to many hours, depending on the structure of the material and its
tern perature.

p P

--------------- P 2 ------- ----P 2

- - - - - - - - - - - - Pi

(a) (b)
FIGURE 6.22 Build-up and decay of polarization upon sudden application or removalof steady field.

According to this analysis, if fei is the (real) permittivity of the material at a frequency
WI just below the first infrared resonance, and if Es is the static (real) permittivity, then
Pi = P2 - P3 = (Eci - fo)E (6.86)
P2 = (f s - fo)E (6.87)
If the shape of the build-up or decay curve in Figure 6.22 is known, it is possible with the
aid of the boundary conditions (6.86) and (6.87) to determine the complex permittivity
feW) = f/(W) - jf"(W)
in the range of angular frequencies below Wi. To see how this is accomplished;" imagine
that during the time interval between u and u + du a rectangular pulse of strength
E(u) has been applied to the dielectric, being zero outside this interval. By virtue of
the equation
D = foE + P (6.88)
it follows that a displacement D will arise as a result of the application of this pulse,
which will be present at time t > u + du, but which will gradually subside. D is there-
fore a function of t - u such that one may write
D(t - u) = E(u)J(t - u) du t> u + du (6.89)
where J{t - u) is the decay function which describes the gradual subsidence of D. In
general, J(t - u) may be a tensor function.
As suggested by Figure 6.22, D contains a part which can follow the field almost
instantaneously and which is expressible by feiE. Thus
D(l - u) = feiE(u) + E(u)j(O) du u <t<u + du (6.90)
27 This analysis draws upon a development given by Frohlich, Theory of Dielectrics, 2d ed., pp. 4-8,

70-73, Oxford Press, London, 1958.


SECTION 19 Dipolar Relaxation 385

in which it is assumed that f is essentially constant during the short time interval duo
Imagine that at a later time interval between u' and u' + du' an additional rectangular
pulse E(u') is applied to the dielectric. Under the assumption of a linear theory, the
principle of superposition will apply and the resulting displacement D(t - u') will add
linearly to the earlier displacement D(t - u). If this process is extended to a continuous
time-dependent field E(t) which is initiated at the time t == 0, then the total displace-
°
ment D (t) at any time t > is given by

f E(u)f(t -
t

D(t) = EeiE(t) + u) du (6.91)


o
If this development is applied to time-harmonic fields so that E == CRe E(x,Y,z)e i wt , then

f f(t - u)eiwu du
t

D(x,Y,z,t) - EeiE(x,y,z)ejwt = E(x,y,z)


o

E(x,y,z) f f(v)ejw(t-v) dv
t
=
o
t

= E(x,y,z)e jwt f f(v)e- jwv dv (6.92)


o
in which v == t - u is a substitution variable. If it is assumed that the time-harmonic
field has existed long enough so that all transients have died out, and D is also periodic,
this implies that t is large enough so that f(t) is essentially zero. But this means that the
upper limit of integration in (6.92) can be extended to infinity with negligible error.
When this is done, (6.92) may be written in the form
~

D = E [Eei + f f(v)e- jwv dv] (6.93)


o
Upon defining the complex permittivity by the relation D = EE, one finds that

+ f f(v)e- jwv dv
ec

E(W) = E' (w) - jE" (w) = Eei (6.94)


o
Thus if the decay function f(v) is known, the frequency dependent portion of the com-
plex permittivity may be deduced.
For many materials the simple exponential function

f(v) == Ae- vlT (6.95)

is found to be appropriate. In this equation T is independent of time but may be


influenced by temperature; it is called the relaxation time and its reciprocal is referred to
as the relaxation frequency. A is a constant which will now be determined.
If the special case of equilibrium in a static field is considered then w == 0, D == e.E,
and (6.94) gives
~

Es = Eei +f Ae- v/r dv = E.i + AT


o
386 Dielectric Materials CHAPTER 6

so that A = (e, - fei)/T. Thus

fs - fei
f(v) = - - - e - fJ T
/ (6.96)
T

When this decay function is inserted in (6.94), one obtains

feW) = fei + 1+. fs - fei

JWT
(6.97)

If this result is separated into its real and imaginary parts, it is found that

fa - fei
f' (W)
= fei +1+ W 2T 2
(6.98)

WT
f"(W) - (e -
- s
f .)
et 1 + w 2r 2
(6.99)

These equations are customarily referred to as the Debye equations, and they are
illustrated in Figure 6.23 as functions of WT for the case e, = 5f ei/4. The real part fo' is
seen to change smoothly from the value f s to the value fei as W is increased. The imaginary
part f" has a broad maxirnum at wr = 1. Irrespective of the values of fa and fei, this
resonance is such that a decade change in crt either side of unity causes a fivefold
decrease in e",

Ee; - - - - - - - - - - - - -- - - - - - - - - - - - - - - - - -

E"

L- --"- --Aoo. ----' WT

o 1 2 ;{ 4 5
FIGURE 6.23 Debye curves for a dielectric 'material 'uith a single relaxation time (e, = 5f ei/ 4).
SECTION 19 Dipolar Relaxation 387

From these observations it may be concluded that maximum absorption of energy


from the field by such dielectric materials will occur at the relaxation frequency
w == ,-I. Sufficiently below this frequency, the absorption is negligible and the permit-
tivity has the real value €s; sufficiently above the relaxation frequency, the absorption is
once again negligible and the permittivity has the smaller real value €ei. At the relaxation
frequency, € == (e, + €ei) /2 - j(€s - €ei) /2. For most materials to which this analysis
applies, (€s - €ei) / (e,+€ei) « 1 so that this dielectric constant is only slightly complex.
Several physical models fit the assumption that the decay functionf(v) is representable
by a simple exponential term containing a single relaxation time ,. Prominent among
these models is the case of a volume density N of permanent dipoles of strength J-I. in a
liquid. In the presence of an externally applied periodic field, these dipoles attempt to
rotate back and forth in synchronism with the field, but are impeded in this effort by
their inertia and the thermal effects of their neighbors. The result is that there is a net
orientational polarization density Npo which is periodic but lags the field, being depend-
ent both on frequency and temperature, The analysis is similar to the one already
presented for static fields in Section 6.11 and gives results which agree best with experi-
ment for dilute solutions of dipolar molecules in non-polar liquids."
Another physical model which fits the theory is concerned with dipolar solids.
Imagine such a solid to consist of permanently polarized molecules, each of which, due
to the internal crystalline field, possesses a number of equilibrium orientations which
are separated by potential barriers. In the simplest ease only t\VO equilibrium positions
(1) and (2) exist, with opposite dipole directions. At very low temperatures these
dipoles fall into an ordered arrangement due to interactions with each other. But as
the temperature is raised, the extent of this ordering decreases, changing from long-dis-
tance to short-distance, and finally disappearing all together at sufficiently high
temperatures.
If one assumes that the temperature is sufficiently elevated, an individual dipole is
just as likely to be found in position (1) as in position (2), under the proviso that the
pot.ential barrier is symmetrical. However, if an external static field E is introduced,
the energies which a dipole has in the two allowed states will now differ by 2v · E, in
which tJ is the dipole moment in state (1), -tJ being the dipole moment in state (2).
(Cf. Equation 6.40.) For this reason the equilibrium distribution will be altered and
there will now be more molecules in one state than in the other.
Let N 1, N 2 be the numbers of molecules in the two states at a given time and let
W12 dt be the probability that a molecule in state (1) makes a transition to state (2)
during the time interval di: similarly, let the probability for the reverse process be
W21 dt. Then

(6.100)

At equilibrium lV'l == N2 == 0 which gives N 1/ N 2 == W21/W12' However, in equilibrium

28 An analysis may be found in H. Frohlich, ibid., pp. 83-90. Section 11 of Froehlich's text also con-

tains an excellent discussion of other physical models 'which fit the relaxation theory.
388 Dielectric Materials CHAPTER 6

N land N 2 must satisfy the Boltzmann distribution and thus


Nl
- = e21!: E /k T
N2
If l,1 - E « kT, the perturbation from two equally populated states is small and

(6.101)

with K a constant whose value can be determined in the following manner: In the
absence of the field E let T be the average time between when a dipole arrives in state
(1) and when it next arrives in state (2). Then in unit time, Wl2N 1 molecules leave state
(1), the same number arriving back, in which Wl2 == W2l are the transition probabilities
for equally populated states. The average time per round trip per molecule is therefore
2T = N l/W12N 1 = l/w12 == 1/tv21. With l,1- E/kT small, Wl2 and W21 differ but little
from their field-free values W12 and W2l. Therefore K == 1/2T.
Consider next the application of a time-harmonic field Eei wt • Equations (6.100) are
once again applicable, but N 1 and N 2 are now functions of time. Using for (6.101)

~ ;r [1 - (f}k"T
E
Wl2 ) ei wt ]

one may write for (6.100)


2TN l == j2WTN l = -(N l - N 2 ) + (N l + N 2) (y - E/kT)e i wt
2T1V 2 == j2WTN 2 = (N l - N 2 ) - (N l + N 2 )(J) - E/kT)e i wt
These equations have the solution

N _ N = N 1 + N '}.
1 2 1 + jWT
(!lkT· E) e
i wt (6.102)

If N 1 + N 2 = N is the number of molecules per unit volume, then the net polarization
density is

P = (N 1 - N = (Nf})(f} " .E/kT) ei wt (6.103)


2)f}
1 JWT +
and the orientational effect of these permanent dipoles in the solid make a contribution
to the permittivity which is in the form of the second term of (6.97). Therefore this
model conforms with the assumption that the decay function is exponential with a
single relaxation time.
The result (6.103) is based on the premise that in the absence of an external field the
t\VO states are equally probable for any molecule. This premise loses its validity as
the temperature is lowered and short-distance order sets in. Interestingly, as the tem-
perature is lowered still further so that the order is long-distance, the analysis once
again becomes valid. The reason for this is that when the lattice is completely ordered,
it contains t\VO types of sites in which the dipoles are oppositely directed. Each molecule
has a second equilibrium position in which its energy is slightly higher than when in the
ordered position. But this differential in energy grows smaller as the temperature is
lowered and finally becomes negligible, so that the premise leading to (6.103) is once
again justified.
SECTION 19 Dipolar Relaxation 389

If there are several potential barriers separating more than t\VO directions of a per-
manent dipole, then there should be a variety of r values associated with the different
switching times. Similarly, if ions can occupy more than one position, these positions
being separated by potential barriers, a relaxation phenomenon occurs, leading to a
result in the form of (6.103). This latter effect appears to be the case in glassy substances
whose ions can be displaced over one or more interatomic distances.

90 ,..---...,..------,------r---.--,.----.r------r----.

80..----.....----+-----+-----+---1-----+----4

70

60

,iO
(r

40

:~O

20

10

0
-70 -60 -50 -40 -30 -20 -10 0
Tee)
FIGURE 6.24 Relative permitlioiuj of ice VB. teniperaiure and frequency,
in cps. [After Smutli and Hitchcock, J ~·11n Chem Soc, 54, 4631,. 1932.J

The temperature dependence of permittivity for materials which fit assumptions such
as have been used in the foregoing analysis is more complicated than appears explicitly
in (6.103). The particle density N is somewhat temperature-dependent in solids, but of
even more significance is the dependence on temperature of the relaxation t.ime T. For
example, in ice r increases sevenfold as the temperature is lowered from - 5°C to
-22°C. The manner in which permittivity varies with temperature and frequency in
ice is shown in Figure 6.24. It is observed that at a given temperature as the frequency
is raised, the permittivity decreases until finally the orientational contribution vanishes;
390 Dieleciric M aterials CHAPTER 6

at Iower temperatures the frequency which must be reached is not so high because T is
greater. I t is reasonable that a permanent dipole should be able to switch its direction
in less time at higher temperatures. In water at r00111 temperature the relaxation fre-
quency is as high as 3 X 10 10 cps,
The Debye equations (6.98) and (6.99) are of such a form that knowledge for all
angular frequencies w of either e' or f " permits complete determination of the other. This
conclusion is reached for any f(v) and not just for (6.95), which led to the Debye equa-
tions. The general dependence of f' and f " on each other 111ay be expressed by a pair of
in tegrals, known as the Kronig-Kramers relations."

6.20 DIELECTRIC LOSSES

If a static electric field is established in a region containing linear dielectric materials,


the stored energy 111ay be deduced by the technique used in Section 3.19. All the primaru
charges 111ay be assembled in neutralizing pairs, with proper distribution over what is to
become an equipotential surface <Po. The primary charges can then be 1110ved along
what are to become D lines to their ultimate positions. Elements of work can be corn-
puted by using the coniponeni of E parallel to the D lines. The result is that the stored
energy is given by

WB=iJEoDdV (6.104)
v

in which V is the VOlU111e of all of space. This differs from the earlier free-space formula
(3.151) only in that D has replaced Do. Each of these flux densities is associated with
surface densities of primant charge, which are the agencies whereby the energy is
introduced to the system, even though SOB1C of it ends up being stored in the induced
dipoles within the dielectrics. In (3.1tj1) E and Do were parallel and the dot product
notation was somewhat superfluous. However, in (6.104), for points within a dielectric,
E and D are not necessarily parallel, and it is only the C0111pOncnt ofE along the D
lines which is effective in the transfer of energy. For isotropic materials, in which E and
D are parallel, (6.104) reduces to

WE = i J ~$E2 dV
1T
(6.105)

wherein e, is the static nontensor dielectric constant. For this reason it is often stated
that the volume density of electrostatic stored energy is f s E 2j 2. This statement is 111is-
leading unless e, is independent of temperature. It can be shown'" that (6.10;"») generally
represents the change in free energy of the system. t
If, after the static fields E, D of (6.104) have been established, the primary charge
distribution is changed infinitesimally by starting with additional infinitesimal charge
pairs on <Jl o and rnoving them to their destinations along the D lines, t hen the additional
t The free energy is an exact function of the state variables whose changes represcn t the maximum
work which can be done in an isothermal process.
29 See, e.g., C. Kittel, QuanlU111 Theory of Solids, pp. 405-406, John Wiley and Sons, Inc., New York,
1963.
30 E.g., H. Frohlich, Ope cit, pp. 9-12.
SECTION 20 Dielectric Losses 391

energy put into the system is


dW = f E· dDdV (6.106)
v
in which dD == do is the charge density in transport during the process of changing the
primary charge distribution. If this process takes a time dt, then

dW = f E· aDat dt dV
v
(6.107)

Imagine that, in a succession of infinitesimal time increments such as this, the


primary charge system is cyclically varied so that E and D, as they appear in (6.107),
are time-harmonic. Then
E == E(x,y,z) cos [wt + /3(x,y,z)]
D == /E(x,y,z) cos [wt + ,B(x,y,z)] + f"E(x,y,.z) sin [wt + ,B(x,y,z)]
in which E(x,y,z) is a real vector function, /3(x,Y,z) is a real phase angle, and f/, f " are
the real and imaginary parts of the complex permittivity. (They may be tensors.)
Over one complete cycle the energy supplied to the system is
2lT

f dW = f E(x,y,z) · f"E(x,y,z) dV f cos" [wt + ,B(x,y,z)] dwt


v 0

f
(6.108)
= 7f' E(x,y,z) • f"E(x,y,z) dV

According to (6.104), the energy stored in the system at the end of a cycle is the same
as at the beginning, since E and D have reverted to their initial values. Thus (6.108)
gives the energy lost per cycle within the dielectric materials in the form of heat.
The heat produced per second per unit volume in the dielectric may therefore be
expressed by

(6.109)

If the dielectric is isotropic, so that D II E and f/, f " are real scalars, then
" I
L == ~ E2 == ~ E2 tan 0 (6.110)
2 2
in which tan 0 == ,.
f " (6.111 )
f

with 0 the phase angle of the com plex permittivity. Because of the form of (6.110), 0
is also called the loss angle and tan 0 the loss tangent. Formula (6.110) is a conven-
ient form in which to express the loss because for many materials f' is essen tially con-
stant over wide frequency spectra, and tan 0 then becomes the loss index of the
material.
Dielectric loss at frequencies below the infrared is usually due to dipolar relaxation
in one of the forms discussed in Section 19. (In some dielectric materials there may
also be a small amount of loss due to free electrons which cause a slight conductivity.
This subject will be considered more fully in Chapter 8.) An extensive compilation of
dielectric data in the frequency range from 100 cps to 2.5 X 10 10 cps has been made by
392 Dielectric AIaterials CHAPTER 6
'fABLE 6.7
THE REAL PART OF THE HELATIVE DIELECTRIC CONSTANT AND THE LOSS
TANGENT VEHSUS FREQUENCY FOR VARIOUS DIELECTRIC MATERIALS

Values for Frequency in cps


TOC tan {)
111aierial
are multiplied
by 104 102 104 106 108 1010

Zirconium porcelain .... 25 f'l EO 6.44 6.35 6.32 6.30 6.18


tan {) 59 31 23 25 57

Borosilicate glass....... 25 E'I Eo 4.05 4.05 4.05 4.05 4.05


tan {) 13.6 4.4 5.8 8.0 15.5

Mycalex 400.......... 25 E'I Eo 7.47 7.42 7.39 .... 7.12


tan s 29 16 13 .... 33

Bakelite BM-120 ....... 25 E'I Eo 4.87 4.62 4.36 3.95 3.68


tan {) 300 200 280 380 410

Laminated fiberglas .... 24 E'/ fo 14.2 7.2 5.3 4.8 4.37


tan 0 2500 1600 460 260 360

Nylon FM 10001 ...... 25 E'I Eo 3.84 3.64 3.36 3.17 3.02


tan {) 115 262 232 175 107

Lucite ................ 23 t.' / t.o 3.20 2.75 2.63 2.58 2.57


tan {) 620 315 145 67 49

Plexiglas.............. 27 E'/ Eo 3.40 2.95 2.76 ., .. 2.59


tan 0 605 300 140 .... 67

Polystyrene........... 25 t.' / Eo 2.56 2.56 2.56 2.55 2.54


tan {) <0.5 <0.5 0.7 <1 4.3

Teflon ................ 22 E'/ Eo 2.1 2.1 2.1 2.08


I 2.1
tan 0 <5 <3 <2 <2 3.7

von Hippel" and selections from his data are reproduced in Table 6.7. One observes
that tan 0 is characteristically small and that E' is reasonably constant, in keeping with
the interpretation of the Debye equations as displayed in Figure 6.23. One or more
resonances may be noted in tan 0, but they are all very broad, which is also consistent
with the theory.
The dielectric losses occasioned by ionic vibrations are customarily referred to as
infrared absorption. Similarly, those losses associated with the electrons are called

31A. R. von Hippel, ed., Dielectric Mcderials and Applications, John Wiley and Sons, Inc., New York,
1954.
SECTION 21 J11 axwell' s Equations for Dielectric Materials 393

optical absorption. Most of these losses are normally in the ultraviolet and for trans-
parent materials they are totally so. However, in some materials they also occur in the
visible spectrum, thus giving rise to the colors of these materials. In ionic crystals this
may happen if electrons occupy S0111e of the sites normally filled by negative ions, thus
causing a natural frequency intermediate between ultraviolet and infrared.

6.21 MAXWELL'S EQUATIONS FOR DIELECTRIC MATERIALS

If a dielectric material is considered at the microscopic level to consist of an aggregation


of charged particles in motion in a vacuum, then the free-space Maxwell's equations

v X E == -B
1t E (6.112)
VXB==-=i+ 2
Mo c
are applicable, with t, == 1 + lb representing the total current density due to motions
of both primary charges t and the bound charges (1b) within the dielectric. However, a
more convenient form of these equations 111ay be devised as follows: If
AI 2111
P dV == L pn == 2:
n=l n=l
qnrn

is the total instantaneous dipole moment within a macroscopic volume element dV


(cf. Section 6.2, 6.3), then

in which v- is the instantaneous velocity of the nth charge within dV. This expression
is seen to be the instantaneous current in dll due to the motions of the bound charges.
When the thermal velocity components (which average to zero) are subtracted out,

in which tb is the equivalent instantaneous ordered current density representing the


contributions due to the motions of all the bound charges within dV. Thus P == tb is
the contribution to the total current density tt made by the time-varying dipole mo-
ments within the dielectric. Since
D == f:oE +P
it follows that

If the dielectric material is nonmagnetic (this restriction will be removed in Chapter 7),
then
tt D+ 1 - 1t
VXB==--=i+ 2
).Lo f:oC
1 +D
-1
J.Lo

Maxwell's other equation remains intact in the form given in (6.112) so that at
points within nonmagnetic dielectric materials, Maxwell's equations 111ay be wril.ten in
394 Dielectric Iv[ aterials CHAPTER 6

the form

v XE = -8 VXB
t +D (6.113)
-1
J.Lo

The auxiliary relations become


v· D = p V· B == 0 (6.114)
in which D = eE, with the permittivity e capable of all the diverse characteristics dis-
cussed earlier in this chapter.
These results will be generalized further in the next t\VO chapters to include magnetic
and conductive materials,

REFERENCES

1. Bottcher, C. J. F., Theory of Electric Polarization, Elsevier Publishing Company, Am-


sterdam, 1952.
2. Corson, D., and P. Lorrain, Introduction to Electromagnetic Fields and lVaves, 'V. H.
Freeman and Company, San Francisco, 1962.
3. Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1957.
4. Dekker, A. J., Electrical Engineering 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1959.
5. Frohlich, H., Theory of Dielectrics, 2d ed., Clarendon Press, Oxford, 1958.
6. Kittel, C., Introduction to Solid State Physics, 2d ed., John Wiley and Sons, Inc., New
York, 1956.
7. Plonsey, R., and R. E. Collin, Principles and A pplications of Electromaqnetic Fields,
McGraw-Hill Book Company, New York, 1961.
8. Reitz, J. R., and F. J. Milford, Foundations of Eleciromaqnetic Theory, Addison-Wesley
Publishing Company, Inc., Reading, Massachusetts, 1960.
9. Van Vleck, J. H., Theory of Electric and Jf agnetic Susceptibilities, Oxford University Press,
London, 1932.
10. von Hippel, A. R., ed., Dielectric ill aterials and Applications, John Wiley and Sons, Inc.,
New York, 1954.
11. vVhittaker, E. S., A History of the Theories of the Aether and Electricity, Vol. 1, Tholnas
Nelson and Sons, Ltd., London, 1951.

PROBLEMS

6.1 What charge distribution over a spherical surface is equivalent in the exterior region to
a dipole of moment p at its center?
6.2 An electron is placed 30 angstroms from an atom of polarizability a = 10- 40 farad 1n 2•
Find the dipole moment induced in the atom and the resulting force on the free electron.
What initial acceleration will this cause the electron to experience?
6.3 Find the energy stored in an atom of polarizability a if it is placed in a uniform field of
strength Eo.
6.4 Find the electrostatic energy stored in a system consisting of two identical dipoles which
are parallel and coaxial. Let their strength be p = qd and their separation be l, with l »
d.
Problems 395

6.5 The molecules of a polar gas have a permanent dipole moment of 1.5 debye units. What
externally applied field strength is needed to cause an orientational polarization which is
0.5 percent of the saturation value? Assume that the temperature is 20°C.
6.6 The dipole moment of a water molecule is 1.87 debye units. Neglecting electronic and
ionic polarizabilities, estimate the dielectric constant of water at O°C.
6.7 Show that the interionic force between the positive and negative ions of a salt will de-
crease by an order of magnitude when the salt is dissolved in water. This effect plays a
prominent role in keeping the salt in solution.
6.8 With the aid of Figure 0.11, estimate the permanent dipole moment of CH 3CI and the
relative strength of its orientational polarizability.
6.9 A commercial 400 volt d.c. paper-insulated tubular capacitor for use in electronic com-
ponents has 0.1 microfarads of capacitance. Exclusive of the exterior plastic shell it con-
sists of two sheets of foil 0.0005 in. thick, interleaved by two sheets of paper 0.001 in.
thick, with the assembly wound into a circular cylinder 16 in. in diameter and 1 ~ in. long.
What is the relative dielectric constant of the paper?
6.10 Find the polarization density in the capacitor of the preceding problem, if a paper of
relative dielectric constant 2.4 is used and 400 volts is applied.
6.11 If a uniform field Eo is set up in a dielectric of permittivity €, find the strength of the
field in a spherical cavity inside the dielectric.
6.12 Figure 6.18 indicates two drops in the spontaneous polarization of BaTi0 3 , as measured
along a cube edge. These occur when the direction of polarization changes from being
parallel to a cube edge, to being parallel to a face diagonal, and then to being parallel to
a body diagonal. Use the experimental data of Figure 6.19 to show that the total spon-
taneous polarization is essentially constant below 300oK.
6.13 Barium titanate has an index of refraction n = 2.4. Assuming that the Clausius-Mossotti
equation is applicable, show that the electronic polarizability is approximately one and
a half times as great as the spontaneous ionic polarizability.
6.14 A charge q is placed at the center of a dielectric shell of inner radius a, outer radius b, and
relative dielectric constant Er . What is the change in energy of the system if the charge is
removed to infinity?
6.15 Show that a conducting metallic sphere of radius a has a polarizability a = 411"€oa 3 • I f an
artificial dielectric is constructed by using N such conducting spheres per unit volume,
find the relative static dielectric constant. What would be the frequency dependence of
the permittivity of this array?
6.16 Show that Equation (L.3) of Appendix L is applicable to the case of permanent dipoles
which oscillate about central pointing directions with {P} representing the retarded
orientation of the polarization density.
6.17 In parallel with the development in Appendix L, find the magnetic vector potential
function due to a collection of oscillating dipoles which represent a dielectric material.
Show that the equivalent time-harmonic currents are consistent with the equivalent time-
harmonic charge densities of Equation (L.3).
CHAPTER 7
Magnetic Materials
'VITH RESPECT to their magnetic behavior, materials may be classified as diamagnetic,
poramaqnetic, [erromaqneiic, antijerromaqnetic, or [errimaqnetic; distinctions will be
drawn among these five types of materials in later sections of this chapter. Briefly, the
manifestation of magnetic behavior may be attributed to electron orbital motion, to
electron spin, and to nuclear spin. All three of these causes are representable by equiv-
alent atomic currents flowing in circular loops. Unlike the currents considered in Chapter
4, which were caused by the macroscopic transport of charge, these atomic currents are
bound to the individual molecules, and the phenomenon in many respects is analogous
to that of the bound dipole charges which formed the basis for an explanation of di-
electric behavior. These equivalent atomic currents cause magnetic fields which are
calculable by the methods developed in Chapter 4. The aggregate effect of the atomic
currents in all the molecules of a material specimen may be to produce a significant
macroscopic magnetic field.
An equivalent atomic current loop may be characterized by its magnetic moment m.
For diamagnetic materials, the net magnetic 1110111ent of each molecule in the absence
of an external magnetic field is zero. The presence of an external field induces a slight
net magnetic moment, and this effect is akin to electronic polarization in dielectric
materials. The induced magnetic moments translate into a relative permeability slightly
less than unity.
In paramagnetic materials, the molecules possess net permanent magnetic moments
which are randomly oriented. The presence of an external field causes some net align-
ment of these magnetic moments, in balance with the thermal agitation. This effect is
analogous to orientational polarization in dielectric materials, Normally, in paramag-
netic materials, adjacent molecules exert little magnetic influence on each other and
such materials exhibit a relative permeability only slightly in excess of unity.
Ferromagnetic materials are composed of molecules possessing magnetic moments of
equal strength. The interactions of adjacent molecules arc so strong that all of the
magnetic moments align within a given domain of the material even in the absence of
an external field. This effect is similar to what occurs in ferroelectric crystals. These
materials have large relative permeabilit.ies, values in the range 10 4 to 10 5 being not
uncommon. Antiferrornagnetic materials differ in that adjacent molecules have 111ag-
netic 1110111ents of equal strength but the alignment is anti parallel. In fcrrimagnetic
materials, adjacent magnetic 11101nen ts are at two different strengths as well as being
anti parallel. These four conditions of the permanent magnetic moments are suggested
for a one-dimensional model in Figure 7.1.
SECTION 1 Historical k')urvey 397

To account for all three causes of magnetic behavior and their possible occurrence in
any of the five types of magnetic materials, an arbitrary specimen will be treated as
though it were a general collection of magnetic moments m, in a vacuum, with considera-
tion of the detailed composition of m, deferred until later in the chapter where specific
types of materials are discussed. Static conditions will be treated first, and an expres-
sion will be obtained for the total magnetic field caused by externally impressed currents
plus a distribution of atomic magnetic moments. This expression will then be used to
generalize the relation between Band H and to deduce the local field at the site of any

I\/~ tt tt
Paramagnetic Ferromagnetic

Antiferromagnetic Ferrimagnetic
FIGUHE 7.1 Aliqnmeni of magnetic moments for di.fferent types of magnetic materials.

molecule. Consideration will then be given to the problem of relating the strengths of
the local field and the net magnetic moment density M. For many materials this will
permit simplifications of the expression for the total field; additionally, it will lead to the
relation H == J..L-1B, with the permeability factor J..L serving to describe the magnetic
behavior of the various types of magnetic materials.
The permeability will be investigated for all five classes of materials under static (or
quasistatic) conditions, and then the results will be generalized to time-harmonic
situations. Resonance phenomena will be described and hysteresis losses treated.
Finally, the free-space form of Maxwell's equations derived in Chapter 5 will be ex-
tended to apply to regions occupied by magnetic materials in a manner analogous to
what was done for dielectric materials in Chapter 6.

7.1 * HISTORICAL SURVEY

An awareness of the existence of magnetized and magnetizable materials can be traced


back to the ancients, who were familiar with lodestone and its power to attract iron.
Thus Plato, in the dialogue Ion, invests Socrates with the words

. . . this gift you have of speaking well on Homer is not an art; it is a power divine, im-
pelling you like the power in the stone Euripides called the magnet . . . . This stone does
not simply attract the iron rings, just by themselves; it also imparts to the rings a force
enabling them to do the same thing as the stone itself, that is, to attract another ring, so that
sometimes a chain is formed, quite a long one, of iron rings suspended from one another.
For all of them, however, their power depends on that lodestone.
* This section may be omitted without loss in continuity of the technical presentation.
398 Magnetic 1J1aterials CHAPTER 7

Despite this long history of acquaintance with the powers of natural magnets, the
introduction of the compass (which was the first practical application of magnetism)
did not occur until the Middle Ages.' The date and authorship of this invention are not
known, but primitive forms of the compass were commonly employed in northwestern
Europe by the end of the twelfth century. About one century later, and apparently
independently, the Chinese also discovered that the directive power of a magnet could
be applied to navigational purposes.
The science of magnetism can be said to date from 1269, for in that year Pierre
de lVlaricourt (Peregrinus) announced the discovery of an important property of lode-
stones. In his own words:"

So you must know that this stone bears in itself the similitude of the heavens, the method
of proving which I will explain clearly how to find . . . there are two points in the heavens
more noteworthy than the rest, because the celestial sphere turns about them as upon axes.
One of these is named the Arctic or North pole, whilst the remaining one is named the
Antarctic or Southern. So in this stone you should thoroughly comprehend there are two
points of which one is called the North, the remaining one the South. To the general dis-
covery of these two points you may attain by manifold industry . . . one way is to have
this stone rounded with a tool with which crystals and other stones are rounded. Afterwards
let a needle . . . be placed over the stone, and along the length of the needle let a line be
marked out dividing the stone along the middle. Afterwards let the needle be placed in
another position over the stone, and mark the stone with a line again in the same way
according to that position. And if you wish, you shall do this in several places or positions,
and without doubt all the lines of this kind will meet in t\VO points, just as all the meridian
circles of the World meet in the t\VO opposite poles of the World. Know you then that one is
the North, the other the South . . . .

Peregrinus went on to the discussion of further experiments in which he showed that


the two poles were also the points of greatest concentration of magnetic strength. His
terminology has prevailed to this day, and this conception of a polarization effect in
magnets has made a lasting impression and forms the basis for many subsequent
theories of magnctizaticn.
It is difficult for the present-day reader to appreciate the superstitious awe with
which medievalists regarded the magnet and its seemingly supernatural ability to
attract other bodies. Thus curative properties for all sorts of afflictionswere ascribed
to it. Gout, dropsy, convulsions, and even marital disputes were believed to give way
to its magical powers, Another common belief which prevailed for centuries was that a
magnet would lose its directive properties if rubbed with garlic. The rise of the scientific
method included consideration of such matters, as one can see in this passage from the
writings of Giambattista della Porta:"

1 A summary of what is known about the discovery of the compass, with bibliography, has been given

by D. G. Knapp in "Origins of Geomagnetic Science," Chap. 6 of ill tujneiisni of the Earth, Publica-
tion 40-1, Coast and Geodetic Survey, U.S. Dept. of Commerce, Washington, I).C., 1962.
2 Petrus Peregrinus, De M agnete, Chap. 4, English translation by Sylvanus P. Thompson, Chiswick

Press, London, 1902.


3 John Baptist Porta, Natural illtujik, English translation of Afaqiae Naturlis, 20 volumes, published

in London, 1658~
SECTION 1 Historical Survey 399

It is a common opinion among seamen that onions and garlic are at odds with the lode-
stone; and steersmen and such as tend the mariner's card are forbidden to eat onions or
garlic lest they make the index of the poles drunk. But when I tried all these things, I found
them to be false; for not only breathing . . . upon the lodestone after eating of garlic did
not sto pits virtues; but even when it was anointed all over with the juice of the garlic, it per-
formed its office as well as if it had never been touched with it, and I could observe almost
not the least difference.

Although dispelling such beliefs also gained the attention of William Gilbert (1540-
1603), it is fortunate that he still found time for more significant inquiries. Gilbert
was the first to appreciate that the earth itself is a giant spherical magnet. He went so
far as to magnetize a small iron ball and demonstrate that it possessed a magnetic field
similar to that of the earth. The action of a compass needle was then readily explained
as merely another example of the principle that like poles of different magnets repel,
whereas unlike poles attract."
The seventeenth century witnessed a widened interest in magnetic investigations.
Among the accomplishments of that period may be mentioned the demonstration by
A. Kirchner (1601-1680) that the two poles of a magnet have equal strength. This was
done by measuring the force required to pull a piece of iron away from either pole.
N. Cabeo (1585-1650) revealed an inductive effect when he noted that an unnuumeiized
needle floating freely on water would align itself with the earth's magnetic meridian.
H. Gellibrand (1597-1636) discovered the secular variation of the magnetic declination.
Descartes offered the first theoretical explanation of magnetic phenomena by attempt-
ing to embrace all known effects within his theory of vortices. He assumed that the fluid
matter of a vortex entered a magnet at one pole and emerged at the other, acting on
nearby pieces of iron because the molecules of the iron presented a special resistance
to its motion.
In the Principia, Isaac N ewton speculated that the law of force for a bar magnet
was the inverse cube of distance. s However, John Xlichell (1724-1793) was the first
to enunciate a correct law for the force between magnetic poles, stating:"

Whenever any Magnetism is found, whether in the Magnet itself, or any piece of Iron,
etc., excited by the Magnet, there are always found t\VO Poles, which are generally called
North and South . . . . Each Pole attracts or repels exactly equally, at equal distances,
in every direction . . . . The Attraction and Repulsion of Magnets decreases, as the
Squares of the distances from the respective poles increase.

Michell based the statement of this law on his own experimental observations and those
of several contemporaries. The validity of the inverse square relationship was later
reinforced by the refined experiments of Coulorn b.
The prevalence at that time of fluid theories of electricity naturally led to efforts
to construct similar theories of magnetism. A one-fluid theory was proposed by Aepinus
in 1759, in which the poles were presumed to be regions in which the magnetic fluid

4 W. Gilbert, de ill tumeie, London, 1600. An English translation by P. F. Mottelay was published in
1893, reprinted by Dover Publications, Inc., New York, 1958.
5 Cf. Book III, Prop. VI, Theorem \'1, Cor. \Y. This conclusion is correct for distances large compared

with the length of the magnet.


6 J. Michell, A Treatise of A rtificial }vf agnets, London, 1750.
400 Magnetic Jl.faterials CHAPTER 7

was present in excess or deficiency of the normal amount. A two-fluid theory was
favored by Brugmans and Wilcke, with elements of one fluid repulsing each other, but
attracting all elements of the other fluid. The names austral and boreal were given to
these fluids and Coulomb adopted this two-fluid idea, using it to explain why a magnet,
upon being broken in t\VO, becomes t\VO magnets each with a pair of poles, rather than
t\VO half-magnets each with a single pole. According to Coulomb," this effect could be
explained by imagining the t\VO magnetic fluids to be trapped in equal amounts within
the molecules of magnetic bodies, with no possibility of transfer of either fluid from one
molecule to the next. In the unmagnetizcd state every molecule of the body has both
its fluids uniformly distributed, and magnetization occurs when the austral and boreal
fluids retreat to opposite ends of each molecule,
This polarization hypothesis of Coulomb was used by Poisson in 1824 as the basis
for the first successful theory of magnetism." Poisson's development has already been
described in Section 6.1 and its dual in terms of electric dipoles was presented in detail
in Section 6.3. Upon representing the effect of a magnetized body in terms of equivalent
surface and volume distributions of magnetic charge, .Poisson was able to explain satis-
factorily all the magnetic phenomena known at that time. In addition to describing the
behavior of a permanent magnet, he accounted for the properties of temporary magnets
by deriving an expression for the distribution of induced magnetization. He also offered
an explanation for the observed fact that some materials are more highly magnetizable
than others, introducing a quantitative index of this property which is akin to the
modern factor of permeability.
The Coulomb-Poisson conception that a polarized magnetic molecule is the primi-
tive element in any magnetized specimen was favored by many subsequent investiga-
tors. With some refinemen ts, it still forms the basis of a mathematical theory of mag-
netism which is acceptable for most calculations. However, a different model, conceived
by Ampere," was destined to supersede it. Stimulated by his discoveries relating mag-
netic fields to the shapes and dispositions of curren t-carrying conductors, Am perc
chose to consider all of magnetism as being basically an electrical phenomenon. In his
view, each of Poisson's magnet.io molecules owed its properties to a small internal cur-
rent which circulated perpetually. Aggregations of these elemental circulating currents,
properly aligned, could account for the action of permanent magnets, Although Ampere
did not develop a complete theory which would rival Poisson's, he did treat extensively
the equivalence of permanent magnets and electric circuits. It was to be S0111e time
before this approach \VaS carried much further; the notion of a perpetual molecular
current which encountered no resistance was too revolutionary to be accepted readily
by his contemporaries.
A new class of magnetic materials was discovered by Faraday in 1845. Earlier that
same year, he had determined that the plane of polarization of a beam of light was
rotated when the beam travelled through a bar of heavy glass in the presence of a
longitudinal magnetic field. In an extension of this experiment he noted: 10

7 c. A. Coulomb, "Seventh Memoir on Electricity and Magnetism," M ern ilead, 488: 1789.
8 S. D. Poisson, "Memoir on the Theory of Magnctism ," Al em Acad Sci, Ser. 2, 5, 247-338; February
1824.
9 A. IVI. Ampere, "On the Mathemutical Theory of Electrodynamic Phenomena Uniquely Deduced

from Experimen t," AIern A cad, 6, 367; 1825.


10 ~1. Faraday, Experimental Researches in Electricity, vol. 3, Sec. 2253-2258, Bernard Quaritch,

London, 1855.
SECTION 1 Historical Survey 401

The bar of silicated borate of lead, or heavy glass already described as the substance in
which magnetic forces were first made effectually to bear on a ray of light . . . was sus-
pended centrally between the magnetic poles, and left until the effect of torsion was over.
The magnet was then thrown into action by making contact at the voltaic battery: immedi-
ately the bar moved, turning round its point of suspension, into a position across the mag-
netic curve or line of force . . . . Here then \ve have a magnetic bar which . . . pain ts
perpendicularly to the lines of magnetic force.

Faraday also observed that a cube of the heavy glass would tend to move out of the
magnetic field. This behavior was contrary to that of an iron bar, which would align
itself with the field, and to that of a cube of iron, which would seek the strongest region
of the field. Faraday offered an explanation of the behavior of such materials by sup-
posing that magnetic induction caused in them a contrary state to that which it pro-
duced in magnetic matter, and for this reason he called them diamaqneiic.
Upon attempting to picture this behavior in terms of lines of magnetic force, Faraday
was led to distinguish between diamagnetic materials and those which were normally
magnetic. He introduced the word paramaqneiic to describe the latter, and suggested
that a paramagnetic body possessed a high conducting power for magnetic flux, thus
causing the lines of force to crowd into it in preference to its surroundings. Diamagnetic
materials, on the other hand, were pictured as having a low conducting power for the
flux lines so that the density of lines within a diamagnetic body was low compared to
the surroundings.
Two years later, Wilhelm Weber (1804-1890) offered a detailed explanation 11 of
diamagnetism. He assumed the existence of Amperian molecular circuits, and invoked
Faraday's emf law to argue that currents should be induced in these circuits if a time-
varying magnetic field were applied. Since the induction would result in currents whose
fields were opposed to the stimulus, this would neatly account for diamagnetic behavior.
According to this argument, all bodies exhibit diamagnetism, Weber accepted this
conclusion, and then assumed further that paramagnetic substances additionally
possessed permanent molecular currents which were the cause of their paramagnetism.
A material whose permanent molecular currents were large would be normally magnetic
to such a high degree that the weak diamagnetic effect due to induced currents would
be masked completely. Weber was so satisfied with this explanation that he used it as a
reason to reject the Coulomb-Poisson hypothesis of polarizable magnetic fluids, saying
in the same article:

Through the discovery of diamagnetism, the hypothesis of electric molecular currents


in the interior of bodies is corroborated; the hypothesis of magnetic fluids in the interior of
bodies is refuted.

In 1871 Weber reformulated his theory of magnetism, taking as a new model for an
Amperian molecular current one in which an electric charge was pictured as orbiting
around a fixed charge of opposite sign. This hypothesis preceded by a quarter of a
century Thomson's discovery of the electron and by a third of a century the develop-
ment of the Rutherford-Bohr atom. It was adopted in 1905 by P. Langevin (1872-1946)

11 W. Weber, "On the Relationship of the Science of Diamagnetism With the Sciences of Magnetism

and Electricity," Ann Phys, 87, 145-189; 1852.


402 Magnetic AIaterials CHAPTER 7

as the basis for the first electron theory of magnetism." Langevin's theory provided a
satisfactory explanation for the distinctions between diamagnetism and paramag-
netism, including the temperature independence of the former and the T-I temperature
dependence of the latter, effects which had been observed experimentally by P. Curie
(1859-1906) a decade earlier.!"
In introducing the terms diarnagnetic and paramagnetic Faraday had grouped all
materials whose permeabilities exceed that of free space in the paramagnetic category.
The need for subdivision of this classification became apparent in 1907 when P. E.
Weiss (1865-1932) pointed out that spontaneous magnetization exists in highly 111ag-
netic materials in the absence of an external field.!' He called this the ferromagnetic
property, and this term has ever since been used to characterize such materials, The
adjective paramagnetic is now confined to those materials which have relative permea-
bilities only slightly in excess of unity and which normally do not exhibit spontaneous
magnetization.
Weiss accounted for the possibility of a gross demagnetized state of a ferromagnetic
material by postulating the existence of regions (now called domains) each of which is
always magnetized to saturation. He supposed that the directions of magnetization
in the various domains were different, and randomly oriented in the absence of an
external field, in which case the gross magnetization of all the domains considered
together was zero. To explain the uniaxial saturation magnetization in any domain,
Weiss hypothesized that a strong, local molecular field forced the parallel alignrnent of
adjacent Amperian current loops. As a consequence, he was able to show that a ferro-
magnetic material, when heated to a certain critical temperature 'I', (called its Curie
point) ceased to be Ierromagnctic, and that for higher temperatures the magnetic
susceptibility was proportional to (T - TJ-l. This latt-er result is called the Curie-
Weiss la w .
Experimental evidence in support of Weiss' domain theory for ferromagnetic mate-
rials was first provided by Barkhausen " in 1919, who observed jU111PS in the magneti-
zation of a ferromagnetic specimen which are associated with irregular fluctuations
in the motion of the domain walls, Even 1110re direct experimental evidence results
from a technique introduced by Bitter." In this technique one prepares a colloidal
suspension of ferromagnetic particles and spreads it on a carefully prepared surface
of the specimen ; the strong local magnetic fields at the domain boundaries cause the
particles to collect there, creating a pattern which is seen easily under a microscope.
The hypothesis by Weiss of a strong local field which aligns all the magnetic moments
in a domain is similar to the Lorentz local field theory for dielectrics (cf. Section 6.5).
However, in order to account for the extremely large values of relative permeability
in ferromagnetic materials, if one writes for the local field Bloc = B + ('Y - l)MjJ,LO-l,
with M the magnetization density, it develops that 'Y is several orders of magnitude
t
larger than the value predicted by a Lorentz-type theory based on ordinary electro-
12 P. Langevin, "Magnetism and the Theory of Electrons," Ann Chim (Paris), VIII, 5, 70-127; 1905.
13 P. Curie, "Magnetic Properties of Bodies at Diverse Temperatures," Ann Chim. (Paris) VII, 5,
289-405; 1895.
14 P. E. Weiss, "The Hypothesis of the Molecular Field and the Ferromugnctic Property," J de }:>hys

(Paris) Ser. 4, 6, 661-690; 1907.


15 If. Barkhausen , "Two Phenomena Discovered with the Help of a l\C\V Arnplifier," Phys Zeit, 20,

401-403; 1919.
16 F. Bitter, "Experiments on the Nature of Ferromagnetism," Phys Rev, 41, 507-515; August 15,1932.
SECTION 1 H isiorical Survey 403

magnetic assumptions. This difficulty was overcome in 1928 by Heisenberg," who


showed that the large local field may be explained in terms of quantum-mechanical
exchange forces. These forces are electrostatic, but due to constraints imposed by the
Pauli exclusion principle are equivalent to extremely strong coupling between electron
spins. Heisenberg's theory has been given experimental confirmation through studies
of the gyromagnetic effect. 18 In these studies the ratio of magnetic moment to angular
momentum is determined. Theory predicts a characteristically different ratio if the
magnetism is due to electron orbital motion as opposed to spin; the experimental data
agree with the spin prediction.
Another magnetic phenomenon whose explanation requires the quantum theory is
the spatial quantization of atoms under the influence of an applied magnetic field.
The theory predicts that the angular momentum of the atom along the field axis must
be an integral or half-integral multiple of h/21r, with h Planck's constant. The experi-
ments of Stern and Gerlach on the deflection of atoms in a nonhomogeneous magnetic
field have confirmed this." A discrete rather than continuous set of deflections was
observed, giving conclusive evidence that the atoms can assume only particular orien-
tations in the presence of the field.
In the Heisenberg theory for ferromagnetic materials the exchange forces are posi-
tive, accounting for the parallel alignment of adjacent magnetic moments, There is
no restriction in the theory which requires that these exchange forces be positive for
all materials; if they are negative, an anti parallel alignment of neighboring spins is
favored. Materials which meet this latter condition are said to be antiferromagnetic.
The theory of such materials was first investigated by Nee120 and Bitter;" and later
extended by Van Vleck." The first material shown to be antiferromagnetic is man-
ganese oxide; this was demonstrated in 1938 by Bizette, Squire, and Tsai. 2 3
Most magnetic materials are good conductors, and thus will not pass electromagnetic
waves easily, particularly at the higher frequencies. In an effort to overcome this
limitation Snoek, Verwey, and their co-workers at the Phillips Laboratories instituted
a search during the 19408 for new ferromagnetic materials which would have high
enough electrical resistivities to be suitable for practical applications requiring wave
passage. They were successful 2 4 in developing a group of materials, called ferrites,
which are obtained by replacing the ferrous ion Fe 2+ in magnetite (FeO' Fe203) by
another divalent metal ion such as Mn, Cd, Co, Zn, Ni, Cu, or IVIg. Mixed ferrites were
also obtained by using combinations of these ions. The resistivities achieved are in the
range 10 0 to 10 4 ohm-m, compared with 10- 7 ohm-m for iron.
17 W. Heisenberg, "On the Theory of Ferromagnetism," Z Phys, 49, 619-636; 1928.
18 See, e.g., S. J. Barnett, "Gyromagnetic and Electron-Inertia Effects," Rev Mod Phys, 7, 129-166;
1935. Also, Proc A l1L A cad A rts Sci, 75, 109; 1944.
19 W. Gerlach and 0. Stern, "The Experimental Detection of Quantized Orientations in a Magnetic

Field," Z Phys, 9, 349-:352; 1922.


20 L. Neel, "Influence of Molecular Field Fluctuations on the Magnetic Properties of Bodies," Ann

Phys (Paris), Ser. 10, 18,5-105; 1932.


21 F. Bitter, "A Generalization of the Theory of Ferromagnetism," Phys Rev, 54, 79-86; 1938.

22 J. H. Van Vleck, "On the Theory of Antiferromagnetism," J Chem Phys, 9, 85-90; 1941.

23 H. Bizette, C. F. Squire, and B. Tsai, "The Transition Point ~ of the Magnetic Susceptibility of

MnO," Comp Rend, 207, 449-450; 1938.


24 For reviews of these developments see J. J. Went and F. W. Gorter, "Magnetic and Electrical
Properties of Ferroxcube l\1aterials," Phillips Tech Rev, 13, 181; 1952. Also see A. Fairweather, F. F.
Roberts, and J. E. vVelch, "Ferrites," Rept Prog Phys, 15, 142-172; 1952.
404 Magnetic it! aterials CHAPTER 7

Ferrites have been found to have the spinel structure (after the mineral MgAI 20 4) ,
which is a composite of tetrahedral and octahedral arrangements of ions. N eel has
treated these materials theoretically" by assuming that a negative interaction exists
between the ions at the tetrahedral sites (A sites) and those at the octahedral sites
(B sites), with this interaction causing an antiparallel spin alignment of the A and B
ions. The theory also includes consideration of the weaker AA and BB interactions,
which turn out to be negative as well. The ferromagnetic behavior of ferrites is thus
explained, surprisingly, in terms of three antiferromagnetic interactions. Neel termed
materials which behave in this manner ferrimagnetic.

7.2 THE STATIC MACROSCOPIC MAGNETIC FIELD


DUE TO A VOLUME OF POLARIZED MAGNETIC MATERIAL

A satisfactory theory of the magnetic behavior of materials can be based on the con-
ception that individual atoms or molecules contribute magne.tic effects because of (a)
the orbital motion of their electrons, (b) electron spin, and (c) nuclear spin. Following
Ampere, one can represent each of these causes by the magnetic moment of an equiva-
lent current loop. In this way the resulting magnetic theory will parallel in many
respects the dielectric theory presented in Chapter 6.
Consider a specimen of material of arbitrary composition whose volume V is bounded
by the surface S. In determining the magnetic field caused by the material, each mole-
cule of the specimen will be replaced by an elementary current loop (or several current
loops), these current loops being treated as though they were in a vaCUU1n, the total
magnetic field of all the current loops being the same as that of the material specimen
they replaced. The magnetic field will be significant if enough of these current loops
are similarly oriented.
In some materials this similar orientation exists independent of any external stimulus.
Such materials are appropriately called permanent magnets. They may occur naturally
or be formed by metallurgical processes. Other materials are magnetically neutral until
subjected to an external magnetic field. The response to a given stimulus varies from
one material to the next, and even for one material the response may vary in a compli-
cated way with changes in the direction or strength of the external field. For most
materials the magnetic response is very small and in this sense they differ little from
free space. A strong magnetic behavior is exhibited mainly by a small group of metals,
chief of which are iron, nickel, cobalt, and their alloys.
As a consequence of this diversity in magnetic response among materials, no attempt
will be made beforehand to link the distribution of equivalent current loops with any
external field; the discussion of such cause-and-effect relations will be deferred to a
later point in the development of the theory. Accordingly, the initial broad assump-
tion will be made that an arbitrary distribution of current loops exists throughout V.
As in the case of the dielectric theory, static conditions will be considered first.
Through use of the results of Example 4.6, if m is the magnetic moment of the mole-
cule situated at the point (~,l1,r), then the magnetic vector potential function due to

25 L. Neel, "Magnetic Properties; Ferrimagnetism and Antiferromagnetism," Ann Phys (Paris),

Ser. 12, 13, 137-198; 1948.


SECTION 2 The Static Macroscopic Magnetic Field 405

this molecule is

(7.1)

in which ~ is drawn from (~,l1,r) to the distant point (x,Y,z) where .A. is being determined.
It is usually the case that the volume V of the magnetic material is large enough so
that it may be subdivided into a great number N of macroscopic volume elements dV,
with N big enough so that the methods of the integral calculus may be employed,
whereas dV is large enough to contain many molecules. (The notation of Figure 6.2 is
applicable.) When this is so, a continuum approximation may be employed. Let
M(~,?7,r) dV represent the vector sum of all the magnetic moments in dV, so that M is
the volume density of magnetic moments. Then

is the magnetic vector potential function due to all the molecules in dV. From this it
follows that the potential at (x,y,z) due to all the molecules in the entire volume V of
magnetic material is

A(
x,y,z
) == f M(~,?7,r) X ~ dV
-1 3 (7.2)
V 47r,uo ~

In (7.2) M has the dimensions amperes/meter; it is customarily called the magnetiza-


tion density, or simply the magnetization, and is a macroscopic function. Implicit in
this derivation is the assumption that (x,Y,z) is sufficiently remote from each equiva-
lent current loop to satisfy the condition that ~ » a, with a the loop radius. t This
condition clearly is met for points (x,Y,z) outside the material, and the remainder of
the analysis will first be carried through under this restriction.
The Field at an Exterior Point. If one again distinguishes del operators with respect
to the source point (~,l1,S) and the field point (x,y,z) by the symbols V s and V F, then
the relation V s(l/~) == ~/~3 may be used to convert (7.2) to the form

A = JM X V s(l/~) dV
v 41r,u OI

Use of the vector identity (V. 109) then gives

A = f Vs X ~.dV _ f
Vs X (M!~) dV
v 41r,uo ~ v 47r,uo

But from Problem V. 29 (at the end of Appendix V),

_f V s X (M/~) dV = fM X dS
V 47T',u OI S 41r,uOl~

t The equivalent loop diameter need not exceed the molecular diameter for any of the three atomic
causes of magnetism.
406 Magnetic Materials CHAPTER 7

and thus
_f 1\1
A -
X dS
1 + f V S X 1\'1
1
dV
(7.3)
S 41rJ..Lo ~ v 41rJ.Lo ~
in which S is the bounding surface of the material, The magnetic field caused by this
specimen of magnetized material is therefore

B =
VFX
[f M41rJ..LOdS + f
s
X
1
~ v
Vs X M dVJ
1
41rJ..Lo ~
(7.4)

An interesting interpretation of Equations (7.3) and (7.4) may be offered. The action
of the magnetic specimen is as though an areal current density t = V X 1\1 amperes/rn!
were distributed throughout V, and as though a lineal current density j = M X in
amperes/rn were distributed over S, with in the outward-drawn unit normal vector. t
This concept of equivalent current densities will prove useful in describing many
aspects of magnetic behavior.
The Field at an Interior Point. The previous analysis may be adapted to points
(x,Y,z) within the magnetic material by treating separately a small region surrounding
the interior point in question. Let a spherical surface S~, of radius 0, be erected around
(x,Y,z) as center, thus separating a volume V~ from the remainder of the material, V t
should be chosen large enough to satisfy the condition 0 » a, with a the radius of any
equivalent current loop, but it should be small enough that 1\1 is essentially uniform
throughout V 6• These conditions normally will be satisfied if 0 is an order of magnitude
larger than the linear dimensions of a macroscopic volume element dV.
If V'is all the volume of magnetic material except V~, then

B (x,Y,z) = V
I
F X
[ f M X dS
-1 + f Vs X M dVJ
-1. (7.5)
S +Sc5 41rllo ~ V' 41rllo ~

is the magnetic field at (x,y,z) due to all the specimen except for the contributions
of the molecules within V~. If now (x,Y,z) is permitted to range over the subvolume dV,
consisting 'of the macroscopic volume element which is at the center of V c5 , no sensible
change will occur in the value of B/(x,Y,z) computed from (7.5), since dV is so small
compared to 11~. Thus Equation (7.5) also gives the average magnetic field throughout a
macroscopic volume element, due to all the molecules outside V~.
It is shown in Appendix N that all the current loops within V~ cause an average
magnetic field throughou t 11 0 of amount
2 IVI
B·= - -1 (7.6)
1 3 110

Since V o is a small local volume, if the molecules within V~ are alike and uniformly
distributed, it follows that (7.6) also gives the average magnetic field throughout the
macroscopic volume element dl/, due to all the current loops within V~. Therefore, t
if ll(x,y,z) is interpreted to mean the total average field in dV, or more briefly the

t If at a point in S a line of length dl is drawn in S transverse to the current flow, and the total
current crossing dt is counted and found to be dI, then i = dlldt is the lineal current density. t is
given the direction of the current flow.
t Most magnetic materials have this locally homogeneous property.
SECTION 2 The Static Jvfacroscopic IvIagnetic Field 407

macroscopic field, then

B(x,Y,z) = V F X [s+so
1 M4 X dS
-1 +
1 Vs X
4
1\1 dV]
-1 +
2 M
"3 -=1 (7.7)
7rJ.lo ~ V' 7rJ.lo ~ t J.1.o

This result can be simplified. Since V o is so small that M is uniform over So, one may
take a polar axis parallel to M so that M = 1rM cos 8 - 1 0M sin 8. Noting that dS
is directed into Yo, it follows that

V F X 1 M X dS _
-1 -
1 (M X dS) X !:
-1 3
So 47rJ.1. 0 ~ .'3 0 47rJ.1. 0 ~
[(lrlll cos 8 - l011f sin 8) X (-lr dS)] X (-lr o)
1
So 47rJ.1.o-1~3
u

sin edS = _ ~ 1\'1


1IBM
So 47rJ.1.o-102 3 -1
'- J.1.o
(7.8)

Therefore the equivalent lineal current density over So makes an equal and opposite
contribution to the macroscopic field at (x,y,z) when COIn pared to the contribution
made by the molecules within V o.
Additionally, since V o is so S111 all , Vs X M is constant throughout V o and thus

1 (v s (v s X M) 1 -dV==O
1
V s X M dV X IVI) X ~ dV ~
VFX 4 -1
=
4 -1 3
= 4 -1
X 3
(7.9)
Vb 7rJ.lo ~ Vo 7rJ.1.o ~ 7rJ.lo V(5 ~

Therefore (7.9) may be added to the right side of (7.7) without affecting the value of
B(x,Y,z). When this is done, and the explicit value for the surface integral obtained in
(7.8) is utilized, there results

B(x,Y,.z) = VI? X [1 M x_~~ + v1


S 47rJ.1.o ~
Vs X
47rJ.lo ~
~dV] (7.10)

But this is identical with (7.4) and thus the macroscopic field is given by the same
formula, whether (x,Y,z) is an interior point or an exterior point.
EXAl\1PLE 7.1
The long, thin cylindrical compass needle suggested by the figure has an almost-uniform
axial rnagnctiza tion, which is indicated by the array of arrows. If Equation (7.3) is ap-
plied to this specimen, one can conclude that, to the extent M is uniform throughout V,
V s X 1\1 == 0 in 1'. Further, M X dS == 0 over the end surfaces and M X dS is circumfer-
ential on the cylindrical surface and transverse to the long dimension. Thus the entire

......
~~
~.-~~.-
.. ~~---~~~~~
................... ............
~~~~---~~.-~~~--­
~~~ .... ~~ ..................
specimen is equivalent to a long, slender solenoid with many closely spaced turns carrying
a steady current: the equivalent ampere-turns per unit length is ill. The field of this com-
pass needle is therefore similar to the one shown in the flux plot of Example 4.11.
If the flux lines external to this compass needle were assumed to originate on positive
magnetic charge and to terminate on negative magnetic charge, one can see how the right
end would act like a north magnetic pole and the left end like a south magnetic pole.
408 "AIagnetic "AIaterials CHAPTER 7

"fhe original work of Poisson, in which magnetic effects were explained in terms of
positive and negative magnetic fluids, was noted in the historical introduction. Equal
amounts of these fluids were assumed to be within each molecule and to separate into
magnetic dipoles when a specimen became magnetized. The resulting fields were com-
puted in a manner precisely analogous to the procedure followed for dielectric mate-
rials in Chapter 6, in which polarized electric charges were assumed to be the cause of
dielectric behavior.
This analytical approach to magnetic behavior has been widely used, but it suffers
from several disadvantages. First, the equivalent atomic current concept of Ampere
appears to be closer to reality. Second, the expressions for internal and external macro-
scopic fields are not the same under a magnetic charge formulation;" whereas it has
just been demonstrated that they are the same when a current formulation is used.
Third, certain difficulties arise when interpreting the results. of combining magnetic
fields due to primary currents and magnetic materials, when the former is expressed
in terms of electric charge transport and the latter in terms of magnetic dipoles. This
can be appreciated by tracing the arguments of Section 7.3 under a rnagnetic charge
formulation.
Despite these difficulties, there are boundary value problems in which the magnetic
charge concept is useful and offers certain simplifications. These problems are usually
concerned with the fields outside the materials, in which case the aforementioned
difficulties are avoided. The simplifications occur because, like electric charge effects,
the fields due to magnetic charges may be given in terms of a scalar potential function.
For the reader who is interested in this alternative development, a derivation may be
found in many textbooks."

7.3 A GENERALIZATION OF "0


Consider next a general magnetostatic system consisting of primary steady currents
plus equivalent current loops which represent magnetic materials. One can imagine
that a primary current density l(~,1],r) occupies the volume V 1 and that an assortment
of magnetic materials occupies the volume V 2. V 1 and V 2 may overlap and neither of
them need be only one simply connected region. If l\l(~,1],r) is the volume density of
magnetic moments in 11 2 , then the total macroscopic magnetic field B at a point (x,Y,z)
will be
B(x,Y,z) = B 1(x,Y,z) +
B 2(x,Y,z) (7.11 )
in which

and

n.
-
= VF X [ J M41TJJ.OxdS
82 I!
+ vJ s» 41TJJ.O
2
X MdV]
1
!

If the entire magnetostatic system, including the magnetic materials, is viewed as a


26 See, e.g., I). R. Corson and P. Lorrain, Introduction to Eledronuupieiic Fields and 1t' aves, Chap. 7,

W. H. Freeman and Company, San Francisco, 1962.


27 See, e.g., J. R. Reitz and F. J. Milford, Foundations of Electrornagnetic Theory, Chap. 10, Addison-

Wesley Publishing Company, Inc., Reading, Massachusetts, 1960.


SECTION ;) A Generalization of H 0 409

distribution of primary currents of density \ and equivalent currents of density V X IVl


in a vacuum, a total macroscopic H o field 111ay be defined by
Ho(x,y,z) == J..Lo1B(x,y,z)
== J..LolBl(x,y,z) + J..Lo l B 2(x,y,z) (7.12)
== HOl(x,y,z) + H 02 (x,y,z)
In (7.12) HOI is the field which encircles the primary current distribution t. Similarly ,
H 0 2 is t he field associated with the equivalent currents V X M. H 0 2 is a macroscopic
field and does not have the detailed microscopic structure of a magnetic field associated
with the movements of the individual charges within the molecules. Both of the fields
HOI and H 02 satisfy Ampere's circuital law and are derivable through formation of the

curl of a vector potential function. Therefore,


V· HOI == 0
V X HOI = t
(7.13)
V · H 02 == 0
V X H 02 == V X M

In most problems of practical significance one is interested in the total B field which
determines the force on a moving charge, and in the HOI field which is linked to the
primary sources t dV through Ampere's circuital law. Equation (7.12) can be rewritten
to display these t\VO quantities of interest in the form
HOI == J..LoiB - H 02 (7.14)
H 02 is the extraneous quantity in this equation, and it is advantageous to explore the
implications of defining a macroscopic magnetic field H by the relation
H == J.lolB - M (7.15)
The substitution of M for I-I 0 2 is clearly suggested by the last of Equations (7.13). In
making the definition (7.15) it is to be emphasized that B still has the meaning of total
macroscopic magnetic flux density and that M still represents the macroscopic volume
density of magnetic moments. Outside 11 2 , where M == 0,

H == Hal + H 02 (7.16)
whereas inside V 2
H == HOI + H 02 - M (7.17)
Thus in ter111S of the earlier conception of magnetic fields associated with currents
in free space, H equals the total macroscopic H, field outside the magnetic materials,
but not inside. Furthermore, taking the divergence of (7.15) and utilizing (7.13) gives
V·H == -v·M (7.18)
whereas taking the curl yields
vxH==t (7.19)
Therefore H as defined by (7.15) may not be continuous but it does satisfy Ampere's
circuital law in terms of the primary sources only, and one may write

¢ H · ae sJ \. dS
c
= (7.20)

and conclude that H is governed explicitly by the primary current distribution.


410 Magnetic AIalerials CHAPTER 7

Equation (7.15) will be adopted as the defining relation for the generalized magnetic
field function I-I, with the understanding that this equation refers to the total macro-
scopic B everywhere, whether inside or outside the materials, but that it refers only
to the total macroscopic H, field outside the materials. It is an equation of card inal
importance in the theory of magnetic materials and can often be converted to the form
II == J.L- 1B , since for many materials M is expressible as a function of B. The permea-
bility factor J.L then serves the role of representing the macroscopic magnetic behavior
of the material.
Besides containing the magnetization M explicitly, Equation (7.15) has several
other advantages. Outside the materials (7.15) gives I-I == J-Lo1B which is a natural
relation, whereas (7.14) does not similarly connect HOI and B. In the absence of mate-
rials (7.15) reduces everywhere to (':1.:12) and thus all the earlier discussion of steady
magnetic fields due only to primary currents becomes a special case of the present
generalization. But an even more important feature is that 1-1, as defined by (7.15),
shares with 1-1 0 1 the property of satisfying Ampere's circuital law, thus providing a
cause and effect relation with the system of primary currents.
Needlelike and waf'erlike cavities can be constructed inside a specimen of magnetic
material in order to 111eaSUre Band H. The procedure is analogous to what was done
for dielectric materials in Example 6.3.

7.4 THE LOCAL FIELD

The defining relation (7.15), which gives H in terms of nand 1\1, is a macroscopic
equation whose further interpretation must await the linking of M to its causes. But
this linkage occurs at the microscopic level, and thus 1VI forms a bridge between the
macroscopic and microscopic theories of magnetic behavior.
To develop the connection between 1\'1 and its causes, it is useful to introduce the
concept of the local field, Bloc, which will be defined as the average field intensity in the
region occupied by a given molecule within the material, due to all external sources
plus every other molecule, but excluding the molecule in question. Bloc 111ay therefore
be determined by removing the given molecule, maintaining all other molecules in
their time-averaged states, and calculating the space-averaged maguetostatic field
in the cavity previously occupied by the removed molecule, If V m is the volume of the
cavity, then

Bloc =
1
-V f-b d V - 1
-V f-b m dV (7.21)
m Vm m Vm

in which b is the total time-averaged field at a point in V m and b m is the time-averaged


field at the same point due just to the molecule in question.
If the magnetic material is locally homogeneous, the first integral in (7.21) is approx-
imately equal to the macroscopic field B. And if V mean be chosen as a spherical volume
of radius r-; then the results of Appendix N give

(7.22)

with m the total magnetic moment of the removed molecule. If N == 1/ V m is the local
SECTION 5 Magnetic Susceptibility 411

volume density of molecules, then assuming parallel and equal magnetization for all
local molecules gives

so that the space-averaged self-field is 2M/3,uo 1 . With these substitutions (7.21)


becomes
2M
Bloc == B - --1 (7.23)
3,uo
This equation is similar to the Lorentz field derived in Section 6.5 for electrically
polarized molecules and suffers from the same limitations. However, once again one
111ay argue that, even if the integral in (7.22) does not reduce precisely to 2M/3,u Ol in
all cases, it is defensible that the average self-field in V m should be proportional to m,
which in turn is proportional to M by definition. Therefore

V1 J- M
bmdV == (1 - 1')-=1
m Vl1l ,uo
in which l' is called the internal field constant. t In this case (7.21) aSSU111es the more
general form
M
Bloc == B + ('Y - 1) ----=t
,uo
(7.24)

Equation (7.24) was first postulated by Pierre Weiss. ror. Section 7.1. ) It will be
interesting to note subsequently that for S0111e magnetic materials l' will be several
orders of magnitude larger than the value ~~ predicted by the Lorentz-like derivation
given above, and culminating in (7.23).

7.5 MAGNETIC SUSCEPTIBILITY


In general, the average magnetization per molecule m in any magnetic specimen is
functionally related to the local field. This dependence may be described by the equation

m = aBIDe (7.25)
in which a is called the magnetic polarizability and may be a simple scalar or a tensor,
but is often a function of the strength of the local field. If N is the density of molecules
per cubic meter, then
M == Nm == NaB loe (7.26)
is the magnetization density. With the use of (7.24) this may be rewritten in the form

M == Na B (7.27)
1 - ('Y - 1) N al ,u o 1

an equation which is seen to be similar to (6.49), derived earlier for dielectric materials.
t (1 - -y) is used as the proportionality constant here to conform with tradition. Because of this
choice (7.24) is equivalent to H 10 c = H + -yM, which is the customary way of expressing the local field,
and which unfortunately emphasizes H rather than B.
412 illagnetic ltl aterials CHAPTER 7

Equation (7.27) indicates a functional relation between M and B. The truupieiic


susceptibility x, is defined in such a way that this relation may also be written

~l = X
m
-IB (7.28)
1 + Xm J.Lo

A study of either (7.23) or (7.24) reveals that the dimensions of M and B differ by J.Lo l
and therefore that Xm is dimensionless. t
Upon combination of (7.27) and (7.28) the magnetic susceptibility may be expressed
in ter111S of the polarizabilityand t he internal field constant, namely,
N a/ Jl(;1
Xm = l
(7.29)
1 - 'YNa/J.Lo
This equation is seen to be completely analogous to (6.,51).
Through the combination of (7.1:» and (7.28), the H field is given by

-In Xm -IB
B B
I-I = (7.30)
11
r:
0
-
1 + Xm 11
r:
0

in which
(7.31)

is called the permeobiliiu of the material, The relative permeability j1.r is defined by

Jl
J.Lr = - = 1 + Xm (7.32)
j.lo

and, like X m , is a dimensionless quantity.


Because of the nature of the polarizability a, all the quantities Xm, jJ., and j.lr may be
scalars, or they may be tensors, depending on the material in question. In materials
which are strongly magnetic, they almost invariably will be dependent on the strength
of the field. At the macroscopic level the induced magnetization in a material specimen
may be represented through usc of Equations (7.28) and (7.31) or (7.32) by anyone of
the quantities Xm, J.L, or J.Lr.
EXAIVIPLE 7.2
Consider a coil wound on a soft iron core, as shown in the figure, and carrying a time-
independent current I. This steady current causes a B 1 field upward inside the coil and
thus induces a nearly uniform magnetization M upward in the iron core. The M distri-
bution is equivalent to a solenoidal sheet of current in the cylindrical surface of the core,

t The reader may wonder why XIII was not introduced by using; the simpler equation 1\'1 = xm,uolB,
in parallel with (6.50). The reason for this is that the traditional development of magnetic theory links
Xm to II rather than to B, the defining relation being M = xmH, which is equivalent to (7.28).
The reader is also cautioned against a possible source -of confusion in approaching the literature
of this field for the first time, Many investigators express their results in cgs units, measuring H in
oersteds. They may choose to give the magnetic moment per ec, or per gram, or per mole of the
substance being measured. The corresponding susceptibilities are then X e.m.u.Zcc, or x' e.m.u.ygm ,
etc. The dimensionless susceptibility Xm is related to t.hese other quantities hy Xm = 47rX =' 47rpX', etc.
(p is the density.)
It should also be noted that some authors use the symbol Xm to denote susceptibility per mole,
In this text the subscripts e and 1H arc used to distinguish between the dimensionless electric suscepti-
bility x, and the dimensionless magnetic susceptibility Xm.
SECTION 5 Magnetic Susceptibil ity 413

I
/---- ..........
.......
I "
I \ C
I \-
I \
\
\
\
\
\
\
\
\
\
\

,
\

\
\
I
I
I
,I
I
I
I
I
I
I
/
I /
I /
\ /
\ /
"<, .......... _ - - - / /

in the same direction as I. Thus the B 2 field caused by M enhances the B 1 field caused by I;
the total field B is greater everywhere because of the presence of the iron core.
Applying Ampere's circuital law (7.20) to the contour C, shown dotted in the figure,
one obtains

C
¢H.d£=NX = f H·d£+ f H·d£
Cair Ciron

in which Cair and Ciron are the parts of the contour C which are outside and inside the
iron core, and N is the total number of turns of the coil. If (7.30) applies in the iron, then

NI = J J.Lo B · se + J J.L-1B· se
Cair
l

Ciron
414 Magnetic Materials CHAPTER 7

With the length of the contour C designated by J./, the average value of B along C is

By virtue of this equation, the upper and lower bounds on B a v are such that

The lower bound occurs with no iron core, and the upper bound occurs when the core
extends around to include the entire contour C. Since J.L- 1 is usually so much less than J.Lo 1 ,
the range in Bi; can be considerable. Because B is continuous, it follows that B just out-
side the core is the same as n just inside the core. Therefore B along Ca ir keeps increasing
as Ciron is made a larger and larger part of C. It is for this reason that iron cores in electrical
machines are designed to occupy as great a portion of the path as possible, leaving the
minimum air gap permissible under mechanical considerations. For a given current in the
coil wound on the iron core, a current-carrying moving conductor in the air gap then
experiences the greatest force through its interaction with this maximized B field. This
force provides the driving torque in a motor and the reaction torque in a generator.

7.6 MEASUREMENT OF SUSCEPTIBILITY

The method used to measure magnetic susceptibility depends on whether or not the
relative permeability is close to unity. For diamagnetic and paramaqnetic substances,
the commonly used techniques involve a determination of the force exerted on a speci-
men by a nonuniform field. This force is most easily expressed as a spatial derivative
of the stored energy.
The derivation in Section 5.8 of an expression for the magnetic energy stored in a
region of free space is still valid when magnetic materials are present, since H (like
HOI) satisfies Ampere's circuital law and all other steps in the proof are identical. Thus
as a generalization of (.5.79), one n1UY write for the power being supplied to an entire
magnetic field in a volume V,

p = J H· B dV
IT

If various subregions of V contain materials which satisfy (7.30), then one may write

P =
Jv J.L-lB • ·
B dV = -a J -21
at v
p.-lB2 dV

From this it follows that the magnetic stored energy is

W", = t J
v
/L-IB2 dV (7.33)

in which the permeability J.L is not necessarily a constant throughout V.


Let this result be applied to the Gouy apparatus shown in Figure 7.2 in which a
slender magnetic specimen is suspended from a balance 80 as to hang between the pole
pieces of a magnet. If 11 is taken large enough to encompass the magnet, all of its
SECTION G M easuremeni of Susceptibility 415

sensible field, and the specimen, the magnetic stored energy may be wri tten

Wm = ~. f J.L-l[B(~,1'/,n + /lB(~,1'/,r,z)F d~ d1'/ dt


v

in which B is the field distribution in the absence of the specimen, and oB is the addi-
tional field due to the presence of the specimen, with z the vertical coordinate of some
reference point in the specimen (say the midpoint).

e. Specimen

FIGURE 7.2 Gouy apparatus.

If (Vo,JLo), (V 1,JL 1), and (V2,~2) are the volume and permeability, respectively, of the
air region of V, the magnet region, and the specimen region, then

w, = i J
VO+V2
J.Lo1(B + /lB)2 dV +i J J.LI1CB + /lB)2 dV
VI

+i J (J.L2
Vz
1
- J.L(1)(B + /lB)2 dV

in which the free-space integral has been added and subtracted over the volume V 2.
In the absence of a specimen, V o + V 2 is the air region of V, and with oB == 0, the
first t\VO integrals in the preceding expression give the total stored energy with no
specimen present. For a smoll diamagnetic or paramagnetic specimen (J.1.2 ~ J.1.o), oB is
negligibly small everywhere, and only the third integral of the preceding expression is a
sensible function of the position of the specimen. Therefore the force on the specimen is

dW-m == - --
F z == - - d f -1 (JL2"l
---=i - 1) JLo 1B2 dV
0

dz d.z V2 2 J.1.o
(7.34)
d
== -d
f
z+l/2
1 -1 01 -1
- xmA JLo B2 dr == ? xmAJLo (B a
2 O

- B
O
2
b)
z z -l/2 2 *oJ

in which a thin cylindrical specimen of length l and cross section A has been assumed,
and Sa and Bb are the field intensities at the top and bottom of the specimen.
416 111agnetic j1{ aterials CHAPTER 7

Typically, a specimen 111ay be 10 to 15 CIn long, so that the top end might be in a
fringe field of about 100 gauss when the bottom end is in a central field of 10,000 gauss;
B; may then be neglected in the above formula, The SpeCi111Cn is weighed with the
electromagnet not excited, and again with the field turned on. The difference in weight
is F z ; with A and Bb known, Xm may be deduced. The method is applicable to solid
samples and also to liquids and powders placed in a glass tube.
Instead of measuring directly the susceptibility of highly permeable materials, such
as iron, the B-H curve is often deduced by forming a specimen in the shape of a torus,
on which t\VO windings are placed. A d.c. source in series with a set of standard resistors
is connected to the primary winding, and a ballistic galvanometer to the secondary
winding..As the resistors are shorted out one by one, the primary current increases in
steps, as does H in the torus. An ammeter is used to 111eaSUre the current, and Ampere's
circuital law to deduce H. As each resistor is shorted out, the impulse in the galvanometer
is used to determine B. If the resistors are removed sequentially, then reinserted one by
one, and if then the d.c. source is reversed and the process is repeated, a hysteresis loop
is determined. The incremental susceptibility may be deduced from the slope of the
hysteresis curve, and is normally not a constant.

7.7 DIAMAGNETISM
The diamagnetic effect in materials causes a slight negative contribution to susceptibility
and is attributable to alterations in electron orbital motion due to the presence of an
external magnetic field. While the external field is being established, the growth in its
flux density induces changes in electronic motion which in turn create a reaction field
opposed to the stimulus. When the external field reaches a steady value, so too does the
reaction field.
In terms of a classical model, this steady reaction field can be pictured as due to
circular orbital motion at constant speed of each electron in every at0111 of the material
specimen. Although this conception is somewhat crude when compared to a quantum
mechanical model, it greatly facilitates the calculation of magnetic moments and gives
results which agree with quantum deductions.
Using the classical model, consider first the case of a single electron moving along a
circular path of radius r at a constant speed v, as indicated by Figure 7.3. It will be
assumed that a uniform, steady field Bloc exists in the region, and is perpendicular to
the plane of the orbit, directed out of the paper in the figure. The electron experiences
radial forces due to (1) the electrostatic attraction of the nucleus, and (2) its interaction
with the magnetic field. N ewton's force law yields

2
- l 1nv (7.35)
r
r

in which m and -e are the Blass and charge of the electron, respectively. With the angu-
lar velocity w of the electron defined so that v = w X r (cf, Section V.8), it follows that w
is into the plane of the paper in Figure 7.3, and that (7.3tj) may be rewritten as
Diamaqnetism 417

When both sides of this equation are crossed with v, the result is

( wXv - - e Bloc X V
) X v == - e
2
r X (w X r)
m 47l'"€omr 3
which reduces to
2
e e
( 1 - 47l'"€ow 2mr 3 co == -m
)
Bloc (7.36)

If the angular velocity and orbital radius in the absence of a magnetic field are Wo

and ro, respectively, then (7.36) yields for this case

(7.37)

Since To is of the order 10- 10 meters, (7.37) indicates that Wo ~ 10 16 rad/sec. Values of

/
+e v

Bloc 0
FIGURE 7.3 Electron in circular orbit about nucleus.

Bloc normally do not exceed 100 webers /rn" (10 6 gauss) and thus eBloc/l1~ ~ 10 13 • Under
these circumstances, a study of (7.36) discloses that w is close to woo If (7.36) is written
in the form
e
- Bloc ==
m

upon expansion the first-order result is


e
DW == - Bloc (7.38)
2m
so that, to first order,
e
W == Wo + -2m Bloc == Wo + WL (7.39)
418 Magnetic Materials CHAPTER 7

The term WL == eB loc / 2nt is called the Larmer angular frequency, and is the change in
angular velocity the electron experiences due to the presence of the magnetic field. It is
in the same direction as the mugnet.ie field, and is independent of the sense of rotation of
the electron in the orbit. Although WI. is small compared to Wo, it will soon be seen to
playa significant role in the phenomenon of diamagnetism.
This electronic orbital motion corresponds to a charge - e passing any point in the
orbit w/21r times per second, or to an equivalent current I = we/21r. The magnetic
moment caused by this orbital motion is therefore
(7.40)

the minus sign occurring because the electronic motion in one direction corresponds to a
positive current flow in the opposite direction.
Since the angular momentum cC or b of the electron about the nucleus is

cC or b = r X mv = r X rn(w X r)

it follows that cC or b = rnr 2w and thus that


e
morb = - - c:Corb (7.41)
21n
The orbital magnetic moment of an electron is therefore oppositely directed to its
angular 1110111entunl, these quantities being in the ratio (-e/21n).
Consider now a material specimen which, in the absence of any applied magnetic
field, exhibits no magnetic effect whatsoever. One may conclude from this that the
orbital planes of all the electrons within the specimen have tilts which are completely
random, thus giving a net magnetic moment which is zero. When the field is applied,
one n1UY consider the resulting effect in the following manncr : Imagine that the orbital
motion of a particular electron is viewed in terms of its projection 011 a plane transverse
to Bloc and in terms of its projection on a plane parallel to Blo(~. Only the motion in the
transverse plane is affected by 1he field. Thus when one considers all the electrons in
the specimen, the projected orbits in planes parallel to Bloc are random and give a null
effect. The projected orbits in a plane perpendicular to Bloc arc affected in an ordered
way and do give an effect.
Within a single atom of the material, if there is an electron whose projected orbital
motion in a plane traverse to Bloc is counterclockwise at angular velocity

and radius 1\, it is equally likely that on the time average there will be another electron
whose motion is clockwise at angular velocity W-i = - WOi +
eB loc / 2nl and the same
radius. These electrons pair to give a net magnetic moment which, according to (7.40), is
2r;Bloc
m, = _ ! er; (WOi + eIJlo C) _ ! er; (-WO; + eBloc) = - e (7.42)
2 21n 2 2111, 2111-

If there are N atorns/rn'' and Z is the number of electrons per atom, the net magnetiza-
tion is
Z/2
M - N e2B l oc/21n 2:
i=l
r; (7.43)
SECTION 7 Dianuujnetism 419

If the X Y plane is chosen to be perpendicular to Bloc, the mean square distance from the
field axis is

L r; = ~ L (x
_ 0 Z/2 ') Z/2
r~ = ~ 2 + y2) = x 2 + y2
Zi=l Zi=l

If the atom is assumed to be spherical and to have a mean square radius r 2 == x 2 + y2 + Z2,
- - -
then since x 2 == y2 == z\

and Equation (7.43) may be written

M = _ (Z~~ r) Bloc (7.44)

In diamagnetic materials (i.e., in materials which exhibit this induced magnetization


but no other magnetic effects) the magnetization is so weak, and the coupling between
atoms so slight, that Bloc ~ Band Xm «< 1. For such materials use of (7.28) gives

(7.45)

Equation (7.45) is the Langevin expression for diamagnetic susceptibility. Although


derived on the basis of classical arguments, it is in agreement with the result obtained
by 111eanS of a quantum mechanical analysis." If one assumes as representative values,
Z == 10 and N == 5 X 10 28 atoms /rn ", this expression predicts that Xm will be in the
order of 10- 5 .
The susceptibility of diamagnetic materials may be determined experimentally by
measuring the force on a specimen when it is placed in a nonuniform magnetic field. 29
The results of such measurements for a variety of rnaterials are given in Table 7.1. The

TABLE 7.1
SUSCEPTIBILITIES OF SOlVIE DIAl\1AGNETIC MATERIALS
AT ROO.M TEl\tIPERATURE AND ATlVIOSPHERIC PRESSURE

111aterial Xm X 105 111aterial Xm X 105

Bismuth ........ .. .. . -1.66 Selenium ....... . ..... -1.7


Copper ........ ... -0.95 Silicon ..... . .. . . . .. \
-0.3
Diamond .... .. . . · . -2.2 Silver ........ . . . .... -2.6
Graphite. ' " . ·. -12 Sodium .... . ....... -0.24
Gerruanium ....... ... -0.8 Aluminum oxide .. . . -0.5
Gold ..... .
" -3.6 Barium chloride .. . .. -2.0
Mercury ........ ·. -3.2 Sodium chloride ..... -1.2
I

28 Sec, e.g., C. Kittel, I ntroduction to Solid State Physics, 2d cd., p . .577, John Wiley and Sons, Inc.,
X C\V York, 1D56.
29 Cf. Sec. 7.6 for a typical procedure. A survey of experirnental methods may be found in L. F. Bates,

Modern Miujnetism, 3rd ed., Cambridge University Press, London, 1951.


420 Magnetic lv[ aierials CHAPTER 7

agreement between these data and the prediction drawn from Equation (7.45) is seen
to be quite good. The theory can be improved'? by using more realistic assumptions in
calculating r 2•
The similarities between the present discussion of diamagnetism and the earlier
treatment of electronic polarizability in dielectrics are striking (cf. Section 6.6). A
further parallel arises when one considers tern perature effects. Like electronic polariza-
bility, diamagnetism is also due to alterations in the relative positions and motions of
the parts of individual at0111S caused by an external field. So long as the temperature
is not so excessive as to cause a change in electronic configuration, the diamagnetic
effect should be independent of temperature, This conclusion is supported by experiment.
EXANIPLE 7.3
A hydrogen atom in its ground state (Is) has a mean square orbital radius for its electron
given by
r 2 = 3a~ = 3(0.529 X 10-1°)2
in which ao is the Bohr radius expressed in meters. If the molal specific volume of Hz gas
under standard conditions of pressure and temperature is taken to be 22.4 In 3/ kg m-mole,
then use of Avogadro's number yields the result that there are 2.68 X 10 25 hydrogen
molecules/m ' under these conditions. If one assumes that each hydrogen atom in an
H 2 molecule behaves diamagnetieally as though it were monoatornic, then the value
N = "5.36 X 102 5 atorns/rn- 111ay be used in formula (7.45) to compute the diamagnetic
susceptibility of hydrogen gas. When this is done, one obtains
(1.6 X 10-19)2(5.36 X 10 25) X 3 X (0.529 X 10- 1°)2
Xm = - ------------------~
6 X 9.1 X 10- 3 1 X (4-rr X 10-i)-1
= -2.65 X 10- 9
This figure is about 10- 4 times the values listed in Table 7.1, but the difference can be
attributed to the fact that Z = 1 and to the fact that the atom density in a typical gas
is only about 10- 3 of the value for a typical solid or liquid.

7.8 PERMANENT MAGNETIC MOMENTS


In the previous section diamagnetic effects were seen to arise in materials due to
alterations in electronic orbital motion induced by an external magnetic field. Thus
diamagnetism is present in all materials, and is in no way dependent on whether or
not the individual molecules possess permanent magnetic 1110111ents. However, the
explanation of all other magnetic phenomena (i.e., paramagnetism and the various forms
of ferromagnetism) rests on the assumption that the molecules of materials which
exhibit these phenomena do contain permanent magnetic moments. For this reason it
is desirable to review the origin of molecular magnetization in order to determine the
manner in which permanent moments can occur.
. It will be assumed that the reader has some familiarity with quantum mechanics and
is acquainted with the solutions of Schrodinger's equation applicable to a hydrogen
atom." These solutions are characterized by a set of quantum numbers having the
following properties:
C. Kittel, op. C1·t., p. 209.
30

An excellent introductory treatment is given by C. W. Sherwin, Introduction to Quantum Jl,fechanics,


31

Holt, Rinehart and Winston, New York. 1960.


SECTION 8 P ermaneni Magnetic ]\I! aments 421

1. The principal quantum number n determines the electron orbital energy. It can
have the positive integral values n == 1, 2, 3, 4, The corresponding electronic
shells are called respectively the K, 1-1, 1);[, N, shells.
2. The orbital angular 1110mentU111 of the electron is determined by the quantum
number l which can aSSUl11e the values l == 0, 1, 2, . . . , (n - 1). The corresponding
spectroscopic designations for these l states are s, p, d, !, 9, . . . . The value of the
orbital angular momentum is given by h[l(l +
l)P~, in which h == h/27r == 1.054 X 10- 34
joule see is known as the reduced Planck's constant.
3. The azimuthal quantum number Ttl.; indicates the allowed cornponents of orbital
angular momentum along a given direction (e.g., the direction of a magnetic field). It
111ay have the values 111l == 0, ± 1, ± 2, . . . , ± l.
4. In addition an electron is found to have spin, and the angular momentum asso-
ciated with this spin has the allowed values ±h/2 along a magnetic field direction. For
this reason a fourth quantum number m, is introduced and is permitted the values ±t.
5. The probability density distribution for the electron's position is governed in the
radial dimension by nand l and in the angular coordinates by land m.,
Upon invoking the Pauli exclusion principle (which states that no t\VO electrons of
the same atom may have the same four quantum numbers), one is able to extend these
solutions qualitatively to other elements than hydrogen and to explain the arrangement
of the elements in the periodic table." If the quantum numbers for the ground state of
hydrogen are chosen to be n == 1, l == 0, m, == 0, in, == t, then when the above develop-
ment is extended to helium, which has t\VO electrons, its ground state 111USt be such that
the additional electron assumes the set of quantum numbers n == 1, l == 0, rni == 0,
m, == -to This fills the K shell so that when lithium is considered the third electron in
its ground state must adopt the principal quantum number n == 2, thus initiating the
population of the 1.1 shell. In this manner a progression of sets of quantum numbers is
seen to accompany the additional electrons which go into the formation of the higher
elements. Table 7.2 indicates the numbers of electrons found in the various nand l
states for the first 36 elements in the periodic table.
In the light of these relationships, the various contributions to a possible permanent
magnetic moment of a free atom or ion 111ay be assessed as indicated in the following
sections.
1. Electron orbital motion. Equation (7.41) gives the relation between the orbital
angular momentum of an electron and its orbital magnetic moment. Though developed
through use of a classical argument, it is equally valid under a quantum mechanical
derivation. Thus the quantum numbers land m, govern the orbital contribution an
electron makes to the magnetic moment. For example, if l == 2, the total orbital angular
momentum is
.e == h[l(l + 1) p~ == V6h
and the allowed components of angular momentum along a field direction may be
deduced from Figure 7.4. If for convenience the field is assumed to be in the Z direction,
then it is evident that

.£. = [l(l + l)pi.£ = mdi (7.46)

32 See, e.g., L. ·V. Azaroff and J. J. Brophy, Electronic Processes iri Mcteriols, Chap. 3, Mc Gr aw-Hill
Book Company, New York, 1963.
422 1J1agnetic 111aierials CHAPTER 7

'fABLE 7.2
THE ARHANGEMENT OF ELECTRONS IN THE FIRST 36 ELE1VIENTS

K L ill N
n = 1 n=2 n=3 n=4
Atomic
Element - - -
Number Z
l = 0 l =0 l = 1 l = 0 l = 1 l = 2 l = 0 l = 1 l = 2 l =3
s s p s p d s p d f
- - --- - - --- - -- - --"- - - -

I 11 1
2 He 2

3 Li 2 1
4 Be 2 2
5 B 2 2 1
6 C 2 2 2
7 N 2 2 3
8 0 2 2 4
9 F 2 2 5
10 Ne 2 2 6

11 Na 2 2 6 1
12 MCT 0 2 2 6 2
13 Al 2 2 6 2 1
14 Si 2 2 6 2 2
15 P 2 2 6 2 3
16 S 2 2 6 2 4
17 CI 2 2 6 2 5
18 A 2 2 6 2 6

19 I{ 2 2 6 2 6 1
20 Ca 2 2 6 2 6 2
21 Sc 2 2 6 2 6 1 2
22 Ti 2 2 6 2 6 2 2
23 V 2 2 6 2 6 3 2
24 Cr 2 2 6 2 6 5 1
25 Mn 2 2 6 2 6 5 2
26 Fe 2 2 6 2 6 6 2
27 Co 2 2 6 2 6 7 2
28 Ni 2 2 6 2 6 8 2
29 eu 2 2 6 2 6 10 1
30 Zn 2 2 6 2 6 10 2
31 Ga 2 2 6 2 6 10 2 1
32 Ge 2 2 6 2 6 10 2 2
33 As 2 2 6 2 6 10 2 3
34 Se 2 2 6 2 6 10 2 4
35 Br 2 2 6 2 6 10 2 5
36 Kr 2 2 6 2 6 10 2 6
I
SECTION 8 Permanent Magnetic M oments 423

From (7.41), the allowed values of orbital magnetic moment along the field direction
are seen to be
eft eli eli
... , -2-, 0, -, 2-,
211~ Zm 211~

the quantity m, == (eh/2nz) == 9.27 X 10- 24 amp 111 2 is commonly called a Bohr
magneton. The orbital magnetic moment of an electron along a field direction
may then be said to have the value -'}nz Bohr magnetons.

V6
rn, = 2 ------

FIGURE 7.4 Allowed coniponenis of orbital angular momentum along a field direction.

If one recalls the discussion leading to the construction of the periodic table in the
form of Table 7.2, it is apparent that a completely filled electronic shell makes no net
contribution to the orbital magnetic moment of an atom, this being due to the fact that
electrons in the shell may be paired in terms of their equal and opposite m, quantum
numbers. A resultant orbital magnetic moment can occur only in atoms containing
incompletely filled electronic shells. Even then, the resultant for a large number of
similar atoms may be zero if the occurrence of a positive value for m, is just as frequen t
as the occurrence of the corresponding negative value.
Among those elements containing incompletely filled shells particular interest
attaches to the transition elements 21 through 28 (the iron group). The partially filled
3d set of states for these elements indicates that individual atoms have a net orbital
magnetic moment, and this is confirmed by experiment. Despite this, the contribu-
tions these moments make to the magnetic properties of the iron group elements in the
solid state prove to be negligible. This result may be explained by noting that, in the
iron group, the partially filled shell lies near the outer boundary of the atom and thus
is strongly influenced by neighboring atoms. The interaction of adjacent atoms proves
so strong that the orbital magnetic moments of the individual atoms cannot orient
424 111aqneiic J[ aterials CHAPTER 7

themselves in an external field. These 1110111ents are therefore "quenched" and their
immobility is similar to that of the rigid permanent electric dipole 1110111ents found in
some dielectrics.
This behavior 111ay be contrasted with that of the transition elements 58 through 71
(the rare earths), whose incomplete shells lie deeper inside the atom, and are thus
shielded from the influence of neighboring atoms. The orbital magnetic 1110111ents of
these rare earth elements are 110tquenched and do make a contribution to the magnetic
susceptibility.
2. Electron Spin. Since the spin angular 1110111entum of an electron can be ± h/2
along a field direction, Equation (7.41) would suggest that the electron spin contributes
a magnetic 1110111ent of ±t Bohr magneton in this direction. However, (7.41),
though applicable to orbital angular nl0111entU111, is not valid for spin. It has been
found that the proper relation is

e eli
mspin == - 2.0023 - cCsf)in ~ - - l spin (7.47)
21n 2n~

Therefore the electron spin gives rise to approximately one Bohr magneton along the
direction (or opposite) of a field B.
If one refers again to the discussion concerning the construction of the periodic table,
the Pauli exclusion principle is seen to require that a completely filled electronic shell
have equal numbers of electrons whose spins are aligned and eounteraligned with a
magnetic field direction. For this reason only partially filled shells can make a net spin
contribution to the magnetic moment of an atom. Herein lies the explanation for the
strong magnetic behavior of the iron group of transition elements. Table 7.2 shows an
orderly filling of the K, L, and M shells for the first 18 elements, Beginning with
potassium, the 3d states are first passed over while the t\VO 4s states are filled, and
then with scandium the population of the 3d states begins. There is a total of ten 3d
states (m, == 0, ± 1, ± 2; m, == ± t) and these are filled in such a way that the first five
electrons have like spins; not until the sixth electron is added does any spin cancellation
occur.
This process is illustrated by Table 7.3 in which the spins of the 3d electrons are
schematically represented as either up or down. Two variances in the orderly progres-
sion of population may be noted: chromium and copper each add two electrons to the
3d level at the expense of one electron from the 4s level (cf. Table 7.2). The net number
of Bohr magnetons (in round figures) due to spin is noted in the last column for each of
the elements in this series. The free atom of manganese may be seen to have the
largest spin magnetic moment.
It should be emphasized that Table 7.3 refers to single free atoms of the transition
elements. The situation is 1110re cornplicated in the solid state, since adjacent at0l11S can
exert a strong local effect on the orientation of the magnetic moments of their neigh-
bors. Table 7.4 lists the net magnetic moment per atom in the solid state for several
of the iron group elements. The effect of neighboring atoms is seen to reduce the mag-
netic 1110n1ent from the value which would occur with a single free atom.
3. Nuclear spin. Since the nucleus of an atom contains a net charge, it is also a
contributor to magnetic moment through the agency of nuclear spin. It is found that
the angular momentum associated with nuclear spin is of the same order of magnitude
SECTION 8 Permanent Magnetic Moments 425

as the angular momenta connected to electron spin and electron orbital motion, How-
ever, since the nuclear mass is three orders of magnitude larger than the mass of an
electron, the magnetic moment caused by nuclear spin is of the order of 10- 3 Bohr
magnetons. For this reason the nuclear contribution to magnetic susceptibility is almost
invariably overshadowed by the electronic contributions.

TABLE 7.3
POPUIJATION OF THE 3d STATE FOH THE IHON GROUP TRANSITION ELEMENTS

iltomic Number of S pin direction of N et spin magnetic


Element
number 3d electrons 3d electrons moment in Bohr magnetons

21 Scandium 1 r 1
22 Titanium 2 I l' i 2
23 Vanadium 3 i j l' 3
24 Chromium 5 j j j j j 4t
25 Manganese 5 l' i j l' i .5
26 Iron 6 l' j j l' j 1 4
27 Cobalt 7 r l' i i j 1 1 3
28 I Nickel 8 r i i j ill 1 2
29 Copper 10 i i l' l' l' 1 1 1 1 1 It
t Including the contribution from the single 48 electron.

4. Spin-orbit coupling. If the nuclear contribution is neglected, the total magnetic


moment of a single atom or ion may be determined by adding the orbital and spin
contributions of the constituent electrons. An electron possessing the quantum num-
bers m, and in, will contribute essentially m, + 21n Bohr magnetons of magnetic
8

moment along an external field direction. The total magnetic moment along this direc-
tion may then be found by summing the individual values of m, + 211'1.,8 associated with
the different electrons present in the atom. Filled shells make no net contribution, so
the summat.ion can be confined to incompletely filled shells.

TABLE 7.4
MAGNETIC l\10l\;IENT PER ATOl\1 IN THE SOLID STATE FOH IHON GHOUP
THANSITION ELEMENTS

Element Sc Ti V Cr Mn Fe Co Ni

In in
Bohr magnetons ... . .. 0 0.4 1.1 2.22 1.71 0.61
I
The result of this com putation n1~Y be related to the total angular momentum of the
electron cloud. If vectors c9, cC, S, of lengths h[J(J + l)P\ h[L(L + l)P~, h[S(S + l)P~
are chosen to represent, respectively, the total angular momentum, the total orbital
angular momentum, and the total spin contribution to momentum of the electron
426 Magnetic Materials CHAPTER 7

cloud, then
(7.48)
The quantities L, S, and J are the magnitudes of the algebraic SUIns of the m, values,
the m, values, and the m, + 1'n 8 values for the individual electrons; i.e.,

Which values of mi and 1n the individual electrons of the atom are allowed to assume
8

is governed by the Pauli exclusion principle and the energy state of the atom. In gen-
eral the allowed quantum numbers are such that Land S are not independent and
therefore it is proper to say that there is spin-orbit coupling. For a single isolated atom
or ion in the ground state, the values of S, L, and J can be determined through appli-
cation of H und's rules:
1. The electron spins are so oriented as to yield the maximum value of S consistent
with the Pauli exclusion principle.
2. The orbital angular momenta are arranged so as to maximize L for the value of S
found above.
3. If a partially filled shell is less than half occupied, J = 1.1 - S; if it is more than
half occupied, J = L S. +
The first of these three rules is evident in the electronic configuration of Table 7.3.
a
The total angular momentum of a free atom or ion can assume only a discrete set of
orientations with respect to the direction of a field B. If a construction identical with
that given in Figure 7.4 is used, the allowed orientations are such that the component
of the total angular momentum along the field direction is h multiplied by one of the
numbers
-J, - (J - 1), . . . , (J - 1), J (7.49)
The experimental discovery by Stern and Gerlach of this quantization in orientation
has been noted in Section 7.1.
The orbital contribution to magnetic moment for an individual electron is related to
the electron's orbital angular momentum by Equation (7.41). Similarly, the spin contri-
bution which this electron makes to magnetic moment is related to its spin angular
momentum by Equation (7.47). When these relations are summed over all the electrons
comprising the cloud of the free atom or ion, one may write
e
m= -gJ-~ (7.50)
2m
in which m is the magnetic moment for the entire free atom or ion, and gJ is called the
Lande splitting factor. If each electron spin is weighted with a g factor of 2, in accord-
ance with the approximate form of (7.47), and if each orbital contribution is weighted
with a g factor of 1, in aecordance with (7.41), it develops that g., is given by the Lande
formula."
J(J + 1) + S(S + 1) - 1./(1.1 + 1)
(7.51)
gJ = 1 + 2.J(J + 1)

33 A derivation may be found in G. Herzberg, Ato1n£c Spectra and AtonL'ic Structure, 2d ed., pp. 109-111,

Dover Publications, Inc., New York, 1944.


SECTION 9 Poramaqneiism 427

The name splitting factor for gJ arises from the fact that the presence of a magnetic
field causes the potential energy to depend on orientation of the magnetic moment,
according to the relation U == -m· B (cf. Example 4.7). With the aid of (7.50), this
gives
eli
U == -gJ - pB == -gJprnoB (7.52)
2m
in which p is one of the quantum numbers in the sequence (7.49). Therefore successive
allowed orientations of the magnetic moment with respect to the external field direc-
tion have energy levels which differ by gJ Bohr magnetons per gauss. This difference
in energy levels is evident in the spectrum (Zeeman effect) and the splitting of the
spectral lines is the cause for the naming of gJ.
In a crystalline solid the coupling of adjacent atoms modifies the magnetic moment
per atom and the quantization of orientation. The splitting of spectral lines is then
denoted by the more general spectroscopic splitting factor g. The Lande factor gJ
defined above may be looked upon as the spectroscopic splitting factor for the special
case of an isolated atom or ion.
EXAl\1PLE 7.4
Consider the Fe 2 + ion which, according to Table 7.2, has six electrons in the 3d shell and
an empty 48 shell. According to the first of Hund's rules, five of the 3d electrons will align
their spins, the sixth will oppose, and thus S = 2. The five aligned electrons 111USt take the
mi values - 2, -1, 0, + 1, +2, respectively, but the sixth electron may take any of these
values. The second of Hund's rules then gives L = 2. Since the 3d shell is 1110re than half
occupied, the third of Hund's rules gives J = 4. Use of (7.51) yields the information that
the Lande splitting factor is gJ = 1.5 for an Fe 2 + ion.

7.9 PARAMAGNETISM
The paramagnetic effect occurs in certain materials whose individual atoms possess
permanent magnetic moments which are randomly oriented in the absence of all
external magnetic field. The presence of an external field causes some net alignmeu t
of these magnetic moments with the field direction, and thus S0111e magnetization of
the specimen. Under normal eircumst.auces this magnetization is slight and the para-
magnetic substance exhibits only a small positive susceptibility.
The discussion of the preceding section indicates that permanent atomic magnetic
moments occur only in atoms or ions whose electronic configuration includes one or more
incompletely filled shells. For this reason paramagnetic materials are found among
those compounds containing transition group elements. Of these the iron group
compounds and the rare earth group compounds have received the most extensive
investigation.
An analysis of the behavior of paramagnetic materials can be undertaken by con-
sidering a specimen containing N atoms per unit volume, the total angular 1110111entU111
quantum number of each atom being J (cf. Section 7.8). In accordance with (7.50)
there is a magnetic moment per atom m, associated with this angular momentum,
whose magnitude is
(7.53)

in which peff is the effective number of Bohr magnetons per atom.


428 Magnetic Materials CHAPTER 7

If these atomic magnetic moments were assumed to be freely rotating, the classical
Langevin theory would be applicable, with the development paralleling what has
been presented in Section 6.8 for orientational polarization in dielectric materials. From
Example 4.7 the magnetic potential energy of an atom would be -m· Bloc. For para-
magnetic materials this could be written - m · B, since the local interactions are weak
and Bloc is essentially equal to the macroscopic field B. The net magnetization per unit
volume would then be
(7.54)

with ~(a) = coth a - l/a the Langevin function, and a = peffmoB/kT.


However, as noted in Section 7.8, the atoms are not freely rotating but are restricted
to a discrete set of orientations relative to the direction of the applied magnetic field.
The possible components of magnetic moment along the field direction are given by
pgJrno in which p has one of the values listed in the sequence (7.49).
The magnetic potential energy of an atom when it is in one of these positions is then
given by (7.52) and the relative probability of finding the atom so oriented is exp
(pgJrnoB/kT). Summing over the probability distribution of N atoms per unit volume
gives for the magnetization

L
J
pgJrnoePOJmoB/kT
M = N P~=--_JJ------ (7.55)
L ePoJmoB/kT
p= -J

Since both numerator and denominator of (7.55) are geometric and/or arithmetic
progressions, they can be summed to give

(7.56)
in which

~J(x) =
2J+1
2J coth
[(2J+1)X]
2J
1
- 2J coth 2J
(x) (7.57)

is called the Brillouin function, with x = gJJrnoB/kT.


For large J the Brillouin function is plotted in Figure 7.5, and is seen to be essentially
equivalent to the Langevin function (Figure 6.9). The magnetization formula (7.56)
then reduces to the classical expression (7.54).
A study of Figure 7.5 reveals that as x becomes large, }BJ(x) tends to a limit of
unity and therefore the magnetization density approaches the saturation value
M sat = N Peffrno. One interpretation which may be placed on this result is that, at a
constant temperature, as B is increased, a greater percentage of the atomic magnetic
moments spend a greater percentage of the time being aligned with the field. The
saturation magnetization corresponds to total alignment of all atomic magnetic mo-
ments. The formulas also indicate that alignment is easier as the temperature is lowered.
Even for small J, normal temperatures and field values imply that x « 1 and the
Brillouin function (7.57) then reduces to Q3 J (x ) = (J + 1)x/3J. (See Problem 7.10
at the end of this chapter.) The magnetization, as given by (7.56), simplifies to

(7.58)
SECTION 9 Paramaqnetism 429

This is also the classical limit of (7.54) when a is small, and thus the quantum and classi-
cal theories yield the same expression for magnetization under normal circumstances.
Using (7.28), and recognizing that the susceptibility of paramagnetic materials is
small, one may write
Ai Q5 J (x )
X == - - == Np.ogJJm o - (7.59)
m J.Lo1B B
For x « 1, this reduces to
C
(7.60)
T

in which C is called the Curie constant; Equation (7.60) is known as the Curie law of
paramagnetism. It may be contrasted to (6.57), in which the orientational contribution

58J(x)
1.0 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
/
/
0.8 /
/
/
/
0.6 /
/
/
/
/
0.4 /.
~
b
0.2

L- L-- ~ __L .L.._ _'__ ....a.._ x


o 2 3 4 5 6 7
FIGURE 7.5 The Brillouin function for large J.

to dielectric susceptibility was seen to be N /3kT€o~times the square of the permanent


electric dipole moment per atom. A good experimental verification of the Curie law
of paramagnetism is offered in Figure 7.6. t
An estimate of the size of the Curie constant 111ay be obtained by taking N ~ 4 X 10 2 8
atorns./m! and peff == 2, in which case C ~ iOI<:. This means that at room temperature
one would expect paramagnetic materials to have a susceptibility of the order of 2 X 10- 3 •
This prediction of the theory is borne out by the experimental data given in Table 7.5.
In measuring the susceptibility of a paramagnetic substance, one l11USt accept the
fact that there is present a diamagnetic contribution. However, the diamagnetic and
paramagnetic contributions are additive, and the results of Section 7.7 indicate that
the diamagnetic contribution is normally t\VO orders of magnitude below the para-
magnetic contribution at room temperature and thus can be ignored.
t Miss Hupse's original results were expressed in terms of x', the susceptibility in e.m.u. per gram.
The dimensionless susceptibility x., has been determined using the relation x., = 41l'px l , with p the
density in gm/cc. In the temperature range shown, p varies but little.
430 Magnetic Materials CHAPTER 7

15

I
;:;
X
i 10
e
><
Q.

0-----1....-----_'--
o 100 200
.1..--_
300

T(OI()
FIGURE 7.6 Reciprocal paramagneticsusceptibility of powdered CUS04 . K 2S0 4 • 6H~O
os. temperature. [Afler J. C. lIupse, Physica, 9, 633; July 19.i2.]

Experimental verification of the magnetization formula (7.56) has also been achieved
by holding a variety of paramagnetic materials at a low temperature and varying the
applied field. A clear illustration of the saturation effect in ~J(x) is found in Figure 7.7.
Agreement between theory and experiment is seen to be excellent.

'fABLE 7.5
THE ROOM TEMPERATURE SUSCEPTIBILITIES
OF SO~1E PARAMAGNET.IC MATERIALS

A[ aterial 10 3 X m Material 10 3 X m

FeS04............ 2.8 Fe 20 a......... 1.4


NiS0 4............ 1.2 Cr 20 3......... 1.7
MnS04............ 3.6 FeCI2 .••••••.. 3.7
CoS0 4·H 20 ........ 2.0 CrCta......... 1.5

With the aid of Hund's rules and the Lande formula for gJ, the effective number of
Bohr magnetons can be calculated for various particles through use of (7.53). When
this is done for the trivalent positive rare earth ions, results such as those shown in
SECTION 9 Paramaqneiism. 431

7.00

6.00

s.oo

x
~
0
.~ 4.00
~
0
..,.>
Q)

~
E S = ~(Cr3+)
$..c
..c: :{.OO
0
~

2.00 o 1.30 oK
~ 2.00 oK
x :3.00 oK
o 4.21 °I\:
- Brillouin
1.00

o. 00 _-'---.L-----'-----I~....&..---L-___A__L..-.-~....L----'-_.L..._'""'____L___L.._..&_ _'____i_____L_..J

o 2 4

13
T webers/n1 2 / oK

FIGURE 7.7 J[ agnetic moment vs. B / T for qadolinuiui sulfate octahydrate, for ferric ani-
monium alum, and for potassium chromium alum. [.lfter Il". E. llenry, Phys lie», 88,559;
1952.]

Table 7.6 are obtained. Representative experimental values are also listed, and the
agreement is seen to be quite good.
When this same procedure is applied to the iron group ions, no such favorable corn-
parison with.experiment is obtained, as is evident by comparing the third and fifth
columns of Table 7.7. However, if one assumes that the orbital 1110111entull1 of these
ions is "quenched" by strong neighbor interactions, (i.e., the orbital part of the mag-
netic moment is unable to respond to an external field), this is tantamount to setting
L = 0, letting J = Sand gJ = 2. When p.« is calculated on this basis, the result is as
432 M aqneiic Materials CHAPTER 7
TABLE i.6
THE EFFECTIVE l\10iVIENT IN BOHR NIAGNETONS ron VARIOUS
TRIVALENT RAHE EARTH IONS AT ROOl\1 TEl\tlPERATURE

Ion Configuration Peff = gJ[J(J + l)r~ Peff(CXP)

Ce 3+ 4f 15s2p 6 2.54 2.4


Nd 3+ 4f35s 2p 6 3.62 3.5
Gd 3+ 4f75s 2p 6 7.94 8.0
D y 3+ 4f95s2p 6 10.63 10.6
Er 3+ 4fl15s 2p 6 9.59 9.5
Yb 3+ 4f135s 2p 6 4.54 4.5

given in the fourth column of Table 7.7. These calculations are seen to be in much
better agreement with experiment.
This difference in behavior of the rare earth and iron group salts is reasonable when
one considers the relative positions of the incomplete shells. In the rare earth C0111-
pounds the 4f shell is responsible for the paramagnetic effect, and this shell is well-
shielded from neighboring atoms by the completed outer 5s and 5p shells. However,

TABLE 7.i
THE EFFECTIVE IVIO:\1ENT IN BOHR ?vIAGNETONS FOR VARIOUS IHO~ GROUP IONS

Ion Configuration Peff = gJ[J(J + 1)J}2 Peff = 2[8(8 + l)r~ pclf(exp)t

Ti 3+ V 4+ 3d 1 1.55 1.73 1.8


V 3+ 3d 2 1.63 2.83 2.8
Cr 3+ V 2+ 3d 3 0.77 3.87 3.8
l\1n 3+ Cr 2+ 3d 4 0 4.90 4.9
Fe 3+ Mn 2+ 3d 5 5.92 5.92 5.9
Fe 2+ 3d 6 6.70 4.90 5.4
C 0 2+ 3d 7 6.54 3.87 4.8
Ni 2+ 3d 8 5.59 2.83 3.2
Cu 2+ 3d 9 3.55 1.73 1.9

t Representative values

in the iron group ions, the 3d shell is responsible for the paramaguetism ; this is the
outermost shell and thus exposed to the intense local fields caused by neighboring
atoms, These local fields are responsible for the quenching of the orbital contribution
to magnetic moment.
EXAMPLE 7.5
For the Cr 3+ ion, which has three electrons in the 3d shell, Hund's rules give 8 = ! and
L = 3. Since this shell is less than half-occupied, J = L - 8 = !. Use of the Lande formula
yields gJ = 0.4. From this, it follows that, gJ[J(J + 1)1~~ = 0.77, 2[8(8 + 1)]~~ = 3.87,
SECTION 10 Properties of Ferromaqnetic Materials 433

which are t\VO of the entries in Table 7.7. The low value of calculated gJ, namely 0.4, is seen
to be at wide variance with the value

Peff(exp) = 1.97
[8(8 + 1)]~~
calculated through use of the experimental data and the assumption that orbital angular
momentum is quenched.

EXAMPLE 7.6
Since the limiting value of the Brillouin function is unity, Equation (7.56) indicates that
the saturation magnetization for a paramagnetic substance is

Under general conditions of temperature and applied field, the fractional magnetization is
then

M(x) = SBJ(x)
n.;
At normal field intensities and temperatures, this reduces to

M(x) = ~ + 1 ~ = [(J + l)/J]~~ PeffrnO B


u.; J 3 3kT
111 (x)/ll1 s a t may be interpreted as the fraction of atoms which have their magnetic
moments aligned with the B field. At room temperature, for the representative value
[(J +
1) / JJ ~~Peff = 2, the above equation gives

lV/(x) = 1.5 X 10- 3B


M sat
For the weak field B = 10-- 6 webers/rn" (10- 2 gauss), only one atom in 109 is seen to be
aligned. Even for the strong field B = 1 weber /m? (104 gauss), only about one atom in a
thousand is aligned. This result for paramagnetic materials will be seen to be in sharp con-
trast with the behavior of ferromagnetic substances, all of whose magnetic moments can be
aligned at room temperature.

7.10 PROPERTIES OF FERROMAGNETIC MATERIALS

A substance is said to be ferromagnetic if it can exhibit a spontaneous magnetic


moment, that is, a magnetic moment even in the absence of an applied field. For a
given ferromagnetic material this spontaneous magnetization can occur only below a
critical temperature Tc, called the [erromaqnetic Curie temperature. "VeIl above the
Curie temperature, such materials are found to behave paramagnetically, and to have a
well-defined susceptibility which follows the Curie-Weiss law, namely,

X =---
c (7.61)
m (T - 0)

in which C is the Curie constant and () is called the poramaqnetic Curie temperature.
e is usually somewhat higher than T c ; for the five elements which display ferromag-
netism, the comparison of these Curie temperatures is given in Table 7.8. Ferromag-
434 Magnetic Materials CHAPTER 7

netic alloys and compounds also conform to this pattern as well as some compounds
which do not even contain ferromagnetic elements.

TABLE 7.8
CURIE TE1VIPEUATUUES OF FEUROlVIAGNETIC ELEMENTS

Fe Co Ni Gd Dy

Tc 1043 1393 631 289 105

() 1093 1428 650

4~ typical plot relating x., to temperature above the Curie point for a ferromagnetic
material is shown in Figure 7.8. Comparison with Figure 7.6, which typifies a para-
magnetic material, reveals a similarity in slope, and comparable values of X m for a given
temperature level. The principal difference in behavior between the t\VO types of

300,.-----....-,--------r-------r------"'"'"

200 t---------t------+-------t---~.,_.-_i

100

600 650 700 750 800

T(OI{)
FIGURE7.8 Reciprocal susceptibilityof nickel VS. temperature. [After P. lFeiss and R. Forrer,
Ann Phys. (Paris), 5, 153; 1926.]

material is that, for a truly paramagnetic material, (} == 0; also, for a ferromagnetic


material, as () is approached from above, X~l becomes a nonlinear function of tempera-
ture. Stoner" has offered an explanation of this nonlinear behavior in the neighborhood
of the Curie point.
34 E. C. Stoner, Proc Leeds Phil Lit Soc, 3, 457; 1938.
SECTION 10 Properties of Ferromaqnetic Materials 435

Below the Curie temperature T e ferromagnetic materials show a marked increase


in susceptibility, and also have a B-H characteristic which displays the familiar
hysteresis effect. This behavior is indicated by Figure 7.9 and can be explained in the
following "ray: Beginning with a virgin specimen (B == H == 0), if H is increased
(through application of a current in a coil wrapped around the specimen, for example),
B at first increases reversibly, and the curve oa is traced out. Further slight increases

d
----~:--~.......---Bsat

- - - - - - -.....- - - - - i &
o -----e-------H

g
- Bsat---llllllli:::~-----

FIGURE 7.9 B-H hysteresis loop of a [erromaqneiic specimen.

in H cause large increases in B, until at position c on the curve saturation has set in.
When position d is reached, further increases in H cause negligible change in B, and a
saturation field B sat is clearly delineated. If now H is decreased, the curve de results,
and a remanent field B, is observed with H == O. Since there is no longer an external
excitation, the specimen has become spontaneously maqnetized: From (7.15) the mag-
nctization at this point on the curve is 11-1 == MolEr.
Reversal of the external current will cause a negative H field, resulting in further
decrease of B, and producing the curve segment ef. At a value -He, called the coercive
436 Magnetic Materials CHAPTER 7

force, the B field is reduced to zero. Further decrease of H creates the segment fg, with
reverse saturation of the B field occurring at g. Another reversal of H traces out gijd,
the second half of the cycle, this being an inverse image of defg.
The incremental permeability of the specimen may be defined as the slope of the
B-H curve, and is obviously a function of B and the prior history of the specimen.
This incremental permeability may be extremely large, values of relative permeability
as high as 10 5 being not uncommon. The initial permeability of the specimen is defined
as the slope of the virgin curve oab at the origin.
The coercive force varies widely from one ferromagnetic material to another and
is a property of great practical importance. In permanent magnet materials it should
be high, whereas in transformer materials it should be as low as possible. The wide
range of He is evident from a study of Table 7.9, in which the principal properties of
many ferromagnetic materials are listed.

T'ABLE 7.9
DATA FOR FERROl\fAGNETIC l\fATEHIALS

High-permeability materials

Maximum Saturation flux Coercive


~!J aterial Percent composition relative density B sat force He
permeability (webers/rn") (amp/In)

Iron 99.91 Fe 5,000 2.15 80


Purified iron 99.95 Fe 180,000 2.15 4
Cold rolled steel 98.5 Fe 2,000 2.10 145
78 Permalloy 21.2 Fe, 78.5 Ni, 0.3 Mn 100,000 1.07 4
Mu metal 18 Fe, 75 Ni, 2 Cr, 5 Cu 100,000 0.65 4
Supermalloy 15.7 Fe, 79 Ni, 5 Mo, 0.3 Mn 800,000 0.80 0.16

Perrnanent-magnet materials

Remanent flux Coercive


J.,1 aterial Percent composition density B r force He
(webers/rn 2) (ampyrn )

Carbon steel 98.1 Fe, 1 Mn, 0.9 C 1.0 4,000


Tungsten steel 94 Fe, 5 'tV, 0.3 Mn, 0.7 C 1.03 5,600
Rernalloy 71 Fe, 17 Mo, 12 Co 1.05 20,000
Alnico II (sintered) 64.5 Fe, 10 AI, 17 Ni, 2.5 Co, 6 Cu 0.69 41,600
Alnico V 53 Fe, 8 AI, 14 Ni, 24 Co, 3 Cu 1.25 44,000
Platinum-Cobalt 77 Pt, 23 Co 0.45 208,000

7.11 THE WEISS THEORY OF FERROMAGNETISM

Inspection of Table 7.9 indicates that the remanent flux density B r for permanent
magnet materials is typically about 1 wober/rn". Since H = 0 at this point on the
SECTION 11 The JVeiss Theory of Ferromaqnetism 437

hysteresis curve, (7.15) gives lYI = P-olB ~ 106 amp/rn. If each atom in the material
has a magnetic moment of the order of one Bohr magneton, then the atom density is
N == Mirna ~ 10 2 9 atorns /rn", if all the atomic maqnetic moments are aligned. Since
this is the right order of magnitude for the atom density of solid materials, it follows
that such a high remanent flux density can be explained only by assuming a spon-
taneous magnetization in which essentially all the atoms of the material are so oriented
that their magnetic moments are parallel. This behavior in the absence of any external
magnetic field whatsoever is radically different from that which occurs in paramagnetic
materials. In Example 7.6 it was observed that even with so strong an external field
as 1 weber/rn" (10,000 gauss), only one at0111 in a thousand within a paramagnetic
material has its magnetic moment aligned with the field.
In order to interpret this phenomenon of total spontaneous magnetization in ferro-
magnetic materials, Pierre Weiss, in 1907, advanced the hypothesis that a strong
"molecular field," or local field, exists within the material and is the cause for total
alignment of the atomic magnetic moments. To account for the fact that even below
the Curie point a ferromagnetic specimen does not always exhibit so great an external
field as Bv, Weiss assumed as a second hypothesis that a macroscopic region of the
specimen may contain, if T < T c , a number of subregions (domains), each of which
is spontaneously magnetized. All the atomic magnetic moments within one domain
are pictured as being aligned, but the direction of magnetization might vary from one
domain to the next. With the domains randomly magnetized the net bulk magnetiza-
tion would be zero; with all domains reinforcing, the net bulk magnetization would
result in the field B r-
With these two hypotheses Weiss was able to explain the temperature dependence
of susceptibility above the Curie point, and the hysteresis effect below the Curie point.
His original theory made use of the classical Langevin analysis based on freely rotating
atomic magnetic dipoles. In the presentation which follows Weiss' theory is slightly
modified to account for the quantized nature of the field-directed component of atomic
magnetic moment.
The Weiss expression for the local field has been introduced in Section 7.4. In terms
of B fields it may be written

Bloc == B + (1' - l),u aM (7.62)

in which B is the macroscopic total field, defined in Section 7.3, and M is the magneti-
zation, defined in Section 7.2; l' is called the internal field constant, or the Weiss
constant. t
Suppose that the ferromagnetic specimen contains N atoms/rn", each with a total
angular momentum quantum number J. Repeating the analysis of Section 7.9, one
finds that the magnetization is once again given by

(7.63)
except that now
x - gJJmoB 1oc _ J [B + ('Y - l)J.L oM]
(7.64)
- kT - gJ rna kT

t The original Weiss hypothesis was that H 10c = H + I'M, but this is completely equivalent to (7.62).
Through use of (7.15), either may be deduced from the other.
438 Magnetic Materials CHAPTER 7

This analysis is distinguished from the development previously given for paramagnetic
materials in that it is no longer assumed that Bloc ~ B.
It is no\v desirable to consider separately two temperature regions.
Ferromagnetic Materials at High Temperatures. If T > T c , the thermal agitation
is so great that the local field is not sufficient to maintain alignment of adjacent atomic
magnetic moments. The domain structure disappears, and individual magnetic mo-
ments are free to assume any allowed orientation with respect to Bloc; the behavior ]8
completely analogous to paramagnetism except that the local field is not negligibly
different from the applied field. The low magnetization at these high temperatures
corresponds to x « 1 in (7.64). But then the Brillouin function may be approximated
by'BJ(x) = (J + 1)x/3J and (7.63) becomes

M = N(Peff mo)2/ 3kT B


(7.65)
1 - Jlo(" - l)N (PeffmO) 2/3kT
With the application of (7.28) the magnetic susceptibility is found to be
NJlO(Peffmo) 2/3k
Xm = - ------- (7.66)
1
1 - "N Jlo(Peffmo) 2/3k
which is seen to be in the form of the Curie-Weiss law (7.61). The paramagnetic Curie
temperature is therefore
8 = "INJlO(PeffmO) 2 (7.67)
3k
and the Curie constant is C = 0/".
Ferromagnetic Materials at Temperatures The Curie-
Below the Curie Point.
Weiss law (7.66) quite obviously cannot hold for T ~ 8, for then the magnetic suscepti-
bility would pass through a pole and become negative. The presence of a pole in (7.66)
suggests the possibility that spontaneous magnetization can occur (i.e., that a finite
value of M is possible for H == 0). To develop this argument further, one can return
to (7.63) and consider the magnetization for values of x which are not necessarily small.
Since H = 0 for spontaneous magnetization, B = JloM. Insertion of this result in (7.64)
gives

(7.68)

The variable x may be eliminated from (7.63) and (7.68) by simultaneous solution,
which permits the magnetization to be expressed as a function of temperature. This
elimination is facilitated by first normalizing both expressions so that they become

M(x) = 'SJ(x) =J + 1 !.- x (7.69)


u.: 3J 8
in which M s a t is the limiting maximum value of (7.63), occurring when x ~ co. These
two functional forms of the fractional magnetization are plotted in Figure 7.10 for the
case J = 1 and for several values of T /8. An intersection is seen to occur only for
T < o. If this intersection is used to determine x, the fractional magnetization may be
plotted as a function of temperature, the result being as indicated in Figure 7.11. Also
shown in this figure are the results of following this procedure for the case J = t and
SECTION 1] 71he
lVeiss Theory of Ferromaqneiism 439

lV/(X)
A/sat
T<O
1.0

0.8

0.6

0.4

0.2

O--------"-------'---------L.-----..A----X
1.0 2.0 3.0 4.0
FIGURE 7.10 Intersection of iuo functional representations of [raciional inaqneiizaiion,

1.0 ~~~-====I?\':--::------.-------r------r------'

0.8 t-------+-----~--::lllooc------+----

0.6 r--------+------~-----+----

1ll(x)
IV/sat
• Fe

0.4. t--------+-- o Ni--+------f------4---

x Co

0.2 r - - - - - . . . - - - - + - - - - - + - - - - - - - - 4 - - - - - - + - - - - -

o 0.2 0.4 0.6 0.8 1.0


T
(J

FIGURE 7.11 The spontaneous magnetization of iron, nickel. and cobalt »s. teniperaiure.
440 111agnetic 111aterials CHAPTER 7

for the classical Langevin ease J ~ 00. The experimental data for iron, cobalt, and
nickel are seen to fit the ,I == t curve quite well. Considering the fact that these three
materials have widely different paramagnetic Curie temperatures and saturation
magnetizations, the agreement between theory and experiment is even more significant.
The fit between experimental data and the ,I == t curve in Figure 7.11 also supports
the argument that the orbit.al contribution to magnetic moment is quenched in the
iron group elements (cf. Section 7.9). Further support comes from gyromagnetic
experiments, in which the gJ factor of a specimen is determined by the effect that mag-
netization and angular momentum have on each other. 35
From the saturation magnetization l\1s a t and the volume density of atoms, one can
calculate the effective number of Bohr magnetons per atom for various materials, The
results of such calculations for the iron group elements were given in Table 7.4. The
nonintegral values may be explained by noting that, in the solid state, atomic energy
levels are broadened into bands, and adjacent atoms affect the electronic structure (and
thus the magnetization) of their neighbors.
Finally, the Weiss theory as embodied in Figure 7.10 indicates that the spontaneous
magnetization goes to zero when T == (J, for then an intersection of the two curves is
no longer possible. Thus this theory does not differentiate between the ferromagnetic
Curie temperature T; and the paramagnetic Curie temperature O. Table 7.8 reveals
that experimentally these two Curie points are not the same for real materials. How-
ever, the difference is not great and the theory does afford a reasonable explanation
for all the principal observed phenomena.

EXAMPLE 7.7
If the Curie point (J = 1093°K is selected for iron from Table 7.8, at room temperature, for
J = 1, the right side of (7.69) gives

J + 1 '!.. x = 0.185x
3J ()

A plot of this straight line on Figure 7.10 provides an intersection with the SSl(X) curve at
a point for which Jlf(x)/ A/sat> 0.99. This result would be modified slightly if the more
appropriate value J = t were used, but the indication would still be that, at room tern-
perature, iron is spontaneously magnetized to a degree very close to the saturation value.
For cobalt, which has a higher Curie point, the approach to saturation is even greater at
room temperature. For nickel, with a Curie point of only 650 oK, the fractional magneti-
zation at room temperature is in the neighborhood of 95 percent.

7.12 THE WEISS FIELD CONSTANT AND THE EXCHANGE INTEGRAL


An estimation of the value of the Weiss constant l' may be obtained by returning to
Table 7.8 and noting the experimentally determined values of the paramagnetic Curie
temperature e. For iron it is seen that (J == 1093°1(. Insertion of this value and other
appropriate quantities in (7.67) yields the result that 'Y ~ 5,000. This is four orders
of magnitude higher than the classical value 'Y = t predicted by the Lorentz theory
(cf. Section 7.4).
35See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 408-410, John Wiley and Sons,
Inc., New York, 1956.
SECTION 12 The TVeiss Field Constant and the Exchange Integral 441

In spontaneously magnetized iron, H == 0, B == J-Lo~!J, and thus Bloc == 'YJ-Lol\!J ~ 6,000


webers/m 2 (60 million gauss). This intense local field cannot be explained classically;
the field produced by the magnetic moment of a neighboring atom is only I"'..IpeffmO/
47T"J-Lo 1a 3 ~ 10- 1 webera/m". Proper addition of the contributions due to all significant
neighbors will produce a field which still falls short of Bloc by four orders of magnitude, t
An adequate explanation of the physical origin of the intense local field was first
presented by Heisenberg:" and is based on the quantum mechanical exchange integral.
Under suitable assumptions, the energy of interaction of atoms a and b, having spins
Sa and Sb, contains a term
(7.70)
in which Vex is the exchange energy and ~ is the exchange integral and is governed by
the overlap of the electron clouds of the t\VO atoms, The spatial derivatives of this
exchange energy give the exchange forces, which prove to be intense and account for
the high value of Bloc; these exchange forces have an electrostatic origin, but the effect
is as though there were direct interaction of the spins Sa and Sb. When S is positive, the
exchange forces are in such directions as to cause parallel alignment of Sa and Sb; this
is the case in ferromagnetic rnaterials. However, S can also be negative, in which case
Sa and Sb tend to line up in an anti parallel arrangement. This occurs in antiferromag-
netic and Ierrimagnetic materials. (Cf. Sections 7.13-7.14.)
An approximate relation connecting the exchange integral S and the paramagnetic
Curie temperature emay be established for ferromagnetic materials as follows: Imagine
that an atom in the specimen has z nearest neighbors, with each of which it shares an
exchange energy given by (7.70), and that its exchange energy with more distant neigh-
bors may be ignored. With all the spins equal and parallel, the total exchange energy is
U, == -2zSS(S + 1) (7.71 )
This energy would be expended if the atom in question were rotated so that its spin
was at right angles to the spins of its nearest neighbors. But in a spontaneously mag-
netized specimen this would amount to rotating a magnetic moment of strength
PeffmO == gJmo[J(J + 1)P~ == gJmo[S(S + 1)P2
through a local field whose intensity is Bloc == 'YfJ.OMsat. If the result of Example 4.7
is once again employed, it follows that
(7.72)
Since for a ferromagnetic material, L == 0 and J == S, the saturation magnetization
is given as the limit of (7.63) in the form 111s a t == N g.rS1no. Using this result and corn-
bining (7.72) with (7.71) one obtains
(7.73)
wherein application has been made of (7.67). Therefore the exchange integral is related

t In the case of ferroelectn:c materials, the electric dipole moment is two orders of magnitude higher,
and dipole-dipole interactions thus provide a satisfactory explanation for the observed field constants,
which for such dielectrics are in the range of one-third.
36 The Heisenberg theory is treated completely in J. II. Van Vleck, Theory of Electric and "Magnetic

Susceptibility, Chap. 12, Clarendon Press, Oxford, 1932.


442 Magnetic Materials CHAPTER 7

to the paramagnetic Curie temperature through the approximate expression

3kfJ
(7.74)
g = 2zS(S + 1)
indicating that a high value for the exchange energy is manifested by a high Curie
temperature.

7.13 FERROMAGNETIC DOMAINS

As mentioned earlier, in order to explain the fact that a ferromagnetic specimen may
exist in a state which exhibits no bulk magnetization, whereas a weak external magnetic

',---',,/ ,~// ,,---."


, / /
'v" 'v/ V
I
I
I
I
I
I
I
I
I
I
I
I

,
I
I
I
I
A
/ /
/~,
", /
r>
/~, /~',
' //",

(a) Single crystal. (b) Polycrystal. Crystal walls solid.


Domain walls dotted. Domains not shown.
Individual crystals magnetized.
FIGURE 7.12 Examples of zero net rnagnetization.

field is capable of producing saturation magnetization in the same specimen, Weiss


introduced the notion that the volume of a specimen may be divided into domains.
Below the Curie temperature each domain is envisioned as being spontaneously mag-
netized, with the degree of magnetization being appropriate to the temperature T of the
specimen. The bulk magnetization is given by the vector sum of the individual domain
magnetizations. Examples of how the bulk magnetization can be zero in single crystals
and polycrystals are given in Figure 7.12.
SECTION 1» Ferromaqneiic Domains 443

Ample experimental evidence exists to support the domain hypothesis. A graphic


technique due to Bitter involves the preparation of a colloidal suspension of a fine
ferromagnetic powder. When a drop of this suspension is placed on the carefully pre-
pared surface of the ferromagnetic crystal being studied, the colloidal particles gather
along the domain boundaries. This occurs because adjacent domains are magnetized
in different directions resulting in strong local fields at the interfaces. A photornicro-
graph reveals the powder distribution, and thus the domain boundaries. An example
of this is shown in Figure 7.13. The origin of domains may be explained by invoking

H = 475 amp/m

875 ampyrn

4,000 amp/rn

10,600 amp/rn

27,800 amp/rn

FIGURE 7.13 Hitter powder patterns on surface of


single crystal of silicon-iron, showing variation of do-
main size with magnetic field; If horizontal in figure.
[After Bates, Modern J[ aqneiism, 3d ed., p. 465, Cam-
bridge [,1 niversity Press, London, 1951.]

the principle of minimum energy. A magnetized crystal has stored magnetic energy
associated with its magnetization. If the crystal can arrange itself into domains which
are oppositely magnetized, the overall crystal magnetization will be reduced, as will
the stored magnetic energy. But the boundary surfaces between oppositely magnetized
domains require energy to be maintained, since the exchange forces favor parallel
and oppose anti parallel orientations of the magnetization. Additionally, most crystals
are more easily magnetized along certain crystallographic axes (easy directions) than
they are along intermediate (hard) directions, the difference in energy for these t\VO
states being called the anisotropy energy. A perfect crystal will tend to form into
domains whose shapes and number are such as to minimize the sum of the magnetic
energy, the domain wall energy, and the anisotropy energy.
444 illagnetic Materials CHAPTER 7

As an illustration, the single crystal shown in Figure 7.12a could represent cobalt,
which is a hexagonal crystal, and has only one easy axis of magnetization. With the
domain structure shown, there is no net magnetization and thus zero magnetic energy.
The long domains are magnetized in easy directions, but the small closure domains at
the ends are magnetized in hard directions. If this type of domain structure is retained
but the number of domains is increased, the fraction of the volume occupied by closure
domains will decrease, as will the anisotropy energy. However, the number of boundaries
will increase along with the boundary energy. These two competing tendencies will

--- - - - - -+-- - ---

Irreversible domain
wall motion

Reversible domain

-\--
wall motion

~---_.--.Il.-_---------H
HI
FIGURE 7.14 T'upical maqnetization curve sho-wing regions
in which different rnagnetization processes dominate.

achieve equilibrium when the number of domains is such as to minimize the total
energy.
In iron, which is a cubic crystal, the easy directions of magnetization are parallel
to the cube edges. Nickel also has a cubical crystal structure but the preferred axes
of magnetization are the body diagonals. In each of these elements it is possible for
both principal domains and closure domains to be magnetized along different easy
directions. Domain size is then determined by minimizing the boundary energy plus
the magnetostrictive energy-the latter arising due to the difference in elongation
of the principal and closure domains, which causes an elastic strain.
Of course, these idealized pictures must be modified to account for the presence of
impurities, lattice imperfections, etc., but the basic explanation that energy minimiza-
tion accounts for the origin and size of domains seems well established. In real materials
the size of individual domains varies widely, with typical values lying in the range
10- 2 to 10- 6 cm 3.
Consider such a specimen of real ferromagnetic material which is below its Curie
temperature and has arranged its internal domain structure so as to display no net
magnetization. Upon application of an external field, the individual atomic magnetic
SECTION 14 A nii]erronuujneiism 445

moments tend to align with the field, upsetting the balance of internal forces which
had resulted in minimum energy. It is found that the volume growth of favorably
oriented domains occurs more easily than the rotation of magnetization from an easy
to a hard direction, and thus domain wall motion occurs for relatively weak applied
fields, whereas rotation requires high fields. This behavior is indicated by Figure 7.14,
in which the magnetization curve of a virgin specimen is shown, with the different
regions noted in which one or the other process dominates, Observe that for fields less
than HI, removal of the field permits the specimen to return to its original domain
structure.
One is able to exert some control over the coercive force He by taking advantage of
the fact that domain wall motion occurs 11101'e easily than magnetization rotation. In
materials designed for use as permanent rnagnets, a large value of He is desired. Through
suppression of domain boundary motion, a high cocrcivity is assured; this may be accorn-
plished by using materials consisting of two metallurgical phases which give a hetero-
geneous structure on a very fine scale.
Alternatively, materials used in transformer cores should have a high permeability
and a low hysteresis loss. Since the latter is proportional to BrH c (cf. Section 7.16)
reduction of He is desirable. This can be achieved by making the material pure, h01110-
geneous, and well-oriented to facilitate domain wall motion. The result is not only a
lowered He but a higher permeability as well. 'The inverse relation between coercivity
and permeability is clearly demonstrated by the data in Table 7.9.

7.14 ANTIFERROMAGNETISM

The Heisenberg theory of ferromagnetism is based on the quantum mechanical ex-


change integral, and a positive value of that integral corresponds to parallel alignn1ent
of adjacent spins (cf. Section 7.12). However, if the exchange integral is negative, a
tendency for anti parallel spin alignment exists. This occurs, for example, in certain
materials in which the interatomic distance is small, Counteraligned spins are found
in both antiferromagnetic and fcrrimagnet.ic substances. These t\VO classes of materials
are distinguished by the fact that the t\VO sets of opposed spins have unequal moments
in a ferrimagnetic specimen whereas they are of equal strength in an antif crromagnetic
material (see Figure 7.1).
A theoretical treatment of antiferrornagnetism by K eel preceded the experimental
discovery of the phenomenon (cf. Section 7.1). Polycrystalline manganese oxide is the
first antiferrornagnetic substance to have been identified by experiment and one of
its principal properties is the manner in which the susceptibility varies with tempera-
ture. As seen in Figure 7.15, a maximum in the susceptibility occurs at a ternpera.ture
TN (called the N eel temperature) and this behavior is characteristic of all anti ferro-
magnetic materials. It may be explained qualitatively on the basis of a crystal model
containing two types of atoms, A and B, distributed over t\VO interleaved lattices.
At low temperatures, due to the strong internal fields, the A spins tend to "lock in"
antiparallel to the B spins and an external field has little effectiveness in inducing a net
magnetization. As the temperature is raised, the thermal energy tends to unlock the
spin pattern and an external field can cause a greater net magnetization, this being
manifested by a higher susceptibility. Finally, at the critical ternpcrat.urc TN the spins
are completely freed, and above TN the specimen behaves paramagnetically, exhibiting
446 Magnetic M aterials CHAPTER 7

a susceptibility which decreases as 'I':', A quantitative theory based on this model


will be discussed later in this section.
Neutron diffraction studies provide direct experimental evidence for the existence
of the antiferromagnetic spin arrangement. A neutron beam which is incident on a
crystal suffers scattering by the atomic nuclei but also interacts with the ordered spin
lattice, this latter interaction giving rise to additional diffraction lines. These extra
lines have an intensity which diminishes as the temperature is raised because the anti-
ferromagnetic order is decreasing; finally, at TN the extra lines disappear completely
from the diffraction pattern. Shu1l 37 and his co-workers were the first investigators to

10.0

8.8

7.5
.50 100 150 200 250 300
T(OI{)
FIGURE 7.15 ill agnetic susceptibiHty of MnO VS. temperature in 5,000 Gauss
field. [After Bizette, Squire, and Tsai, Comp Rend, 207, 449; 1988.]

employ this technique and have used it successfully to determine the spin arrangements
in both ferromagnetic and antiferromagnetic substances.
In terms of the two sublattice model, assuming that all the nearest neighbors of an
A atom are B atoms and vice versa, the local fields at the two sites may, in a further
use of (7.24), be written
B, = B - a,uoMa - /3,u OM b (7.75)
B, = B - /3,u oM a - a,uol\1 b (7.76)
in which a and /3 are internal field constants. It is anticipated that /3 will be positive
in order to account for antiparallel alignment of spins. It is further anticipated that
IIJI > lal, since all of an atom's nearest neighbors are of the opposite type. However,
no prior prediction is being made about the sign of a.
It now becomes advantageous to consider several temperature regions.
1. The temperature region T > TN. In this region the magnetizations M a and M b
are weak and the material is behaving parnmagnetically. A repetition of the analysis
37 C. G. Shull and J. S. Smart, "Detection of Antiferromagnetism by Neutron Diffraction," Phys Rev,
76, 1256-1257; October 15, 1949.
SECTION 14 A ntijerromaqneiism 447

of Section 7.9 then leads to the result that

l\1 =
~ a
rLN (PeffmO)
3kT
2J B
a
(7.77)

in which N is the density of A atoms and peff is the number of Bohr magnctons per
A atom,
If it is assumed that the density of B atoms is also N and that the net magnetic
moment per atom is the same for both types of atoms, then it follows that

1\1 =
~ b
rN (Peffn10)
3kl
2J B
1 b
(7.78)

Upon adding these t\VO equations and making use of (7.75) and (7.76) one obtains

M = Ma + Mb = rLN(Peff3kT o)2J [2B -


m
({3 + a)).LoM] (7.79)

which can be solved for 1\1 to give

so that
M C
X ------ (7.80)
m - ).LolB - T + ()
in which the Curie constant is
C = 2N (PeffmO) 2
(7.81)
3k).Lol
and the paramagnetic Curie temperature is

(J = _({j_+_a)_C (7.82)
2

Because the expectation is that (3 + a is positive, one would predict that () is also posi-
tive, and this is confirmed by experiment for a variety of materials.
It is interesting to compare the behavior of paramagnetic, ferromagnetic, and anti-
ferromagnetic materials in a temperature range in which they are all acting para-
magnetically. This is done in Figure 7.16, where the intercepts with the T axis clearly
indicate the differences in the nature of the paramagnetic Curie temperature.
2. The N eel ieniperaiure TN. At this temperature, the magnetization is still weak
enough for the above analysis to be valid. If there is no external magnetic field present,
at 1 N one may combine (7.77) and (7.75) to give
1

which may be rewritten in the form

(7.83)
448 Magnetic Materials CHAPTER 7

/ --c-
/

/
/
/
/
/
- 8(antiferro)~
/ I
~8(ferro)
T
FIGURE 7.16 Comparison of behavior of reciprocal rnagnetic s11sceptibility
with temperature above Curie points for three classes of materials.

Similarly, (7.78) and (7.76) combine to give

(7.84)

These two equations will yield nontrivial solutions for M, and M, only if the deter-
minant of the coefficients vanishes. But this leads to

TN = «(1 - a)C (7.85)


2

Comparing (7.82) and (7.85), one sees that the two sublattice model does not predict
the same value for the N eel temperature and the paramagnetic Curie temperature,
This is in agreement with experiment, as can be seen from Table 7.10, which lists
measured values of TN and () for a variety of antiferrornagnetic materials. It may be
observed from this table that not only are TN and () widely different but also that TN
is generally lower than (). This suggests that a is positive which implies that not only
the AB interactions, but also the AA and BB interactions are antiferromagnetic.
3. The teniperaiure region T < TN. Below the Neel temperature TN, it is necessary
to distinguish the behavior of a single crystal from that of a polycrystal. In a single
crystal, due to anisotropy, there will be one or more preferred directions along which
the spins will tend to align themselves. If an external field is applied perpendicular to a
natural spin direction, the magnetic. moments M, and M, are turned by the field so as
to make an angle 2</> with each other, as shown in Figure 7.17. The local fields at the two
sites are still given by (7.75) and (7.76), and in equilibrium M, should be aligned with
SECTION 14 A ntijerronuumetism 449

TABLE 7.10
NEEL TEMPERATURE TN AND PARAIVIAGNETIC CURIE TEIVIPERATURE ()
FOR CERTAIN ANTIFERROMAGNETIC MATERIALS

Substance TNOK I~
NiC1 2 . 50 68
CoF 2 . . • . . . . . . . • • 38 53
FeF 2 . 79 117
MnF 2 . 72 113
NiF 2 . . . . . . . . . . • . . 73 116
FeO . 198 570
MnO . 122 610
Mn02 . 84 316
MnS . 165 528

B a • and M, should be aligned with B b • Thus

M, X B, == 0 == M, X (B - aJ..L ol\l a - (3,u OM b)


o == M a X (B - (3,u OM b)

This requires that the component of B - {3,u OM b which is perpendicular to M a be zero, or

B cos c/> - {3,u 0111 b sin 2c/> == 0 (7.86)

When M, is crossed into (7.76), Equation (7.86) is seen to hold for M, as well, Therefore
the net magnetization in the direction of B is

. 2B cos cP sin cP B
M=(Ma+ft1 b) sln cP == . =- (7.87)
(3,uo SIn 2c/> {3,uo

The transverse susceptibility is therefore given by

.1 J..LoM 1
Xm =:--=- (7.88)
B (3

This formula is independent of temperature, but of course the model is imperfect,


and the concept of an array of counteraligned spins is less valid as the temperature

FIGURE 7.17 Calculation of perpendicular susceptibility for antijerromaqneiic material below TN.
450 Magnetic lJl aierials CHAPTER 7

departs from absolute zero. An actual case is shown in Figure 7.18, where x;
is seen to
decrease somewhat as TN is approached.
If the applied field is parallel to a natural spin direction, the calculation of the longi-
tudinal susceptibility x~ is more complicated, and involves statistical methods which

30 r-------.---------.------.,--------.

0
20
~

X
~
~
:.0
'Z
0-
<1,)
Co)
00
='
rn
s..
~
"0 10
~
~

o 50 100 150 200

T(OK)
FIGURE 7.18 Molar maqneiic susceptibility of MnF 2. [After Gri.ffel and Stout, J Chern Phys, 18,
1455; 1950.]

use Brillouin functions. It is apparent that


x~(OOI{) = 0
since all spins at absolute zero are either parallel or anti parallel to the field, thus
experiencing no torque. Calculations by Van Vleck." show that the longitudinal
susceptibility increases regularly from its null value at absolute zero until it reaches
the value
x~(TN) = x;(T N)
This conclusion is in agreement with the experimental curve of Figure 7.18.
For polycrystalline materials, the susceptibility below TN is an average value lying
between x~ and x;,
which accounts for the rising portion of the curve in Figure 7.15.

EXAMPLE 7.8
From the data offered in Table 7.10, the relative size of the internal field constants a and {3
may be determined. Taking the ratio of (7.85) and (7.82), one obtains
TN = {3 - a
o (3 + a
38 J. II. Van Vleck, "On the Theory of Antiferromagnetisrn," J Chern Phys, 9, 85-t)(); January 1941.
SECTION 15 Ferrinuumeiism 4t51

which may be solved for ({3 / a) to give

{3 1 + (TN/B)
a 1 - (TN/B)
Calculation from this formula gives the following data:

Material NiCh CoF 2 FeF 2 MnF 2 NiF 2 FeO MnO Mn02 MnS
I
1

{3/a 6.5 6.1 5.2 4.5 4.4 2.1 1.5 1.7 1.9
I

When one recalls that {3 and a are measures of the relative influence on the local field of
nearest neighbors and next-nearest neigh bors, these are seen to be reasonable ratios. Their
variability is due partly to the differences in lattice dimensions.

7.15 FERRI MAGNETISM


The previous section was concerned with antiferromagnetic materials, whose magnetic
properties could be explained in terms of t\VO su blattices of equal and opposite spins.
There is no fundamental reason why the two opposite sets of spins must always be equal.
When they are not, one encounters a different class of materials called [errimaqneiice.
These materials also have a Curie point, and below this critical temperature the unequal
spin systems tend to "lock in" with an antiparallel orientation, resulting in a net
spontaneous magnetization. In this respect, they resemble ferromagnetic substances.
However, one should note carefully the basic difference in the mechanism. Ferro-
magnetism is associated with a positive value of the exchange integral and parallel
alignment of all the spins. Ferrimagnetism, like antiferrornagnetism, corresponds to a
negative value of the exchange integral and to anti parallel spin alignment.
Prominent among ferrirnagnctic materials are the ferrites, a group of compounds
whose composition may be represented by the chemical formula XO·Fe 20 3, in which X
is a divalent ion such as Cd 2+, C 02+, Cu 2+, Fe 2+, ~lg2+, Mn?", Ni 2+, Zn 2+ (or a mixture
of these ions). t Because they are oxides, ferrites have a lowered density when compared
to metallic substances and also have a much higher resistivity. Depending on the specific
composition, they have resistivities in the range 10° to 104 ohm-m, which is comparable
to the resistivity of a semiconductor, and which is many orders of magnitude above the
value (10- 7 ohm-m) for iron. For this reason, ferrites are attractive for use in transformer
cores at frequencies beyond the range where the eddy current losses in iron cores are
prohibitive. They are also widely used in microwave applications where the low depth
of penetration of iron prevents its use.
The process of producing ferrites involves mixing the various desired oxides in the
proportions which are required to yield a given set of properties. The oxides are then
put under pressure (sometimes with a binder) and sintered at temperatures as high as
1700 0 } ( . The result is a ceramic-like material which can be ground to a desired shape.
The electric and magnetic properties vary widely with composition.
t One of these ferrites, FeO·Fe 2 0 a (magnetite) bears the distinction of having been the first ferrous
material known to mankind with references to it dating back through antiquity.
452 Magnetic 1\1aierials CHAPTER 7

Through careful X-ray diffraction and neutron diffraction analyses, the crystal lattice
of ferrites has been found to have the spinel structure (after the mineral spinel, l\;fgAI 20 4)
with a unit cell which is cubical and approximately 8.4 Aon a side. The spinel structure
is indicated in Figure 7.19, and is found to contain 32 0 2 - ions, 16 Fe 3+ ions, and 8
divalent ions. The oxygen ions are so distributed as to form 64 tetrahedral interstices
(A sites) and 32 octahedral interstices (B sites). Eight of the tetrahedral sites are
occupied, as are sixteen of the octahedral sites, thus accounting for the twenty-four
metallic ions in the unit cell.

/0,
Q_-:- ,--b
~
' ~

"0'//
, Octahedral interstice
(32 per unit cell)

Tetrahedral interstice
(64 per unit cell)

o Oxygen

® Metallic ion at tetrahedral site

• Metallic ion at octahedral site

FIGURE 7.19 The spinel structure.

Since the 0 2- ions have only completely filled shells, and therefore no net spin, the
magnetic properties of ferrites are due to the metallic ions. A good illustration is
3
magnetite, in which the 8 Fe 2+ ions occupy half of the octahedral sites, and the 16 Fe +
2
ions are evenly divided between tetrahedral and octahedral sites. Since each Fe + ion
has 4 Bohr magnetons and each Fe + ion has 5 Bohr magnetons (cf. Table 7.3; the 48
3

electrons are removed first), and since the AB interaction is antiferron1agnetic, one
would expect the A sites to have a net magnetic moment of 40nlo per unit cell, and the
B sites to have a counteraligned magnetic moment of 72mo per unit cell. The saturation
magnetization J.l1s a t (which is the spontaneous magnetization at DO!\:) should therefore
be 72 - 40 = 321no per unit cell. The experimental value is 32.641no, so the agreen1ent
is quite good.
If the value 8.4 A is used for the dimension of the unit cell, the above numbers
translate into a saturation magnetization ~lsat for magnetite of 5 X 10 5 amp/me This
is smaller than the M s a t values typically encountered in iron and cobalt materials by a
SECTION 15

factor of 2-;"). The lower value of saturation magnetization in ferrites may be explained
by noting that the concentration of magnetic ions is smaller and that the ruagnetiza-
tions at the A and B sites oppose.
In 1948, ~ ee139 proposed a theory which accounts for the principal features of ferri-
magnetism, and which is based on the hypothesis that there exists a negative interaction
between the rnetallic ions at the tetrahedral (A) sites and the metallic ions at the
octahedral (B) sites, this interaction being the cause of a tendency for the A and B ions
to adopt an anti parallel spin alignment. The essential features of K eel's theory may be
appreciated by considering the ferrite XO' Fe 20 3, whose unit cell contains y Fe 3+ ions
on tetrahedral sites and (16 - y) Fe 3+ ions on octahedral sites; the divalent X ions are
then distributed such that (8 - y) of them occupy tetrahedral sites and y of them are
at octahedral sites. The formula for this ferrite may be written

in which x == y/8 is a discrete variable having the range 0 ~ x ~ 1. The bracketed


portion of this formula refers to the occupancy of the octahedral sites.
If the divalent ions are nonmagnetic, the net magnetization may be written

M == xlVI a + (2 - x)lVI b (7.89)

in which M, and 1\'1 b are the magnetization densities of the A sites and B sites for the
special case that there are 8 I~e3+ ions at tetrahedral sites and 8 Fe 3+ ions at octahedral
sites per unit cell (x == 1).
The mathematical expressions for the local fields at the two sites were postulated by
N eel in the form

B a == B - Jlo'Y[(2 - x)1\1b - axM a ] (7.90)


B b = B - IJo'Y[xM a - (3(2 - x)M b] (7.91)

in which B is the total macroscopic field, - J.Lo",/(2 - x)M b or - J.Lo",/xM a accounts for the
negative AB interaction, and JlO'YaxMa, Jlo'Y{3(2 - x)M b represents the AA and BE
interactions respectively. The factors a, {3, yare internal field constants, and these
equations are based on the same assum ption of linearity which Weiss had earlier intro-
duced in his theory of ferromagnetism.
Above the Curie point the magnetizat.ions M, and M b should have an inverse tem-
perature dependence and follow a paramagnetic Curie law. It seems reasonable to
write for this temperature region,

1\1 = CIJolB a M _ C/-lo1B b


2T (7.92)
a 2T b -

in which C is a Curie constant. Upon combining Equations (7.89)-(7.92), one finds that

(7.93)

39 L. N eel, "Magnetic Properties of Fcrrites; Ferrimagnetism and Antiferromagnetism ," Ann Phys
(Paris), 3, 137-198; 1948.
454 Magnetic Materials CHAPTER 7

in which
1 l'
- = - [2x(2 - x) - ax 2 - (3(2 - X)2] (7.94)
XO 4
')'2
a = -
16
Cx(2 - x)[x(1 + a) - (2 - x)(1 + (3)]2 (7.95)

')'
() = "4 Cx(2 - x)(2 + a + (3) (7.96)

Equation (7.93) indicates that, if 1/Xm is plotted versus T above the Curie point, the
resulting curve is convex. This is in agreement with experiment, and a plot of experi-
mental data in this form may be used to determine the constants C, XO, (1, and () which
occur in (7.93). If x is known, (7.94)-(7.96) may be used to obtain the internal field
constants lX, (3, and ')'. This procedure has been followed for several ferrites with the
interesting conclusion that a and (3 are both negative, so that all three interactions are
antiferromagnetic' with the AB interaction dominating.

3.0
~
2.5
~v
y~

-
b
X IJi
2.0
/
/
.;

I
i
)( 1.0
/'"
OJ> /
o /
800 8.1)0 900 950 1000 1050 1100
T(OK)
FIGURE 7.20 Reciprocal rnagnetic susceptibility of rnagnetite vs.
temperature. [After C. Kittel Introduction to Solid State Physics,
2d ed., p. 445, John lriley and Sons, Inc., lvew York, 1956.J

The foregoing analysis is also applicable to ferrites in which the divalent ions are
magnetic, provided that all divalent atoms are on octahedral sites (x = 1) and the
magnetization M b includes the divalent contribution. Magnetite is an example of such
a ferrite, and its reciprocal susceptibility as a function of temperature is plotted in
Figure 7.20. The convex curvature predicted by the theory is clearly indicated.

7.16 TIME-VARYING PHENOMENA


Previous sections of this chapter have dealt with the behavior of magnetic materials
under the influence of a static or quasistatic external field. Three internal causes for
magnetic effects in materials have been noted (electron orbital motion, electron spin,
and nuclear spin), and five classes of materials have been identified (diamagnetic,
paramagnetic, ferromagnetic, antiferromagnetic, and ferrimagnetic). In what follows
the static results obtained in these earlier sections will be extended, in principle, to
include time-varying phenomena, and then several cases of practical interest will be
discussed.
SECTION 16 Time- Varying Phenomena 455

If primary current sources l(x,y,z) and their associated field B1(x,y,z) induce a mag-
netization density M(x,y,z) in a material specimen, then if 1 and B 1 become time-
varying, in general M will vary with time also. Leaving aside for the mornent the ques-
tion of a specific relationship between B 1 (x,y,z,l) and M(x,y,z,l), the macroscopic
response field B 2 (x,y,z,l) caused by the aggregation of time-varying magnetic moments
may be deduced in a manner completely analogous to what was done for electric dipole
moments in Chapter 6 (specifically, cf. Appendix L). The result is that B 2 is given by
the expression

B 2 (x ,Y,z,t) == V F X
[f {M } X dS + f VsX{M}dV]
S
-1
47rJ..Lo ~ v
-1
47rJ..Lo ~
(7.97)

in which S is the bounding surface of the specimen, V its volume, and ~ is the distance
from dS or dV to the field point (x,Y,z). {M} is the time-retarded value of M. This
result is seen to be similar to the static formula (7.4), the only difference being that the
magnetization density is now time-varying and retardation must be included. Equation
(7.97) normally is valid for points within the magnetic specimen as well as without and
permits the interpretation that the specimen is equivalent in its magnetic effect to a
lineal current density j == M X In on S plus an areal current density t == V X M in V.
In parallel with the development of Section 7.3, a macroscopic H field may be related
to the magnetization M and the total field B == B 1 + B 2 by the defining equation

(7.98)

In (7.98) all three vectors are time-varying and M may not be parallel to B, in which
case H is not parallel to either of them.
Effective use of Equations (7.97) and (7.98) requires knowledge of the relation
between Band M. Unfortunately, unlike the analogous situation in dielectric materials
(for most of which P and E were found to be linearly connected), no simple general
relationship exists between M and B in those materials whose magnetization is sizeable.
Thus if one writes H == ,u-lB == B/,uo(l + Xm), the common result is that the dynamic
magnetic susceptibility x., is a nonlinear com plex tensor when the fields are time-
harmonic. Despite this difficulty, many dynamic cases are amenable to analysis,
particularly those involving small incremental fields. The remainder of this section will
be devoted to a discussion of several of these cases.
1. Diamagnetic susceptibility. Diamagnetic materials have been seen to have slight
negative static magnetic susceptibilities of the order of 10- 5 • The phenomenon of dia-
magnetism is similar to electronic polarizability in dielectrics, and therefore by analogy
one would expect the dynamic diamagnetic susceptibility Xm to exhibit a resonance at a
characteristic frequency i-: In the neighborhood of 1m a complex value of Xm should be
found, indicating some absorption; below 1m the static value of Xm should prevail,
whereas above 1m a negligible value of x., should result. However, the diamagnetic
effect is so small that this phenomenon is of little practical interest and has not been
widely investigated.
2. Cyclotron resonance. If the diamagnetic material is a semiconductor, a different
resonance phenomenon arises when the specimen is placed in a region containing both a
static magnetic field B, and an electromagnetic field (E1,B1) whose electric intensity is
perpendicular to B o. A free electron or mobile hole (cf. Section 8.11) within the semi-
456 lYfagnetic AIaterials CHAPTER 7

conductor will experience forces due to these fields such that its equation of motion
between collisions is
(7.99)
in which q = ± e depending on whether the mobile carrier is a hole or electron, and m *
is the effective mass of the hole or electron. t
Since E I = (c/n)B I , and since the index of refraction n of the semiconductor is close
to unity, for normal values of v the term v X B I may be neglected in comparison to E I .
Upon taking B, in the Z direction and E 1 in the X direction, (7.99) may be expanded
to give
qE1ei wt + qVyB o = 1n*v x

-qvxB o = n~*vy (7.100)


o= m/»,
In (7.100) it has been assumed that the steady magnetic field B, is uniform throughout
the region occupied by the semiconductor specimen, and that the wavelength of the
electromagnetic field (EI,B I) is large compared to the size of the specimen, so that the
instantaneous phase of E I may be assumed to be uniform throughout the specimen.
The third of Equations (7.100) indicates a constant drift in the Z direction. The first
two equations may be combined to give

(7.101)

(7.102)

in which We = qB o/ 1n * IS called the cyclotron frequency. These equations have the


particular solution
jww c E 1 .
Vx = -2 - - - eJwt (7.103)
w - w~ B o

which can be recognized as elliptical motion in the X Y plane with a resonance at


W = We. When collisions are taken into account this resonance is limited to a finite
amplitude. Therefore if the frequency of the E 1 field is varied, the absorption of
energy by the specimen from the field will show a characteristic peak at the cyclotron
frequency. Knowledge of B 0 then permits determination of the effective Blass of the
free electrons or holes.] The result is found to depend on crystal orientation, but
typically the effective masses of holes and electrons are found t.o be in the range 0.1 to
0.35 times the mass of a free electron."
3. Paramaqneiic relaxation. If a magnetic field is suddenly applied to a paramag-
netic specimen, the magnetization density ~1 exhibits some inertia and does not
t The effective mass of an electron or hole in a semiconductor differs from the free space mass of an
isolated electron because the periodic potential within the crystalline semiconductor causes the average
motion of the electron or hole to be equivalent to the mot.ion of a particle of altered mass against
a background of uniform potential.
t The specimen can be doped so that the density of either free electrons or holes dominates by many
orders of magnitude and thus the effective mass of only one or the other is being measured; cf. Sec. 8.11.
40 See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 371-379, John Wiley and Sons,

Inc., New York, 1956.


SECTION 16 Tin~e- Varying Phenomena 457

immediately reach its static value, but instead approaches it gradually. If after the
ultimate value of magnetization is essentially achieved the field is suddenly removed,
the magnetization will not vanish instantly but instead will decay with the entire
process resembling the trace of Figure 6.22. In the case of many paramagnetic materials
the inertia in this process is attributable to two relaxation mechanisms. The first of
these results frorn a spin-lattice interaction through which energy 111ay be interchanged
between the spin system and the lattice vibrations. The second mechanism is due to
spin-spin interaction through which an individual spin can exchange energy with the
magnet.io field caused by neighboring spins. The spin-spin relaxation time is typically
about 10- 10 seconds and is temperature independent. The spin-lattice relaxation time is
normally much longer; it varies with the material and is strongly temperature dependent.
In analogy with dipolar relaxation in dielectrics (cf. Section 6.19), this phenomenon
may be described in terms of the spin-lattice relaxation time 7; a repetition of that
earlier analysis leads once again to the Debye equations, such that the complex per-
meability of the paramagnetic specimen is given by
-1 -1

J.l
-lew) = -1
J.l a
+ J.l s
1
-
. J.l a
+ JW7 (7.104)

in which J.la is the permeability at a frequency fa such that f~1 is much greater than the
spin-lattice relaxation time 7 but much less than the spin-spin relaxation time ; J.ls is the
static permeability,
If one introduces the complex paramagnetic susceptibility Xm = x~ - jx~ by the
defining equation jJ. = (1 + Xm)J.lO, and recognizes that x, and Xs are both small corn-
pared to unity, (7.104) may be converted to give

(7.105)

which is a generalization of a result first obtained by Gorter and Kronig.!' Equations


(7.105) indicate a resonance at w = 7- 1. Another resonance would occur at a frequency
equal to the reciprocal of the spin-spin relaxation time, but (7.105) does not show this
and these equations are not valid for frequencies that high. Agreement between experi-
ment and (7.105) in the appropriate frequency range is quite good. 42
An interesting independent derivation of (7.105) has been provided by Casimir and
Du Pre using a thermodynamic argument. 43 A key point in their analysis is that the
spin-spin relaxation time is so short compared to the spin-lattice relaxation time that
the spin system can be treated as though it were always in thermodynamic equilibrium.
For this reason the aggregation of spins may be taken as a separate thermodynamic
41 C. J. Gorter and It. Kronig, "On the Theory of Absorption and Dispersion in Paramagnetic and

Dielectric Media," Physica, 3, 1009-1020; November 1936. In their analysis spin-spin relaxation was
ignored, which is equivalent to setting X a equal to zero.
42 L. J. F. Broer and C. J. Gorter, "Paramagnetic Dispersion in Gacloliniurn Salts," Physica, 10,
621-628; October 1943.
43 H. B. G. Casimir and F. K. Du Prc, "Therrnodynamic Interpretation of Paramagnetic Relaxation

Phenomena," Physica, 5,507--.511; June 1938.


458 111agnetic !l![aterials CHAPTER 7

system, with its own temperature and entropy. This assumption gains its validity from
the fact that it leads to the Debye equations, and it further serves to explain the process
whereby temperatures below 1°1<: are attained by the adiabatic demagnetization of a
paramagnetic salt.
This cooling process is indicated by the entropy-temperature diagram of Figure 7.21.
The paramagnetic specimen is placed in good thermal contact with its surroundings
when the spin system has a temperature T 1 and an entropy S1. An external magnetic

,b
I
I
I
r
I
I
I
I
I
I
------------------..L---------T
T2 T1
FIGURE 7.21 ..1 diabatic demoqneiizaiion.

field is then applied, increasing the ordering of the magnetic moments and thus de-
creasing their entropy. Heat flows from the spin system to the lattice to the surroundings
and the isothermal path ab is traced out. The specimen is then insulated from its
surroundings and the external field is removed, resulting in the isentropic path be and a
lower temperature T 2• Successive repetitions of this process have yielded temperatures
as low as 10- 3 deg abs. The ultimate temperature attainable is limited principally by the
natural splitting of the spin energy levels (caused by the crystalline field).
4. Electron paramaqnetic resonance. The discussion of Section 7.8 indicated that,
in a crystalline solid, the magnetic moments of the individual atoms or ions can adopt
only 2J + 1 orientations with respect to an external field direction. In a paramegnetic
material these magnetic moments are essentially independent and noninteracting. As a
consequence the 2J + 1 energy levels are approximately equally spaced in the presence
of a macroscopic field Eo, the increments being gmoB o joules (cf, Equation 7.52).
If in addition to the steady field B o the specimen is in the presence of an electro-
magnetic wave of angular frequency w, chosen to satisfy
Iiw = gmoBo (7.106)
then transitions among the 2J + 1 energy levels will be induced and the internal array
of magnetic moments can absorb energy from the wave. This phenomenon is known as
electron paramagnetic resonance. It may be observed by positioning a specimen inside
SECTION 1() 1\l11,e- Varying Phenomena 459

a rectangular waveguide and then placing the waveguide and specimen between the
pole faces of an electromagnet so that the field of the electromagnet is perpendicular to
the broad walls of the waveguide. When a TE lo mode is passed down the waveguide
and detected at the far end, the detector readings sho w a dip as the field Eo of the electro-
magnet is varied, the dip occurring at the value of Eo which satisfies (7.106).
The first observation of this phenomenon was made by Zavoisky" in 1945 using the
paramagnetic salt CuC1 2·2H 20 . The clearly resolved resonance of 1\ln 2+ obtained by
CUn1111CrO\V and Halliday is illustrative of the experimental results which can be
achieved and is reproduced in Figure 7.22. Upon inserting in (7.106) the values

1(\
" \
/ \
/ \\.
-----
~ ~

o 0.1 0.2 0.3


Electromagnet field in webers/rn 2
FIGUfiE 7.22 Ptuamaqnctic resonance of l\1n 2 + ion in
l\1nS04·4H 20 . Observed at 2930 l\1C. [.lfter Cutnmeroio
and J I alliday, Phys [(ell, 70, 433, 1946.]

w/27r = 2,930 X 10 6 and Eo = 0.1125 derived from their experiment, one can determine
that g = 1.86 for the paramagnetic ion Mn?", In an iron-group salt such as this, one
would expect the g factor to have a value close to the Lande factor 9J' The calculated
value of gJ is 2 so the agreement is quite satisfactory.
Electron paramagnetic resonance has proved to be an important research tool in
gaining further understanding of various interactions within solids and liquids. For
example, the crystalline electric field inside many solids affects the allowed orientations
of magnetic moments. Therefore an anisotropy in the measured g values is observed for
these materials as the crystal axes are changed with respect to the field direction. This
anisotropy permits deductions to be made about the crystalline field. Other investiga-
tions have been concerned with the detection of impurity atoms in semiconductors and
with studies of free radicals, excited molecules, and conduction electrons in metals.
5. Ferronuumeiic resorumce. Electron resonance also has been observed in ferro-
magnetic substances, using the same absorption technique at microwave frequencies
which has just been described for paramagnetic materials. The ferromagnetic specimen
is usually a thin plate or disc, inserted in the waveguide so that its faces are transverse
to the longitudinal waveguide axis, and thus parallel to the B o field of the electromagnet.
For a constant microwave frequency, as the field of the electromagnet is varied, the

44E. Zavoisky, "On the Absence of Anisotropy for Spin Magnetic Resonance," J Phys U SSR, 9,
447-448; July 1945.
460 Magnetic leI alerials CHAPTER 7

absorption traces a curve typified by Figure 7.23. This effect was first observed by
Griffiths" in 1946.
The analysis of ferromagnetic resonance differs from that of paramagnetic resonance
because the individual magnetic moments in a ferromagnetic specimen cannot respond
to an incident microwave field independently of each other. Consequently, it is appro-
priate to consider the interaction of the microwaves with the induced magnetization

0.68 0.72 0.76 0.80

Electromagnet field in webers/rn!


FIGURE 7.23 Ferromaqnetic resonance in nickel ferrite at 24 ge.
[After Yager, Galt, Merrit, and "rood, Phys Rev, 80, 744; 1950.]

density M, rather than with the individual atomic magnetic moments. To this end, let
m be the magnetic moment of a single atom and let ~ be the angular momentum of the
electron cloud, these quantities being related by m = -g(e/2m)$) (cf. Equation 7.50).
In the presence of a magnetic field Bloc, the electron cloud experiences a torque m X Bloc
(cf. Example 4.7) and therefore the dynamic equation of motion of the cloud is

d&J 2m dm
mX Bloc = di = - g; di
45 J. H. E. Griffiths, "Anomalous HF Resistance of Ferromagnetic Metals," Nature, 158, 670-671;

November 9, 1946.
SECTION 16 Time- Varying Phenomena 461

Upon multiplying both sides of this equation by the atom density N, one obtains
dM
- == -rM X Bloc == -rM X B == -f,uoM X H (7.107)
dt
in which I' = g(e/21n) is a substitution constant, called the gyromagnetic ratio, and the
t\VO reductions in (7.107) are by virtue of (7.24) and (7.15).
If Z is selected as the longitudinal axis of the waveguide, with Y perpendicular to the
broad walls, the total field and magnetization may be written
(7.108)
in which H 0 is the quasistatic field of the electromagnet, M 0 is the magnetization it
induces, HI is the microwave field, and M 1 is the magnetization it induces. In a typical
experiment H 0 » HI, M 0 » MI. Making this assum ption so that only first-order terms
are retained when (7.108) is inserted in (7.107), one finds that
jwM l x == rJ.Lo(Jltl 1zH o - MoH t z )
jwM 1y ~ 0 (7.109)
jwM lz == r,uo(MoH l x - M1xHo)
Since H lz is very much greater inside the ferromagnetic specimen than without, the
discontinuity in M 1z at the material faces is accounted for essentially by the value of
H lz within the specimen. Thus one may replace - H lz by M lz in the first of Equations
(7.109) and then solve for the dynamic magnetic susceptibility in the X direction,
obtaining
MIx Mo/ll o
Xx = - = (7.110)
H 1x 1 - (w/WO)2

with Wo = g(e/2m) vi J.LolloB o the ferromagnetic resonant frequency.


Equation (7.110) indicates a resonance which is useful in determining the g values for
various ferromagnetic materials, and representative values include 2.15 for iron, 2.20
for nickel, and 2.22 for cobalt. These results are in satisfactory agreement with g values
obtained for these same elements using gyromagnetic expcriments.:"
6. Hysteresis loss. If the external field applied to a ferromagnetic specimen contains
a time-harmonic component, the induced magnetization generally will consist of the
sum of steady and cyclical terms. The B-H relation will take on the appearance of the
closed loop shown in Figure 7.24, this loop being traced out once per cycle. The varia-
tions in magnetic moment are resisted by the crystal, and energy must flow from the
exciting field into the specimen to provide for this hysteresis loss.
The instantaneous power being supplied to a specimen of volume V (cf. Section 7.6) is

p = f H·BdV
v
so that the energy supplied per cycle is

f ¢ f H · dB
T

Wm = P dt = (7.111 )
o c v
46 See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., pp. 408-410, John Wiley and Sons,

Inc., New York, 1956.


462 AIagnetic AIaierials CHAPTER 7

in which C is the contour of the hysteresis loop and r == 1/ v is the period of the cyclic
field variations. If Hand B arc parallel, and if each is uniform throughout V, (7.111)
reduces to
(7.112)

and the integral in (7.112) can be recognized as the area enclosed by the hysteresis loop
in the BH plane. The time-average power supplied to account for hysteresis losses is
then simply v times the value of W m computed from (7.112).

~------------------H

FIGURE 7.24 Hysteresis loop with steady bias.

7. Tensor permeability in [erriies. If a ferrite specimen of arbitrary shape is placed


in a region containing both time-independent and time-harmonic magnetic fields, with
the steady field Z directed and much greater than the cyclic field, then the first-order
equations relating the magnetization and the fields are
jwlvJ 1x == -rp.o(lVI]yH o - 111 0 H 1y)
jwll11y = - fp.o(1\f oHl z - M lxH 0) (7.113)
jwllli z ~ 0
These equations arise from an expansion of (7.107), the development being similar to
that which led to (7.109). Simultaneous solution of these equations for ]vf1:L and 1\1 1y
yields

(7.114)
SECTION 16 T'inLe- Varying Phenomena 463

These equations may be written in matrix notation, namely,

1\[ IX) 111 0 / /1 0 ( 1 jw / wO)(I-I IX) (7.11S)


( 111 1 - (w/ wo) 2 jw / wo
1JJ = - 1 II 1y

which can be recognized as being in the form M I == xm H 1 , with Xm the complex tensor
magnetic susceptibility, since it is identifiable as [Jf 0/ H 01/[1 - (wi wo) 2] times the
matrix appearing in (7.115). 1' 1'0 111 this it follows that the COIn plex tensor permeability,
1

defined by B 1 == ~tHl == ).L0(1 +


xm)H1is given by the expression

M = Mo ( -jx JX
s
0 ~) (7.116)

~1 0/ H 0
in which E 1 +1- (W/WO)2
(7.117)

W llfolH o
x == - (7.118)
Wo 1 - (W/wO)2

Equation (7.116) indicates that in the presence of a static longitudinal field, the trans-
verse components of the time-harmonic field are coupled, in the sense that either B Lz
or B l y can give rise to both H 1x and H 1y components. This coupling has a resonant
feature at the angular frequency woo
EXAMPLE 7.9
Electromagnetic waves can propagate through a ferrite medium without suffering intoler-
able attenuation because the conductivity is low. The interaction of the waves and the
ferrite is particularly interesting if the waves are circularly polarized and propagating in
the direction of the static field B o • If one lets If ly = += JH lx, t with H Iz == O. Equation (7.116)
leads to

BIX)
( Bi, =).Lo
(s -jx
J'is 0)( u., )
0 += JI/ l x
BIz 0 0 1 0

Bi; = ).Lo(s ± x)H l z

B l y = Jlo( -Jx =+= js)H l z = ).Lo(e ± x)H ly

and therefore B I = ).Lo(e ± x)H 1 • This means that right and left circularly polarized waves
propagate through the ferrite as though it had a simple scalar permeability, but the value
of this permeability is different for the t\VO senses of rotation. This fact has led to a variety
of useful microwave devices utilizing ferrites. Since e and x are both affected by H 0, varying
the d.c. magnetic field will cause a circularly polarized wave to suffer a variable phaseshift
in traversing a ferrite section. This phaseshifting feature is useful in its own right. In addi-
tion, a linearly polarized wave may be treated as equal amounts of right and left circularly
polarized waves. Thus when a linearly polarized wave passes through the ferrite, its polari-
zation rotates; this feature permits the construction of circulators through the use of t\VO
output ports in a waveguide section containing a ferrite, these ports being disposed at
right angles to each other. Control of e and x through H 0 determines which of these ports
t This is the condition for circularly polarized waves. Cf. Sec. 5.10.
464 11{agnetic 1Y1aterials CHAPTER 7

is coupled to the emerging wave. The reader interested in these and other ferrite devices
is referred to the literature."

7.17 MAXWELL'S EQUATIONS FOR MAGNETIC MATERIALS

If a collection of dielectric and/or magnetic materials] is considered at the microscopic


level to consist of an aggregation of charged particles in motion in a vacuum, then the
free space Maxwell's equations

v X E = -8
are applicable at points within these materials, with tt representing the total current
density. t, is expressible as the linear sum

tt = 1 + r, + t,
wherein t is the primary current density, tb = P is the contribution made by the time-
varying dipole moments which represent the dielectric effect, and t m = V X 1\1 is the
contribution made by the time-varying magnetic moments which represent the magnetic
effect.
Proceeding as in Section 6.21, one may account for 1b by writing Maxwell's equations
in the form
v X E = -B v X B = (\ + \m + ))) (7.119)
#.Lo- 1

Since

it [ullo ws that
1m = V X M = Jlo-1V X B - V X H
When this relation is substituted in (7.119) one obtains
v X E = -B VXH=t+D (7.120)
which is the form of Maxwell's equations suitable for application to dielectric and/or
magnetic materials. The auxiliary relations remain
V· D = p v·B == 0 (7.121)

These results will be generalized further in the next chapter to include conductive
materials.

REFERENCES

1. Corson, D. R., and P. Lorrain, Introduction to Electromagnetic Fields and fVaves, 'V. H.
Freeman and Company, San Francisco, California, 1962.
2. Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey,
1957.
t This includes materials which are both dielectric and magnetic.
47See, e.g., C. L. Hogan, "The Microwave Gyrator.. " BSTJ, 31, 1-31; January 1952. Also, R. F.
Soohoo, Theory and Applicoiions of Ferrites, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1960.
Problems 465

3. Dekker, A. J., Electrical Engineering 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1959.
4. Katz, H. W., ed., Solid State 111 agnetic and Dielectric Devices, John vViley and Sons, Inc.,
New York, 1959.
5. Kittel, C., Introduction to Solid State Physics, 2d ed., John vViley and Sons, Inc., New
York, 1956.
6. Panofsky, VV. K. H., and 1\-1. Phillips, Classical Electricity and 111 agnetism, Addison-
Wesley Publishing Company, Inc., Reading, Massachusetts, 1955.
7. Reitz, J. R., and F. J. Milford, Foundations of Electromagnetic Theory, Addison-Wesley
Publishing Company, Inc., Reading, Massachusetts, 1960.
8. Sears, F. \V., An Introduction to Thermodynamics, Kinetic Theory of Gases, and Statistical
Mechanics, 2d ed., Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1953.
9. Van Vleck, J. H., Theory of Electric and illagnetic Susceptibilities, Oxford University Press,
London, 1932.

PROBLEMS

7.1 A. magnetic specimen has a permanent magnetization M(x,Y,z) and occupies a volume V.
Its magnetic moment is defined by the integral f vM dV. If the specimen is placed in a
uniform magnetic field Bs, find the torque it experiences in terms of its magnetic moment.
7.2 A long straight wire of radius a carries a steady current I and is on the axis of a long
hollow cylinder of soft iron. The inner and outer radii of the cylinder are band c and its
relative permeability is J..Lr. Find the total flux of B inside the cylinder, per unit length.
Determine the equivalent current density throughout the iron cylinder, including its inner
and outer surfaces. From this express B external to the cylinder.
7.3 Show that the force on an atomic current loop of strength m, when placed in a non-
uniform magnetic field B, is given by F = ( m · V)B.
7.4 In a development which parallels Example 6.3, discuss the types of cavities which could
be constructed inside a specimen of magnetic material in order to measure Band H.
7.5 A spherical shell of radii a and b is permanently magnetized with a uniform magneti-
zation M. Find Band H in all three regions.
7.6 .A homogeneous spherical shell of magnetic material, of permeability J..L < }J.o, is placed in
a field which "vas originally uniform at strength Bs, Find the total field inside the shell.
7.7 A homogeneous sphere of magnetic material is uniformly magnetized to a strength M
by a coil on its surface, which carries a steady current I. How is the winding designed?
(The Westinghouse-Goudsmit mass spectrometer employs such a winding on a hollow
sphere.)
7.8 If an atomic model is assumed which consists of a nucleus surrounded by a spherical
charge cloud, all elements of the latter rotating about the nucleus at a common angular
velocity w, show that the orbital magnetic moment of such an atom is also given by (7.41).
7.9 A. homogeneous sphere of mass AI has a charge Q uniformly distributed over its surface
and is rotating about an axis through its center at angular velocity co. Find the gyro-
magnetic ratio for this system (i.e., the ratio of magnetic moment to angular momentum).
7.10 Using the formula for the sum of a geometric progression, show that (7.55) reduces to
(7.56), with the Brillouin function given by (7.57).1'hen show that, for x«1, the Brillouin
function is well-approximated by (J + 1 )x/3J. Finally show that for J large the Brillouin
and Langevin functions are essentially equivalent.
466 Magnetic Materials CHAPTER 7
7.11 The Curie constant and paramagnetic Curie temperature which appear in the Curie-
Weiss law (7.61) may be deduced from experimental data relating Xrn to temperature.
For nickel, such data are shown in the plot of Figure i .8. Use this curve to determine
C and 8 for nickel, and then use (7.67) to check the reasonableness of your results.
7.12 The trivalent ion of thulium, Tm 3+, has the outer shell configuration 4f125s2p 6. Using
Hund's rules and the Lande formula, show that for this ion S = 1, L = 5, J = 6, and
gJ= 1.167. 'rhus show that u;« = 7.57 Bohr magnetons. (The experimental value is 7.3.)
7.13 For several typical ferromagnetic materials, establish that 1l1oBIoc ~ k'I'«. Is this a reason-
able result?
7.14 Using the value J = t and a construction similar to Figure 7.10, determine the fractional
magnetization for nickel at room temperature. What is the saturation magnetization?
7.15 From Figure 7.20 deduce the internal field constants a, (3, l' for magnetite.
CHAPTER 8
Conductive Materials
ELECTRICAL CONDUCTIVITY varies widely from one class of materials to another-at
room temperature a typical insulator will display a conductivity in the range of 10- 16
mhos/rn, whereas for a semiconductor the value might be 10- 2 mhos/rn, and for a
metal such as silver the conductivity will be as high as 108 mhos/me lVlost of the discus-
sion to be undertaken in this chapter will be concerned with highly conductive materials
though some of the results (such as Ohm's law) will be seen to apply to the less conductive
materials as well,
The band theory of solids will be invoked to explain the basic differences among
insulators, semiconductors, and conductors. The free electron theory will then be used
to describe the conduction process in metals. This theory views a metal, at the atomic
level, as consisting of a lattice of positive ions which is held together by an electron gas,
with motions of this gas accounting for the conductive properties. With the aid of this
model, a derivation of Ohm's law will be presented which assumes an electron-lattice
interaction that impedes the flow of the electron gas. A relaxation time and mean free
path will be deduced for the electrons with the aid of Fermi-Dirac statistics and an
atomic interpretation of Joule heat loss will also be developed. The temperature
dependence of the resistivity of metals will be considered, including the effects of
impurities. Some attention will be given to the heat capacity and thermal conductivity
of metals and the connection between thermal and electrical conductivities will be
developed. Consideration will then be given to semiconductors, and the manner in
which their conductivity varies with impurity concentration and temperature. The
chapter concludes with a discussion of the form of Maxwell's equations suitable for
conductive media.

8.1 * HISTORICAL SURVEY

The discovery that certain materials could be used to convey electricity from one place
to another was made by Stephen Gray ' in 1729. His experiments were quasistatic and
originally dealt with a glass tube about three feet long, to one end of which he fitted a
cork. Upon rubbing the glass tube Gray found that the cork also became electrified, and
concluded that . . . "there was certainly an attractive Vertue communicated to the

* This section may be omitted without loss in continuity of the technical presentati 'n.
I S. Gray, "Several Experiments Concerning Electricity," Phil Trans Roy Soc (London), 37, 1S 44:

February 1731.
468 Conductive 111aieruils CHAPTER 8

Cork by the excited Tube." Stimulated by this result, Gray interposed a wooden rod
between the glass tube and cork and observed the same effect. N ext he connected tube
and cork with iron or brass wire, Still the same effect, undiminished by the length of
wire, Finally, he tied one end of a length of hemp cord to the glass rod and the other
end to an ivory ball. Using lengths of cord as great as four hundred feet, Gray was able
to electrify the ball by rubbing the distant glass rod.
Among those to whom Gray first communicated this discovery of electrical conduc-
tion was J. T. Desaguliers (1683-1744), who continued the experiments after Gray's
death in 1736. Desaguliers determined" that only a limited class of materials, notably
the metals, could convey electricity easily, and to these materials he gave the name
conduclorS.
Further progress with the investigation of conductive properties was hampered by
the lack of a source capable of maintaining the flow of electricity. However, Beccaria
was able to show' in 1753 that when an electric discharge was passed through a circuit
containing a tube of water the shock was more powerful if the tube cross section were
increased. And Henry Cavendish partially anticipated Ohm's law when he showed
that the resistance of a conductor is independent of the strength of the discharge. He
also established the manner in which a discharge divides itself among a set of conduc-
tors in parallel, and determined several relative conductivities, saying in a memoir"
presented to the Royal Society in 1775:
It appears from some experiments, of which I propose shortly to lay an account before
this Society, that iron wire conducts about 400 million times better than rain or distilled
water-that is, the electricity meets with no more resistance in passing through a piece of
iron wire 400,000,000 inches long than through a column of water of the same diameter
only one inch long. Sea-water, or a solution of one part of salt in 30 of water, conducts
100 times, or a saturated solution of sea-salt about 720 times, better than rain-water.
The details of these experiments lay undisclosed for a century until Maxwell's posthu-
mous edition of Cavendish's papers appeared in 1879.
Invention of the first chemical battery by Alessandro Volta (1745-1827) was an
immediate stimulus to the study of conductivity. Motivated by Galvani's researches
on animal electricity, Volta developed a pile consisting of pairs of strips of dissimilar
metals immersed in brine or a weak acid electrolyte. When a circuit was formed by
connecting a wire across the pairs of strips, a continuous electric current was observed
to flow. This was one of the most important discoveries in the history of electrical
science, and was communicated to Sir Joseph Banks, President of the Royal Society
in 'London, in a letter dispatched by Volta from his home in Como, Italy, on March
20, 1800. This letter arrived in two sections and was ultimately read before the Society
on June 26th, being published in the Transactions later that year. 5
The announcement of this discovery was so startling that scientists on both sides
2 J. T. Desaguliers, "Some Thoughts and Experiments Concerning Electricity," Phil Trans Roy Soc

(London), 41, 186-210; July 1739.


3 G. B. Beccaria, Dell' elettricismo artificiale e naturale, p. 113, Turin, 1753.
4 H. Cavendish, "An Account of Some Attempts to Imitate the Effects of the Torpedo by Electricity,"

Phil Trans Roy Soc (London), 66, 196-225; 1776.


5 A. Volta, "On the Electricity Excited by the Mere Contact of Conducting Substances of Different

Kinds," Phil Trans Roy Soc (London), 90, 403-436; 1800.


SECTION 1 H islorical Survey 469

of the Atlantic immediately set forth to repeat and extend Volta's experiments. Even
before the delayed second section of Volta's letter reached England, Nicholson and
Carlisle had constructed a voltaic pile and with it effected the electrical decomposition
of water into its constituent gases. This achievement was then extended by Cruick-
shank, who showed that metallic salt solutions could be similarly decomposed. vVollas-
ton next showed 6 that water could also be decomposed by a discharge of frictional
electricity, thus inferring that the sources of voltaic electricity were common with
those of electrostatic phenomena.
'I'hese experiments attracted the attention of Humphry Davy (1778-1829), who at
about this t.ime was appointed Professor of Chemistry at the Royal Institution in
London. Together with William Pepys, an instrument maker and Fellow of the Royal
Society, Davy designed and had constructed a succession of voltaic piles which were
the largest then in existence. The last of these was built in 1808 and consisted of 2,000
pairs of plates of zinc and copper, each plate being 6 in. square. With these batteries,
Davy melted iron wires up to a tenth inch in diameter and decomposed alkalis, obtain-
ing thereby potash and soda ash from which he extracted the new elements potassium
and sodium. He was also able to melt quartz, sapphire, and platinum, to evaporate
diamond, and to boil liquids such as water and oil. The new clements barium, strontium.
magnesium, and calcium were extracted frorn the decomposition of alkaline earths.
And Pepys in 1815 utilized the intense heat developed by the voltaic pile to melt iron
wire and diamond dust together. thus directly carburizing the iron and producing steel.
In 1821 Davy turned his attention to the problem of determining the ability of
various metals to conduct a voltaic current." He accomplished this by connecting a
voltaic battery across a circuit consisting of a column of water in parallel with the
metallic \vire being investigated. When the length of wire was less than a certain critical
value, the division of current was such that the water ceased to decompose. Davy
measured the lengths and weights of wires of different materials which would cause
this critical condition; by comparing the results he was able to show that the critical
conductance of a wire was inversely proportional to its length l and directly proportional
to its cross sectional area A, though independent of the shape of the cross section.
Critical conductance could thus be expressed by the formula G = a(Ajl) in which a is a
fundamental material property called the electrical conductivity. With this apparatus
Davy also was able to compare the conductivities of different metals, and determined
additionally that critical conductivity varied inversely with temperature.
A year earlier Ampere had provided a usable definition for current and devised an
instrument for measuring it, which he called a galvanometer. (Cf. Section 4.1.) He
distinguished between electric tension (voltage) and electric current, and observed that
electric tension existed in a voltaic battery before the circuit was closed, being detect-
able through the use of an electroscope. He viewed tension as a cause and current as an
effect. Am pere realized that a relation existed between the cause and the effect, bu t
neither he nor Davy appreciated that the relation was a simple ratio in proportion to
6 W. II. Wollaston, "Experiments on the Chemical Production and Agency of Electricity," Phil Trans
Roy Soc (London), 91, 427-434; June 1801.
7 H. Davy, "Further Researches on the Magnetic Phenomena Produced by Electricity; With Some
New Experiments on the Properties of Electrified Bodies in Their Relations to Conducting Powers and
Temperature," Phil Trans Roy Soc (London), 111,425-439; July 1821.
4 70 Conductive 1\1 aierials CHAPTER 8

Davy's critical conductivity figures. This final link ill the chain was forged by Georg
Simon Ohn1 8 (1787-1854) in the year 182G.
Working with deficient apparatus, Ohm was nevertheless able to perform a series
of carefully devised and definitive experiments which firmly established the law of
conduction which now bears his name. Preliminary investigations using voltaic batteries
proved unsatisfactory, because the electric tension of such cells fluctuated with time
due to chemical changes. For this reason Ohm substituted as SOUf(~e a thermoelectric
battery, the principle of which had been discovered by Seebeck in 1821. Using strips
of copper and bismuth joined at their t\VO ends, Ohm kept one pain t of contact in boiling
water and the other in ice, and thereby obtained a very stable current in any external
circuit he connected across the t\VO points of contact. A magnetic needle was placed
over the circuit and suspended from a torsion balance so that the current strength
could be gauged by the torsion needed in the balance in order to preserve the pointing
direction of the needle.
In one series of experiments, Ohm prepared eight copper conductors of common
cross section but different lengths and placed them in turn across the battery, recording
the following data:

Length of conductor, in. 2 4 6 10 18 34 66 130


-- -- -- -- -- -- --
Strength of magnetic action, torsion 305 282 2581- 223t 178 124i 78 44

He then went on to analyze these results, saying:


The numbers already given can be represented very satisfactorily by the equation

x=_a_
b+x
in which X is the strength of magnetic action when the conductor is used whose length is x,
and a and b are constants which represent magnitudes depending on the exciting force and
the resistance of the rest of the circuit. If for example we set b equal to 20t and a
equal to . . . 6800, we obtain by calculation the following results

305t I 280t I 259 I 224i I 177i- I 125f I 79 45

If we C0111pare these numbers found by calculation with the former set found by experiment,
it will appear that the differences are very small, and are of the order that one might expect
in researches of this kind.

In this same paper Ohm reported four other trials for each of which the same values
of a and b gave comparable agreement, He then considered wires of different material
and different diameter and established the general validity of his formula, clearly
identifying the parameter a with the electroscopic force (voltage) of the battery. It
was then possible to deduce from his formula Davy's result that resistance is propor-
8G. S. Ohm, "Determination of Laws Whereby Metals Conduct Contact Electricity," J Chemic urul
Physik (Schweigger's Journal), 46, 137; 18:26.
SECTION 1 H islorical Sl.l.rvey 471

tional to length, and inversely proportional to cross-sectional area. Ohrn also confirmed
the temperature dependence of conductivity previously reported by Davy.
K at yet satisfied, Ohm next made the important generalization that the law he had
discovered applied to any part of the circuit as well as to the entire length of wire, He
compared the flow of electricity to the flow of heat, and drew the parallel that electro-
scopic force played the same role with respect to current that temperature did with
respect to heat conduction. However, neither Ohm nor his contemporaries truly appre-
ciated the relation between the electroscopic force of a battery and the electrostatic
potential of Poisson. Several decades were to pass before this relation was widely
understood," and Oh111 was forced to endure a long, bitter period during which the true
value of his work was neither recognized nor rewarded.
The law which connects the current flowing in a metallic conductor to the heat
evolved was determined by J. I). Joule (1818-1889) in the year 1841. 10 'This was aCC0111-
plishcd by coiling wires of different lengths, cross sections, and composition onto thin
glass tubes, and then immersing the resulting assemblies in separate beakers containing
measured quantities of water. When the same intensity of steady current was passed
through the different coils, the water was found to heat up to an equilibrium tempera-
ture which differed among the several beakers, but in such a way that the change in
temperature was proportional to the resistance of the coil in question. From this Joule
concluded
. . . that when a given quantity of voltaic electricity is passed through a metallic con-
ductor for a given length of time, the quantity of heat evolved by it is always proportional
to the resistance which it presents, whatever may be the length, thickness, shape or kind
of that metallic conductor.

Joule then reasoned:


On considering the above law, I thought that the effect produced by the increase of the
intensity of the electric current would be as the square of that element, for it is evident
that in that case the resistance would be augmented in a double ratio, arising from the
increase of the quantity of electricity passed in a given time, and also from tie increase of
the velocity of the same. vVe shall immediately see that this view is actually sustained by
experirnen t.

Joule then established this last feature of the law which bears his name by showing
that the temperature rise of a coiled wire immersed in water is proportional to the square
of the current passing through it.
The ease with which heat passes through a conductor also was found to depend on
its electrical conductivity, and in 1853 Wiedemann and Franz obtained the cxperirncn-
tal result that at any temperature the ratio of the thermal conductivity of a body to its
electrical conductivity is approximately the same for all metals, and that the value
of this ratio is proportional to the absolute temperature."
As the atomic nature of materials became better understood, these experimental laws
governing electrical conductivity were rendered susceptible to theoretical derivation.
9 The clarification was achieved mainly by Gustav Kirchhoff. See, e.g., "On a Derivation of Ohm's

Law which Agrees with Electrostatic Theory," Ann Phys, 78,506-513; 1849.
10 J. P. Joule, "On the Heat Evolved by Metallic Conductors of Electricity," Phil Moq, 19,260-265;

August 1841.
11 G. Wieden1ann and R. Franz, "On the Heat Conductivity of l\1etals," Ann Phys, 89, 497-531; 1853.
472 Conduclive M aierials CHAPTER 8

Shortly after the discovery of the electron by .T. J. Thomson in 1895, an atomic model
of a conductor was proposed by Drude in which free electrons were pictured as wander-
ing through a lattice of fixed positive ions. The application of an electric field would
cause the free electron gas to drift, resulting in a current, and electron-ion collisions
could then account for the resistance to this current flow. The work of several investi-
gators, based on this model, culminated in a theory by Lorentz 12 which used Maxwell-
Boltzmann statistics and employed the Boltzmann transport equation to derive Ohm's
law and obtain a specific formula for the conductivity. Thermal conductivity also carne
within the scope of this analysis when one assumed that the thermal energy was trans-
ported by the free electrons. A major accomplishment of the theory was a confirmation
of the Wiedemann-Franz law.
Another success of the Maxwell-Boltzmann statistics was the derivation that the
specific heat capacity of a dielectric solid should be 3R, with R the ideal gas constant.
This result was based on a consequence of the statistics that each degree of freedom
of the particles of a system yielded a mean energy per particle of k'I' /2, with k Boltz-
mann's constant and T the absolute temperature. It agreed with the experimental
result of Dulong and Petit which had been established in 1819. However, when the
Sa111e argument was applied to a conductor, the prediction was that the free electron
gas should make a contribution of 3R/2 to the specific heat capacity. Experiment did
not reveal this contribution, and matters were further complicated in that the specific
heat capacity of all solids was found to decrease as absolute zero is approached, in con-
tradiction to the theory.
These theoretical difficulties were overcome with introduction of the quantum statis-
tics. An explanation of the behavior of specific heat capacity at low temperatures was
offered by Einstein in 1906, and an improved theory was then put forward by 1). Debye
in 1912. Later workers have refined this analysis so that theory and experiment are
now in satisfactory agreement. In the case of metals, when the Fermi-Dirac statistics
are applied to the free electron gas, the contribution to specific heat capacity is found
to be small, in agreement with experiment.
In 1928 A. Sommerfeld!" reworked Lorentz' theory of conduction in metals, retaining
the model of a free electron gas but replacing the Maxwell-Boltzmann statistics by
quantum statistics. The results were particularly satisfactory in the case of monovalen t
metals, for which the assumption of a spatially uniform potential within the metal (an
inherent feature of the free electron model) is a good one. Better agreement with experi-
ment for a wide variety of solids has been achieved by assuming a spatially periodic
variation of potential in accordance with the lattice dimensions, this development
being part of what is known as the band theory of solids.
These refinements have produced an acceptable theory of solids with respect to
electric and thermal conductivity and specific heat capacity, including many tempera-
ture effects and containing derivations of the laws of Ohm, Joule, and Wiedernann-
Franz. A clear delineation of the basic differences among materials which are classified
as insulators, semiconductors, or conductors has been a major triumph of the band
theory. In the case of semiconductors, use of the Fermi-Dirac statistics to determine
electron population in the different energy bands has led to a completely satisfac-
12IT. A. Lorentz, Proc Amst A cad, 7,438, ,585, 684; 1904-]905.
13A. Sommerfeld, "On the Electron Theory of Metals Based on the Fermi Statistics," Z Phys, 47,
1-32; 1928.
SECTION 2 Classification of Conductive Properties 473

tory explanation of the dependence of conductivity on impurity concentration and


tern perature.
One experimental discovery of significance which is still engaging the attention of
theorists is the superconductivity effect uncovered by Kamerlingh Onnes of Leiden
in 1911. This phenomenon is confined principally to those metals which are not among
the better conductors at 1'00111 temperature, and is characterized by a resistivity versus
temperature curve which suffers a sharp drop within a few degrees of absolute zero,
and then tends to zero as the temperature is reduced further. Applications to switching
and magnetic coils have heightened interest in these superconductive materials.

8.2 CLASSIFICATION OF CONDUCTIVE PROPERTIES UNDER THE BAND THEORY

As in Chapter 7, it will be assumed that the reader has S0111e familiarity with quantum
mechanics and is acquainted with the solutions of Schrodinger's equation applicable
to a hydrogen atom. These solutions form a discrete set, with individual solutions
identified by four quantum numbers and a characteristic energy given quite accurately
by the formula

(8.1)
2h 2 n 2

in which m and -e are the mass and charge of an electron, h is the reduced Planck's
constant, and n is the principal quantum number.
Using the quantum selection rules discussed in Section 7.8, one sees that if n == 1,
the quantum numbers land mi are constrained to be zero; m, can be either +t or -t.
Thus there are two quantum states with an energy E 1 calculable from (8.1). Similarly,
if n == 2, the quantum number l could be zero, in which case m, == O. Another possi-
bility is that l == 1, and then m, can have anyone of the three values 0, ± 1. For all
these four allowed combinations of land mi, m, can be ± t, so there is a total of eigh t
quantum states possessing an energy E 2 calculable frorn (8.1). Proceeding in this

'TABLE 8.1
THE CORRESPONDENCE OF SPECTROSCOPIC NOTATION AND QUANTUM NUMBERS

l 0 1 2 3 4
-- ------
Spectroscopic designation .......... s p d f g

fashion for higher values of n, one can determine the totality of allowed quantum
states at each energy level. If one uses the spectroscopic designation of Table 8.1, this
information may be displayed in an energy level diagram, as shown in Figure 8.1.
Each short line segment represents one allowed state and a dot placed on a particular
line segment can then indicate a hydrogen atom whose electron is in that particular
state. As an illustration the case of an atom in the excited 2p state is shown in the
figure.
Diagrams similar to Figure 8.1 may be constructed for other elements and displayed
474 Conductive /VIaterials CHAPTER 8

l=3 l = 2 l = 1 l = Oil = 0 1 = 1 l = 2 l =3

4/ 4d 4p 4s 48 4p 4d 4/ , , n = 4
, I I I I I I I L...L.L...J.... L.J...L.J U L.J ~ '-L..L.J,.."L.
" «' "

1
~
3d
L1..L....L...L-J
3p
L.L...L.J
38
L-I
38
L.J
3p
L..J....L.J
3d
'-L..L.J,.."L. n = 3

~
eo
'""
Q)
c
2p 28 28 2p
Q) u....w L-I L...I ILu n=2
Q)
-101 0
·z>
l1t =
~
~
~

117,8 = -i- m, = +i
18 Is
'-' L..J n = 1

FIGURE 8.1 A pproximate energy level diagrarn for hydrogen.

3d

3p

38

~
eo
c'"" 2p
Q)

Q)
Q)
>
·z
~
~
~

28

18 _

(a) One atom (b) Two atoms


FIGURE 8.2 Energy levels for a single atom and for a
pair of identical atoms with close spacing.
SECTION 2 Classification of Conductive Properties 475

in a somewhat simplified form, as shown in Figure 8.2a, wherein the segments repre-
senting different quantum states have been coalesced into a single horizontal line.
Note that in this more general case, the energy of a state is dependent on the quantum
number l as well as the quantum number n, (This is really true in the case of hydrogen
as well since formula 8.1 is not quite precise.)
If t\VO atoms of the same element are widely enough separated, their energy level
diagrams are identical, and each can be represented by Figure 8.2a. However, if the

3p

3s
~
bl)
~

~
Q)
C)
>
·zoj
2])
a3
~

~~--F----------------- 2s

Is

J nteratornic spacing
FIGURE 8.3 Energy bands in a crystalline solid.

atoms are brought close together, they interact. This introduces a coupling term into
Schrodinger's equation and causes the solutions to split into pairs, so that the energy
level diagram is represented by Figure 8.2b. The degree of splitting is governed by the
distance of separation. An important feature to note is that the total number of allowed
states is twice the number for an isolated atom, since there are now t\VO at0111S in the
system.
This idea can be generalized to include the case of a crystal (i.e., a three dimensional
array of atoms). If the interatomic distance is varied, an energy level diagram will take
the form suggested by Figure 8.3. As the separation of at0111S decreases, the lines repre-
senting energy states begin to separate and form bands. In a typical solid of at0111
density 10 29 per 111 3, there will be approximately 10 29 states in each band so tha t a band
may be looked upon as a quasicontinuous region of allowed energy states. Between
these allowed energy bands (unless they overlap) are forbidden regions or gaps, so-called
476 Conductive Materials CHAPTER 8

because no electron of any atom in the solid can have an energy corresponding to a level
in one of these forbidden regions.
Depending on the equilibrium value r, of the interatomic separation distance, two
particular allowed energy bands mayor may not overlap. As indicated by Figure 8.3,
the lower energy states split into bands at a smaller atom spacing. This is reasonable,
since these states represent electrons closer to their nuclei and therefore less coupled
to electrons of other atoms. For this reason the higher energy bands overlap first, and
the diagram indicates a situation in which the equilibrium spacing is such that all but
the Is and 28 states have split into bands, with the 38 and 3p bands overlapping.

few)

"
, "'"
" \ " ' - 0 °1~
\
\
\
,
\

0"---------------------1-----..;;:1----.;;::.-----
WF(O) W

FIGURE 8.4 A plot of the fermi function vs. energy for several teniperaiures.

For any finite atom spacing, if one looks high enough on an energy diagram of the
type represented by Figure 8.3, overlapping bands will be encountered. This feature is
of little significance if, for the material in question, the overlapping bands of allowed
energy states normally are unoccupied by electrons. However, if t\VO overlapping bands
normally are partially filled, electronic conduction can occur readily, as shall be seen
shortly. Therefore, it is important to be able to determine the degree of occupancy of
the allowed energy levels in the various bands. This can be accomplished with the aid
of the Fermi-Dirac statistics.
It is shown 14 in texts concerned with quantum statistics that, for particles (such as
electrons) which obey the Pauli exclusion principle, the probability that a particular
quantum state having an energy w is occupied is given by the Fermi-Dirac function
1
(8.2)
f(w) = e(w-wrl/kT + 1

in which WF is a temperature-dependent parameter known as the Fermi energy. When


(8.2) refers to the energy levels in a crystalline solid, the value of W F is also found to
depend on the material being considered.
14See, e.g., F. W. Sears, A n Introduction to Thermodynamics, the Kinetic Theory of Gases, and Statistical
Mechanics, 2d ed., Chap. 16, Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1963.
SECTION 2 Classification of Conductive Properties 477

In the temperature range T :::; 1000°1(, k'T does not exceed T10th of an electron volt
and, for most crystalline materials, wF/kT » 1. Therefore, if w «w/<', the exponential
term in (8.2) is very small andf(w) ~ 1, indicating that all states having energies much
smaller than the Fermi energy essentially are completely occupied. Alternatively, if
W »wP, the exponential term in (8.2) is quite large andf(w) tends to zero exponentially
as w increases. A plot of (8.2) is given in Figure 8.4 for several temperatures and it is
evident that states whose energies exceed WF by 1110re than kT are virtually unoccupied.
At absolute zero the distribution simplifies to a step function, with all states below
w F fully occupied and all states above w F totally empty.
If for a particular material S(w) dw is the number of allowed quantum states per
unit volume in the energy range from w to w + dw, it follows that
N(w) dw = f(w)S(w) dw
Sew) dw (8.3)

is the number of electrons per unit volume of the material having energies between w
and w + dw.

'---------~W(' --------~We t - - - - - - - -..... W"

-------- Wf'
.---------- Wv

-·-----tWr

(a) Partially filled band. (b) Totally filled band. (c) Overlapping bands.
Conductor. I nsulator or Conductor.
semiconductor.
FIGURE 8.5 Energy band diaqrams for various positions of the Fermi level. (Top of lower
band denoted by Wv, bottom of upper band by We).

Since the Fermi level must lie either within an allowed band or between bands in a
forbidden region, three possible cases arise, as illustrated by Figure 8.,S. If the Fermi
level lies within a band (Figure 8})a), Equation (8.3) indicates that at QOI( the band is
totally filled below Wp and completely empty above Wp. As the temperature is raised,
the population in this band spreads out somewhat but very few electrons are found in
the next higher band. A material exhibiting this type of electron distribution is a
478 Conductive Materials CHAPTER 8

conductor, since unpopulated energy states are available in the same band, to which
an electron can move, The change in energy which an electron requires in order to
move to one of these unpopulated states is slight, since the density of states within the
band is so great. An electron can gain this energy easily from an applied electric field
and therefore readily transfers its association from one atom to the next.
If the Fermi level lies between bands (Figure 8.5b), Equation (8.3) indicates that at
OO!{, the lower band is completely filled and the upper band is empty. At a temperature
T, if the energy gap between bands is large compared to k'I', the population distribution
is still essentially the same as for 0°1(. Thus there are no empty states at a nearby energy
level to which an electron can move-c-eonduction is very difficult and the material
is an insulator. t If the gap width is not too great, at a finite temperature some popula-
tion of the upper band by electrons occurs and these electrons are free to move to empty
states in the upper band. They are therefore conduction electrons and contribute to a
small current occasioned by the presence of an external electric field. The states which
they vacated in the lower band also provide a contribution to the conduction process.
Such behavior characterizes materials known as semiconductors.
Finally, if the Fermi level lies in a region where two bands overlap (Figure 8.5c),
one or both of the bands is only partially filled and conduction occurs easily.
These observations may be summarized by saying that a conductor is a material
whose electrons populate the allowed energy bands in such a way that there is an upper
band which is partially filled or an upper pair of overlapping bands which are partially
filled. Electrons in these bands can move from one state to another with only a small
change in energy and are therefore highly mobile. On the other hand, an insulator is a
material whose electrons populate the allowed energy bands in such a way that there is
an upper band which is completely filled. Conduction in an insulator could only occur if
some of these electrons could move up to the next higher band, but the energy gap is
too great to permit this at normal temperatures. In a semiconductor, this gap width
is not so great, and some small population of the next band occurs at normal tempera-
tures, permitting a slight current to flow in the presence of an electric field.

8.3 FREE ELECTRON THEORY OF METALS-OHM'S LAW

Since the conduction electrons in a good conductor lie in partially filled bands and can
move from one state to another with little change in energy, a suitable model of such
materials is one which pictures the conductor at the atomic level as consisting of a
lattice of positive ions coexisting with an electron gas. The ions are capable of vibra-
tions about their individual lattice sites but are not free to wander from one lattice
site to another. The electrons comprising the gas, on the other hand, are highly mobile
and able to move throughout the lattice against an almost-constant background poten-
tial, thus not belonging to any particular atom. In this picture, called the free electron
model, the number of electrons in the gas is not necessarily an integral multiple of the
number of atoms in the material; thus the average valence of the lattice ions is not
restricted to an integer. Also, the identities of the electrons making up the gas may
change with time, thereby accounting for electron-lattice interactions in which an
electron is either freed or captured.
t At room temperature, kT is 41>th of an electron volt. In a good insulator, such as diamond, the gap
width may be as great as 7 or 8 electron volts.
SECTION :3 Free Electron Theoru of M etals-r-Uhm'sLaw 479

At a constant temperature T, the number of free electrons per unit volume should
have a time-independent value n. The average, or drift velocity of the electrons making
up the gas is given by

(8.4)

in which n is assumed to be large enough to make this average meaningful and Vi is


the instantaneous veloci ty of the ith electron. In the absence of an applied electric field
v a == 0 since in equilibrium, in any speed range, there should be just as many electrons
moving in one direction as in any other.

v(t)

/_ _ T _ _ I
I'-

FIGURE 8.6 Equivalent velocity history of average free electron.

If a constant electric field E is applied to the conductor, it is found that a new equilib-
rium condition is established in which a steady current density exists at each point in
the conductor, implying a time-independent, but nonzero value for the drift velocity Yd.
This experimental result can be explained by assuming that each electron in the gas,
during each time interval marked by successive interactions it has with the lattice, is
accelerated by the applied field. The electrons gain 1110111entum during these intervals
and then surrender some of their momentum to the lattice during an interaction. For a
given electron the succession of time intervals between lattice interactions is a random
sequence, as is its initial velocity after each interaction. The history of such encounters
also differs from one electron to the next. However, the net average effect is as though
each electron possessed the drift velocity Yd.
One 111ay account for the momentum transfer to the lattice by imagining that the
average electron has an equivalent velocity history as shown in Figure 8.6. Every r sec
it interacts with the lattice, stopping momentarily and surrendering mv r units of
momentum, with 112 the electronic mass. Between interactions the equivalent constant
velocity v « is maintained. It is apparent that the time-average velocity in Figure 8.6
is Vd if the interaction time is negligible compared to r . The period r can be selected so
that the rate of momentum transfer per electron is properly given by 112vd/r.
Since the time-average motion of the free electrons is unaccelerated, they experience
no net force and the effect of the external field is balanced by the rate of momentum
480 Conductive Materials CHAPTER 8

transfer so that

-eE (8.5)
r

with e the electronic charge. The induced steady current density is given by

t = -neVd (8.6)
and thus
ne 2r
t =-E (8.7)
m

When the electrical conductivity a, is defined by the relation

(8.8)
it follows that
1 ne 2r
(Jc = - =- (8.9)
Pc m
in which Pc, the reciprocal of a., is called the resistivity.
Equation (8.7) has been derived under the assumption that the conduction electrons
are free to wander throughout the metal against a constant background potential.
This assumption implies an isotropic material and is most valid for monovalent metals.
For nonisotropic materials, the free electron mass m must be replaced by the effective
mass m* (cf. Section 7.16), and both rand m* depend on the direction of application
of E. In this more general case (Jc as given by (8.9) becomes a tensor.
Inspection of Equation (8.9) reveals that the electrical conductivities of different
isotropic conductors are governed by the free electron density n and by the average
time T between momentum transfers. If one considers monovalent metals such as
lithium, copper, or silver, n can be taken to be equal to the atom density. An experi-
mental determination of the electrical conductivity will then yield an estimation of T.
The experiment consists simply of choosing a cylindrical specimen of the material,
of length l and cross-sectional area A, and establishing a uniform electric field E
throughout the specimen. Then from (8.8)

(JcA A V
LA =I =- El = - V = - (8.10)
l Pel R
in which I is the total current, V is the voltage drop through the length land

l
R = o.>: (8.11)
A

is the resistance of the specimen. Equation (8.10) is recognized as Ohm's law, an alter-
native form of which is given by (8.8). Thus measurements of V, I, and the dimensions
of the specimen will permit a determination of R, Pc, and ultimately a.: The results of
this experiment in the case of several monovalent metals are listed in Table 8.2. It is
to be noted that the deduced values of r are all in the range of 10- 14 sec.
If instead of being constant E is a function of time which varies slowly compared
SECTION 3 Free Electron Theoru of Meiole-e-Ohm:« Law 481

to r , the force equation (8.5) assumes the more general form

m-
dt
dVd
+ n~vd/r == -eE (8.12)

This differential equation has the complementary solution


vd(l) == vd(O)e- t /r (8.13)

indicating that, if a steady electric field were suddenly removed, the drift velocity
would decay exponentially. For this reason r customarily is called the relaxation time.
For the metals of Table 8.2 this relaxation time is seen to be exceedingly short.

1'ABLE 8.2
THE ELECTRICAL CONDUCTIVITY (Jc OF SOl\1E lVIONOVALENT l\IETALS AT aoe
Metal a, (mhos /m) T (sees) from (8.9)

Li ........... 0.11 X 10 8 0.9 X 10- 14

Na .......... 0.21 X 108 3.3 X 10- 14

K ........... 0.15X10 8 4.4 X 10- 14

Cu .......... 0.58 X 108 2.7 X 10- 14

Ag.......... 0.62 X 108 3.8 X 10- 14

n
If E is time-harmonic (8.12) becomes

m (jw + Va = -eE (8.14)

revealing that - Vd (and thus t) will be in phase with E only if r- 1 »w. Microwave
measurements in K-band waveguide at an angular frequency w '"'" 2 X 1011 rad/sec
indicate that, even at this high frequency, t and E are essentially in phase in the copper
walls, thus providing supporting evidence that T < 10- 1 1 sec. For this reason Equa-
tions (8.8) and (8.9) may be presumed to be valid for monovalent metals which are
good conductors, even if t and E are time-varying, and even at frequencies as high as
the upper end of the microwave range.

EXA:MPLE 8.1
By experiment, the resistance at room temperature of a silver wire is found to be 0.0281
ohms. Its measured length is 1.265 In and its mean measured diameter is 0.096 em. I t is
desired to estimate the relaxation time for this specimen.
Th rough use of (8.11)

RA (0.0281)(77"/4)(0.096 X 10- 2)2


p; = -Z-
1.265
= 1.61 X 10- 8 ohm-m
482 Conductioe M ateriais CHAPTER 8

from which
1
<Ie = - = 0.62 X 108 mhos/rn
Pc
in agreement with Table 8.2.
If one free electron per atom is assumed for silver, then n = 5.86 X 1028 and (8.9) gives
<Icm (0.62 X 108)(9.1 X 10-3 1)
T =- =
ne 2
(5.86 X 1028) (1.6 X 10- 19 ) 2
= 3.8 X 10- n sec
which is also in agreement with Table 8.2.

8.4 OHM'S LAW-ALTERNATE DERIVATION


By making use of Boltzmann's transport equation, one is able to derive Ohm's law
in a manner which differs from the approach taken in Section 8.3. In addition to pro-
viding further insight to the meaning of electrical conductivity, this second procedure
has the advantage of paralleling the development of thermal conductivity to be pre-
sented in Section 8.10, thus simplifying the establishment of the Wiedemann-Franz
law.
Once again a free electron gas is assumed to exist within the conductor. If one lets
(vx,vy,v z ) represent the velocity of an electron, at time t the quantity

!(x,y,z,Vx,vy,v;.,t) dx dy dz du; do; do,

can be taken to represent the number of electrons in the spatial volume element
dx dy dz which have their velocities lying in the range vx to u, + du., Vy to Vy + dvy,
u, to Vz + du.. The function f is known as the distribution function in six-dimensional
space, or phase space (three spatial dimensions plus three dimensions for the velocity
components).
If there were no electron-lattice interactions, an electron which at time t was at the
point (x,Y,z) and had velocity (vx,vy,v z ) would at time t + dt be at the point (x o; dt, +
+ + +
y Vy dt, z u, dt) and have the velocity (vx Vx dt, Vy vy dt, u, vz dt). Were it + +
not for collisions, all the electrons which had been in dx . . . du, would be in a new
volume element dx' . . . dv;. Since the velocities and accelerations of all the electrons
in these t\VO adjacent elements of phase space are essentially the same, to first order

dx dy dz du; do; du, = dx' dy' dz' dv: dv~ dv;


If now one lets

[ ~]
at coli
dt dx' dy' dz' dv~ dv~ dv;
be the net num ber of electrons which are forced into dx' .. · dv: due to electron-
lattice interactions during the time interval dt, then it follows that
f(x + Vx dt, Y + Vy dt, z + u, dt, 1)x +V x dt, Vy + v dt, », + V dt, t +
y z dt)

= !(x,y,z,VX,vy,vz,t) + [:1011 dt (8.15)


SECTION 4 OhnL'S Laur-rAliernaie ]Jerivation 483

This equation may be recognized as containing a primitive form of a total differential,


from which there results

Vx -
af + af +
Vy -
aj + V.x - aj + V. af + v.z -af + -af
Vz - y - =
[ -af] (8.16)
ax ay az avx au!} avz at at colI

This is known as Boltzmann's transport equation.


In many problems such as the one being treated here the collision mechanisms are
such that, if a stimulus is present and causes the distribution f, and then the stimulus
is removed, the collisions cause the particles to reach an equilibrium distribution fa
under a relaxation process. In other words if the stimulus is removed at t = 0, then

(8.17)

in which T is a relaxation time which characterizes the process.


In the absence of a stimulus collisions are the only cause of a time change in f and
thus from (8.17)

[ ~] = i (f - fo) = _ f - fo (8.18)
at coIl at T

In the development which follows it will be assumed that (8.18) is an appropriate


expression to insert in (8.16) to account for distribution changes due to collisions.
Imagine that through the agency of external sources a static electric field has been
applied to a conductor, and that coordinate axes have been chosen so that in a small
region of the conductor the field is Z directed. After sufficient time has elapsed, a tinle-
independent distribution f(z,vx,vy,v z ) of the free electrons within the conductor will have
been established which differs from the field-free distribution fo(vx,vy,v z ) . Since

e
vz = - - E..
nl
the Boltzmann equation becomes

u, ~ - ~ e. af f - fo
(8.19)
az 171 auz T

For normal electric fields f will differ only slightly from fa, and one 111ay substitute
f = fa in the left side of (8.19), obtaining

f = fo + re E z
afo (8.20)
m avz
The distribution f is thus seen, in first approximation, to depend only on the velocity
variables, and may be used in the following manner to deduce an expression for the
electric current:
With reference to Figure 8.7 let dA be a small area constructed transverse to the
Z direction and consider the parallelopiped of length v dt erected on dA as a base. The
number of free electrons within this parallelopiped whose velocity components are

Vx = v sin (J cos <P "» = v sin (J sin c/> », = v cos (J


IS
484 Conductive Materials CHAPTER 8

since dx dy dz = v cos () dA dt. This is the number of electrons with velocity v which
have passed through dA in time dt. (Electron-lattice interactions change the identity
of some of these electrons but not their number.) The charge transported per unit area

....y

FIGURE 8.7 Calculation of flow through area element.

per unit time is therefore

Upon considering in this manner the free electrons in the entire velocity range, one
concludes that the electric current density is

t = - eIIIv.j(vx,vy,v,) dvx dvy du,


-e III v.jo do, dvy dv, - ::2 E, III 11, :~: d», dvy dv, (8.21 )

Since 10 is symmetrical in v., the first integral in (8.21) is zero; the second term may be
integrated by parts, giving

(8.22)

The first of these integrals is odd and gives a null value; the second yields the result

(8.23)

in which n is the spatial electron volume density. Equation (8.23) is seen to be identical
SECTION .s T'he 1\1ean T'iJne Between Electron-Lattice Interactions 485

with (8.7) when proper regard is given to vectorial directions, and is thus a statement
of Ohm's law,

8.5 THE MEAN TIME BETWEEN ELECTRON-LATTICE INTERACTIONS

Within the electron gas of an isotropic conductor, consider a large number N G of elec-
trons, all of these electrons being characterized by the fact that at the present time t,
each is about to have its next interaction with the lattice. Let N(t l + dt l ) be the sub-
group of these electrons which have not suffered a lattice interaction since the earlier
time t 1 + dt-: At the still earlier time l-i, this sub-group is further reduced to N(t l ) . It
follows therefore that the number

is the count of electrons within N G which had their last interaction with the lattice
during the time interval dt 1 beginning at the time i..
I t is reasonable to assume that dN is proportional to both N (ll) and dis, so that one
may write

dN == N dt 1 (8.24)
Ie

in which Ie is a constant of proportionality having the units of time. Integration of


(8.24) gives
(8.25)
The average time between lattice interactions for these N G electrons is given by

and thus the proportionality constant Ie is the mean free t.ime between electron-lattice
interactions, or more briefly, the mean collision time.
The drift velocity is the average velocity of all N G electrons just before their next
lattice interaction, namely

v« = ~
Na
J v(t) s», (8.26)

in which dN v is a function of v and the integration is over all velocity space.


In the case of isotropic materials since »« is normally so small compared to the aver-
age speed of the free electrons, the scatter of these free electrons after their lattice
interactions is isotropic, and their aggregate momentum afterwards is zero. Therefore
the momentum they transfer to the lattice is their aggregate momentum just before
interaction. This means that the average momentum transferred per electron is

(mv) = ~G J mv(t) dN" = mv« (8.27)

For this reason one may say that the average momentum transferred to the lattice per
electron per interaction is 1nvd/Tc. Upon comparing this result with the analysis in
Section 8.3 or Section 8.4, one finds that for isotropic materials the mean collision time
486 Conductive 111 aterials CHAPTER 8

is the same as the relaxation time-s-that is, r. = T. This is not true for nonisotropic
materials, but such a discussion lies outside the scope of the present development.

8.6 MEAN FREE PATH

A mean free path A for the electrons comprising the gas may be defined by the relation
A = UTe (8.28)

in which U is an appropriate average electron velocity. Since the electron gas obeys the
Pauli exclusion principle, in the absence of an external field the energy distribution
of electrons is given by the Fermi-Dirac formula;" namely,
47r W~2
dn w = - (211~)% dw (8.29)
h 3
exp [(w - wF)/kT] + 1
in which dn.; is the number of electrons per unit volume whose energy lies in the range
from W to w + dw. The quantities h, k, and m which appear in (8.29) are Planck's con-
stant, Boltzmann's constant, and the electronic mass, and w F is a temperature-depend-
ent parameter known as the Fermi energy. (Cf. Section 8.2.) The value of WF must be
adjusted so that the integral of (8.29) over all possible energies gives the total number
of electrons per unit volume, n.
A relative plot of this distribution in momentum space for several values of the
absolute temperature T has been given in Figure 8.4. At absolute zero all states
are seen to be uniformly occu pied out to an energy W F(O), beyond which all states
are empty. As the temperature is raised, this distribution is rounded off and WP
becomes the energy of the half-populated state. For kT « W r, the rounding off causes
a significant departure from the absolute zero distribution only in the energy range
WF - kT < W < WF + kT.
The value of the Fermi energy at absolute zero may be determined by con-
sidering the distribution in momentum space. Since the free electron gas model of an
isotropic conductor implies a uniform potential distribution within the conductor, to
an energy w possessed by a free electron there corresponds a momentum p, these quan-
tities being related by
1 (mv)2 p2
- mo? = W = -- = -
2 2m 2m

Thus 47rp 2 dp = 27r(21n)~2w~~ dw and the distribution (8.29) is equivalent to

2/h 3
dn; = 47r p 2 dp (8.30)
exp [(w - wp)/kT] + 1

Inspection of (8.30) reveals that, if T = OOI{, for w < w F(O) the distribution in mo-
n1entU111 space is
(8.31)

whereas for w > wp(O), dn p == O. Therefore the spherical shell of volume 47rp 2 dp in
momentum space has a uniform population of electrons with density (2/h 3 ) for all
U See, e.g., F. W. Sea.rs, loco cit.
SECTION 6 Mean Free Path 487

shells of radius 0 ~ p ~ P FO, in which

P~o == Zmu: F(O) (8.32)


Integration of (8.31) gives
(8.33)

Combination of (8.32) and (8.33) yields for the Fermi energy level at absolute zero

W F(O)
h (3n)%
2
== - ' - (8.34)
8rn 7r
A calculation of the Fermi zero energy may be undertaken using (8.34) if one assumes
a value for the free-electron density n. For monovalent metals, allowing one free electron
per atom, the results are listed in the second column of Table 8.3. It is to be noted that
WF(O) is characteristically several electron volts for these materials,

TABLE 8.3
FERl\lI LEVEL ENERGY, VELOCITY, AND l\lEAN FREE
PATH FOR SOlVIE l\'10NOVALENT l\1ETALS

111etal WF(O) ev UF m/sec A angstroms

Li........... 4.7 1.3 X 106 115


Na ........... 3.1 1.05 X 10 6 345
K ............ 2.1 0.85 X 106 370
Cu ......... '1 7.0 1.6X10 6 430
. .Ag ........... 5.5 1.4 X 106 530

It may be shown that the Fermi energy at any temperature is given by the expansion 16

WF(7 1
) == WF(O) {I _ 127f2 [~J2
WF(O)
+ ... }
Since at room temperature kT ~ 0.025 ev, it follows that at ordinary temperatures,
for good conductors with Fermi energies such as those given in Table 8.3, kT /w F(O) « 1
and wF(T) ~ WF(O). Under these circumstances one may gain a pictorial feeling for
the Fermi-Dirac distribution at room temperature for a free electron gas (within copper
for example) by imagining it to be the curve labeled T 1 in Figure 8.4, with the partially
populated energy levels confined essentially to a region less than T!oth of W F·
SO far this discussion of the Fermi-Dirac distribution of an electron gas has been
limited to the case in which no external electric field is present. Upon application of an
external field, the entire gas takes on a drift velocity Vd which slightly alters the energy
distribution. However, since the energy levels below W F - kT were essentially filled,
and since the energy levels above W F + k1 were essentially empty, the principal change
1

in the distribution occurs around the Fermi level. Thus, for example, when a steady
electric field is removed and the drift velocity v« begins to disappear, this is due to a
See, e.g., C. Kittel, Introduction to Solid State Physics, 2d ed., Chap. 10, John Wiley and Sons, Inc.,
16

New York, 1956. This result was first obtained by Sommerfeld.


488 Conductive AIaterials CHAPTER 8

relaxation in the energy distribution of free electrons back to the form of Figure 8.4.
For this reason it is principally those electrons whose energies arc close to W r which
participate in the relaxation process. The relaxation time T occurring in (8.13) and in
the conductivity expression (8.9) therefore refers to the Fermi-level electrons.
Returning to (8.28), one sees that the mean free path for Fermi-level electrons in an
isotropic conductor is expressible as A = UFT, in which Up is given adequately by the
relation Jnu~/2 = WF(O). For the monovalent metals listed in Table 8.3, the values for
UF and A calculated in this way are given in columns 3 and 4 of that table. On the pre-
sumption that the average collision t ime for free electrons in any speed range (v, v + dv)
is inversely proportional to v, A has the same value for all the free electrons. Thus even
though the mean free path A has been computed by focusing attention on those electrons
whose energies are close to W F, the results should be applicable to the entire free electron
gas.
At first sight the A values listed in Table 8.3 seem startling, since they are t\VO orders
of magnitude larger than the lattice dimensions of the metals to which they refer.
However, if one recalls the wavelike properties of electrons," an explanation can be
found in terms of waves passing through periodic structures. Just as a light wave
would propagate through a perfect crystal without attenuation, so would electron waves
pass through a perfectly periodic lattice without interaction. Thus it is not the lattice
sites themselves which cause the electron-lattice interactions, but rather it is the
deviations from a perfectly periodic lattice which impede the electron motion. These
deviations usually are caused chiefly by thermal lattice vibrations but also include
lattice defects and foreign impurity atoms. Boundaries, of course, also play an impeding
role.
Because the classical picture of a collision between an electron and a lattice ion is
seen not to be valid in describing this process, the term electron-lattice interaction has
been used in preference to collision. The subject of lattice imperfections will be taken
up again in Section 8.9 where the temperature dependence of resistivity is discussed.

EXAl\IPLE 8.2
The Fermi zero energy for silver may be determined from (8.34) by assuming one free
electron per atom, so that n = 5.86 X 1028 electrons/rn''. Then

(6.62 X 10- 34 )2
WF(O) = - - - - -
(3 X 5.86 X 1028)~~
8 X 9.1 X 10- 31 1f'

= 8.85 X 10- 19 joules = 5.5 ev


From this it follows that

2WFCO)] H
UF = [ --;;;- = 1.39 X 106 m/sec

Using Table 8.2, one finds the mean free path in silver to be

A = UFT = 5.3 X 10- 8 m = 530 A


These results are all seen to be in accord with Table 8.3.

17 See, e.g., the electron diffraction experiments of Davisson and Germer, "Diffraction of Electrons by

a Crystal of Nickel," Phys Rev, 30, 705-740; 1927.


SECTION 7 Joule's Law 489

8.7 JOULE' 5 LAW

As noted in Section 8.1, Joule determined experimentally that the heat developed per
second in a conducting wire of resistance R, carrying a current I, is 12R. If one uses
(8.10) and (8.11), this may be put in the form
V2 (El)2 E2
C1 c

PR = It = pel/ A = U
in which l and A are the length of the wire and its cross section. From this, one may
infer that the volume density of heat developed per second is
(8.35)
It is this form of Joule's law for which an elementary derivation will now be given.
Consider again, as in Section 8.5, an isotropic conductor in which each of a large
number of electrons N G is about to have its next interaction with the lattice at the
present time t. One of these electrons had velocity components Vx, Vy , u, as it carne off its
last encounter with the lattice at an earlier time t.. If an electric field of strength E is
acting along the negative X direction, its velocity components now are

Vy

since it has been uniformly accelerated throughout this period of time. The increase in
energy of this electron between lattice interactions is therefore

(8.36)

Assuming isotropic scattering, if (8.36) is averaged over all values of initial velocity
v,;, since (vx ) = 0, one finds that
e2
(b.W) == - E2(t - t 1)2 (8.37)
2m

If now this expression is further averaged over all earlier times i., using the distribution
function (8.25), one obtains

«sw» = -
NG
1
J (6W)dN = -2mT
e2E2
J (t - t

c _ co
t 1) 2e-(H\)h,dt 1

Upon performing the integration, one finds that

When one recalls that for isotropic scattering T == T c, if there are n free electrons per
m 3, and if all these electrons transfer their excess kinetic energy to the lattice, then

(8.38)

which agrees with the experimental statement of Joule's law, (8.35).


490 Conductive Materials CHAPTER 8

8.8 THE DEBYE THEORY OF SPECIFIC HEAT

The first law of thermodynamics is a statement of the principle of conservation of


energy and asserts that if an amount of heat dQ is added to a system, this must be equal
to the increase dU in the system's internal energy plus the work done by the system on
its surroundings. If the work done is mechanical in nature, the law takes the form
dQ = dU + p dV (8.39)
with p the pressure exerted by the system and dV its change in volume.
The internal energy U is expressible formally as a function of any t\VO of the state
variables-pressure, volume, and temperature. Thus one form of the total differential is

dU = (aU)
av
dV + (au)
aT v
T
dT (8.40)

with the subscripts on the partial derivatives indicating which state variable is being
held fixed. Combination of these two equations gives

dQ = [p + (:~)J dV + (:~)v dT (8.41 )

The heat capacity of the system at constant volume will be represented by the symbol
Cv , and is defined as the amount of heat which can be absorbed per unit rise in tem-
perature under the restriction that the volume not change. In mathematical terms

c- = (ddTQ) v (8.42)

It is sometimes advantageous to speak in terms of one mole of the system and to dis-
tinguish molal values by using lower-case symbols for the heat absorbed, internal
energy, volume, etc. The heat capacity per mole at constant volume is thus designated
by c, and is customarily called the specific heat at constant volume. From (8.41) and
(8.42) it is given by

c
v
= (au)
aT v
(8.43)

This quantity can be ascertained for different materials by experiment and is generally
found to be a function of temperature. In 1912 Debye" addressed himself to the problem
of determining a theoretical expression for c; which would agree with observed data.
The essentials of his theory are embodied in the following development.
Consider a solid in the shape of a cube of edge L, and imagine that the vibrations of
the atoms which occupy the lattice sites of this solid give rise to elastic standing waves
within the solid. These waves satisfy the differential equation

(8.44)

which has the standing wave solutions

f = sin
l7rX) sin (m7r
(L L Y) . (n7rz)
SIn L cos 27rvt (8.45)

18 P. Debye, "On the Theory of Specific Heat," Ann Phus. 39. 789-839: 1912.
SECTION 8 The Debye Theory of Specific Heat 491

in which l, m, n are positive integers and v is the frequency of oscillation; Cs == VA is


the propagation velocity, with A the wavelength. Insertion of (8.45) in (8.44) yields a
relation which dictates the allowed frequencies and wavelengths, namely,
71"' 2) 471"'2 v 2 471"'2
( -L2 ([2 + m? + n2) == __ == - (8.46)
c; A2
The modes (8.45) may be put into one-to-one correspondence with the triplets of
positive integers (l,m,n) and this correspondence permits determination of the number
of modes Z(v) dv in the frequency range from v to u + dv, If the triplets (l,m,n) are
plotted as points in Cartesian coordinates, each point occupies a unit volume. Since

r2 =F+ m? + n 2 = ( 2Lv)2
-;;:
may be interpreted as the square of the distance out to the triplet (l,m,n), then it
follows that

being the volume in the first octant between the spheres of radii rand r + dr, is nu-
merically equal to the number of triplets in the shell bounded by these spheres. But

(2L)3
-1 (471"'1' 2 dr) == -71'" - V2 dv == Z ( v) d v (8.47)
8 2 c.
and thus the frequency distribution of I110des has been determined.
In Debye's theory longitudinal and transverse elastic waves are permitted with
propagation velocities c. and Ct. Since there are t\VO independent transverse axes, one
concludes as an extension of (8.47) that the distribution of modes, including longitudinal
and both transverse types, is
Z (v) d» == 471'" V (ci 3 + 2C~3) v2 dv (8.48)
in which V == £3 is the volume of the specimen.
Debye assumed that each atom in the lattice had 3 degrees of vibrational freedom,
and that the number of allowed modes for the elastic waves was limited to 3N, with N
the total number of atoms. This defined an upper frequency v D (called the Debye fre-
quency) which satisfies the relation

l
o
Z(v) dv = 3N

Using (8.48), one finds" that


9N (c- 3 + 2C-3)-1
3 __
VD - 471"'V 1 t
(8.49)

A feeling for the magnitude of v D may be obtained by choosing as a representative value


NjV == 10 28 atoms per m ', and assuming that ci == Ct == 10 3 rri/sec. This gives
liD ~ 10 cps.
12

Using Planck's formula that the average energy of an oscillator at temperature T is


hvj(ehJllkT - 1), one may write for the vibrational energy of the solid
Jl D h 9NkT JXD x 3 dx
U = j
0 Z(v) ehPlk/- 1 d.v = (hVD/ k T )3 eX - 1
0
(8.50)
492 Conductive Materials CHAPTER 8

in which x = hv/kT. It is convenient to introduce a characteristic temperature (}D,


called the Debye temperature, by the definition (JD = hVD/k, which gives
. 9NkT x 3 dx
f
8D/T

L = (8 n / T )3 ex _ 1 (8.51)

For high temperatures (T» (}D), inspection of (8.51) reveals that x « 1 for the
entire range of integration. The denominator of the integrand may then be replaced
by x and one obtains
U = 3Nk1' (8.52)
wherein N A is Avogadro's number, the number of atoms per mole. If one uses (8.43),
Cv = 3kN A = 3R (8.53)
in which R = kN A is the ideal gas constant. This result agrees with the classical
Maxwell-Boltzmann theory, indicating that the latter is adequate at temperatures
sufficiently above (}D.
For low temperatures (T« (}D), the upper limit of integration in (8.51) may be
replaced by infinity and the integral assumes" the value 1r 4/15. Then
31r 4R c = 121r R
4
(~)3
1t = -- 1'4 (8.54)
5(J~ v 5 (Jv

At intermediate temperatures the integral in (8.51) can be calculated by standard


techniques. Over the whole range of temperature one is able to construct the plot of
specific heat shown in Figure 8.8. The observed values of c, for any solid may be fitted
19 See, e.g., Whittaker and Watson, Modern A nalysis, 4th ed., p. 265, Cambridge University Press,
London, 1935.

./~
V
2R J
(
Y
/
J
R

)
o 0.2 0.4 0.0 0.8 1.0 1.2

T
OD
FIGURE 8.8 Debye curve for heat capacity of a solid.
Data points are for aluminum (() D = 418 0 K).
SECTION 8 11he
Debye Theory of Specific Heat 493

to this curve by selecting a value for the Debye temperature eD • The result of doing
this for aluminum is indicated by the data points and the agreement is seen to be very
good. A list of Debye temperatures, determined in this manner for a variety of mate-
rials, is given in Table 8.4.

TABLE R.4
REPRESENTATIVE VALUES OF UEBYE TEMPERATURE IN OK

Substance Substance Substance


-------- ---- ----------1----·11-------_·····_-- - - - -

Beryllium . 1160 Vanadium . 273 Silver . 225


Diamond C . 2000 Chromium . 402 Cadmium. . 300
Sodium . 150 Iron . 467 Tungsten . . . . .. .,. 379
Magnesium . 406 Cobalt . 385 Platinum . 229
Aluminum . 418 Nickel . 456 Gold . 165
Silicon . 658 Copper .. " . 339 Lead . 95
Potassium . 100 Zinc " . 308 Bismuth . 117
Calcium . 219 Germanium . 366 NaC~l " 281
Titanium . 278 Molybdenum . 425 AgBI' . 144

The Debye theory has been found to be only approximate with deviations occurring
between theory and experiment at very low temperatures, However, the refinements
which improve on the Debye theory need not be a concern in the present discussion,
since they do not affect the arguments which are to follow. The reader interested in
these refinements is referred to the literature;"
It is apparent from the development that the internal energy expression (8.51) refers
to the lattice energy, and does not include any contribution from a free electron gas
which might be present within a conductive material. However, the Fermi-Dirac
statistics may be used to show " that the total internal energy of N conduction
electrons is

u€ == ~"i NWF(O) {5
1 + ~ - kT- J + ...
2 [ 2 } (8.55 )
5 12 WF(O)

For a conductor which has fNA free electrons per mole, the contribution to the specific
heat capacity at constant volume is therefore

cv(electrons) = 2" fR
7T"2 [kT
WF(O)
J (8.56)

Since kT/wF(O) is so small at ordinary temperatures for most metals, and since the
fraction f is normally close to unity for these metals, one sees from (8.56) that the free
electron gas normally makes a contribution to the total specific heat which is much
less than R. For this reason, at temperatures sufficiently above OD, cv(total) ~ 3R
whether the substance is a conductor or not. This conclusion is in agreement wi th the
Dulong-Petit experimental law, At temperatures so elevated that the contribution
20 See, e.g., IV!. Blackman, "The Theory of the Specific Heat of Solids," Rep Prog Phys, 8, 11-29; 1941.
21 See, e.g., F. W. Sears, op cit., pp. 335-336.
494 Conductive Materials CHAPTER 8

(8.56) cannot be ignored, most metals are no longer solid and indeed may have evap-
orated so that the entire analysis is inapplicable,

8.9 THE TEMPERATURE DEPENDENCE OF THE RESISTIVITY OF METALS

Equation (8.24) may be rewritten in the form

dN/N
(8.57)
at, T

in which T has replaced r., implying isotropic materials, (Cf'. Section 8.5.) Written in
this manner, (8.57) may be given the interpretation that the fraction of free electrons
which suffer a lattice interaction per unit time is l/T. Stated in another way, liT is the
probability for isotropic scattering of a free electron per unit time.
Imagine a conductor whose lattice atoms do not vibrate, but which contains imperf'ec-
tions in the forms of vacancies, interstitials, dislocations, and impurity at0111S. These
imperfections cause scattering, with a probability which may be designated by l/ri.
Alternatively, imagine a conductor which contains no imperfections but whose lattice
atoms do vibrate. These vibrations cause the instantaneous spacings of the at0l11S to be
slightly irregular so that the electron waves experience some scattering. Let the prob-
ability of scattering in this case be denoted by 1/T l .
A real conductor exhibits both scattering mechanisms since it contains imperfections
and since its lattice atoms do vibrate. Because these two mechanisms are independent
the scattering probabilities are additive, and one may write

111
-=-+- (8.58)
T Ti Tl

If use is made of (8.9), the resistivity is given by

p -
c -
-
ne'
ni (1 +-1)
-
T 1' Tl
(8.59)

Over a reasonable range of temperatures, the concentration and distribution of


imperfections, as well as the free electron density, are independent of temperature,
indicating that ri is the only temperature-dependent quantity in (8.59). One would
expect the scattering probability I/Tl to be proportional to the collision cross section
of a lattice atom. In turn, the collision cross section is governed by the square of the
mean amplitude of atomic vibrations, and is thus proportional to the lattice energy.
For this reason (8.59) may be rewritten in the form

pc = Pi + b'u (8.00)

in which Pi = 1n/ne2Ti is known as the residual resistivity (being due to imperfections),


u is the molal internal energy, and b' is a constant. At temperatures sufficiently above
the Debye temperature On, (8))2) indicates that u ~ 1 and then (8.60) becomes 1

Pc = a + b'I' rs.en
with a and b constants. This linear dependence of resistivity on temperature is known
SECTION 9 The 'I'eniperaiure Dependence of the Resistivity of Metals 495

50 r - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ,

40

''''

C:::
0::
------
0 30
..-4

~
~
if:
.~
Q)
:-.
C,.)
20
>
~
C3
:c
10

o
2 4 6 8 10 12 14 16 18 20 22

TCI()
FIGURE 8.9 Resistance of an inipure specimen of sodium as a function of temperoture.
[After AfacDonald and Mendelssohn, Proc Roy Soc (London), A202, 103; J 950.]
as Mat.t.hicserr's rule. It i~ a result which agrees with observed data for most metals,
for T» OJ).
An illustration of the effect indicated by (8.60) is shown in Figure 8.9. The resistance
of a specimen of sodium is plotted as a function of temperature, with the curve clearly
displaying a behavior which can be explained as being due to a residual resistance plus
a contribution which follows the trend of the specific internal energy.
Another illustration of the effeet of imperfections on resistivity is provided by Figure
8.10. When nickel is added as an impurity at0111 to copper, the resistivity increases
markedly even though niekel is itself a good conductor. The decisive factor is the sym-
metry disruption of the atoms filling the lattice.
EXAMPLE 8.3
With reference to Figure 8.9, the scattering probabilities due to imperfections and due to
thermal lattice vibrations may be estimated for the sodium specimen under consideration.
"Then contractions in the length and diameter of the specimen as the temperature is lowered
are ignored, the ordinates are resistivities relative to the resistivity at 290 oK. If one takes
for the latter Pc(2900K) = 4 X 10- 8 ohrn-m, the residual resistivity is
Pi = 4 X lO-4 pc(2900I() = 16 X 10- 12 OhI11-111
'This means that

ne 2Pi (2.54 X 1028)(1.6 X 10- 19 )2(16 X 10- 12 )


r, m 9.1 X 10- 31
= 1.14 X 10 10
496 Conductive Materials CHAPTER 8

[)

i
e
..c 4
0

S
X
e, 3
~
'>
:fj
'r;. 2
c::
Q)

1'(OI{)
FIGURE 8.10 Resistivity of Cu-Ni alloy vs. temperature as a function of
percentage of Xi concentration. [;lfter Linde, Ann. Ploj«. 15, 1~19; 1932.J

The mean time between electron-lattice interactions thus would be about 10- 10 sec if there
were only imperfections and no lattice vibrations.
At 290 0 !(
m
Pc =- = 4 X 10- 8
ne21'
so that ! = 0.285 X 10 14
T

Therefore at room temperature, with both imperfections and lattice vibrations contributing
to the resistivity, the mean time bet\veen electron-lattice interactions is much shorter,
being about 3 X 10- 14 sees, Since

it follows that in this case Tl ~ T and the lattice vibrations are the dominant cause of
resistivity at room temperature. Electron-lattice interactions due to lattice vibrations occur
approximately ten thousand times as often as do interactions caused by imperfections in
the specimen.

8.10 THERMAL CONDUCTIVITY OF METALS


AND THE WIEDEMANN-FRANZ LAW

If the temperature is not uniform throughout a homogeneous, isotropic substance, a


flow of heat "rill occur in the direction opposite to the tern perature gradient. This fact
SECTION 10 Thermal Conductiuiiu of Metal» and the Wiede1nann-Franz Law 497

is usually expressed by the equation


Q == -KvT (8.62)

in which Q is the areal density of heat flo\v in watts per n1 2 and K is the thermal con-
ductivity coefficient, expressed in watts per meter per degree.
For most materials the value of K depends on temperature: the case of copper is
shown in Figure 8.11 as an illustration, and Table 8.5 lists measured thermal conductiv-
ity values for some representative materials at two different temperatures. It may be

.1)0

40
~
Q.)
"0
E
~
<,
2
~ 30
~
~
'S;
t
='
"0
c 20
0
0

~
E
s...
~
..c
~
10

o 20 40 60 80 100
T(OK)
FIGURE 8.11 Thermal conductivity of copper vs. temperature.

observed from this table that the metals are generally better conductors of heat than
the insulators, the difference being as great as two orders of magnitude. This is due to
the fact that the principal means for transferring heat energy in insulators is via lattice
vibrations. Although this mechanism is also operative in metals, the highly mobile
free electrons are the carriers of most of the heat being transferred;" and are responsible
for the greater thermal conductivity.
The analysis which follows will be restricted to metals, and only the contribution
to thermal conductivity made by the free electrons will be taken into account. The
procedure will be to consider all the free electrons which pass in unit time through a
22 In some metals the thermal conductivity due to electrons may be one or two orders of magnitude

greater than the contribution made by lattice vibrations. See, e.g., C. Kittel, J ntroduction £0 Solid
State Physics, 2d ed., p. 149, John Wiley and Sons, Inc., New York, 1956.
498 Conductive Materials CHAPTER 8

unit area transverse to the heat flow, in both directions, to determine their individual
transport energies, and thus by summing to deduce Q, the net energy flow per unit
time. The result will prove to be proportional to V T so that use of (8.62) will yield
an expression for K.
Consider a metallic conductor in which a temperature distribution has been estab-
lished. The average energy of those free electrons which are in regions of higher tern-
perature will be greater than the average energy of those free electrons which are in
regions of lower temperature. The more energetic electrons will move toward the lower

1'ABLE 8.5
THERlVIAL CONDUCTIVITY K ~"'OR REPHESENTATIVE l\'IATEHIALS

K in watts/n1/oI{
ill aterial

Aluminum . 250 220


Copper . 565 385
Silver . 420 410
Gold . 300
Iron . 180 90
Magnesium . 185 170
Nickel . 110 80
Sodium , . 150 135
Nae!. . 26 7
Potassium Alum \ 1.2 1.9
Carbon . 4.2
Glass . 1.0
Mica . 0.8

temperature regions carrying this additional energy with them and causing a flow of
heat. If this loss of energy in the high temperature regions is replenished, and if the
gain in energy of the low temperature regions is siphoned off, the temperature distribu-
tion can be maintained.
Because of the difference in mean velocity associated with this differential in electron
energy, there is a tendency for the number of electrons which arrive per unit time in
the low temperature regions to exceed the number which leave, with the opposite
tendency prevailing in the high temperature regions. This tendency toward charge
accumulation causes a small internal electric field parallel to the temperature gradient,
which is just sufficient to prevent continued charge accumulation. The net result is
that as many electrons leave any region per unit time as arrive; in a low temperature
region, the arriving electrons bring with them 1110re kinetic' energy than the departing
electrons take away; the reverse situation exists in the high temperature regions.
With this picture of the mechanism in mind, choose coordinate axes such that in
some region of the conductor, the temperature is a function only of z, represented by
T(z). In parallel with the development of Sect ion 8.4, let !o(z,VX,vy,v z ) be the free electron
equilibrium distribution which would exist in the presence of this temperature gradient
SECTION 10 Thermal Conductivity of Metals and the W iedemann-Franz Law 499

if there were no electric field present, and let f(z,vx,vy,v z ) be the free electron equilibrium
distribution with the compensating electric field also present. For this case Boltzmann's
transport equation becomes
aj e aj f - fa
Vz - - - Ez - = - -- (8.63)
az m avz T

Since the temperature gradients and the electric fields they induce are normally very
small, one may substitute j == fa in the left side of (8.63) with negligible error, obtaining

e ajo ajo aT)


f = jo + T ( - E z - - Vz - - (8.64)
ni av z aT az
For the same reason fo, as it appears in (8.64) and all subsequent equations, may now
be taken to be the Fermi-Dirac distribution jo(vx,vy,v z ) .
The electric current density may be deduced from the transport of charge, and as
in Section 8.4 is given by

L = fff (-e)vzf dv x dvy do, = -er fff v, (;e E, ;vafoz - afo aT)
o, aT a; dvx do; d», (8.65)

By analogy the flow of heat may be deduced from the transport of kinetic energy,
and is given by

Q= ff f (2mv2) vzf du; dv dvzy = 2rnr fff vzv 2 ( e e. avajoz


;

-
ajo -aT) dv do; d».
Vz - (8.66)
x
a1 az
1
.

Under the equilibrium requirement that L == 0, the electric field E, 111ay be eliminated
by combining (8.65) and (8.66). When this is done, and one finds the partials ajo/avz ,
ajo/aT for the Fermi-Dirac distribution, evaluation of the integrals gives the first-
order expression 23

1 n1r2k 2TT aT
Q== (8.67)
3 m az
From this result and (8.62) it follows that the thermal conductivity is
1 n7r 2k 2T
K == - - - T (8.68)
3 m
a formula which is seen to depend linearly on the absolute temperature.
The electrical and thermal conductivities may be compared by forming the ratio

K ==! (7rk)
2
L == (8.69)
(jT 3 e
This result is seen to be a universal constant and is a theoretical confirmation of the
Wiedemann-Franz law, established experimentally in 1853. The constant IJ is known
23 The reader interested in the details of this analysis is referred to A. H. Wilson, The Theory of Metals,

2d ed., pp. 200-201, Cambridge University Press, London, 1954.


500 Conductive M aierials CHAPTER 8

as the Lorenz number and has the theoretical value 2.45 X 10- 8 watt-ohm/deg". The
experimental Lorenz numbers for various metals are listed in Table 8.6 and the agree-
ment with theory is seen to be quite good.

'TABLE H.6
EXPERI~iENTAL LORENZ NUMBEHS

L X 108 watt-ohm/deg! L X 108 watt-ohrrr/deg"


Metal .~f etal
T = 273°I{ T = 373°K T = 273°!{ T = 373°I{

Copper 2.23 2.33 Tin 2.52 2.49


Zinc 2.31 2.33 Tungsten 3.04 3.20
Molybdenum 2.61 2.79 Platinum 2.51 2.60
Silver 2.31 2.37 Gold 2.35 2.40
Cadmium 2.42 2.43 Lead 2.47 2.56
I
At low temperatures the theory given does not apply. It is no longer suitable to
neglect lattice contributions to K, and a relaxation time for the electrical and thermal
conductivities cannot be defined uniquely. For this reason neither theory nor experi-
ment conforms to the Wiedemann-Franz law unless the temperature of the specimen
is above the Debye temperature.

8.11 CONDUCTIVITY OF SEMICONDUCTORS


Homogeneous semiconductors are found to share with metals the property of support-
ing drift currents which conform to Ohm's law. The explanation of this conformance
differs for the two classes of materials, In earlier sections of this chapter, the free
electron model formed the basis of a theory of conduction which gives satisfactory
agreement with experiment in the case of metals, particularly those which are mono-
valent. However, the conduction process is more complex in a semiconductor. The free
electron model is not applicable and one must turn to the band theory for a suitable
explanation.
Referring to the discussion of Section 8.2, a crystalline solid whose Fermi level lies
in the forbidden region between two allowed energy bands is a senliconductor if the
energy gap between these bands is small, Silicon and germanium are notable examples
of such solids. Silicon has the atomic number 14 and a valence of 4. At normal tempera-
tures all of its energy bands are essentially filled through 2p (cf. Table 7.2). Its 38 and
3p bands have combined in a manner which can best be understood by imagining what
happens as the silicon atoms are brought closer together..At large atom separation
these two bands are separated and well-defined, with 3p above :38. As the separation
decreases, the bands overlap and merge together. At still shorter interatomic spacing,
the merged levels split into t\VO bands, each of which contains four states per atom.
This is the situation at the equilibrium spacing in solid silicon, and at absolute zero the
lower of these bands is completely filled, the upper band being void; a gap of 1.1 ev
separates these two bands. Similarly, germanium, with the atomic number 32, has a
SECTION] 1 Conductivity of Semiconductors 501

valence of 4. At normal tern peratures all of its energy bands are essentially filled through
3d. Its 4s and 4p bands have merged and resplit into two new bands which are separated
by a gap of 0.72 ev. At absolute zero the lower of these bands is completely filled and
the upper band is empty. It will be shown later in this section that for both of these
pure crystals the Fermi level at absolute zero lies half way between the last filled band
(called the valence band) and the first unfilled band (called the conduction band).
Because of their valence of 4, silicon and germanium solidify into crystals with a
structure similar to that of diamond, in which each atom forms a covalent bond with
four other atoms in a three-dimensional lattice. A two-dimensional representation of
this structure is indicated by Figure 8.12. At absolute zero all the bonds are complete,

FIGURE 8.12 Two-dirnensional representation of silicon or qermanium crystal structure.

but as the temperature is raised some of the bonds break, indicating that an occasional
electron has gained enough energy to detach itself, leaving behind an atom with a
vacancy, or hole.
In thermal equilibrium, in the absence of an external electric field, the detached
electrons will wander randomly through the crystal. So too will the holes, since it is an
unnatural state for an atom in the crystal to be incompletely bonded. An atom in this
state will "steal" an electron from one of its four neighbors, thus transferring the hole
to the neighbor, which in turn will pass the hole on to one of its four neighbors. In this
way the holes also move about randomly through the crystal. Occasionally a hole and a
detached electron encoun tel' each other and recom bine. This annihilation of hole-
electron pairs is balanced by their thermal generation so that a constan t volume density
of mobile electrons and holes is maintained. Raising the temperature increases the
volume density of these charge carriers.
This condition can be represented in an energy band diagram, as shown in Figure
8.13. The Fermi-Dirac function (8.2) is plotted to the right of the bands and indicates
a partial population of the bottom of the conduction band at the expense of the top
of the valence band. Individual energy levels just below Wv or just above We are ran-
domly occupied, but on the time average, the fractional occupancy follows the Ferrni
curve.
If an electric field is applied from left to right in Figure 8.12, the random motion of
mobile electrons and holes has superimposed on it an ordered drift of electrons to the
left and holes to the right; both of these drifts contribute to a current flow in the direc-
502 Conductive JIll aterials CHAPTER 8

tion of the electric field. In terms of the band diagram of Figure 8.13, since there are
filled and empty energy levels adjacent to each other just below wv and just above We,
the holes and conduction electrons need to acquire only a slight amount of energy from
the field to move from one level to another. Because of this, they are highly mobile
and will drift easily.
Despite this mobility, the drift current per unit applied field is extremely low in a
pure semiconductor. The reason for this is that the density of conduction electrons and

Electron energy

Conduction band

Wr - - - - - - - - - - - - - - - - - - - -

o 1
~

Probability that an electron


occupies quantum state
FIGURE 8.13 Energy band diaqram for pure semiconductor.

holes is very small. For example, ill germanium at 1'001n temperature, the measured
conduction electron density is only 2.5 X 10 19 per 111 3 • This is nine orders of magnitude
below the value for silver (cf. Example 8.2) and accounts for the great disparity in
the conductivities of the t\VO materials.
The conductivity of a semiconductor may be enhanced greatly by the addition of a
small trace of appropriately chosen impurity. For example, if a slight concentration
of an element of valence 5, such as arsenic or antimony, is introduced into molten
germanium, the crystal formed on cooling contains a dispersion of impurity atoms
which have substituted for germanium atoms at regular lattice sites. After forming
bonds with four neighbors, the impurity atom has one electron left over which is easily
SECTION 11 Conductivity of Semiconduciors 503

detached; from then on this electron is capable of moving about through the crystal,
thus contributing to the conductivity. Because valence 5 atoms donate an electron
in this manner, they are called donors. Upon making its donation, the impurity atom
becomes an immobile positive ion. Thus donor impurities contribute conduction elec-
trons, but do not produce holes.
Similarly, if a slight concentration of an element of valence 3, such as gallium or
boron, is added to molten germanium, the resulting crystal contains a dispersion of
substituted impurity atoms. After forming bonds with fou:" neighbors, the impurity
atom is shy one electron, which may be "stolen" from a germanium atom, thereby
creating a mobile hole. Because valence 3 atoms accept an electron frorn the crystal
in this manner, they are called acceptors. Upon accepting an electron, the impurity
atom becomes an immobile negative ion. Thus acceptor impurities contribute mobile
holes, but do not produce conduction electrons.
For impurity concentrations even so low as one part in a million, the mobile carriers
produced by impurities outweigh those produced thermally by at least several orders
of magnitude; the conductivity is then governed by the impurities. For this reason, a
semiconductor containing donors is said to be n-type since the conduction is due largely
to the negatively charged electrons. Similarly, a semiconductor containing acceptors
is called p-type because the conduction is principally due to the positively charged holes.
The energy band diagram for a semiconductor containing impurities is illustrated
by Figure 8.14. Shown is the case in which both donor and acceptor atoms are present
with the donors more abundant. Therefore the population of the conduction band is
increased with respect to the case of a pure semiconductor (cf. Figure 8.13) and the
Fermi level has risen to reflect this fact. The energy level the donor electrons had when
associated with the donor atoms is designated by W D. Actually, this is a band of energy
levels, but the density of donor atoms is normally so small that their spacing is 10
lattice sites or more, causing the donor band to have negligible width. Similarly, the
energy level the acceptor electrons have when they become attached to the acceptor
atoms is designated by WA. The energy level WD is shown close to We and WA close to Wv,
indicating that ionization of both types of impur ity atoms occurs easily.
The population densities in the various energy bands 111ay be determined with the
aid of the Fermi-Dirac function. If N [) is the volume density of donor impurity atoms
and n a the number of electrons per unit volume remaining in the states at WD, then

(8.70)

which is a further use of (8.2). Under normal circumstances, w» - WI" » k.T and (8.70)
reduces to
(8.71 )

indicating that nD is quite small and that most of the donor Ut0111S are ionized.
Likewise, the density of electrons nA which have been raised to the states at Wit IS
given by

(8.72)

with NAthe volume density of acceptor impurity atoms,


504 Conductive Materials CHAPTER 8

To find the volume density of electrons in the conduction band, it is convenient first
to determine the equivalent volume density of states N e concentrated at the energy
level We. Actually, the allowed states are spread out through the entire conduction band,
but the Fermi-Dirac function indicates that only those states close to We have a sig-
nificant electron population. When the true density of states is combined with the

Electron energy

Conduction band

WF
---------------------

WA -
Wv .. ' . ' . . ...,". .... . .. ".

o
Probability that an electron
occupies quantum state
FIGURE 8.14 Energy band diagrarn for impure semicoruluctor,

Fermi-Dirac function, it can be shown." that the equivalent volume density of states
at We is

(8.73)

in which 1n:is the effective mass of a conduction electron in the presence of the periodic
potential of the crystal (cf'. Section 7.16). The volume density of electrons in the con-
duction band is then given by

(8.74)

24 See, e.g., L. V. Azaroff and J. J. Brophy, Electronic Processes in Muterials, pp. 197-200, :\IcGraw-Hill

Book Company, New York, 1963.


Conductivity of Semiconductors 505

By similar reasoning, one can show that the volume density p of mobile holes in
the valence band may be expressed as

(8.75)

in which (8.76)

is the equivalent volume density of states at the energy level wv. In (8.76) is the 1n:
effective mass of holes in the presence of the periodic potential of the crystal.
These formulas permit determination of the population densities at all four energy
levels if these levels and the temperature are known, and if the Fermi energy W F has
been ascertained. l\.. deduction of the value of W F proceeds from a statement of the
charge neutrality of every volume element of a homogeneous semiconductor. Since
the density of free electrons is n, that of negative acceptor ions is nA, that of holes P,
and that of positive donor ions N J) - nJ), it follows that

n + n.,l == p + lV J) - nn (8.77)

which, with the aid of the preceding formulas, may be written

With all other quantities known (8.78) may be solved for WF. The general solution is
not of great interest, but three special rases are of practical importance,

Case 1: The pure, or intrinsic, semiconductor.

If N A == N D == 0, the semiconductor contains no impurities, If one designates n;


as the free electron density for this case, Pi the mobile hole density, and so, the Fermi
level, (8.77) gives n, = pi, and (8.78) yields

ui; = We + Wv + kT In Nv (8.79)
2 2 Nc
Since the effective masses of electrons and holes are nearly equal, N v ~ N c and (8.79)
indicates that the intrinsic Fermi level is approximately half way between the valence
and conduction bands for all reasonable temperatures.

Case 2: An n-type semiconductor.

When only donor impurit.ies are present, N A == 0. With n; designated as the conduc-
tion electron density for this case, P» the mobile hole density, and ui; the Fermi level, if
the donor at0111S are essen tially all ionized (which is usually the case), then (8.77) gives

n.; == 'p« + ND (8.80)

However, the product of (8.74) and (8.75) is independent of Fermi level and therefore

(8.81)
506 Conductive 111 aierials CHAPTER 8

Under normal circumstances, N D » n; so that, with the aid of (8.81), Equation (8.80)
becomes
n; ~ N D (8.82)
showing that n; » p;
Combination of (8.82) with (8.i4) gives for the Fermi level
NC
io; == Wc - k'l' In N (8.83)
J)

Equation (8.83) indicates that the Fermi level rises as N D is increased, which reflects
the increased electron population in the conduction band.
Case 3: A p-type semiconductor.
When only acceptor impurities are present, N D == O. With n p chosen to represent the
conduction electron density for this case, PP the mobile hole density, and W p the Fermi
level, if the acceptor atoms are essentially all ionized (which is usually the case), then
(8.77) gives
n p + N A == P» (8.84)

Because normally N A » tu, and because further use of (8.81) gives


(8.85)
it follows that (8.84) reduces to
Pp ~ N A (8.86)

Combination of (8.86) with (8.75) gives for the Fermi level


Nv
Wp = Wv + kT In N A
(8.87)

Equation (8.83) indicates that the Fermi level lowers as N A is increased, which is
consistent with the decreased electron population of the valence band.
For all three of the foregoing cases, the densities of mobile carriers, (8.74) and (8.75),
may be written
n = N ce-(wc-u\)/kTe(WF-W)/kT == nie(wF-w)/kT (8.88)
p = Nve-(Wi-WV)/kTe-(wp-w)/kT = Pie-(w,..-wj)/kT (8.89)

indicating that the conduction electron and hole densities are controlled by the shift
in Fermi level. This shift may be determined by combining (8.79) and (8.83) or (8.87).
EXAMPLE 8.4
Using the atomic number of germanium, its measured density, and Avogadro's number,
one can determine that the atom density is 4.4 X 1028 per m'. If an arsenic impurity con-
centration of 10- 5 is added to pure germaniuru, this means that the volume density of
arsenic atoms is N D = 4.4 X 1023 per 01 3 • The shift in Fermi level due to this impurity
concentration can be deduced from (8.88). One may write
n.; = ND = 4.4 X 10 23 = nie(WII-Wi)/kl' = 2.5 X 101IJ e ( w ,,- wi ) / k l'

ui; - uu = kT In (17,600) = 9.78kT


At room temperature, kT = 0.025 ev and the rise in Fermi level is
Wn - ui, = 0.24 ev
Since We - Wv = 0.72 ev, this shift amounts to 34 percent of the gap width.
SECTION 11 Conductivity of Semiconductors 507

The conduction electron-lattice and hole-lattice interactions in a semiconductor fit


the assumptions of the analysis presented in Section 8.4, the only differences being the
form of the velocity distribution function of mobile carriers and the dependence of
carrier acceleration on the periodic potential of the lattice. When these differences are
taken into account, the result is once again Equation (8.23), with the effective mass
replacing the electronic mass. Therefore the drift current density in a semiconductor
may be written
t == In + lp (8.90)
== <TeE + <ThE == <TE

in which In is the current density due to free electron drift and t , is t.he current density
due to hole drift. The conductivities a; and <Th are given by

ne'r,
o; == .-*­ (8.91)
m,

with r, and Th the respective relaxation times. The total conductivity <T is the sum of
a, and (Jh. Because of the anisotropies in the relaxation times and effective masses, (J is
in general a tensor.
Average drift velocities for the conduction electrons and holes rnay be defined by
the flow equations
(8.92)

which can be combined with (8.90) to yield

Vn
(J
e
== ne-
E (8.93)

The drift velocities per unit field are called the mobilities, and defined by the relations

Vn vp
J.Ln = -
E J.LP = E (8.94)

In terms of the mobilities, the total conductivity of a homogeneous semiconductor is

(8.95)

The mobilities are somewhat dependent on temperature and impurity concentration,


and at room temperature representative values are given in Table 8.7. The principal
determining factors of conductivity in a semiconductor, as seen from (8.9,5), are the
volume densities of conduction electrons and holes.

TABLE s.:
MOBILITIES AT ;300 0 K

J.Ln m 2jvolt-sec J.LP m 2jvolt-sec

Silicon . 0.12 0.025


Germanium . 0.36 0.17
508 Conductive Materials CHAPTER 8

EXAMPLE 8.5
A cubical block of germanium 0.5 ern on a side contains a 10- 5 arsenic concentration.
When a potential of 5.0 millivolts is applied across opposite faces of this specimen, the
measured current flow is 0.634 amp. From these data, the mobility of free electrons in
germanium may be deduced.
If one borrows the results of Example 8.4, the conduction electron density in this speci-
men is ti« = 4.4 X 10 23 per 1113• Therefore

n.p, (2.5 X 1019 ) 2


1.4 X 10 15
p; = --;;:: = 4.4 X 1023

and the conduction electron concentration is seen to be eight orders of magnitude higher
than the hole concentration. For this specimen (8.95) reduces to

(J = nnelJ.n = 7.04 X 104,un mhos/m


Since one may also write for the conductivity

(J = ~ = 0.634/(0.005)2 = 2.53 X 10 4 mhos/rn


E 5 X 10- 3/0.005
2.53 X 10 4
it follows that ,un = = 0.36 n12/volt-sec
7.04 X 10 4
which agrees with the entry in Table 8.7.
It is interesting to note that this mobility is two orders of magnitude higher than the
mobility of electrons in commercial copper. The vastly greater conductivity of copper is
due solely to its free electron concentration.
For an intrinsic semiconductor, n, = Pi and (8.95) may be combined with (8.74) to give
(J = nie().Ln + J.lp) = N c e(J.ln + J.lp)e-(wc-wd/kT
With ui, taken midway between We and Wv, this may be written
(8.96)

Equation (8.96) indicates an exponential dependence on temperature of the conductivity


of intrinsic semiconductors, a feature which distinguishes them from metallic conductors.
If (J, J.ln, and J.lp are measured as functions of temperature, (8.96) may be used to deduce
the gap width We - wv.
For an impure semiconductor in which the impurity concentration is large enough to
dominate the conduction process, one or the other of the two terms in (8.95) may be neg-
lected. The remaining term contains a factor n = N D or P = N A, neither of which is temper-
ature sensitive. Since the mobilities also are not strongly dependent on temperature, the
conductivity of an impure semiconductor is found to be almost independent of temperature
(unless the temperature beC0111eS high enough to cause the intrinsic effects to dominate those
governed by impurities).
In addition to the drift currents discussed in this section, which obey Ohm's law, a h01110-
geneous semiconductor can support diffusion currents if the local balance between electron-
hole annihilation and generation is upset. This will occur, for example, when t\VO dissimilar
semiconductors have an interface, and this phenomenon is the controlling feature of the
semiconductor diode and the transistor. Diffusion currents in semiconductors are treated in
most texts on transistor theory. 25
25See, e.g., the excellent discussion in H. I). Middlebrook, . 4n I ntroduction to J unction Transistor
Theory, John Wiley and Sons, Inc., New York, 195i.
SECTION] :2 i11 axwell' s Equations for Conductive lYI edia 509

8.12 MAXWELL'S EQUATIONS FOR CONDUCTIVE MEDIA

If a collection of dielectric, and/or magnetic, and/or conductive matcrialsj is considered


at the microscopic level to consist of an aggregation of charged particles in motion in a
vacuum, then the developments of Sections 6.21 and 7.17 have shown that Xlaxwell's
equations in the form
v X E == -8 (8.97)

properly account for the dielectric and magnetic properties. When further it is appro-
priate to express the primary current density in the form t == c E, these equations
become
vXE==-B vxlI=(JE+I) (8.98)

The auxiliary relations are still


V·D==p v·B=O (8.99)

In particular, if a medium is characterizable by the constitutive parameters € and p"


as well as 0", via the relations
D == eE H == J.l-1B t == 0" E (8.100)
and the fields are time-harmonic through the function ejwt , then (8.98) and (8.99)
become
v X E == -jwJ.lH V X H == (0" + jw€)E
(8.101)
v. E == E v·H == 0
E

Equations (8.101) comprise the general form of Maxwell's equations for time-harmonic
fields in linear media and are the starting poin t for most practical applications of
electromagnet.ic theory. The interested reader is referred to the journal literature and
to the many fine texts which deal with these subjects. Some of the latter are included
among the references at the ends of Chapters 3, 4, and 5.

REFERENCES

1. Dekker, A. J., Solid State Physics, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1957.
2. Dekker, A. J., l~lectrical Enqineerinq 111 aterials, Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 19S9.
3. Kittel, C., Introduction to Solid State Physics, 2d eel., John Wiley and Sons, Inc., New
York, 1956.
4. Lenard, P., Great Jf en of Science, The Macmillan Company, Inc., New York, 1933.
5. Magie, \V. F., ..1 Source Book in Physics, McGraw-Hill 1300k Company, Ne\v York, 1935.
6. Mott, N. F., and H. Jones, Theory of the Properties of J[ etals and Alloys, Oxford Press,
London, 1936.
7. Sears, F. \V., An Introduction to Therrnodynarnics, Kinetic Theory of Gases, and Statistical
Meclumics, 2d ed., Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1953.
8. Whittaker, E. S., A Flistory of the Theories of the A ether and Eleciriciui, vols. 1 and 2,
Thomas Nelson and Sons, Ltd., London, 1951 and 1953.
9. Wilson, .A. H., The Theory of Metals, 2d ed., Cambridge University Press, London, 1954.
t This can include rnaterials which have any two or all three of these properties.
510 Conductive Materials CHAPTER 8

PROBLEMS

8.1 What is the drift velocity of electrons in a copper wire under the influence of an electric
field of strength 100 volts/em?
8.2 Mobility may be defined as the average drift velocity, per unit of electric field, adopted
by the free electrons in a conductor as they form a steady current flow. Find the Inability
in silver.
8.3 Repeat the previous problem for copper.
8.4 Find the relation between the mean free path and mobility of the free electrons within a
metal.
8.5 Estimate the relaxation time for copper if the measured resistance at room temperature of
1,000 ft of No.8 wire is 0.641 ohms. (No.8 wire has a diameter of 0.128 in.)
8.6 Find the Fermi zero energy for free electrons in copper assuming one free electron per
atom. Use this result to find the mean free path and compare your answers with 'fable 8.3.
8.7 Explain why the constant a in Equation (8.61) is not the residual resistivity.
8.8 For small concentrations of impurities, how should the resistivity of a metal depend on
impurity concentration? Is your answer consistent with the data of Figure 8.10?
8.9 An absolutely pure and perfect silver wire has a conductivity at room temperature (300°!\:)
of 0.60 X 10 8 mhos/me Find the conductivity if the temperature is doubled.
8.10 A germanium specimen contains a 10- 5 impurity concentration of acceptor atoms. ASSU111-
ing that the acceptor energy level is 0.05 ev above the valence band, find the fraction of
acceptor atoms which are not ionized at 1'00111 temperature,
8.11 A rectangular specimen of silicon which contains a uniform concentration of impurity
atoms is shown in the figure and is being considered at room temperature. It has meas-
ured dimensions a = 5 mm, b = 2 rnrn, C = 1 InITI.

..
y

-:
X
b
Problems 511

(a) When a d.c. voltage is applied so as to make the face at x = a one volt higher in
potential than the face at x = 0, a current of 0.52 amp ftO\VS through the specimen.
What is the conductivity?
(b) When in addition a steady magnetic field of 10 4 gauss is applied in the positive
Z direction, a Hall voltage of 0.0105 volts is measured in the Y direction, with the
face at y = b at the higher potential. (Cf. Example 4.3.) Is the sample p-type or
n-type? Explain.
(c) What is the mobility of holes in silicon at room temperature?
(d) What is the hole concentration?
(e) If the atom density of silicon is 5.0 X 10 2 2 per CI11 3 , what is the impurity concentration?
APPENDIX A
FRINGE SHIFT VERSUS ROTATION OF THE MICHELSON-MORLEY APPARATUS

IF THE Michelson-Morley apparatus is assumed to have equal arms (ll = l2 = l) and


to be drifting through the ether at a speed u « c, a simple expression can be deduced
for the fringe shift as a function of rotational position of the apparatus. Imagine that
the ether is at rest in Figure A.I and that the interferometer is moving upward as
indicated by the vector u. Let t~ be the time it takes light to get from the half-silvered
mirror P to the reflecting mirror MI. This light suffers a horizontal displacement
l cos () and a vertical displacement l sin 0 +
ut~, because during its transit the mirror
M 1 has moved upward a distance ut~. Since the distance traveled through the ether is
also ct~, one can write
( ct~ ) 2 = (l cos ()) 2 + (l sin 0 + ut~) 2

from which t~ =
c2 -
I
u2
[u sin (J + (c 2 - u 2 cos" (J)~]
Similarly let t~' be the time required for this light to return from M 1 to P. During
transit it undergoes a horizontal displacement l cos () and a vertical displacement
l sin () - ut~', because the half-silvered mirror P has moved upward a distance ut~/.
Thus
(ct~') 2 = (l cos 8) 2 + (l sin (J - ut~') 2

and t"1 = I
c2 _ u 2
[-u sin (J + (c 2 - u 2 cos" 8)~]
Therefore the round-trip time is
(A.I)

To find the time for light to travel from P to M 2 and back to P, one need only replace
o by 8 - 1r /2 in the above expression and obtain
(A.2)

The difference in phase of the two light components reaching the viewing telescope
is therefore
(A.3)
514 Fringe Shift versus Rotation of the Michelson-Morley Apparatus APPENDIX A

A----

II
----------\
u

FIGURE A.I Rotation of the Jfichelson-JIorley apparatus.

The number of fringes shifted, n, can be found by dividing this phase difference by 27r.
If u « c, the radicals can be expanded to give
l u2
n ~ - - 2 cos 28 (A.4)
Xc
and thus the fringe shifting occurs as a second harmonic of the rotational angle e.
APPENDIX B
CLASSICAL DOPPLER SHIFT FROM A MOVING
SOURCE IN THE PRESENCE OF A MOVING ETHER

ASSUME THE existence of a luminiferous ether and the validity of the Galilean trans-
formation and let XYZ be a frame of reference at rest in the ether. Let X'Y'Z ' be
another frame whose axes are respectively aligned to those of XYZ, whose origin coin-
cides with the XYZ origin at t == t' == 0, and which is in translative motion at the con-
stant velocity u through XYZ.
Additionally, assume that a source of light waves is moving through X' Y' Z' at the
constant velocity v, and thus through XYZ at the constant velocity u + v. The
directions of u and v will be taken as arbitrary but attention will be restricted to the
case that both their magnitudes are small compared to the velocity of light c.
In Figure B.l the large dots represent positions of the light source relative to XYZ
at two instants of time, one period apart. Some of the radiated energy from this source

-- -- --
---- --------------- -- -- ---- --
---- --
u+v
FIGURE B.l Radiation from a moving source.

travels in the direction represented by the unit vector e. It is desired first to find the
wavelength and frequency of the light waves going in this direction.
Since XYZ is at rest in the ether, the ray velocity (or velocity of energy transport)
in this frame is the same as the phase velocity of the waves. The latter is nc, in which
n == e is a unit vector normal to the wavefront. In time r, the radiation which left the
source at t == 0 has traveled a distance l == cr and the source has been displaced a
distance (u + v) · nr in the direction of e. Therefore the wavelength is
A == [c - (u + v) · n]r (B.l)
and the frequency is
c c
v == - == vo (B.2)
A c - (u + v) · n
516 Classical Doppler Shift APPENDIX B

in which Vo = liT is the frequency of the light source as determined by an observer at


rest relative to the source.
These light waves, traveling ill the direction of e in XYZ, can be described by the
equation
if; == K cos <p (B.3)

in which K is the wave amplitude and

<p = 21rV (t - reo)


~- (B.4)

is its phase. In (B.4) r is the position vector of the point (x,y,z), that is

(B.5)

Equation (B.4) has a useful physical interpretation. Imagine that the wavefront
which passes through the origin of XYZ at t = 0, traveling in the direction of e, is
labeled. Let an observer 0 be placed at the fixed point (x,Y,z) and let him note the time
at which the labeled wavefront passes him. This time will be

r en
t1 =-­ (B.6)
C

because the phase of this wavefront never changes and its phase was zero when it
passed through the origin. Thus at any later time t, the phase of the wavefront which
is then passing 0 is given by
(B.7)

In short, to measure the phase of the wavefront passing him at any t.ime t, 0 need
take only the labeled wavefront as reference. This means further that the number of
wavecrests which pass 0 between the time the labeled wavefront passed him and the
time t is vet - t 1) .
Now consider an observer 0' stationary in X'Y'Z' at the particular point (x',y',z')
which causes 0 and 0' to coincide at the instant t = t'. If 0' counts the total number
of crests of the wave which pass him between the time the labeled wavefront passed
him and the time t', he must get the same answer as O. But he will describe the wave
by the equation
"" = K' cos 4>' (B.8)

in which 4>' = ( r' n')


27rv' t' - -c-'- (B.9)

Since the labeled wavefront passed 0' at the time

r' n'
e

t~ = - - (B.lO)
c'

and the number of crests which have passed him since then is v'(t' - t~), it follows that
the phase of the plane wave is an invariant, that is

¢ = 4>' (B.Il)
APPENDIXB Classical Doppler Shift 517

}'ro111 this invariance the characteristics of the wave as viewed from X' Y ' Z' can be
determined, since

vI (
t - r-· -
I n '
c'
/)
==v ( t--
l' •-
c
n) (B.12)

must hold for all values of the spatial and temporal variables. If x, y, z, and tare elimi-
nated from (B.12) through use of the Galilean Equations

r == 1" + ut' (B.13)


t == t'
one obtains

1/' [t l - r_'_~,n_/J = 1/ [t l - _(r_I_+_~_t_/)_._nJ (B.14)

If coefficients are equated, the results are

v' == v I - -c- ( u· n) (B.15)


I
V I V
--, n x == - n x , etc. (B.16)
c c
From (B.16) it follows that
I
V V
(B.17)
c' c
n' == 11 (B.18)

Therefore the wavefront has the same direction in both coordinate systems and a corn-
bination of (B.17) with (B.15) gives

c' == c - u · 11 (B.19)

Equation (B.19) gives the phase velocity in X1YIZ '. The ray velocity is given by the
conventional Galilean formula V' == nc - u.
Thus the phase velocity and ray velocity are not collinear in X1Y'Z'. If e' is a unit
vector in the direction of the ray velocity ~ then

V'
n == e / -
c
+-uc (B.20)

and
V' , u · e'
-=n·e--- (B.21)
c c

Since n · e' differs from unity only in second order, (B.20) can be writ.ten as

n e' (1 _u~ e') + ~


= (B.22)

which is correct to first order.


518 Classical Doppler Shift APPENDIX B

Upon combining Equations (B.2), (B.15), and (B.22) one obtains

,
v =
{
vo 1 -
(v/c) - c' - [(ll/C) - c'][(vjc) - e'] + (u/c) -
-----~-----~----
V/C}-l (B.23)
1 - (u/c) - e' + [(u/c) · e']2 - (U/C)2
If this equation is expanded (cf. Mathematical Supplement) and terrns through the
second order only are retained, the result is that
vee' (V-C')2 uev]
v' = vo [ 1 +- -
C
+ - c
-2
+ --
c 2
(B.24)
APPENDIX C
SOME PROPERTIES OF BESSEL FUNCTIONS

THE BESSEL function of the first kind In(v) (in which v == kr and n is integral or zero)
is the coefficient of t» in the power series expansion of the expression

(C.l)

and for this reason (C.1) is known as the generating function. This result 111ay be
appreciated by writing

vt 1 (vt)2
[.1+-+-
2
-
21 2
+-3!1 (vt)3 -2 + ... J
-2 + ... +-n!1 (vt)n
X [1-2t+2! 2l -3!1 (V)3
v I (V)2 2t + ... +~ 2t + ... J
(-l)n(v)n

from which it follows that the coefficient of tn is


(v/2)n (v/2)n+2 (v/2)n+4 (v/2)n+2m
n! - l!(n + I)! + 2!(n + 2)! - + (-l)m m.'( n m.
+. )' +
which is identical with (3.72). Thus

(C.2)
n= -00

1
If in (C.2) one replaces l by - - the result is
t

exp [~ v ( - ~ + t)] = _I
n- -00
(-t)-".I,,(v)

If the summation index -n is substituted for n, this becomes

cxp [t v (t - D
J _I =
n- -00
(-l)"t".L,,(v)

Comparison of (C.3) and (C.2) reveals that

(C.4)
520 Some Properties of Bessel Functions APPENDIX C

The Bessel function of the second kind, for integral order, may be defined by the
relation
. cos VTrJ (v) - J (v)
JI -JI

Y n (V) == 1an . (C.5)


v-~n SIll VTr

and application of L'Hospital's rule to (C.5) leads to the series (3.73). It therefore
follows, since Y n is definable in terms of Bessel functions of the first kind, which obey
(C.4), that
(C.6)
Because of the nature of the defini ng relations (3.76) and (3.77) the Hankel functions
also obey this law. However, the manner in which the modified Bessel functions are
defined leads to the result that
(C.7)
with w == fro These formulas are useful when working with the orthogonal representa-
tions (3.85) and (3.87).
When both sides of (C.2) are differentiated with respect to t, one obtains

If the expression on the right is arranged in powers of t and the coefficients of t n - 1


are
equated, it is evident that
(C.8)

If any two successive Bessel functions are known, the third in sequence can be deduced
from (C.8) and then this process may be repeated indefinitely.
Alternatively, if both sides of (C.2) are differentiated with respect to v, one obtains

Upon equating coefficients of in 011 the t\VO sides of this iden ti ty, one finds that

J:(v) == -klJ n-l(V) - J n+l(V)] (C.9)

Equations (C.8) and (C.9) are known as recurrence relations and arc also satisfied by
Bessel functions of the second and third kind. However, because of the nature of the
definitions (3.80) and (3.81), the modified Bessel functions satisfy
W
nI neW) = 2 [In-l(w) - In+1(w)] (C.lO)

I:(w) = t[In-1(w) + I n+ 1(w)]


(C.ll)
APPENDIX C; S01ne Properties of Bessel Functions 521

and
W
nKn(w) == - 2" [Kn-J(w) - Kn+1(w)] (C.12)

K:(w) == -t[Kn-1(w) + Kn+1(w)) (C.13)

Upon eliminating either I n- 1 or I n + 1 from (C.8) and (C.g) one obtains

vJ~(v) + nJ n(V) == vJ n-l(V)


vJ: (v) - nJ n(V) = -vJ n+l(V)
which are equivalent to
d
dv [VnJn(v)] == vnJn-l(V) (C.14)

d
dv [V-nJn(v)] == -v-nJn+1(v) (C.15)

These differentiation formulas are also obeyed by the Bessel functions of the second and
third kind. For n == 0 the result is simply

(C.16)

Because of the difference in the recurrence relations, the modified Bessel functions
satisfy

(C.17)

(C.18)

and

(C.19)

(C.20)

If v == kr is real, the J n(V) functions oscillate and each has a sequence of roots which
may be designated by l' nJ, I'nZ, . . . , I'nm, . . . ,such that J n( I'nm) = 0,

rn. == 1, 2, 3, . . .
A family of functions

(C.2!)

can be generated with the property that each of these functions has a null at v == Vo; for
the mth function there are m nulls in the interval 0 :s; v :s; vo. That the individual
members of the family (C.2!) are orthogonal to each other 111ay be seen by the following
argument:
522 S01ne Properties of Bessel Functions APPENDIX C

Let

fm(V) = VV J n ( Ynm ~)
fp(v) = vi! J Ynp~)
n (

be any two members of the family. By direct substitution they are seen to satisfy the
differential equations

Multiplying the first of these by fp and the second by 1m and subtracting furnishes the
identity
2 2
- 'Ynm - 2 'Y
np
f =
f mp f"f
m
- f"!
ppm
Vo

Integration of both sides of this identity from 0 to b yields

f f~f~dvJ - f f~f~dvJ
b b

= [f~fpl: - [f~fml~ -
o 0

= rf~fp - f~fm]~
which can be written

_1';_m_-_
2
Vo
_1'--=-~p f vJ
0
n ('Y nm
!!-) J n (I'n p!!.-.) dv
Vo Va

= ~ [ YnpJ n ( Ynm ~) J: (Ynp~) - -:» n ( Ynp ~) .I: ( Ynm ~) ] (C.22)

If m ~ p and n= Vo, since J n(1'nm) = J n('Ynp) = 0, the above formula reduces to

l
o
v.In ('Ynm ~) J n('Ynp ~) dv =
Vo Vo
0 (C.23)

and thus (C.21) is an orthogonal family of functions.


Upon differentiating (C.22) with respect to l' nm and then letting m = p and u = Vo
one obtains

(C.24)
APPENDIX C~ Sonic Properties of Bessel Functions 523

In the development leading to (C.l2) it was shown that


, n
J n(v) == - J n(V) - J n+l(V)
V

and thus J: ('Ynm) == - J n+l(l'nm) (C.25)


Combining these last three results, one can express the orthogonality relation in the
form

(C.26)

in which Omp is the Kronecker delta and has the value unity if m == p, but is otherwise
zero.
Some of the roots l'nm for the lower-order Bessel functions are listed in Table C.l.

lIABLE C.l
THE ROOTS l'nm OF In(v); ,]n('Ynm) = 0

~ 0 1 2 3 4 5

1 2.4048 3.8317 5.1356 6.3802 7.5883 8.7715


2 5.5201 7.0156 8.4172 9.7610 11.0647 12.3386
3 8.6537 10.1735 11.6198 13.0152 14.3725 15.7002
4 11.7915 13.3237 14.7960 16.2235 17.6160 18.9801
5 14.9309 16.4706 17.9598 19.4094 20.8269 22.2178
6 18.0711 19.6159 21.1170 22.5827 24.0190 25.4303
l
APPENDIX D
THE ASSOCIATED LEGENDRE EQUATION

IN CHAPTER 3 the differential equation

(D.I)

was encountered and its solution will be undertaken here. As a first step, consider the
case m = 0 which results in the ordinary Legendre equation

d 2g dg
(1 - u 2 ) -
du 2
- 2u -
du
+ n(n + l)g = 0 (D.2)

Let a solution to (D.2) be assumed in the form

L apu + p
00

g = s (D.3)
p=o
in which s is a constant. Then

I
00

dg
du
(8 + p)apu s+p- I

p=o

and substitution of these terms in (D.2) gives

L (s + p)(s L (s + p -
~ 00

+p - 1)a pu 8 + p - 2 - 2)(s + p - 3)a p _ 2u 8+ J> - 2


p=O p=2

2: L ap_2us+p-2 =
~ 00

- 2 (8 + p - 2)ap_2us+p-2 + n(n + 1) 0
p=2 p=2

Since this result is to hold for all values of u, the coefficient of each power of U 111Ust
separately equal zero and therefore

s(s - l)ao = 0
(s + l)sal = 0
(s + p)(s + p - l )«, = [(s + p - 2)(s + p - 1) - n(n + 1)]a p _ 2

If s = 0 the first two of these conditions are satisfied and the third condition becomes
APPENDIX D The Associated Legendre Equation 525

the recursion formula


(n - p + 2)(n + p - 1)
(D.4)
pep _ 1) ap - 2

The solution to (D.2) can then be written

g = ao [ 1-
n(n + 1)
u2+ n(n - 2)(n + l)(n + 3)
u/: -
... J
2! 4!
(n - l)(n + 2) 3 (n - l)(n - 3)(n + 2)(n + 4)
... J
+ al [ U -
3!
u +
5!
u5 - (D.5)

For non-integral values of n both of the series in (D.5) converge except at u == ± 1.


Since one series is odd and the other even, they represent linearly independent solu-
tions of (D.2) so that (D.5) is a general solution provided that lui < 1. Nothing further
is added by choosing s == ± 1 since each choice leads to one of the series in (D.5).
If n is an even integer, it is clear that the first series in (D.5) terminates and is thus
a polynomial, whereas if n is an odd integer the second series reduces to a polynomial.
If the arbitrary constants ao and al are adjusted so as to give these polynomials the
value unity when u == 1, the Legendre polynomials are obtained, the first few of which
are:
Po(u) == 1 P1(u) == u == cos ()
P 2 (u) == tu 2 - i == t cos 2() + t
P 3 (u) == ju 3 - !u == t cos 3() + i cos ()

These polynomials can also be generated from Rodrigues' formula


1 dn
Pn(u) == - nn! - (u 2 - l)n (D.6)
2 dun
which may be verified by expansion.
For n an integer, the nonterminating series in (D.5), with the constant suitably
adjusted, is known asthe Legendre function of the second kind, Qn(U). These functions
are characterized by singularities at u == ± 1 and must be excluded from the solutions
of physical problems in regions containing the polar axis. They will not be considered
in this appendix.
The Legendre polynomials P n(U) defined above satisfy (D.2) which may be written
in the form
(D.7)

If this equation is differentiated m times with respect to u one obtains


d 2h
(1 - u 2) - - 2(m + l)u -dh + [n(n + 1) -- mtm + l)]h == 0 (D.8)
du 2 du
in which h(u) = - -
a-r,
dum
When one lets h(u) (1 - u 2)- m/2j 2(U), Equation (D.8) transforms into (D.l). Thus
. dmPn(U) )
f2(U) == pr;:(u) == (1 - u 2)m/2 dum (D.9
526 The Associated Legendre Equation APPENDIX D

is a solution to the associated Legendre Equation (D.I). The functions


(1 - u 2 ) m / 2 d n + m
pm(u) == -n- (u 2 - l)n (D.IO)
n 2nn! du + m

are known as the associated Legendre functions of the first kind. Since P n is a poly-
nomial of order n, it follows that pr;:(u) = 0 for m > n.
It is obvious that the functions P~(u) are identical with the polynomials P71(u)
previously listed. If one uses (D.9),
u 2 )% = sin 0
P~Cu) == (1 -
U2)~2 = i- sin 28
p~Cu) == 3u(1 -
P~(u) = !(5u 2 - 1)(1 - U 2)!t1 = i sin 8 + -)f- sin 30
P~(u) == 3(1 - u 2 ) = i- - ! cos 20
Pi(u) = 15u(1 - u 2 ) = Jl- cos 8 - J-t cos 38
P;Cu) == Itj(l - U2)~2 = _4-!- sin 8 - J-l- sin 38
A second generating function for the Legendre polynomials is given by the expression
f(u,t) = [1 - (2ut - t 2 )]- H
which can be expanded into the series (cf. Mathematical Supplement)

(t) (i)(!)
f(ut) = 1
,
+ -I! (2ut - t. 2) + -2!- (2ut - t2 ) 2 + ...
+ (t)(j) . . . [(2n - 1)/2] (2ut _ t 2 )n +
n!
If this is rearranged as a power series in t one obtains
3u 2 1 5u 2 3u
+ --2- +
- -
f(u,t) = I + ut t2 + 2 t
3

and the coefficients of the different powers of t are recognized to be the Legendre poly-
nomials, so that
L tnPn(u)
CIO

(1 - 2ut + t2)-~2 = (D.II)


n=O

Differentiation of (D.II) with respect to t gives


co
t \'
U -
-(1---2u-t-+-t-2)-~2 = /::0 ntn-1Pn(u) (D.12)

which can be written


L tnPn(u) L ntn-1Pn(u)
CIO co

(u - t) = (1 - 2ut + t2)
n=O n=O

Equating coefficients of t", one determines that

(n + I)Pn+1(u) - (2n + l)uPn(u) + nP n - 1(u) = 0 (D.13)

This recurrence relation will permit the determination of any Legendre polynomial if
two successive ones are known.
APPENDIX I) The Associated Legendre Equation. 527

Differentiation of (D.II) with respect to u yields

2:
00

- t == tnp (T).I4)
+ l2)~2 U
I

(1 - 2ul n=O n()


which can be rearranged as

( L [npn(U) L i-r; (u)


00 00

== (1 - 2ut + (2)
n=O n=O

The coefficient of l" gives

Pn-1(u) == P~(u) - 2UP:_l(U) + P~_2(U) (D.I5)


Knowledge of the derivatives of t\VO successive Legendre polynomials will thus permit
determination of any other through the use of (D.15).
Alternatively, (D.14) can be rearranged with the aid of (D.12) to give

L ntn-1Pn(U) 2:
00 00

t (u - t) tnp~(u)
n=O 11 =0

which yields the recursion formula


nPn(u) == uP~(u) - P~_l(U) (D.16)
from which the derivative of any Legendre polynomial can be determined if one
polynomial and its derivative are known.
Combination of (D.IFi) and (D.16) delivers the useful differentiation forrnula
«r;
(1 - u 2) - == nP n - 1 (u ) - nuPn(u) (D.17)
du
Recurrence relations for the associated Legendre functions follow readily with the
aid of (D.IO). Two of the more important formulas are
(n - m + l)P:+l - (2n + l)uP: + (n + 1n)P':_1 == 0 (D.18)
dpm
(1 - u 2 ) _ n == (n
du + 1n)pm
n-l
- nul'"n (D.19)

One of the most useful properties of the Legendre polynomials is their orthogonality
in the interval -1 :::; u ~ 1. This can be established by returning to the differential
equation (D.7). The two polynomials Pl(u) and Pn(u) satisfy this equation in the forms
d I
- [(1 - U2)P l(U)]
du
+ l(l + l)P l(u) == 0 (D.20)

d ,
- [(1 - u 2)Pn(u)]
du
+ n(n + l)Pn(u) == 0 (D.21)

Upon multiplying (D.20) and (D.21) by Pn(U) and Pl(u) respectively, subtracting, and
then integrating from -1 to + 1, one obtains

f
1

(l - n)(l +n+ 1) Pt(u)P,,(u) du = [(1 - U2 )[P ,, (ll )P;(U) - Pt(u)p:(n)JlI~: = 0


-1

(D.22)
528 The Associated Legendre Equation APPENDIX D

in which the right side of (D.22) has been achieved through integration by parts.
Therefore

J PI(u)Pn(u) du = 0
1

l~n (1).23)
-1

and the Legendre polynomials are orthogonal.


To determine the value of this integral if l = n, the generating function (D.lI) can
be used. Squaring both sides and integrating with respect to u gives

J (1 - J [Po(u) + tP1(u)
1 1

2ut + t 2)- 1 du = + ... + tnPn(u) + ...)2 du


-1 -1

which becomes

L J P~(u)
00 1

[ _ .-!. In (1 _ 2ut +t 2) ] 1
t2n du
2t -1 n=O -1

with the reduction of the right side occurring by virtue of (D.23). Insertion of the limits
yields

L~ = L o- J P~(u)
00 00 1

! In 1 +t = 2 du
t I - t n =0 2n + 1 n= 0 _ 1

in which the logarithmic function has been replaced by its series expansion. Equating
coefficients of like powers of t, one obtains
1 2

-I
f p2(U) du
n
:= - -
2n + 1
(D.24)

The associated Legendre functions Pi and P;:, which satisfy

.!!-
du
[(1 - u 2 ) dP'(']
du
+ [iO + 1) - ~-]
1 - u
Pi 2
= 0 (D.25)

~
du
[(l - u 2) dP';:]
du
+ [n(n + 1) - ~] P;:
1 - u2
== 0 (D.26)

are also orthogonal in the same interval. This can be established by a repetition of the
foregoing procedure. If (D.25) and (D.26) are multiplied by P': and P,/" respectively, the
difference taken, and the result integrated, the result is that
1

-1
J P'('(u)P';:(u) du = 0 lrfn (D.27)

The normalization integral is

f [P::'(u)J2 du f (1 -
-1
1
=
-1
1 drP; dmp n
u 2)m- - - - du
dum dun
which reduces to

-1
1

f [P:(U)]2 du = -
-1
1
f dm-1P
m-
du
n .!!-
1
du
[(1 - u 2)m dmPn] du
dum
(D.28)
APPENDIX D I 1he
Associated Legendre Equation 529

after integration by parts. If in (D.8) one replaces m by rn - 1 and multiplies through


by (1 - u 2 )m- J, there results

d [
-
du
(1 -
dmPnJ
u 2)m - - = - (n - 1n
dum
+ l)(n + 1n)(1 dm-1P n
- u 2 )m- l _ -
du m- 1

Substitution of this expression in (D.28) gives


1 1
dm-lPn dm-1P n
f [p:(u)]2
-1
du = (n + m)(n - m + 1)
f
-1
(1 - U 2)m- l - - - - du
dur:' du»:'
(D.29)
f
1

= (n + m)(n - m + 1) fP;:,-l(u)j2 du
-1

with the aid of (D.28). Use of the reduction formula (D.29) yields

j j
1 ( ) , 1

[P;:'(u)j2 du = (: ~ :); [P~(u)j2 du


Finally, through the use of (D.24),
1

f m m en + m)!
-1 Pn(u)P I (U) du = (2n + l)(n _ m)' lit. (D.30)

This result is of considerable importance since it provides the opportunity to expand


a function f2(U) in terms of associated Legendre polynomials with the coefficients
individually determinable from (D.30). This technique greatly facilitates the solution
of many boundary value problems,
APPENDIX E
COMPOSITION OF GENERAL SOURCES

LET A general current density distribution t(x,y,z,l) be represented by a four fold


Fourier integral such that a component may be written

J gl(kz)eikzz dk z J J J g4(WW
00 00 00 00

tz(x,y,z,t) = g2(ky)ejk,y u; ga(k.)eik,. dk. WI


dw (E.l)

This may also be expressed as

JJJJ
00 00

tx(x,y,z,t) = gz(kz,ky,k.,w) u; dk y dk. dw ej(wt+kzxH.y+k,.) (E.2)

in which gx = glg2g3g4. Proceeding in this manner for all three components, one is able
to represent the current density in the form

t(x,Y,z,t) = J J J J g(lcx,ky,k.,w) .u; .u; .u; dw eiewt+kzx+k."+k,z)


-00 -00 -00 -00
(E.3)

with g = lxgx + l yg y +
l zg z.
Similarly, a general charge density distribution p(x,Y,z,t) may be represented by

p(x,y,z,t) = JJJJ
-00 -00 -00 -00
f(kz,ky,k.,w) dk x .u, dk. dw ejCwt+kzx+k.y+k,z) (E.4)

The integrands of (E.3) and (E.4) are connected by the continuity equation, V • t = - p,
which gives
1
f = - - (k· g) (E.5)
w
wherein k = lxk x + l yk y +
lzk z.
If the fictitious charge and current densities in the interval (dk,dw) are treated as an
independent entity which satisfies the flow equation t = pv, then the velocity of these
fictitious charges is
g wg
v(k w) = - = - - (E.6)
, f k· g

This velocity is independent of x, y, Z, and t and is therefore a common velocity shared


by all the charges which give rise to the (k,w) current and charge waves. In a coordinate
APPENDIX I~~ Composition. of General SOllTCeS 531

system traveling at the velocity v with respect to XYZ, these charges are at rest. As
k and ware permitted to range over their complete spectra of values, (E.6) indicates
that all values of v will be encountered in the interval 0 ~ v < 00. One 111ay conclude
from this that arbitrary static charge distributions in all Lorentzian frames 111ay be
combined to give the most general time-varying spatial distributions of current and
charge density in a particular Lorentzian frame.
Because the range of v is unrestricted, some of these fictitious charge distri butions
are traveling through XYZ at speeds greater than light. This requires use of the
Lorentz transformation equations when v > c. Even though the transformation is then
nonphysical, this is mathematically admissible in the sense that all physical laws
formally transform properly under a Lorentz transformation regardless of the value of
vic; it should also be recalled that the charge densities in the interval (dk,dw) are
fictitious. No intimation is intended that the real time-varying charges, which are the
sum of these fictitious static charge densities, are traveling at speeds in excess of c.
As a corollary of the above result, a steady current distribution t'(x',Y',z') in X'Y'Z'
may be decomposed into static charge distributions in all other Lorentzian frames.
For this reason the most general sources t(x,Y,z,t) and p(x,y,z,t) in XYZ may also be
built up from static charge distributions p'(x',y',z') plus static current distributions
,'(x',y',z') in all other Lorentzian coordinate systems X'Y'Z'.
APPENDIX F
GENERALIZATION OF THE FIELD TRANSFORMATION EQUATIONS

IN SECTION (5.2) it was established that, if the most general system of time-independent
sources existed in X'Y'Z', time-varying fields E(x,y,z,l) and B(x,y,z,t) would be expe-
rienced in XYZ, with these fields contributing to the force on a moving charge q in
accordance with the Lorentz force law (5.8). The relations between the static fields
E', B' in X'Y'Z' and the time-varying fields E, B in XYZ were given by (5.5) and (5.9).
Imagine no\v a third Cartesian coordinate system X*Y*Z*, with its axes respectively
aligned to those of X'Y'Z', and with the X* axis sliding along the X' axis in the negative
direction at a constant speed u", Through use of the velocity transformation equations
(2.50), X*y*Z* is seen to be in motion with respect to XYZ at a velocity lxu given by

u - u*
lxu = l x - - - ­ (F.1)
1 - uu*jc 2

In a parallel development to what was presented in Section {J.2, an observer 0*, sta-
tionary in X*Y*Z*, may define time-varying fields E*(x*,y*,z*,t*) and B*(x*,y*,z*,t*)
by the relations
E 'z - u *B'y
E* =-- ---- (F.2)
z [1 _ (u *) 2/ C2P2
, u* I
B'z u* E'
B. 1/ - -2
c Ez + 2 .J y

B* = - - - - - - B* = C (F.3)
u [1 - (u *) 2/ C2P~ t [1 - (U*)2/C2P~

Elimination of the primed field components from among (5.5), (5.9), (F.2), and (F.3)
yields

E; =
E]; + uB: E E*z - uB*Y
(F.4)
(1 - U2/ c2) H z = (1 _ U2/C2)~~

B*y - ~
c2 E*z B*z +~
c2 E*y
By =
(1 - U2/C2)~~
Bz = (1 - u2/ c2 ) ~'~
(F.5)

But these are the same transformation equations as (5.5) and (5.9) and thus the earlier
result has more general validity, linking fields as seen in two different Lorentzian frames
when the fields are time-varying in both frames. Of course, the sources which gave rise
APPENDIX F Generalization of the Field Transformation Equations 533

to these fields are of a restricted class, being time-independent in X' Y ' Z'. However,
time-independent sources in still another frame X" Y" Z" would lead to the same result
(F.4) and (F.5). If time-independent sources in an infinite variety of Lorentzian frames
are su perim posed, the most general time-oaruinq sources can be created for observers
o and 0*. By superposition, the total fields experienced by these t\VO observers will
still satisfy (F.4) and (F.5). Thus these field transformation equations have a completely
general validity.
APPENDIX G
REDUCTION OF THE VECTOR GREEN'S FORMULA FOR E

IN CHAPTER 5 the vector Green's theorem was used to establish the relation (5.39),
namely

J (E· V s X V s X ifia - ifia • V s X V s X E) dV


v'
J (ifia X V s X E - E X Vs X ifia) • 1" dS
81 ... SN,'};

This equation can be transformed in the following manner: Using the vector identity
(V.113) one may write
(G.l)
However,
v sofa == fV s · a + a · V sf == a · V sf (G.2)

since a is a constant vector. Also


V~fa = a V'~f = - k 2 f a (G.3)

because f satisfies the scalar wave equation \7~1/; + k 2f == o. Thus


(G.4)

Using both (G.4) and (5.3F», one obtains

E· VS X Vs X ifia - ifia' Vs X V s X E = E· Vs(a' Vsifi) + a' (jw ifi /L;l)


Further use of the identity (V.I07) gives

E · V s (a · V sf) == V s · [E (a · V sf)] - (a· V sf) V s • E

== V s · [E(a · Vsf)] - .!!- a · V sf


E:o
so that the left side of (5.39) becomes

J
v
{VS. [E(a· Vsf)] ..»eo (a · Vs1/;) + a· (jwifi ~l)} dV ~o

= a' J(jwifi ~l _!:. VSifi) dV - a' J (1 n • E)VsifidS (G.5)


v' /-Lo E:o 81 ... SN,'1:

in which the divergence theorem has been employed,


APPENDIX C~ Iieduction of the Vector Green's Formula for E 535

The constant vector a may also be taken out in front of the integral sign on the right
side of (5.39). Since, with the aid of (V.l09) and the triple scalar product, one can write

[E X (v s X ~a)] · in == [E X (Vsl/; X a)J · 1 n == [(I n X E) X Vsl/;] · a


[fax V s X E] · in == - jwl/;(a X B) · 1n == jwfa · (in x B)
it follows that

f (1/;a X Vs X E - E X Vs X 1/;a) · in dS
S1 ... SN,~

= -a' f [jw1/;(l n X B) - (in X E) X V s1/;] dS (G.G)


81 ... SN,L-

But (G.tj) and (G.6) are equal to each other, being modified forms of the left and right
sides of (5.39). And since this is true for any arbitrary constant vector a, it follows that
the integrals themselves must be equal. Thus

f(jw1/;
v:
~1
fJ.o
- ~ Vs1/;) dV
EO

f [(In' E)V s1/; + (in X E) X Vs1/; - jw1/;(l n X B)] dB


81 ..• SN

= [ [(1" · E)V s1/; + (in X E) X Vs1/; - jw1/;(1" X B)] dS (G.7)

where, for convenience, the surface integral over the sphere ~ has been split off.
Consider this integral. On the surface of the sphere 2: one has

(Vs1/;). = 1., [~ C-;k) 1 = -in (jk + De-;k. (G.8)

If dQ is the element of solid angle subtended at P by a surface clement dS on 2:, then


the right side of (G. 7) call be written

I =! r -
41T"
(in · E)ln jk
(1)
+5 e-
-0- -
j k fJ

u. X E) X in
(1)
+8 jk
e-
j kb

-0-

jkO
e- ]
- ,iw(l n X B) -0- 02 dn
41T"

- se!" f [jk(l n · E)l n + jk(l n X E) X in + jw(l n X B)] dQ


o

f
41T"

- e- jk
• [(1,. · E)1" + (in X E) X in] dQ (G.D)
o

Since both integrals in (G.g) are well-behaved at P, it follows that

f E dQ = - E(P) f dQ = -471'E(P)
41T" 41T"

lim I = - lim e- jk
• (G .10)
0---+0 0---+0 0 0

Next consider the limit, as 0 ~ 0, of the left side of (G.7). Obviously the surface
integrals are well-behaved because P is restricted to be a point within V and thus not
536 Reduction of the Vector Green's Formula for E APPENDIX C~-

on any of the bounding surfaces Si. As V' ~ V, the volume integral is also well-behaved.
To see this, spherical coordinates may be introduced centered at P. Then dV =
r 2 sin fJ dr dfJ dcP. Since t/; and Vsl/; contain terms involving ~-l and ~-2 only, the contribu-
tion of the volume element at ~ = 0 to the volume integral in (0.7) is finite. Therefore,
the limiting value of (G.7) is

E(x,Y,z) = ~ J (!!... VS1f' -


41r V Eo
jW1f' ~1) dV
fJ.o

+~ J [(In· E)V s1f' + (In X E) X VS1f' - jW1f'(ln X B)] as (C.Il)


41r S1'" SlY

in which (x,Y,z) are the coordinates of the point P, and it is to be remembered that a
time factor ei wt has been suppressed.
APPENDIX H
THE WAVE EQUATIONS FOR A AND <P

IN CHAPTER 5 the potential functions A and ep were introduced by the defining rela-
tions
t(~, rJ,f)ej(wt-k!") dV
A(x,Y,Z,t)
v
f 41rJ.Lo ~
1 (H.1)

p(~,1],r)ej(wt-kr) dV
cf?(x,Y,z,t) ==
v
f 41r€o~
(H.2)

Upon taking the divergence of (H.i) one obtains

(H.3)

since \ is not a function of (x,Y,z). But

1. V F (t;-;H) = -1. V s (e-;k f) = e-~kf V S' 1 _ VS' Ce~jkr)

in which use has been made of (V.107). If V s · \ is replaced by -jwp in accordance with
(5.34), (H.3) may be writ.ten

VF • A :=:
. f
-Jw
p(~,1],r)ej(wt-kn
-1 dV -
f \(~,1],r)ej(wt-kO
-1· dS (H.4)
v 41rJ.Lo r s 41rJ.Lo ~

after the divergence theorem has been employed. Since S may be made large enough
to encompass all the sources without containing any of them in its surface, the second
integral in (H.4) vanishes and one is left with the conclusion that

jw 1 .
V·A== --ell:=: --ell (R.5)
c2 c2

Through application of the Fourier integral, if t and p are general functions of time,
one sees that

A(x,Y,z,t)
f 1(1;, 'fJ, r, t - ~/e) dV (H.6)
v 411"J..Lol~

it>(X,Y,z,t)
f p(1;, 'fJ, r, t - 'fIe) dV (H.7)
v 47rEO~
538 Tne W ave Equations for A and <I> APPENDIX H

these integrals being natural extensions of (H.1) and (H.2). Because linear superposi-
tion has been employed, it follows that A and <I> as given by (H.6) and (H.7) also satisfy
(R.5). Further, the fields E and n arising from the sources t(~,l1,r,t) and p(~,1],r,l)
satisfy
E = -V<I> - A (H.8)
B=VxA (H.9)

these equations being restatements of (:>.67) and (;">.68) but for the more general
potential functions (H.6) and (H.7).
If one takes the divergence of (H.8) and the curl of (H.9), the result is
• p
V · E = - \72<1> - V •A = -
fO

V X B = V X V X A = V(V • A) - \72A
t
= - -1 + -1.
2
E
J.Lo C
which, with the aid of (R.5) and (H.8) become
1 .. p
\724> - - <P = (R.IO)
c2
t
-1
(1-1.11)
J.Lo

Thus both A and cI> satisfy the same type of differential equation, the solutions being
given formally by (H.6) and (H.7).
APPENDIX I
VECTOR WAVE SOLUTIONS IN SPHERICAL COORDINATES

CONSIDER the vector function


G == r X V'1' == - V X (r'lJ) (1.1)

in which '¥ satisfies the scalar wave equation

\72'1' + k 2'l' == 0 (1.2)

I t is desired to show that G satisfies the vector wa ve equation

\72G + k 2G == 0 (1.3)

the entire discussion being confined to spherical coordinates. Since

V'lt = iT -
<3'lJ
aT
+ -1r 0 -<3'aelJ + -.-14> <3'1'
-
r sin e a</>
it follows that
10 a'lJ a\j!
G ===
- sin e d ct>
+ 1
4> ao (1.4)

Using Equation (4Jj4), which gives the Laplacian of a vector in spherical coordinates,
one finds that

yr2G == 1 0 [ \72 ( - -1- -a'l!) + - -1 - -a'lJ - 2 cos e - a 2'lJ]


-
sin e ae/> 1'2 sin" e ae/> r 2 sin? e de a</>
+ 1 [\72 (a'l') _ 2 cos e a '1t
2
1_ a'lJ] (1..5)
¢ de 1'2 sin 3 e ae/>2 1'2 sin 2 e ae

\72 ( - -
1 a'lJ)
-
1 a (\7 2'1') + 2 cos e a 2'1'
- - - - -1 - a'l'
But == - - -
sin e a</> sin e ae/> 1'2 sin" e ae act> 1'2 sin" e ae/>

and \72 (a'1J) = ~ (\'I2'l') + 1 a'l' + 2 cos 8 a 2'l'

de ae 1'2 sin 2 e de 1'2 sin 3 e de/>2

and therefore (1.5) becomes


10 a a
V'2G = - - .- - (\7 2'1') + 1¢ - (\1 2'1')
SIn e dcj> de
(1.6)
10 a a
= -.- - (k 2'lt ) - 14> -- (k 2'1' ) == - k 2G
SIn e a</> de
which was to be proved.
APPENDIX J
GREEN'S FUNCTIONS FOR RECTANGULAR WAVEGUIDE

IN SECTION 5.15 it was shown that a useful formulation for TM modes in any cylin-
drical waveguide resulted from the use of a Green's function G 1 which satisfied the
conditions:

e- j k r
1. G1 = 01 - ­
r
2. G 1 == 0 on S
3. 01 is regular in V and on S and satisfies

for all (u,v,s) on S and any (x,y,z) in V.


4. G1 satisfies
(J.l)

wherein P F and P s are the field point (x,y,z) and the source point (u,v,s) respectively,
o is the Dirac function, V is the volume interior to the waveguide, and S is the interior
surface of the waveguide.
For the rectangular waveguide shown in the first figure of Example 5.9, assume

~ ~ 2 . mx ~ . n1rl1
G, = L L _/- SIn - sin - F mn(S,x,y,z) (J.2)
m=ln=lvab a b

This is a reasonable starting assumption, since each term in (J.2) separately satisfies
the boundary condition G1 == 0 on S. All other terms in the complete Fourier series
in ~ and 11 do not satisfy this boundary condition.
If (.1.2) is substituted into the inhomogeneous wave equation (J.l), one obtains

- LL[(- m 7r) 2
a
-b 2] --==
+ (n1r) 2 . 1n7r~ . n7r1]
vab
SIn -
a
SIn -
b
F mn

a2F mn
+ LL 2 .
--== SIn -
vab a
nl7r~ •
sIn - - -
b ar 2
n7r1]

+ k LL
2
2
_ j-
. 1n7r~ . n7rl1
SIn - sin - Fm n = 47ro(PF - P s ) (J.3)
vab a b
APPENDIX J Green's Functions for Rectangular Waveguide 541

Let
(m7f)
2
2
)J.mn == -;; + (n1r)
b 2 (J.4)

and let
(J.5)
be the third or fourth quadrant root, in which k, the wave number, has a slight negative
imaginary component to account for losses which attenuate a wave as it travels along

-Ji.mn 2 eRe

FIGUng J.1 Propaqaiion phaeors.

the waveguide. The pertinent phasors are shown in Figure J.l. Substitution of (J.5) in
(J.3) gives

II(-a2Far- + mn

2
2
f3 mnF mn
) 2
v ab
'7n7i~
_ r::L sin -
a
n1r'Yl
sin -
b
== 41rO(PF - P s )

Multiplication by (2/V ab) sin T1r~/ a sin s1r'Yl/b and integration over (a,b) gives

a2F r s +
-- 2 87f
-==
T7fX S7fY
ar 2
R
IJrs
F
rs
==
vi ab sin - a sin - b o(z - ~r) (J.6)

Thus
2 . T7fX . S7fY
Frs == brs(r,z) _ j - SIn - sIn - (J.7)
vab a b
and
~ ~ 4 1n7f~ n7f'Yl. n17fX . n7fY
Gl == ~ ~ - sin - - sin - SIn - - SIn - bmn(r,z) (J.8)
m=ln=l ab a b a b
542 Green's Functions for Rectangular TVaveguide APPENDIX J

with bmn(r,z) satisfying

(J.g)

Lagrange's method of variation of parameters' will be used to solve (J.g). Assume

(J.IO)

Then (J.II)

Let (J.12)

an d t h en
(J2bmn
-ds2- =
.
-J~
fJmn
dVl
-
as
'/J
e- JIJ mll! + J~' -(JV2,

ar eJ8mflt -
fJmn
~
fJmn
2 '/J,.
Vle-JlJmt,) - ~
Pmn
2 '/J"
V2e JlJ mll) (.J.13)

Substitution in (J.g) gives


.
-J~
fJmn
aVl.
-
as
e-J8mnt + JQ. fJmn
-
aV2.
ar eJ8mnt = 41ro(z ,~r) (J.14)

Equations (J.12) and (J.14) 111USt be satisfied simultaneously by Vi and V2. Thus

eit1mnt I
av! = !47TO(Z 0_ n j{3mn ei {3 m" f 411"o(z - s)e j{3mt,f
(J.15)
ar I e--
j{3mn!
ej {3 mnt I u«:
- j{3mn e- it1mnt j{3mn e j {3 mnf
Similarly,
aV2 411"o(z - r)e- j {3 mnt
(J.16)
ar 2.i{3mn
When one recalls that (3mn is a third or fourth quadrant root, (J.IO) requires that

Vl(r,Z) == 0 r<z
:= C1(z) r> z
V2(r,Z) == 0 r>z
== C2 (Z)
1
r<z
For r > z,

=0 r<z
1 See, e.g., 1. S. Sokolnikoff and R. i'd. Hedheffer, AI athematics of Physics and Modern Engineering,

Section 28, ~lcGraw-Hill Book Company, New York, 1958.


APPENDIX J Green's Functions for Rectangular Waveguide 543

Similarly,

==0 r> z
Therefore, if one uses (J.10),
41T" .
- - - e-Jf3mn(r-z)
2j{3mn
r> z

- -41T"- e- J.{1mn( z- n
z >r
2j{3mn
Of,

(J.17)

for all rand z. When this result is substituted in (J.8), the Green's function G 1 is found
to be

(J.18)

in which
2 . n1/TrX . n7rY
V;mn(X,Y) == ---== SIn - SIn - (J .19)
Vab a b
By a similar process, if one defines the quantity
2 1n7TX n7TY
\Vmn(X,Y) == _ / - cos - - cos- (J.20)
'I ab a b
it may be shown that

LL
<Xl <Xl

G2 = 47r (J.21)
m=On=O

The proof of (J .21) is left as an exercise.


APPENDIX K
THE AVERAGE ELECTROSTATIC FIELD INTENSITY INSIDE A
SPHERE CONTAINING AN ARBITRARY DIPOLE DISTRIBUTION

WITH REFERENCE to Figure 1\:.1, let ql be one of the charges inside a spherical volume
of radius 1'0, and for convenience choose the orientation of the zenith axis so that it
passes through the site of ql. Then if E l is the average value throughout the spherical

FIGURE f<.l ..1 verage field inside a sphere.

volume V of the electrostatic field due to ql, it is apparent from symmetry that E1
has only a Z component. First one wishes to calculate

E- Iz = - 1 JE 1z dV (1(.1)
Vv
that is, the average field due to the single charge ql'
With 1"1 the distance from ql to the center of the sphere, let the volume V be divided
into t\VO parts by the concentric spherical surface of radius 1'1. Then if (x,Y,z) is any
point within V, from the figure,
~2 = 1'2 + ri - 21'1'1 cos 0
APPENDIX I( Electrostatic Field Intensity inside a Sphere 545

which leads to the expansions

~
-1 == -1 L (r)n
- P (cos 8) n
J: T1 n = O r~

= -
1 ~ (1')n
L.. -.-! Pn(cos e)
r n=O r

(cf. Example 3.24). The potential due to the point charge q1 can therefore be expressed
as

r < 1"1

Then since
sin e a<P1
- cos e-a<P1 + -- --
aT T ae
it follows that

r > 'J'l

The orthogonality relation (D.23) gives


1

J Pn(cos e)p o(COS e) d(cos e) = 0 n~O


-1
and thus, since
-
E 1z == -3
3
411'"1'0 v
JE lz r 2 sin e dr de d¢

J J E!zPo(cos e) d(cos e)
3 To 1
= -3 1'2 dr
21'0 0 -1
one may conclude that

ql1'l
(I{.2)
- 411'"EOr~

Had the charge ql been chosen to lie in any other direction than the zenith, the
answer would have been
(1(.3)
546 Electrostatic Field Intensity inside a Sphere APPENDIX I{

If additionally a charge - ql is at the position r~, the average electric field due to both
charges is
ql(rl - r~)
(1(.4)
47l"f01'~
in which d , is the directed displacement from -ql to +q!. But q1d 1 == Pi is the dipole
moment of the charge pair (ql' - ql) and thus, if there are J11 dipoles contained within
the spherical volume V, they cause an average field throughout the volume given by
lv!
- 1 \'
f: == - --3 L pn (K.5)
47l" f o1'o n =1
APPENDIX L

THE DYNAMIC MACROSCOPIC SCALAR POTENTIAL FUNCTION


DUE TO A VOLUME OF POLARIZED DIELECTRIC MATERIAL

CONSIDEH a charge pair (q, -q) within a macroscopic element dV of a specimen of


dielectric material, and imagine that their relative separation is d cos wt, with d the
maximum separation. Let the relative displacement be parallel to a direction character-
ized by a unit vector l z , and let the instantaneous position of q be Zl, the instantaneous
position of - q be Z2. With reference to Figure L.1, if q8(s 1 - Z1) ds 1 and - q8(r 2 - Z2) ds 2
are taken to be the distributions of the t\VO moving charges with 8 the Dirac delta
function, then the scalar potential at a distant point, due to this charge pair, is

in which the braces indicate that time-retarded values are to be used. If neither charge
makes a great excursion about its central point, so that r »> IZII, r »> IZ21, then

cI>(r,(},l) == _q [jet) 8eSl - {zd) dS l _ jet) 8eS2 - {Z2}) d S2 J


47r€o _ eo r - S1 cos (} _ co T - S2 cos (}

q
== 47r€o
(1
T - {z I} cos (} - t -
1)e
{z 2} cos
q cos (}
~ - - -2 ({ZI} - {Z2})
47r€or

Since ZI - Z2 == d cos wt, if d « X, as is usually the case, then

lzd - lzzl = dcosw(t -~)


If p is defined as having a magnitude qd and a direction parallel to L; then

per
cI>(r,O,t) == - -3 cos
47T"€or
w
(r)
t- -
c
(L.1)

with r drawn from the oscillating dipole to the distant point.


548 The Dynami c Macro scopic Scalar Potential Function APPENDIX L

dl l

z,

z = 0

FIGURE L.l Oscillating dipole.


This result is seen to be simila r to t he static case except t hat now th e dipole mom en t
is oscillatory at angular frequ ency wand retarded ti me must he used to dedu ce t he
scalar potential.
Upon letting P dV repr esent th e sum of th e dipol e mom ents du e to all t he cha rge
pairs within dV , on e may write
P (~, 11, L t - ~/ c) . ~ dV (L.2)
cf> (x,Y ,Z,t) [ 41l"Eo~3
APPENDIX L The Ihjnamic Macroscopic Scalar Potential Function 549

in which ~ is drawn from (~,l),n to (x,Y,z). Since Vs (t) = ~/~3, use of the vector

identity
VS' [1~IJ = VS' ~IP}] + lIPI]' VS G)
converts (L.2) to the form

ep(x,y,z,t) = f [{P}]· dS + J (-\'s· [{PI]) dV (L.3)


S 47rE:O~ v 47rE:O~

in which the divergence theorem has been used to obtain the first integral, and {P} is
the retarded value of P.
Equation (L.3) is seen to be similar to the static result (6.8) except that now P is
time-harmonic and retarded time must be used. This dynamic result is applicable at
interior points as well as exterior points. The proof follows the procedure used in Section
6.3 and requires that the radius of the sphere erected around an internal point be small
compared to a wavelength. This is normally a reasonable assumption.
Although the derivation just given is only applicable to electronic and ionic polariza-
tion, Equation (L.3) is also valid for orientational polarization. The proof of this asser-
tion is left as an exercise.
APPENDIX M
THE DAMPING CONSTANT OF A FREELY OSCILLATING DIPOLE

UNDEH THE assumption that Equation (6.80) properly describes the motion of the
center of gravity of a freely oscillating electron cloud of total charge - e and mass in,
if the Z axis is aligned with the displacement, one 111ay write

1nzi + tizz = - 2bi 2 (~I.1 )

which yields !!-dt (~2 niz? + 2~ aZ 2) == - 2bi 2 (~L2)

Therefore P == 2bi 2 is the instan taneous t.ime rate of energy loss by the atorn through
radiation to its surroundings.
Solution of (6.80) gives
z(t) == zae-(b/m)t cos wat
wherein Zo is the initialdisplacement of the electron cloud and

Wo = ~~m _ (~)2
m
(~'I.4)

is the natural frequency of oscillation. From this it follows that

P == 2bz~e-(2b/m)t
b
( -1n cos wot + Wo sin wot )2 (1\1.5)

If 2b/1nwo « 1 (this will subsequently be shown to be the case), then the decay is very
small during one cycle, and the energy lost by the oscillating atom in one cycle at the
time t is

lV ==
9b
>.J
2
Zo e-(2b/m)t
Wo 0
J (b-
211"

m
cos wat + )2 d(wot)
Wo sin wat

27T"bz~
== - - [( -b )2 + Wo2J e-(2b/m)t
Wo '}}1,

~ 27T"bwoz~e-(2b/m)t (lV1.6)

The radiation field associated with this emission of energy may be deduced from the
magnetic vector potential function
-e{i}
A == lz - - - - 1 (lV1.7)
47T"J.Lo r
APPENDIX M The Darnping Constant of a Freely Oscillating Dipole 551

in which {z} is the retarded value of dz/ dt and Zo is assumed to be very small com pared
to the wavelength A = 27rcl woo Upon computing B in spherical coordinates from
B = V X A, one finds that only Bep contains a term with an 1'-1 dependency, and that
this term is given by

Rep = - ezo sin -1 e e-(b/m)(t-r/c) { [(-b


47r J..L 0 cr m
)2 - Wo2] cos Wo (1')
t - - + 2wo -b.
c m
SIn Wo ( t - -r)}
c
(lVI.8)
Through use of Poynting's theorem, the instantaneous density of power flow across
a spherical surface of large radius 1', centered at the dipole, is
CP = J.lo1cB~
so that the total power crossing the surface at time t is

d: = 6:~;lC e-(2b/m)(H!c) {[ (;y - w~] cos Wo (t - ~) + 2wo; sin Wo (t - ~) r


Through further use of the assumption that 2b/1nwo « 1, the energy which crosses
the surface in one cycle at the time t is therefore
2
W
e2z w 3
~ ~ e-(2b/m)(t-r!c) (lVI.9)
6J.lo c
But this should equal the energy which left the dipole during one cycle at the earlier
time t - ric. That energy can be found by using retarded time in (lV1.6). When this is
done and the two expressions are compared, one finds that
e2w~ e2w~
b= - - -3 (1\1.10)
127rJ..Lo 1 c 121l'" Eoc
To check the validity of an earlier assumption, this may be written in the form
2b e2wo
---~ 10- 5
me« 6
7r Eo1n c
3

in which the high value Wo = 10 17 has been used. Thus the assumption that 2b/1nwo « 1
is entirely justified and (lVI.I0) is a good approximation to the value of the damping
constant.
APPENDIX N
THE AVERAGE MAGNETOSTATIC FIELD INTENSITY INSIDE A SPHERE
CONTAINING AN ARBITRARY DISTRIBUTION OF CURRENT LOOPS

CONSIDER a source point (~,1J,s) defined by a position vector r', as shown in Figure
N.1, and a field point (x,Y,z) defined by a position vector r. Let the angles at which r'
and r point be (O',¢') and (O,¢) in conventional spherical coordinates and let I' be the
angle between r' and r. Then if ~ = Ir - r/], it follows from the law of cosines that

1
- = [1'2 + (1")2 - 21'r' cos 1']-71~ (N.I)
~

As in Example (3.24), this result may be expressed in terms of one or the other of the
expansions

2:
00

-1 = -1
r r
(1")n
-
r
Pn(cos 1') r > 1" (N.2)
e n=O

= 2 L~
1"
(!-)n Pn(cos 1')
r'
r < r' (N.3)
n=O

However, the addition theorem for spherical harmonics gives'


P n (cos 1') = P n (cos 0)P n (cos 0')
~ (n-m)1
+ 2 L ); P;:'(cos O)P;(cos 0') cos [m(4) - 4>')] (N.4)
m=l (n +m .

so that both expansions may be written in terms of double spherical harmonics.


These results may be applied to the case of a filamentary current loop of radius a,
situated centrally in the XY plane, as depicted in Figure N.2. For all the source points,
0' = 7r/2 and the magnetic vector potential function due to this loop may be found at
¢ = 0 with no loss in generality, since the answer is e-symmet.ric. One obtains

f cos 4>' d4>'


27r
A.p(T,O) = ~ (N.5)
47r,LLo- 0 ~

which agrees with Example 4.6. Unlike that example, no approximations will be made
due to assumptions about the relative sizes of r and a, but instead the expansions of
1 See, e.g., J. J). Jackson, Classical Electrodunamics, pp. 67-69, John Wiley and Sons, Inc., New York

1962.
APPENDIX N 111 agnetostatic Field Intensity inside a Sphere 553

y
"\.................... II
cJ>/~\\ '".......... I
\ I ,,~
\ I
\ I
x \~
FIGURE N.1 Source and field point qeomeiri].

Idt

x
FIGURE N.2 Circular current loop at origin.
554 M aqnetosiaiic Field Intensity inside a Sphere APPENDIX N

r- 1 will be employed to deduce exact expressions for ~4<p. For T > a,

Acf>(r,8) = ~.
I
41rJ.L 0
I~ ()~ f cos
1 n= 0 1
n
0
21r
et/[Pn(cos 8)P n(O)

+ 2 fL 1 (n-1n)!
( )' P;:(cos fJ)P;:(O)
n + m. .
m=
cos l1~eP'] deP'

This reduces to

(N.6)

(N.7)

To evaluate the radial component of B, one needs

from which it follows that

r > a

r < a

The () com ponent of B is

1ra2I
B e = ---
(a)n-l
-
21r ,-0 Ir 3 n = 1 r
1l -
--
l
oo
p~ (0)
n + 1
r: (cos B)
n
r >a

= -
1ra 2I I~
- -1
-3 -
(r)n+2 -p~(o)
- P1(cos e) n
r < a
21rJ.Lo r n = 1 a n
For r » a, the n = 1 term dominates and is seen to give the same field as was found in
Example 4.6.
With these expressions for the magnetic field com ponents, it is now possible to find
the average value of B throughout a spherical volume Va. Referring to Figure N .3, let
the central point of V~ lie in the XZ plane at the Cartesian position (h,O,k), and let the
current loop (which is seen edge-on) lie entirely inside V s. Then, since
n= (B r sin fJ -t- Be cos 8)(lx cos cj> + ly sin cJ» + (B r cos () - Be sin O)lz
one finds, for r > a,

B = - 1Ta~~ ~ (~)n-l P~(O) [Clx cos </> + i, sin </» (sin ()Pn(cos fJ)
21r).lo ni:. r1"3 1

- cos 8 n
P~(COS
+ f))) + L, ( cos 8Pn(cos 8) + sin 0 P;(COS
1
fJ))]
n + 1 . (N.8)
APPENDIX N 1~1 agnetostatic Field Intensity inside a Sphere 555

FIGUR!'~ N.3 Current loop inside a spherical volurne.

whereas, for r < a,

B = - 7ra~~ n!.:.l
21rJ..Lo
~ (~)n+2
1"3 a
P;(O) x
[(l cos 4> + i, sin 4» (sin ()Pn(cos ())

+ cos ()
P~(cos ()))
n + ( .
L, cos 8Pn(cos 8) - SIn 8
P~(cOS
n
()))J (N.9)

A f f f Br sin
71" 271" a

Since Bav = 2
0 dr dO d¢
""3"11"5 0 0 0
1
ff f
71" 271" n(O,¢)

+- Br 2 sin 8 dr d8 dcP (N.lO)


471'"0 3 0 0

with 1"1(8,4» the distance Irom the center of the loop to a point on Sa, a study of (N.8)
and (N .9) reveals the following:
1. The first integral of (N.lO) does not contribute an X component nor a Y corn-
ponent to Bav because of the 4> periodicity of (N .8).
2. The first integral of (N.IO) does not contribute a Z component to Bav except for
the term n == 1 because sin 8 == - pi(cos 8) and cos () == PI (cos 8) and the orthogonality
relation (D.30) eliminates all other terms.
3. The second integral of (N.lO) does not contribute to Bav whatsoever, This is
because

rl«(),cP) == h sin () cos cP + k cos () + [(h sin f) cos cP + k cos ())2 + 02 - h 2 - k2r~

is even in cPo If the r integration is performed first, the resulting integrand factor must
contain only even terms in cP, each of which is representable by spherical hannonics
whose cP integrations are zero except for m == O. Even for the case m == 0, only the Z
component need be considered, so the problem is reduced to an evaluation of integrals
556 M agnetostatic Field Intensity inside a Sphere APPENDIX N

of the type

P;(O) f P1(COS 0)
o
[cos OPn(COS 0) + sin 0 P~(COS 0)]
n + 1
sin ede
However, inspection of the expression for 1'l(e,</» reveals that the m = 0 component is
accompanied by an even function of (), so the index l must be even. Since P~(O) is zero
unless n is odd, the term in square brackets in the above integrand is an even function
of e. Therefore, the entire integrand is odd and the integral is zero for all allowed values
of nand !.
Because of these simplifications, (N.10) reduces to
3
JJJ
1 211" a

Bay = ( 3 m
47r0 )27r,LLo
-1 3
a -1 0 0
/[P 1(u)]2 + [P~(u)J2lr2 dr de/> du
(N .11)
m
27T',lio 15 3

in which m = lz7T'a 2X is the magnetic moment of the loop. This result is independent of
the position and orientation of the loop in 11 6 and therefore, if a distribution of loops
exists in l1 a, they contribute an average field in V 6 given by
N
Bav = _ 1- \' rn, (N.12)
L
27r,LLo 1 u~3 i=1 ~

In the special but important ease that the distribution of loops is uniform in a region
containing V 6, and of volume density M, those loops within V 6 contribute an average
field throughout V 6 of amount
n-: = M(47l" 013j 3 ) = ~ M l (N.13)
27T'J.Lo (j 3 3 J.Lo
This result includes the effects of those loops which are only partially in V 6 • On the
average such loops are half within Ve5 and half outside. Since only the integration over a
sphere of radius a around a given loop contributes to Bav, it follows that each partial
loop contributes to Bav according to that fraction of its "loop volume," 47T'a 3/3, which
is within Yo.
MATHEMATICAL SUPPLEMENT: PART I
TAYLOR'S SERIES

THE EXPANSION of functions into series representations is a C01111110nly used and


effective analytical technique. In electromagnetic theory the function to be expanded
often depends on several variables, and thus it is desirable to develop such a technique
with adequate generality. Accordingly" this small supplement on series, after a brief
historical introduction, reviews several mean value theorems, derives Taylor's series for
functions of one variable, and then extends the result to cover multivariable functions.

s.i * HISTORICAL SURVEY

The series expansion


h2
f(x + h) == f(x) + hf'(x) + 2!f"(x) +
which bears his name was first enunciated by Brook Taylor (1685-1731) as early as
1712 in a. letter to John Machin, Its first formal appearance was in his text 1v[ethodus
incrementorum directa et inversa which was published in London in the period 1715-1717.
This text also contains the easy consequence now known as Maclaurin's series, but
Taylor's proof of these expansions did not consider convergence and is worthless. The
importance of these expansions was not appreciated by analysts for over a half century
until Lagrange pointed out their applicability, and no rigorous proof of Taylor's
theorem was offered until Cauchy included a remainder term and tested for convergence
in 1821.
Colin Maclaurin (1698-1746), though an able mathematician, is improperly credited
with authorship of the expansion
x2
f(x) == f(O) + xf'(O) + ,f"(O) + ...
2.
which was contained in his Treatise of Fluxions published in Edinburgh in 1742. This
expansion is obviously a special case of Taylor's theorem, a point which was indicated
by Taylor 25 years earlier. Additionally, Maclaurin's expansion was apparently dis-
covered independently by James Stirling and is contained in his paper M ethodus
differentialis sive Tractaius de summoiione et interpolatione serierum infiniiarum pub-
lished in London in 1730. The greater fame of Maclaurin and the wider circulation of
his Treatise are accountable for this miscredit.
* This section may be omitted without loss in continuity of the technical presentation.
558 Taylor's Series IVlATHEl\IATICAL SUPPLEl\1ENT: PAR'r I

5.2 MEAN VALUE THEOREMS

A discussion of Taylor's series builds on the base of several mean value theorems which
serve as lemmas. The first of these is the well-known

ROLLE'S THEOREl'vI: Let f(x) be a function of the real »ariable x which possesses a con-
tinuous first derivative over the interval
Xl ~ X ~ X2. Let a and b be two points within the
intervalt for which f(a) = f(b) == O. Then at least one value of x can befound between a and
b, say x == t, for which f' (t) == O.

Proof: The truth of this theorem is almost self-evident from a geometric display of the
function such as shown in Figure S.l. If the function is to be zero at a and at b, it cannot

f(x)

J------;---4--------4L.----+-~-o#__~__+_--x

FIGURE S.l Rolle's theorem.

be ever-increasing, nor can it be ever-decreasing in the interval between a and b. Where


the function changes over from increasing to decreasing, the first derivative must vanish.

Rolle's theorem can be em ployed to establish the

THEOREIVI OF l'vIEAN VAL DE: Let f(x) and g(x) be two functions of the real variable x which
possess continuous first derivatives ihrouqhoui the interval Xl S X ~ X2. Let a and b be any
two points within this in ierool such that g(a) ~ g(b). If g' (z) is nowhere zero in the interval,
then for some value x == t between a and b,
f(b) - f(a) _ f'(~)
(S.l)
g(b) - g(a) - g'(~)
Proof: Define a function h(x) by the relation

hex) = ~~~ =:~:~ [g(x) - g(a)] - [f(x) - f(a)]


t In this and all subsequent theorems of this supplement, b can be either larger or smaller than a.
SECTION 8.2 Mean Value Theorems 559

It can be observed that hex) is a function which satisfies all the requirements of Rolle's
theorem. It has a continuous first derivative in the interval and h(a) == h(b) = O. 8ince

h'(x) = feb) - f(a) g'(x) - f'(x)


g(b) - g(a)
it follows that for some x == ~,
h'(O = 0 = feb) - f(a) g'm - 1'(0
g(b) - g(a)

which, upon rearrangement, yields the stated result.


A special case of this theorem of some importance occurs when g(x) = x. Then
Equation (S.l) reduces to
feb) - ita) = I'm (8.2)
b- a

A significant generalization of the above theorem is embodied in the

EXTENDED THEOHElVI OF ~!IEAN VALUE: Let f(x) be any function of the real variable x
which, together with its first n derivatives, is coniinuous in the interval Xl ::; X ::; X2. Lei a
and b be any two points within this interval. Then
b-a (b-a)2
f(b) == f(a) + -,-f'(a)
1.
+ ')'
..,.
f"(a) +

in which ~n is some point between a and b.

Proof: If one makes use of Equation (8.2), there is a point ~o between a and b for which

feb) - f(a) - (b -, a) f'(~o) = 0


1.
Define a constant K 2 by the equation

feb) - f(a) - (b - a) rea) _ (b - a)2 K 2 = 0 (8.4)


I!' 2!
and from this form the function
(x - a) (x - a)2
h(x) ==f(x) -f(a) _. 1! f'(a) - 2! K2

The function hex) has a continuous first derivative in the interval, given by

h'(x) == f'ex) - f'ea) - (x - a)K 2


and since h(a) == h(b) == 0, Rolle's theorem applies. Thus there is a point x == ~1 between
a and b such that h' (~1) == O.
Furthermore, h'(x) has a continuous first derivative in the interval, namely

h" (x) == f" (x) - K2


560 Taylor's Series IVIATHElVrATICAL SUPPLEMENT: PART I

and since h'(a) == h'(~l) == 0, there must be a point x == ~2 between a and ~1 for which

If this result is substituted in (8.4), one obtains

f(b) = f(a) + (b - a) f'(a) + (b - a)2f"(~') (8.5)


1! 2! 2

A constant K 3 can next be defined by the relation

f(b) - f(a) - (b - a) f'(a) _ (b - a)2f"(a) _ (b - a)3 K« = 0 (S.6)


I! 2! 3!'

from which it follows by the above procedure that !(3 = f'''(~3), where ~3lies between
a and b. Continuing this process out to the nth derivative yields the result (8.3). The
last term of this series, namely

is known as the remainder after n terms. For the important case in which f(x) is a
function with continuous derivatives of all orders, (8.3) becomes an infinite series as
n ~ co , If the remainder goes to zero in this process, the series converges to the value
f(b) and one may write

(8.7)

EXAl\1PLE S.l
If fex) = sin x, the remainder does go to zero and the expansion (8.7) is applicable. If one
lets a = 1r/4 and b = 1r/6, it follows that f(a) = 1/ y!2 and feb) = t. (8.7) gives

f(b) =
1
V2 r 1r -"21 (1r)2
1 - 12
1 (1r)3
12 + 6 12 + ...
]
Use of only the first four terms of this series gives the approximation

f(b) ~ 0.4999

S.3 TAYLOR'S SERIES FOR ONE VARIABLE

If f(x) and all its derivatives are continuous in the interval Xl ~ X ~ X2, and if a and x
are any t\VO points within this interval, it follows from (8.3) that

(S.8)

in which ~n is some point between a and x. If


SECTION 8.4 Taylor's Series for Several Variables 561

for all x within (XI,X2], then


~ (x - a)m
f(x) == ~ ,fm(a) (8.9)
m=O n~.

is a convergent series representation for f(x), valid within the entire interval. (8.9) is
known as the Taylor's series expansion of f(x) about the point a.
The special case of this result in which a = 0 is known as Maclaurin's series, and can
be written
00
m
\' x
f(x) = L - jm(O) (8.10)
m=O m!

Another useful form of Taylor's series results when f(x -t- ~x) is expanded in a series
about the point x. A straight substitution in (8.9) gives

~ (~x)m
f(x + ~x) = L _,_jm(x) (S.11 )
m=O 1n.

Both x and x + Lix must be within the interval (XI,X2].


EXAMPLE 8.2
Consider the function f(x + ~x) = (x + Lix)n in which n is an integer. Then f(x) = x n and

fm(x) = n'. xn - m
(n - m)!
fm(x) = 0 m>n

Substitution in (S.ll) gives


n

(x + ~x)n = \' n! xn-m(~x)m


mL: o m!(n - m)! (S.12)
n(n - 1)
= z" + nxn-l(~x) + xn-2(~x)2 + ... + nx(~x)n-l + (~x)n
2!

which can be recognized as the binomial expansion.

S.4 TAYLOR'S SERIES FOR SEVERAL VARIABLES

The results of the previous section may be extended to functions of more than one
variable with little difficulty. Let j(x,y) be any function which, together with all its
partial derivatives, is continuous in the interval Xl ~ X ~ X2, YI ~ Y ~ Y2. If (a,b) and
(x,y) are any two points within this interval, then by Equation (8.9),

x ) = ~ (x - a)m amf(a,y) (8.13)


j( ,y L
m=O m.
, axm
But the functions of y appearing on the right side of (8.13) also can be expanded in a
562 Taylor's Series l\IATHEl\IATICAL SUPPLEIVIENT: PART I

Taylor's series, namely,

(8.14)

so that (S.15)

All the series in (8.13), (8.14), and (8.15) rnust converge for all points in the interval
in order for this to be a valid procedure. When they do, (8.15) is known as a 'I'aylor's
series expansion of f(x,Y) about the point (a,b).
A useful alternative form of (8.15) arises when f(x + LlX, Y + ~y) is expanded in a
Taylor's series about (x,y). Direct substitution in (S.15) gives

~ L~ (~x)m (~y)n am+nj(x,Y)


f(x + LlX, Y + /1y) = L -- -- --- (8.16)
m=O n=O nd n! axmayn

Next, let !(x,y,z) be any function which, together with all its partial derivatives, is
continuous in the interval Xl ~ X ~ X2, Yl S; Y :s; 1}2, ZI ~ Z ~ Z2. If (a,b,c) and (x,Y,z)
are any t\VO points within this interval, then by (S.15),

_ ~ ~ (x - a)m (y - b)n am+nj(a,b,z)


f(x,Y,z) - L L (8.17)
m=On=O 1n! n! axmayn
whereas from (8.9),
am+n.f(a,b,z) = ~ (z - c)p am+n+pf(a,b,c)
(8.18)
axmayn ~o p! axmaynaz p

Combination of these results gives

!(x,Y,z) -
_I oo
I~ I oo
(x - a)m (y - b)n (z - c)p a m+ n + 7>! (a ,b,c)
(8.19)
m=On=Op=O
1n! n! p! axmaynaz p

When it is assumed that the necessary convergence conditions are met, (8.19) is known
as the Taylor's series expansion of !(x,Y,z) about the point (a,b,c).
In an alternative form,

<' _ ~ ~ ~ (6.x)m (6.y)n (6.z)p am+n+p!(x,y,z)


j(x + LlX, Y + ~Y, .'wI + LlZ) - L L L
m=O n=O p=O 1n.
,
n.
,
p.
,
ax may71az
p (8.20)

The extension of these results to functions of four or more variables follows the same
procedure and can be predicted by inspection.
EXAMPLE 8.3
In a vacuum triode, the plate current ib is a function of both the plate voltage eb and the
grid voltage e.. In many applications the triode has a plate current which consists of a
time-independent, or d.c. component, and a time-varying component. 'The plate current
can then be expressed in the form
i b = Ib + i»
in which Ib is the quiescent value and i p is the superimposed time-varying part. These t\VO
component currents flow in response to the voltages eb = Eb e p and e c = E e + eg, with+
SECTION 8.4 Taylor's Series for Several Variables 563

(Eb,EJ the quiescent portions and (ep,e ll ) the time-varying portions. When Equation (8.16)
is applied to this situation, one obtains

If the triode is biased to operate in the linear portion of its characteristic, then all higher
order derivatives vanish and this expansion simplifies to

(8.21)

If one defines the plate conductance gp and transconductance gm by the relations

the time-varying part of (8.21) can be written


(8.22)
Equation (8.22) is the basis for a variety of equivalent circuits for the triode which are
distinguished by assumptions concerning the waveforms of the signal voltages and the
lumped elements placed in the grid and plate circuits.

REFERENCES

1. Cajori, F., A History of Miuhenuiiics, 2d ed., pp. 226-229, The Macmillan Company, New
York, 1919.
2. Love, C. E., and E. D. Rainville, Differential and Integral Calculus, 6th ed., pp. 439-447,
The Macmillan Company, New York, 1962.
3. Smith, D. E., History of Mathematics, vol. 1, pp. 449-454, Ginn and Company, New York,
1923.
MATHEMATICAL SUPPLEMENT: PART II
VECTORS

VECTOR analysis is a major part of the mathematical language of electromagnetic


theory and occupies an equally important position in many fields of science. Because
of the varied backgrounds in vector analysis possessed by those em barking on a study
of electromagnetics, this su pplement is in tended to meet the needs of several grou ps of
people. For those well-versed in the subject, an orientation in the notation will be the
principal purpose. For those whose experience in vector analysis has been largely con-
fined to Cartesian coordinate systems, the sections on generalized coordinates 111ay prove
helpful, and perhaps some of the less commonly encountered integral theorems will be
worthy of attention. For those who are new to the subject, the supplement is designed
to cover all those vector topics needed in the exposition in the main part of this book.
A selection of problems is provided at the end as an aid to comprehension and manipula-
tive skill.
Following an historical review, the supplement begins with a discussion of vector
algebra which includes developments of the dot and cross products and their applica-
tions to physical problems. Vector differentiation is introduced, after which generalized
coordinate systems are discussed. Gradient, divergence, and curl are then defined,
physically interpreted, and their general forms developed. Various integral theorems
arc treated, notably the divergence theorem, Stokes' theorem, and Green's theorems.
The supplement contains a list of useful vector identities and concludes with a summary
of important vector relations in the principal coordinate systems.

V.l * HISTORICAL SURVEY

Historically, the origins of the concept of a vector as a quantity possessing direction


as well as magnitude can be traced to early attempts to display the number system
pictorially. The notion of opposite directions was em bodied in the representation of
positive and negative quantities as distances laid off to the right and left of a reference
point on a straight line. However, it is probably more meaningful to date the origin
of vector analysis from the work of John Wallis (1673) who used t\VO successive directed
orthogonal displacements in a plane to represent the complex roots of quadratic
equations.
Wallis selected an origin 011 a horizontal axis and then laid off a distance to the right
or left of this origin algebraically proportional to the real part of the root. From the
* This section may be omitted without loss in continuity of the technical presentation.
SEC1'ION V.I Historical Survey 565

point thus determined on the axis of reals, he then erected a perpendicular line, of
length proportional to the imaginary part of the root, and in one direction or the other,
depending on the sign of the imaginary part. In this way, Wallis achieved a one-to-one
correspondence between the points in a plane and the set of complex numbers: but
surprisingly, it did not occur to him to take the logical next step of introducing a verti-
cal axis of imaginaries.
It remained for Caspar Wessel, a Norwegian surveyor, to take this step over a cen-
tury later. In the modestly titled article On the Analytic Representation of Direction;
an Attem-pt. published in the Proceedings of the Royal Society of Denmark in 1799,
Wessel introduced an axis of imaginaries, constructed a directed line segment connect-
ing the points (0,0) and (a,b), and then associated the terminal point of this directed
line segment with the complex number a + jb. After defining addition and subtraction
such that (a,b) ± (c,d) == (a ± c, b ± d), Wessel proceeded to define the product of
two complex numbers in terms of an operation on the two line segments. He decided
that the product should be a new line segment whose length is the product of the
lengths of the two original time segments, with the new line segment making an angle
with the axis of reals which is the sum of the angles which the two original line segments
make with the axis of reals. This construction is consistent with the expansion

(a + jb)(c + jd) == ae - bd + J(ad + be)

Thus Wessel introduced all the features of what is commonly called an Argand diagram-
Unfortunately he published his findings in a journal not widely read by scholars, and
most of the fame for this discovery went to J. R. Argand, who independently reached
similar conclusions in 1806.
Wessel sought without success to extend his rotational multiplication method to
three dimensions, and the French mathematician Servois made a similar unrewarding
attempt 15 years later. This extension was finally accomplished by Sir William Rowan
Hamilton (1805-1865) after much fruitless toil, through his willingness to break with
tradition and discard the commutative law of multiplication.
Hamilton was a widely gifted man, accomplished in the classics and languages at
the age of thirteen, famous at twenty-seven for his mathematical prediction of conical
refraction, and celebrated at thirty for his fundamental work in dynam·ics. From this
point in his career, Hamilton proceeded to devote all his energy and talent for the
remainder of his life to the subject of quaternions, whose invention he announced before
a meeting of the Royal Irish Academy in 1843.
A quaternion q may be written in the form

in which the o; are ordinary numbers and the I, are fundamental units possessing direc-
tion as well as magnitude. A quaternion is thus, in essence, the sum of a scalar and a
vector, the latter consisting of three independent unidimensional components. Hamil-
ton imposed the conditions that the fundamental unit vectors be subject to noncom-
mutative rules of multiplication:
1 21 3 == -1 31 2 == 1 1
1 31 1 == -1 11 3 == 1 2
1 11 2 == -1 21 1 == 1 3
566 Vectors l\1ATHEMATICAL SUPPLEMENT: PART II

with the additional requirement that

1i == 1~ == 1i == -1

As a consequence of the distributive law, he was able to show that the product of two
quaternions is a quaternion. Two special cases of this general result are noteworthy:
First, the product of a quaternion with a vector can be made to result in any arbitrary
vector, by suitable choice of the coefficients in the quaternion. Thus the quaternion
proved to be an operator capable of rotating a vector to any new arbitrary direction
and altering its length by any prescribed factor. This was the goal Harniltbn had been
seeking originally in his effort to extend Wessel's construction to three dimensions,
Second, the product of t\VO vectors is a quaternion, the scalar part of which is the
negative of what is now called the dot product, and the vector part of which is now
called the cross product. This result is rich in physical applicability, a point which was
well made by Hamilton in his textbook Elements of Qualern1.'ons, published in 1866 the
year after his death. In this text Hamilton also presented another of his inventions,
the del operator, and exhibited the concepts which are now known as gradient, diver-
gence, and curl. He considered this work to be his crowning achievement and did secure
one lifelong champion in I). G. Tait (1831-1901). Despite this, quaternion theory
never gained wide popularity in the scientific community. It was unnecessarily encum-
bered in that vectors were only a part of quaternions, and the results of most operations
gave mixed physical interpretations. What has endured from all of Hamilton's tre-
mendous labors in this field was his demonstration of a self-consistent algebra which
did not require the commutative law of multiplication to hold, and his introduction
of the del operator.
A year after Hamilton's first announcement of the quaternion theory, Grassman
(1809-1877) published in Germany a remarkable treatise, Die Lineale Ausdehnungs-
lehre, ein neuer Zweig der JIllaihemaiik, concerned with algebra in n-dimensional space.
Most of the ingredients of what are now called vector and tensor analysis were embodied
in Grassman's work as special cases. To appreciate the breadth of Grassman's view,
one must realize that, except for the contemporary work of Cayley, no one else at that
time was thinking beyond Euclid's three dimensions, Even Hamilton's quaternions,
while being a four-tuple, were restricted to three dimensions in their vector character.
Like Hamilton, Grassman was a widely gifted man. Accomplished in philosophy,
harmony, physics, and the Sanskrit classics, he was a pious husband, who supported a
wife and nine children from his meagre earnings as an elementary school teacher.
Grassman invested what time he could find, stretching over three decades of his career,
in the algebra which he proudly referred to as a new branch of mathematics. Its devel-
opinent was a rich source of satisfaction to him, ranking perhaps only behind theology
as a central force in his life.
Briefly stated, Grassman's algebra is concerned with the concept of hyper-numbers,
of which an example is the polynomial

in which the am coefficients are ordinary numbers and the quantities 1 m are primary
units upon which Grassman imposed a variety of conditions. The sum of t\VO such
SECTION V.1 Historical Survey 567

hyper-numbers is given by

and multiplication and division of hyper-numbers by ordinary numbers follow the eon-
ven tional rules of algebra.
The product of t\VO hyper-numbers can be written as

GraSS111an imposed the conditions

l~ = Imln = 0
to obtain what he called the inner product, and the conditions

l~ == 0
to obtain the outer product, thus rejecting, as did Hamilton, the inviolability of the
commutative law of multiplication.
In three dimensions, these results are recognized as the dot and cross product, and
Grassman explained their significance in great detail. He also considered higher prod-
ucts, including the triple scalar product associated with volume in three dimensions.
Another type of product, which Grassman called "open," or "indeterminate," led to
what is now called a matrix, and Grassman clearly anticipated the later work of Cayley
in this field. Quarternions can be included in this generalized vector algebra as a very
special case, and the theories of determinants and tensors are also embodied in the
general development.
Unlike Hamilton, Grassman received no honors in his lifetime. His philosophical
interests led him to endow his theory with the greatest possible generality, and the
first edition of A usdehnutujslehre (1844) was heavily burdened with philosophical
abstractions. The fact that it was all hut ignored by mathematicians spurred Grassman
to greater efforts to gain recognition for his theory. Eighteen years later, he published
a second edition of A usdehnungslehre, which was extensively revised and greatly
expanded but which was hardly less incomprehensible, The combination of a generalized
theory which broke with tradition to plow new and difficult ground, plus his own
obfuscations, doomed the second edition to the same fate as the first. Grassman aban-
doned mathemat.ics, 50 years ahead of his time, with the tribute that was his due
reserved for the twentieth century to bestow posthumously.
Probably the figure who had the most influence on the shaping of vector analysis
as it is now used was Josiah Willard Gibbs (1839-1903). Renowned for his work in
statistical mechanics and America's outstanding mathematical physicist of the nine-
teenth century, Gibbs was perhaps better qualified than either Hamilton or Grassman
to sense the mathematical needs of scientists. He blended the D10re useful features of
both men's work into a treatise entitled Elements of Vector Analysis, printed in pam-
phlet form (1881-1884) for the private use of his students. In the preface, Gibbs
acknowledged a similarity between his development and quaternions, but indicated
a stronger relation to the work of Grassman of which he was intimately aware, In
effect, Gibbs employed the three-dimensional form of Grassman's general algebra,
568 11 ectors XL\THEJL\TICAL SUPPLE~\tIENT: PART II

taking the Iundamcntal units to be mutually orthogonal in space, and stripping Grass-
111an'S development of the confusion inherent in its generality. As such, he was dealing
with vectors unshackled from their wedding to scalars, an intrinsic feature of the
quaternion theory. However, Gibbs did retain the various del operations introduced
by Hamilton and clearly illuminated their physical significance.
Although privately published, Gibb's pamphlet became widely known and precipi-
tated a prolonged controversy over the relative merits of quaternions and vector
analysis. He received strong support in this controversy from Oliver Heavisidc (1850-
1925) who published in 1893 a text entitled Eleciromoqneiic Theoru. 1\ long chapter of
this book was devoted to the development of vector algebra and analysis with numerous
practical applications. His point of view was harmonious with that of Gibbs; they
principally differed in notation. The end result of this controversy over quaternions
versus vectors was predictable on practical grounds, and with the death of Tait in
1901, quaternions quietly moved into history.
In 1901, E. B. Wilson published an excellent and exhaustive text entited Vector
Analusis based on the lectures of Gibbs, and this book was instrumental in establishing
its wide use as a mathematical tool. The first significant presentation of the vector
method to appear on the Continent was contained in Foppl's Geometrie der vVirbelfelder
(1897) which received extensive distribution in a revised version written by IVI. Abra-
ham and published in 1904. The impact of these t\VO texts, and others which followed,
has placed vector analysis in its secure and rightful position as a standard part of the
mathematical education of students of science.

V.2 SCALARS AND VECTORS

Many physical quantities permit a mathematical description, and for some a magni-
tude will suffice. 'I'hus the temperature of a chemical solution, the size of an audience,

FIGURE V.I Representation of a vector.

the entropy of a gas, the yield of a cornfield, can all be described by a real number. These
are known as scalar quantities. t
However, the statement that a shell is traveling at 2,000 It/sec is incomplete, No
one will deny that in this instance the direction might prove to be a highly valuable
piece of information. Similarly, all displacements, velocities, accelerations, and forces
are completely defined only when both their magnitudes and directions are specified.
These are called vector quantities and the branch of mathematics which is concerned
with them is known as vector analysis.
t The discussion will be enlarged to include complex scalars and complex vectors in Section V.23.
SECTION V.3 The Addition Law for Vectors 569

The simplest example of a vector quantity is a displacement. It may be represented


pictorially by a directed line segment, as shown in Figure V.I. The length OA represents
the magnitude of the displacement ; the orientation of the line OA and the arrow serve
to indicate the direction of the displacement. If the actual place of occurrence of the
displacement is significant, 0 can represent the initial point and A the terminal point.
As indicated by the use of the symbol a in Figure V.I, a vector will be denoted in this
text by boldface type and its magnitude by the same symbol in italic type.

V.3 THE ADDITION LAW FOR VECTORS


Addition of vectors is permissible if (1) they represent quantities of the same dimen-
sions, and if (2) they are either free or fixed and acting at the same point. t

FIGURE V.2 The sum of two vectors.

The law of vector addition follows naturally from the concept of successive directed
displacements, For free vectors, if the tail of vector b is placed in coincidence with the
head of vector a, and a vector c is drawn from the tail of a to the head of b, then c is

FIGURE V.3 The poralleloqram lino of vector addition.

said to be the sum of a and b and the addition operation is written


c=a+b
This construction is shown in Figure V.2, from which it is apparent that an equally
valid method for determining the SUIn of t\VO vectors would be through the use of a
parallelogram with sides a and b, as shown in Figure V.3. c can then be identified as a
t The problem being considered determines whether a vector is fixed or free. For example, the weights
of soldiers standing on a bridge may be represented by vectors directed downward at the appropriate
spots. To obtain the total live load on the bridge's supports, these vectors may be moved parallel to
themselves, until they are at a common point, and then summed. However, if the partial load at each
support is to be computed, the positions of the vectors are also important. They are the same vectors
in both problems, but in the first case they are free, whereas in the second case they are fixed.
570 Vectors MATHEl\IATICAL SUPPLEMENT: PART II

diagonal of the parallelogram, with the tails of 8, b, and c coincident. Since this alter-
native method is also applicable to fixed vectors, it is the one which shall be adopted.
It is customarily referred to as the parallelogram law of vector addition.
Since the result of t\VO successive directed displacements is independent of their
order, it follows that the commutative law holds for vector addition, namely that

FIGURE V.4 Vector subtraction.

a + h = h + a. This is consistent with the introduction of the parallelogram law,


and the fact that opposite sides of a parallelogram are equal and parallel.
If c = a + h = 2a = 2b, in which by the symbol 2a is meant a vector in the same
direction as a and twice as long, then it is evident from either Figure V.2 or Figure V.3

I
I
I
I
I
I
I
Ia + b
I
I
I

---1--_
~ I
"'"...... I

I . . . . . "'"~+c
a I "'" . . . . .
"""' ..................
a+b+c
FIGURE V.5 The addition of three vectors.

that a = h. Thus two vectors are equal if they have the same magnitude and a com-
mon direction.
If c = a + b = 0, c is said to be a null vector, and it is equally evident from the
t\VO figures that a and b have the same magnitude but are oppositely directed. The
null equation can be rewritten in the form a = - b, providing the interpretation that a
vector and its negative have the same magnitude, but point in opposite directions.
SECTION V.3 The Addition Law for Vectors 571

This permits extension of the concept of vector addition to the case of subtraction of one
vector from another. By writing a - b = a + (- b) and using the construction indi-
cated in Figure V.4, the difference is obtained easily.
To add three free vectors, let them be connected as shown in Figure V.5. From
inspection, it follows that

a + b + c = a + (b + c) (a + b) + c
= (b + c) + a = b + c + a
Thus in the addition of three free vectors, the commutative and associativelaws hold,
and the order in which the three vectors are added is immaterial. These ideas are readily
extended to fixed vectors, and by induction, to the addition (or subtraction) of four or
more vectors.
EXAI\1PLE V.1
In the absence of wind, an airplane flying due west is able to average 200 mph. If a wind is
blowing from the northwest at 40 mph, at what bearing must the pilot set the course if the
plane is actually to be traveling west? What will be its ground speed?

w __

To solve this problem, the navigator can draw a vector a 40 units long (by some con-
venient scale) from the northwest, thus representing the displacement caused by the wind .
.As shown in the figure, he can then account for his air speed by drawing a vector b 200 units
long, starting from the head of a, and such that the head of b is on the same horizontal line
as the tail of a. The resultant c can then be drawn from the tail of a to the head of b so
that c = a + b. This ensures that the resultant ground velocity is due west, as required.
By graphical measurements from the figure, or by trigonometry, the navigator finds that
the bearing (J is 8.1 deg north of west and that the ground speed is 170 mph.
EXAMPLE V.2
Many theorems of plane geometry can be proved 'with considerable economy by the use of
vector algebra..As an example, one can show that for any quadrilateral, if successive mid-
points of the sides are joined, the figure thus formed is a parallelogram.

To show this, imagine that the sides are free vectors, such that a is the directed line
segment drawn from P, to P2, b is the directed line segment drawn from P2 to P 3 , etc.
572 Vectors l\1ATHEMATICAL SUPPLEMENT: PART II

Since the quadrilateral is a closed figure, it follows that a +b+c+d = O. But


e= id + ia f = ~-b + tc
and therefore
e+f=O c = -f
'rhus e and f are parallel and of equal length, and therefore the inscribed figure is a
parallelogram.

V.4 THE MULTIPLICATION OF VECTORS BY SCALARS


The product of any vector a with any real number a will be written as aa and defined
to be a vector whose magnitude is lala. Its direction is the same as that of a if a is posi-
tive, and opposite to that of a if a is negative. This definition obviously includes division
of a vector hy a scalar, since a can be written as the reciprocal of another scalar.
Up to this point, the discussion has been confined to real vectors, but the concept
of a vector can be extended usefully to the complex domain. This can be done by taking
the product of a real vector a with any complex number l' == a + j{3. The result can
be written ')'3, a com plex vector whose real and imaginary parts, o a and (3a are
real vectors which obey all the rules for vectors so far enunciated. Thus the product
of a real vector and a real number becomes a special case of the product of a real vector
and a complex number. Complex vectors will be treated 11101'e fully in Section V.23.
All the rules of scalar algebra apply to the multiplication of a real vector and a real
scalar. As examples,
a({3a) == (a{3)a == (3(aa)
(a + (3)a == aa + (3a
a(a + b) == aa + ab
It is now possible to explain why velocity, acceleration, and force are vectors which
obey the parallelogram law of addition. This law, as has been seen, is a logical state-
ment of the SU111 of directed displacements, But velocity can be defined as the limiting
ratio of displacement to time interval. Thus if
v - liITI f1£ 1 v _ lim f1£2
1 - 6t~O ~t 2 - ~t~O ~t

then Vl + V2 =
lim f1l l + ~l2
~t~O t1t

and the sum of t\VO velocities is found by sumrning t\VO displacements (which sum
obeys the parallelogram law of addition), dividing by the scalar factor tst, and taking
the limit. Similarly, summing t\VO accelerations amounts to summing t\VO velocity
increments (which sum has just been shown to obey the parallelogram law), dividing
by the factor at, and taking the limit, Therefore accelerations are vectors which obey
the parallelogram law of addition. In like manner one can show that forces, which
involve the time rate of change of momentum, also add according to the parallelogram
law, since momentum is the vector velocity multiplied by the scalar mass.

V.5 RESOLUTION INTO COMPONENTS


The parallelogram law of addition provides a useful way to decompose vectors. Sup-
pose a real vector a is acting at a point P. Choose three different lines PP 1, PP 2, and
SECTION \T.5 Resolution into Components 573

PP3 through P, not all coplanar. Find a vector 31 along PP 1 such that a - a , lies in
the plane containing PP 2 and PP 3 • Then find vectors 32 and 33 along PP 2 and PP 3
respectively, such that 32 + 33 = a - al. This ensures that

and a is said to have been resolved into components along the lines PP 1, PP 2 , and PP 3•
For the important case in which PP l , PP 2 , and PP 3 are mutually perpendicular,
the components are uniquely determined and form the sides of a rectangular parallelo-
piped, as shown in Figure V.5. The magnitudes are then simply related by the expres-
sion a 2 == ai + a~ + ai. To find aI, a2, or a3~ one need only drop a perpendicular from
the head of a to PP 1, PP 2 , or PP 3 and measure the projection.

------------/1
/
/
/
a,
// I
/ I
f-- I
I
I at P /
( I /

/------ 1//
PI
FIGURE V.n Resolution of a vector into coniponenis.

It will often be convenient to describe the position of the point P in Cartesian coor-
dinates. The lines PP 1, PP 2 , and PP3 are then taken parallel to the X, Y, and Z axes
and the process is known as resolving a into its Cartesian components. For this case,
the notation can be improved by introducing unit vectors] lx, l y, and l z • These vectors
have a dimensionless magnitude of unity and are oriented parallel to the coordinate
axes in the direction of increasing x, Y, and z, The quantity lxa~ is then understood to
be a vector of magnitude laxl which points along the X axis. It is positively or negatively
directed according to whether ax is positive or negative. With this convention it is
possible to find three real numbers ax, all , and at: such that

and
(V.I)

Addition or subtraction of two vectors then becomes a purely algebraic matter through
the relation
(V.2)
t Some authors use the symbols i, j, and k to represent these unit vectors. However, in texts on electro-
magnetic theory, this can cause confusion since i may stand for current density, j for the imaginary
unit, and k for the wave number.
574 Vectors MATHEl\1ATICAL SUPPLEMENT: PART II

EXAl\IPLE V.3
A river which is 2 111i. wide flows due south at a speed which is to be determined. Let identical
rowers in identical rowboats start out from a C0111ffiOn point by the west bank. As suggested
in the figure, rower A proceeds to the east bank and then returns to the starting point, with
his total route lying in an east-west line. Rower B goes straight south far 2 lui. and then
turns around and rows back to the starting point. The difference in elapsed times is 0.357
hours. If each man rows at a steady rate of 2 111ph relative to the water, what is the speed of
the river current?

This problem derives some importance from the fact that it is analogous to the Michelson-
Morley experiment (see Chapter 2). '1'0 solve it, let the X axis point east and the Y axis point
north and denote the river current by -l y vr • Then for rower A on his first leg, the total veloc-
ity relative to the ground is

in which () is the angle north of east at which he must point his rowboat in order to proceed
due east. Thus

Val = 2 cos () Vr = 2 sin ()

Similarly, on the return leg, rower A establishes a total velocity relative to the ground which
is given by

in which 8' is the angle north of west at which he must point his rowboat in order to proceed
due west. Th us
Va 2 = 2 cos ()' V r = 2 si n ()'
so that 0' = 8 Va 2 = Va 1
SECTION V.5 Resolution into Components 575

Rower .::1 spends the same time on the first leg that he does on the second, and his total
elapsed time is
4 2
t --------
1 - Val - [1 - (v r / 2)2]}2

As for rower B, his downstream velocity is - l y(2 + vr ) and his upstream velocity is
l y(2 - vr ) . Therefore, his total elapsed time is
228 2
lz = -- + -- = -- = ----
2 +V r 2 - Vr 4 - v; 1 - (vr / 2) 2

If the two times are compared, it is apparent that rower B takes longer. Since the difference
in elapsed times is known, one can write

t - t = 0 357 =
2 1 •
2
[1 _ (Vr/2)2]}~
{ I
[1 - (Vr/2)2]~~
- 1}

Solving for the river current gives v,. = 1.0 mph.

Similarly, it may prove convenient to describe the position of the point P in cylindri-
cal coordinates. The lines PP 1, PP 2 , and P P 3 are then taken in the radial, angular,

X
FIGURE V.7 Unit vectors in cylindrical coordinates.

and axial directions, as shown in Figure V.7. Once again, unit vectors can be introduced
in the positive directions of these three coordinates so that a 111ay be written
a = lrar + l¢a¢ + lza z
However, an important essential difference is to be noted. Whereas in Cartesian coordi-
nates, the unit vectors have directions which are independent of the (x,Y,z) coordinates
576 Vectors ~VIATHEi.\L\ TICAL SUPPLEi.\IENT: PAHT II

of the point P, in cylindrical coordinates the directions of both I, and l¢ depend on


the rand c/> coordinates of the point P. Thus in cylindrical coordinates, one can express
addition or subtraction of t\VO vectors in the fonn
a ± h == l r (a r ± br ) + l¢(a o ± blj» + lz(a z ± bz )
only if the vectors are acting at the same point. If they are fixed vectors, this is an
automatic condition. If they are free vectors, in moving one vector from its original
point to a common point, account must be taken of the shift in direction of I, and l¢.
In effect, the shifted vector must be re-resolved into new components. 1.01' this reason
Cartesian coordinates is the natural system in which to translate vectors.
In like fashion the position of the point I~ can be described in spherical coordinates,
as shown in Figure V.8. The lines PP 1, PP 2 , and PP 3 are then taken in the radial,

x
FIGURE V.8 Unit vectors in spherical coordinates.

latitudinal, and azimuthal directions and unit vectors can be introduced in the positive
directions of these three coordinates so that a may be written
a == lra, + loao + l¢a¢
Once again, addition or subtraction of two vectors can be expressed in the form
a ± b == lr(a r ± br ) + lo(ao ± bo) + l¢(a¢ ± b¢)
with the precaution that the vectors are acting at the same point, since the directions
of all three unit vectors depend on the position of the point P.

V.6 MULTIPLICATION OF VECTORS-THE DOT PRODUCT

The multiplication of a vector by a scalar has already been introduced. Additionally,


it is convenient to define t\VO operations involving the multiplication of a vector by a
SECTION V.6 Multiplication of Vectors-The Dot Product 577

vector, each of which has extensive physical application. The first of these operations
is called the dot product (or scalar product, or inner product) and is symbolized in the
form a · b. By definition,
a · h = ab cos 8 (V.3)

in which 8 is the angle measured from a to b, as shown in Figure V.9. Note that the
dot product can be positive or negative according to whether 8 is less than or greater
than 90 deg.

a
FIGURE V.9 N olation for the dot product.

The dot product is a permissible operation if the vectors are free, or are fixed and
acting at the same point. They need not have the same dimensions. Since h · a =
ba cos ( - 8) == ba cos 0, it follows that
a · b == b · a (V.4)

and thus the dot product is commutative.

a
FIGURE V.lO Distributive law for the dot product.

Pictorially, a · b may be interpreted as b times the projection on b of a, or as a times


the projection on a of b. Thus in forming the dot product a · (b c), it can be seen +
from Figure V.IO that

a · (b + c) = a proj, (b + c) == a proj, b + a proj, C

and therefore that


a · (b + c) == a ·h + a · C (V.S)

Thus the dot product obeys the distributive law, This result can be used to expand
products such as
.578 Vectors :MATHEMATICAL SUPPLElVIENT: PART II

When it is recognized that L, · L, = 1, L, · l y == 0, etc., the expansion reduces to


(V.6)
Equation (V.G) is a statement of the dot product in terms of Cartesian com ponents.
In particular
(V.7)
The dot product may also be used to find the angle between t\VO vectors through
the relation
a·b
cos 0 = (V.8)
ab
Finally, if a · b = °and a > 0, b > 0, then a must be perpendicular to b.
EXAl\1PLE V.4
Let a be any vector, and let it make angles a, {3, l' with the X, Y, and Z axes. Then, since
L, is a unit vector,
lx · a = ax = a cos a, etc.
and a = a(l x cos a +
ly cos (3 +
lz cos 1')
from which it follows that
cos" a + cos! {3 + cos- l' =
The three quantities
ax ay az
cos a =- cos {3 =- cos l' = ­
a a a
are called the direction cosines. All three are needed to determine uniquely the direction of a.

EXAl\1PLE \T.5
The dot product is very useful in Iormulating energy problems. As an example, let a particle
of 111aSS In 1110ve along; the path C from point P, to point P2 under the action of a force F.
The path C need not lie in a plane and F need not be constant. At a general position on the
path, the mass ni undergoes a displacement «e. Since dt has a magnitude de and the direction
of the motion of the particle (i.e., tangent to the path), it follows that F · dt is the comp o-
SECTION V.7 The Equation of a Plane 579

nent of F parallel to the displacement, multiplied by the displacement. But this is the ele-
ment of work diV done by F on the particle during the displacement di. Thus

J J
P2 pz

TV = dTV = F· d£ o.»
Pi Pi

is the work done by the force F on the particle as it moves from PI to P z.


As a specific illustration of this relation, let the particle move along the helix given by the
parametric equations
x = cos t y = sin t z= t
with the initial point PI taken to correspond to t, = 0 and the terminal point P« taken to
correspond to t 2 = 21r. Let the force be given by
F = 1x x +1 yY - 1zz2
'fhen
dt = 1x dx + 1 y dy + L, dz = [L, ( - sin t) + 1y cos t + 1z ] dt
F = 1 x cos t + 1u sin t - 1 zt 2
F · dt = (- cos t sin t + sin t cos t - t 2) dt

J
211"
W = - t 2 dt = _ (27r)3
o 3
The negative sign attached to this answer means that on the average the force F was opposed
to the particle's motion. Thus rather than doing work on the particle, the force F was the
means whereby energy was taken from the particle.

V.7 THE EQUATION OF A PLANE

The dot product is also useful in the derivation of the equation of a plane, a result
which is needed in the discussion of electromagnetic plane waves. Consider the plane
shown in Figure V.11. Let P o(xo,Yo,zo) be that point in the plane which is closest to the

/
/
.r-:
/
Po

x
FIGURE V.II Geometry of a plane.
,580 11 ectors MATHE1\IATICAL SUPPLEMENT: PART II

origin, and let P(x,Y,z) be any other point in the plane. Further, let a be drawn from
the origin to Po, and let h be drawn from Po to P. Then a is perpendicular to the plane,
whereas h lies in the plane, and thus

which can be rearranged to give

Zo J
[ 2
(xo + Yo + zo)/-
2 2 1(, Z

= (x~ + y~ + Z~)~2 (V.IO)


This is the equation of the plane, and is in the standard form Ax + By + Cz = D.
As written, it reveals all the important properties which characterize the plane.
The coefficients of x, Y, and z in Equation (V.IO) are the direction cosines of any
line perpendicular to the plane and thus define its tilt. The constant term on the right
gives the distance from the origin to the plane. The equation of any plane can be put
in the form (V.IO) merely by normalizing the coefficients. This is accomplished when
the sum of the squares of the coefficients of x, Y, and z is unity.
It follows that the equations for a family of parallel planes will have identical x, y,
and z coefficients when written in the normalized form (V.lO). The distance between
any two members of the family will then be the difference between the constant terms
on the right sides of the equations.

EXAl\IPLE V.6
What is the equation of a plane which is parallel to the plane 6x - 3y + 2z = 7 and three
times as far from the origin?
One can begin by normalizing the equation of the first plane. Since
[(6)2 + (-3)2 + (2)2P2 = 7
the normalized form is
~x - fy + tz = 1
and thus the first plane is 1 unit from the origin. J.~ suitable choice for the second plane is
therefore
~x - -¥-y + tz
=3
An equally valid choice is the plane on the other side of the origin, for which one obtains the
equation

v.s MULTIPLICATION OF VECTORS-THE CROSS PRODUCT


The second operation to be introduced which involves the multiplication of a vector
by a vector is known as the cross product (or vector product or outer product) and is
symbolized in the form a X b. By definition,
a X b = ablsin Olin (V.II)
in which () is the smaller angle measured from a to b as indicated in Figure V.12, and
In is a unit vector normal to the plane containing a and h. Which direction in takes
perpendicular to the plane is determined by the right-hand rule. If a right-hand screw
were placed parallel to In and rotated in such a way as to take a through the angle ()
SECTION ,r.8 Multiplication of Vectors-The Cross Product 581

into b, the longitudinal displacement of the screw would be in the direction of In. An
alternative method to determine the sense of in is to use the thumb, index, and middle
fingers of the right hand to indicate respectively the directions of a, h, and In.

FIGURE V.12 N olation for the cross product.

From this definition, it is evident that taking the cross product of two vectors yields
a new vector perpendicular to both original vectors. Its magnitude is deterrnined by
the magnitudes of the original vectors and the sine of the angle between them. The
distinctions between the scalar product and the vector product should be carefully
noted.
Forming the vector product is a permissible operation if the vectors are free, or fixed
and acting at the same point. They need not be of the same dimensions.
An immediate consequence of the rule for determining the sense of ln is the relation
a X h == -b X a (V.12)
and thus the commutative law is not obeyed. The vector product does conform to the
distributive law, however, and one may write
a X (b + c) == a X h + a X c (V.13)
This result may be established with the aid of Figure V.13. band c (and hence b + c)

b'
FIGURE V.13 Distributive law for the cross product.
582 Vectors l\1A'rHElVIATICAL SUPPLEMENT: PART II

lie in a common plane which, in general, is tilted with respect to the plane of the paper,
since in the figure a is taken perpendicular to the plane of the paper. Their projections
in the plane of the paper are h', c', and (b + c)' respectively. It follows that

a X b == a X})' a X c == a X c' a X (h + c) == a X (h + c)'

But if the triangle of sides b', c', and (b + c)' is rotated 90 deg and altered by the
factor a, the result is a triangle of sides a X h', a X c', and a X (b + c)'. Thus

a X (h + c)' == a X c' + a X h'


and thus
a X (h + c) == a X h + a X c

which is (V.13). This proof may be extended readily to a X (b +c+ d), (a + h) X


(c + d), etc. In particular it 111ay be applied to the expansion

When it is recognized that L, X L, == 0, L, X 1y == 1z , etc., this expansion reduces to

i, ly lz
a X b == ax ay a, (V.14)
i; by i;

Equation (V.14) is oft-used and worth committing to memory. It can be observed


that h X a corresponds to an interchange of t\VO rows of the determinant, which gives
the required change of sign.
The staternent that L, X l y == L, im plies the use of right-handed coordinate systems.
This convention has been adopted almost universally and will be followed in this
text. Thus for example, (x,y,z), (r,cI>,z), and (r,(},cI» will be the chosen order of writing
coordinates in Cartesian, cylindrical, and spherical systems. The positive directions
of the three coordinate axes will always be chosen so that L, X 1y = 1z , I, X 1<p == 1z ,
and IT X 1 8 = I<p, etc. Strict adherence to this convention is necessary if confusions
of sign in formulas for vector products are to be avoided.
Finally, if aX b == 0, and a > 0, b > 0, it follows that a and b are parallel.
EXAMPLE V.7
A. parallelogram of sides a and b has an area given by S = ab sin (), with ()an interior corner
angle. If the sides of the parallelogram are treated as vectors, this can be written S = X la hi·
I t is even 1110re useful in this case not to take the absolute value. The vector product itself
not only has a magnitude equal to the area of the parallelogram, but also points in a direc-
tion perpendicular to the plane of the parallelogram. This direction serves to describe the
orientation in space of the plane surface bounded by the parallelogram, and thus more infor-
mation is provided when the relation
S=aXb (V.I5)
is used.

EXAl\'IPLE V.8
The result of the previous illustrative example can be combined with the dot product to
express the volume of the parallelopiped shown in the figure. If the three sides are represented
SECTION V.8 Multiplication of Vectors-The Cross Product 583

\
\
\
\
\
\
\
\
-------\
\
\
\

b
c,
by the free vectors 3, b, the area of the base is Ib X cl
and the height is the projection of a
on a line perpendicular to the base. Thus the volume of the parallelopiped is

v = a· (b X c)
When one makes use of the cross product in the form (V.14), the volume can be written
simply as the determinant

This result is often called the scalar triple product.

EXAMPLE V.9
Consider a force F acting on a pivoted rod as shown in the figure. The torque exerted by this
force is T = Fr sin 0, and it tends to rotate the rod counterclockwise. By letting r be a free

I
---+ --- - ---

I
F

vector drawn from the pivot point out to the point of application of the force, one can write
T = r X F (V.16)
The magnitude of T is the torque. The direction of T is the axis of rotation for the torque,
and application of the right-hand rule t yields the information as to whether the torque is
clockwise or counterclockwise. The formula (V.16) is a general result, not restricted to the
case that the rod and the force F lie in the plane of the paper.
t The right thumb is placed in the direction of T and the remaining four fingers then point in the
rotational direction of the torque.
584 Vectors MA'rHEl\1ATICAL SUPPLEMENT: PART II

EXAMPLE V.IO
I magine a body to be rotating about a fixed axis at an angular velocity w, as shown in the
figure. Let r be drawn from a fixed point 0 on the axis of rotation to a point P in the body.

"The angular rotation may be described by the vector w. The magnitude of w is the angular
velocity; its direction is the axis of rotation. The sense of w (up or down) is chosen to corre-
spond to the right-hand rule, thus yielding the proper direction of rotation (clockwise or
counterclockwise). It follows that the velocity of the point P is given by
v=wXr (V.17)
Equations (V.IS), (V.I6), and (V.17) illustrate the power of the vector method. In
a simple, terse vector formula, one is able to include all the information otherwise con-
tained in a scalar equation and its associated paragraph of explanation concerning
directions.

V.9 THE DERIVATIVE OF A VECTOR


Frequently the value (both magnitude and direction) of a vector will depend on one
or more variables. As examples, the velocity of an accelerating particle is a function
of time, the gravitational attraction of the earth is a function of height in the atmos-
phere, and the intensity of a radio signal depends on distance and direction from the
transmitting station.
The rate of change of such vector functions with respect to the functional variable
is often of considerable interest, and one is thereby led to the notion of a vector deriva-
tive. I t proves useful to adopt a definition for this derivative similar to the one used in
ordinary calculus. Thus suppose there is a vector f related to a scalar parameter s in
such a way that as 8 varies continuously, f does also. This dependency of f on 8 will be
indicated by writing f'(s). If ~f denotes the increment in f due to an increment ~s in
the parameter s, then the vector derivative df/d8 is defined by

d£ = lim M = lim r(s + b.s) - res) (V.I8)


ds .18-+ 0 tJ.s .1s-+ 0 tJ.s
SECTION V.9 The Derivative of a Vector 585

It should be emphasized that the increment Lif is often not parallel to f, indicating
that not only is the magnitude changing, but the direction as well. In such cases df/ds
is not in the same direction as f. This is an essential feature of the vector derivative
which distinguishes it from the scalar derivatives encountered previously in earlier
studies of the calculus. One should guard against overlooking this feature.
Formulas for the vector derivatives of common functional combinations can be
established in a manner completely analogous to what is done in ordinary calculus.
Thus for example, if u is a continuous scalar function of the parameter s and if v is a
continuous vector function of the same parameter, then

d(uv) = lim (u + Liu)(v + Liv) - uv


ds ~s-+O Lis
(V.19)
dv du
= u-
ds
+ v-
ds

Similarly, if Ul, Uz, VI and vz are all continuous functions of the parameter s,

(V.20)

These two formulas can be used to establish the general derivative of f'(s) in terms of its
Cartesian components. If one writes f'(s) in the form

since the unit vectors are not functions of s, it follows from (V.19) and (V.20) that

df _ 1 dfl 1 df2 1 df3


'ds - x d s + Yd s + zd S

d 2f = 1 dZf1 1 d 2f2 1 d 2f 3
d8 z xd2
S + Yd2
S + Zd2
S
(V.21)

Formula (V.21) is not, in general, extendable to other coordinate systems. For example,
if f were a function of the angular variable cP in cylindrical coordinates, and if one
expressed f in terms of its cylindrical components, namely,

application of the formulas (V.19) and (V.20) would have to account for the fact that
1T and let> have directions which are functions of cP.
EXAMPLE V.II
With respect to the origin of a Cartesian coordinate system, the position of a particle as a
function of time can be expressed in the functional form r(t). Such a particle might be follow-
586 Vectors l\1ATHEMATICAL SUPPLEMENT: PAHT II

ing a path as shown in the figure. The instantaneous velocity of the particle is given by

v(t) = dr = lim Lir = lim r(t + Lit) - r(t)


dt .1t-+O Lit .1t-+O j)l

and it is evident from the construction in the figure that v(t) is tangent to the path.
As a specific illustration, suppose that the trajectory is given by

in which the coefficients k; are constants. Then

and the particle is drifting at constant speed in its projection on the ..:YY plane but has a
linearly changing veloci ty in the Z direction. I ts acceleration is

a constant (which in some problems might be due to gravity).

EXAl\1PLE V.12
Consider a particle which is going around in a circular orbit of radius ro at the constant angu-
lar rate w rad/sec. As suggested by the figure, a reference line can be established which inter-
sects the orbit at a position which the particle occupied at t = o. The instantaneous angular
position of the particle can then be specified in terms of the angle cP by the relation cP = wt so
that dcP/dt = t»,
SECTION V.I0 Tangent Lines and Tangent Planes 587

If the origin of cylindrical coordinates is taken at the center of the orbit, the position vec-
tor of the particle is
r(t) = l,ro

which is a function of time by virtue of the fact that the direction of the uni t vector 1, is a
function of time. When use is made of (V.19), the velocity of the particle is seen to be

dr d d de/> d
v = -- = ro- (1,) = ro- (1,) - = wro - (1,)
dt dt de/> dt de/>

The angular derivative of 1r can be determined with the aid of the vector diagram showing

I, at successive positions d<t> apart, from which it is evident that d(lr) = let> d<t>, and thus

which leads to the result that v = 1¢wro.


This problem could also have been solved in the manner of Example V.IO.

V.10 TANGENT LINES AND TANGENT PLANES

As indicated in the previous section, a space curve can be described by the vector
function res), in which s is a parameter (not necessarily distance or time). Since dr/ds
is tangent to the space curve, this provides a means for developing the equation of the
tangent line. As an example, in rectangular coordinates the space curve could be
represented by
r = lxfl(s) + lyf2(s) l zf 3(s) +
dr , , ,
so that ds = l,jl(s) + l yf2(s) + lzf3(s)

If Po(xo,Yo,zo) is the point on the space curve corresponding to the parameter value
So, and P(x,Y,z) is any other point on the line which is tangent to the space curve at Po,
then the equation of this tangent line is simply

x - xo Y - Yo z - Zo
-,-- == -,-- = - , - (V.22)
f1(SO) f2(SO) 13(SO)
588 11 ectors lVIArrHEMATICAL SUPPLEl\,IENT: PART II

EXAMPLE V.13
Consider once again the helix of Example V.5, given by the parametric equations
x = cos t y = sin t z= t
A point Po on this helix is given by the position vector
r(to) = lx cos to + ly sin to + lzt o
and, through use of (V.22), the line tangent to the helix at Po is seen to satisfy the equation
x - cos to y - sin to
- - - = z-t o
- sin to cos to
Consider next a scalar function F(x,y,z). The locus of points for which

F(x,Y,z) = F(xo,Yo,zo) = K (V.23)

with K a constant, is simply the collection of points P(x,y,z) for which the function F
has the same value that it does at the point Po(xo,Yo,zo). This locus is usually a surface.
This fact may be appreciated by recognizing that for rnost functions F of practical
interest, (V.23) can be solved in the form
z = f(x,y) (V.24)

Equation (V.24) determines a point P(x,y,z) for each value of x and y. When x and y
vary continuously over some region, P(x,y,z) will, in general, vary continuously over
the corresponding portion of a surface in space.
In what follows attention will be restricted to functions F for which C'/.23) is the
equation of a surface. At the point Po, the total differential is

dF == -
er dx + -aF dy + -er dz (V.25)
ax ay az
in which it is assumed implicitly that the three partial derivatives in (V.25) have been
evaluated at Po. If the neighboring point PI reached by the increments dx, dy, and dz
is also on the surface defined by (V.23), then dF = O. It is interesting to interpret this
as an equation of the form
N· dt = 0 (V.26)
aF aF aF
+
ay +
in which N == lx - ly - lz- (V.27)
ax az
and dt == I, dx + 1]) dy + L, dz (V.28)

Since Po and PI are both points in the surface, dt (which connects them), is tangent
to a space curve which lies in the surface and passes through Po and Pl. N is perpendic-
ular to this space curve, and since PI can be any neighboring point in the surface, it
follows that N is normal to all the space curves in the surface which pass through the
point Po, and is therefore normal to the surface itself at the point Po.
If one refers to the development of Section V.7, the equation of the tangent plane at
Po is of the form Ax + By +Cz = D, and
aF aF aF
A
ax
B=­
ay
c=-
az
SECTION'l.ll Generalized Coordinates 589

The value of the constant D 111ay be found by inserting the point Po into the general
equation of the plane which gives
aF er aF
Xo -
ax + Yo -ay + Zo -
az == D

Thus the equation of the tangent plane is

(x - xo) -
er + (Y - Yo) -
er + (z - zo) -
er == 0 (V.29)
ax ay az
This result could have been achieved in another way. Since l x (x - xo) + ly(Y - Yo) +
l z (z - zo) is a vector which lies in the tangent plane, its dot product with N, a vector
normal to the plane, should be zero.
EXAMPLE V.14
Let the problem be posed to find the general expression for the plane tangent to a sphere of
radius r. Since the equation of a sphere can be written

F(x,Y,z) = x2 + y2 + Z2 - r2 = °
it follows that, at the point Po(xo,Yo,zo),
aF
- = 2xo
aF
- = 2yo
aF
- = 2zo
ax ay az
When one uses (V.29), if P(x,Y,z) is any point in the tangent plane, then

2xo(x - xo) + 2yo(Y - Yo) + 2zo(z - zo) = °


which can be rewritten as
Xox + yoy + zoz = r 2

This is the general expression for the equation of a plane tangent to a sphere of radius r at the
point (xo,Yo,zo) . .As a special case, if the point of tangency is (r,O,O), then the equation of the
tangent plane is x = r, an expected result.

V.ll GENERALIZED COORDINATES

It may be observed that the definitions of all vector operations so far introdueed->
addition, subtraction, the scalar and vector products-are independent of any coordinate
system. This will also be true of all vector operations to be defined subsequently. This
feature of the vector method is one reason for its wide applicability to physical prob-
lems. A physical law is independent of any coordinate system used to describe it; the
mathematics employed should reflect this independence.
This does not mean that one should never use coordinate systems with vector analysis.
On the contrary, it means that any admissible coordinate system 111ay be used as the
frame of reference for vector analysis with the assurance that the results have physical
applicability. A COm1110n choice is Cartesian coordinates, and earlier sections have
shown the forms which S0111e vector relations assume in that frame.
There are many other useful coordinate systems, and which one is employed depends
on the problem being considered. The physical sY111D1etry usually suggests the proper
choice. Thus plane electromagnetic waves are 1110st simply described in Cartesian
590 Vectors lVlATHEl\lATICAL SUPPLEMENT: PART II

coordinates because the equation of a plane has its simplest form in that system.
Cylindrical coordinates are used to show the transfer of power in a coaxial cable, since
the cable consists of t\VO concentric cylindrical conductors. Radiation from antennas is
best expressed in spherical coordinates because, from a great distance, antennas appear
to be point sources. The elliptical cross section of the electrodes of some modern VaCUUlTI
tu bes indicates the choice of eIliptical cylindrical coordinates, etc.
I t is a needless expenditure of effort to derive the forms for all vector operations in
each and every coordinate system which is considered to be potentially useful. A
preferable procedure is the following:
1. Establish the form that the vector operation assumes in Cartesian coordinates.
2. Specify a general coordinate system by the transformation equations

w = 03(X,Y,Z) (V.30)

3. Transform the Cartesian expression found in Step 1 by means of Equations (V.30)


to obtain the general expression for the vector operation in (u,v,1.v) coordinates.
The specific expression desired may then be found by substituting the appropriate
coordinate system for (u,v,w) in the general expression deduced in Step 3.
The only restriction on the procedure just outlined is that (u,v,w) should constitute
an admissible coordinate system. Admissibility in this context arises from the notion
that physical space can be described by a Cartesian reference frame, in the sense that
to each physical point in space there corresponds a unique triplet of numbers (x,Y,z).
Similarly, to each triplet of numbers (x,Y,z) there corresponds a unique point in physical
space. t This notion is often stated more briefly by saying there is a one-to-one cor-
respondence between the Cartesian coordinates and physical space. If (u,v,w) is to be
an admissible coordinate system, there 111USt also be a one-to-one correspondence
between points in physical space and the triplets of numbers (u,v,w). Mathematically,
this means that the functions (V.30) must be single-valued and defined for all values of
x, Y, and z, This will ensure that to every triplet (x,y,z) there corresponds a unique
triplet (u,v,w). It means further that it must be possible to solve (V.30) in the form
(V.31)

in which G 1, G2 , and G3 are single-valued functions, defined for all values of u, v, and w·
This will ensure that to every triplet (u,v,w) there corresponds a unique triplet (x,Y,z).
When these conditions are met, (u,v,w) is a completely admissible coordinate system.
A criterion for admissibility can be developed with the aid of a geometric interpreta-
tion of Equations (V.30). If the discussion of Section V.lO is recalled, U = Ol(X,y,Z)
may be thought of as a family of surfaces, with different members of the family identified
by different values of u. Similarly, v :: g2(X,y,.z) and w = g3(X,Y,Z) may be treated as
families of surfaces. These three families should be mutually intersecting so that the
three surfaces Uo = gl(X,Y,Z), Vo = g2(X,Y,Z), and Wo = g3(X,y,Z) have in C0111mOn the
single point Po(uo,vo,wo).
The surfaces u« and Vo intersect in a line whose parametric equations are given by

y = Gz(uo,vo,w) (V.32)

t Such a concept of space is usually called Newtonian.


SEC'I'ION V.II Generalized Coordinates 591

This is called a 1V line because, along its length, w varies but u and v do not. It intersects
the v line
(V.33)
and the u line
(V.34)

at the point Po(uo,vo,wo). These are the three coordinate lines through the point. The
geometric features just described are suggested by Figure V.14.

FIGURE V.14 Generalized coordinates.

The displacement

+ +1
1
d fJ = 1:r d X 1v d y 1z d Z = [1 x aG 1Y aG aG3]
aw dw
2
.(I + + aw aw Z
(V.35)

obtained by forming differentials from Equations (V.32), lies in the w line at Po if the
partial derivatives in (V.35) are evaluated at Po. Thus the vector

T = 1 aG1 1 aG2 1 aG3


w z aw + Y in» + Z aw (V.36)

is tangent to the w line at Po. Similarly, the vectors

T = 1 aG 1 1 aG2 1 aG 3 (V.37)
v x av + Y av + Z av
T = 1 aG1 1 aG 2 1 aG 3
u x au + Y au + Z au (V.38)

are tangent to the v and u lines at Po respectively.


592 Vectors IVIATHEMATICAL SUPPLEMENT: PART II

If no one of these three tangent vectors is a null vector, and if they are not coplanar,
then the three surfaces Uo, vo, and Wo will intersect in a point Po (rather than, say, in a
line). When the result of Example V.8 is employed, it follows that the necessary and
sufficient condition to ensure a one-point intersection is

aG 1 aG 2 aG 3
- - -
au au au
aG 1 aG 2 aG 3
J - - - ~O (V.39)
av av av
aG aG 2 aG 3
- 1 - -
aw aw aw

Since the inequality (V.39) is also the necessary and sufficient condition that (V.30)
can be derived from (V.31), it follows that the (u,v,w) coordinate system is completely
admissible if J ~ 0 for all values of u, v, and w.
The determinant of (V.39) is known as the Jacobian.
EXA1'vlPLE V.I5
As an illustration of the Jacobian test for admissibility, consider the transformation to
cylindrical coordinates defined by
y
¢ = arctan- Z = Z (V.40)
x

These equations represent, in turn, a family of concentric cylinders, a family of half-planes


which have the Z axis in common, and a family of planes which are all perpendicular to the
Z axis. The geometric relations between cylindrical and Cartesian coordinates were shown
in Figure V.7.
Equations (V.40) may be solved to give

x = r cos ¢ y = r sin 4> Z = Z (V.41)

The Jacobian for cylindrical coordinates is thus

cos ¢ sin 4> 0


J= -rsincP rcos4> 0 =r (V.42)
o 0 1

Except along the Z axis where r = 0, cylindrical coordinates are seen to be an admissible
coordinate transformation. The reason that the Z axis is inadmissible can be understood if

°
one considers the triplet (0,0,3) in Cartesian coordinates. This triplet corresponds to all the
triplets (0,cP,3) in cylindrical coordinates for S <p S 21r. Thus a one-to-one correspondence
does not exist for this point or any other on the Z axis.

EXAl\'1PLE V.16

Another type of transformation, of importance in classical mechanics, connects one Cartesian


coordinate system to another. With reference to the figure, let XYZ and X' Y'Z' be the two
sets of axes and let the origin of the primed coordinate system occupy the point (xo,Yo,zo) in
the unprimed coordinate system at time t = o. Further, let the primed origin have the con-
stant velocity u = lxux + +
lyu y lzu z relative to the unprimed system. Also, let cos xx',
cos xy', ... , cos zz' be the cosines of the angles between the various axes of the two frames.
SECTION V.11 Generalized Coordinates 593

Z'

x
If (x,y,z) is the position of a particle at time t, as seen by an observer 0 who is stationary in
the unprimed system, and if (x',y',z') is the position of the same particle, at the same time, t
as seen by an observer 0' who is stationary in the primed system, then

r = lxx + lyY + lzz


is the instantaneous position vector which 0 attaches to the particle, and

is the instantaneous position vector which 0' attaches to the particle..Additionally, 0


describes the instantaneous position of the primed origin by the vector

If it is assumed that the two observers agree about distance measurements (another classical
assum ption) then these position vectors can be connected by the relation
r' = r - R

If this equation is dotted successively with lXI, 1y and 1z the result is the coordinate trans-
l ) 1

formation

x' = (x - Xo - uxl) cos xx' + (y - Yo - uyt) cos yx' + (z - Zo - uzt) cos zx'
y' = (x - Xo - uxt) cos xy' + (y - Yo - uyt) cos yy' + (z - Zo - uzt) cos zy' (V.43)
z' = (x - Xo - uxt) cos xz' + (y - Yo - uyt) cos yz' + (z - Zo - u;:t) cos zz'

Equations (V.43) are known as the most general Galilean transformation. Their physical
interpretation is that the primed system is moving relative to the unprimed system at a
speed U = (u; + u;
+ U;)~2. This motion is in an arbitrary direction with respect to the
XYZ axes and is also in an arbitrary direction with respect to the X'Y'Z' axes. Furthermore,
the primed origin is in an arbitrary position relative to the un primed origin at t = O. The
special case U = 0 yields the most general static transformation between two Cartesian
coordinate systems.
t This is a classical assumption that time is the same in both frames-an assumption which is challenged
in Einstein's special relativity. See Chap. 2.
594 Vectors IVIATHEMATICAL SUPPLEMENT: PART II

The Jacobian can be deduced readily from (V.43) and is


cos xx' cos yx' cos zx'
J = cos xy' cos yy' cos zy' (V.44)
cos xz' cos yz' cos zz'
It is left as an exercise to show that (V.44) is different f1'0l11 zero, and therefore that the
general Galilean transformation is completely admissible. (See Problem V.21 at the end of
this supplement.)

V.12 ELEMENTARY GEOMETRY IN GENERALIZED COORDINATES

By virtue of Equation (V.8), the angle of intersection of a u line with a v line can be
expressed in the form

in which Ttl and Tv are given by (V.38) and (V.37). Therefore a necessary and sufficient
condition that the intersection of a u line with a v line be a right angle is
aG l aG l2+ aG aG
z 30 aG 3 aG
- - +
au av
- -
au av
- - =
au av
(V.45)

Similarly,
aG aG l
-av-
t
+ - -
aG so.
2+ - - = = 0 aG 3 aG 3 (V.46)
aw av aw av aw
aG aG aG aG
-2-au+ 30
aG l aG l
- - + -aw
aw au
2
-oW-3 =
au
(V.47)

are the necessary and sufficient conditions that the intersections of a w line with a v
line and a u line respectively be right angles.
A coordinate system for which Equations (V.45), (V.46), and (V.47) are satisfied at
all points is said to be orthogonal. t Computation of physical quantities is vastly simpli-
fied in orthogonal coordinate systems, and in the remainder of this supplement only
orthogonal euetems will be considered.
An example of this simplification is the expression for an element of length in gen-
eralized coordinates. Since
dt = [(dX)2 + (dy)2 + (dZ)2P~

forming the total differentials of Equations (V.31) yields the transformation to (u,v,w)
coordinates, namely,

dt = {[ (~~lY + ea~2Y + ea~aY] (dU)2 + [(aa~lY + ea~2Y + (a~3Y] (dV)2


+ [(aGl)2 + (aG2)2 + (aG3)2] (dW)2 + 2 [aGl aG l + aG 2aG 2+ aG aaGa] du dv
aw aw in» au av au av au av
aG2aG aG-aGa] 3 aG3] d d }~~
dG l so,
+2 [ - - + - - + - dudw+2 [aGl- -aG+ - aG - aG+ -
aG -
1 2 2
2 3
vw
au in» au in» au aw av aw av aw ov in»
t It should be noted that when a coordinate system is orthogonal not only do the coordinate lines
intersect at right angles but the coordinate surfaces do as well.
SECTION V.12 Elementary Geometry in Generalized Coordinates 595

When attention is restricted to orthogonal coordinate systems, this reduces to


df = [(hI du)2 + (h 2 dV)2 + (h 3 dW)2P2 (V.48)
in which the hi are called scale factors and are given by

(V.49)

(V.50)

(V.51)

A general space curve can be characterized by the parametric equations


u == U(S) v == V(S) w == W(S) (V.52)

and the length of a section of this curve is therefore

l = f de = I [(hi ~:Y + (h ~:Y + (h ~:Yr' ds


81
2 3 (V.53)

EXAMPLE V.17
An element of length in cylindrical coordinates can be found if one chooses Equations
(V.41) to define the functions Gl , G2, and G3 and performs the differentiations indicated in
Equations (V.49)-(V.51). One obtains

hi = [(cos cf»2 +
(sin cJ»2 + (0)2P2 = 1
h 2 = [(-r sin cf»2 (r cos cf»2 +
(0)2]Yl + = r
h 3 = [(0)2 +
(0)2 + (1)2P2 = 1
from which
df = [(dr)2 + (r dcJ»2 + (dZ)2]~2
With reference to Figure V.7, this result is consistent with the observation that elemental
displacements from the point P in the directions of the three unit vectors are dr, r d<l>, and dz.
EXAMPLE V.i8
In cylindrical coordinates the parametric equations of a helix are

r = ro
in which ro, aI, and a2 are constants and 8 is a parameter. The length of one turn of this helix
can be determined with the aid of (V.53).

J
82

l = [(alro)2 + (a2)2]~2 ds
81

= (82 - 81) [(alrO) 2 + a~]~~'


But a182 - a181 = 211'" and therefore

l = 27r [r~ + (~Yr


In generalized coordinates, the equation of a surface may be expressed in the form
W == w(u,v) (V.54)
596 Vectors lVIATHE1VL\TICAL SUPPLE?\'IENT: PART II

The family of coordinate surfaces U = gl(X,Y,Z) will intersect this surface in a grid of
lines. The family of coordinate surfaces v = g2(X,y,Z) will also intersect it in a grid of
lines. These t\VO grids will cross each other and thus divide the surface into a mesh of
surface elements, The area of the element cut out by the surfaces u« Uo + du, Vo, and
Vo + dv can be found as follows:
In Cartesian coordinates the parametric equations of the line of intersection of
(V.54) and Uo are
x = G1[uo,v,w(uo,v)] Z = G3[uo,v,w(uo,v)] (V.55)
so that
ec, aG Bu:
dv + - - dv
l
dx = -
av aw av
aG 2 ino
dy = -dv
av
+ -aG- d
aw av
2
v

dz =
aG 3 dv + -aG 3 -ino dv
-
ov aw av
The elemental length of the line (V.55) which is contained between the surfaces Vo and
Vo+ dv is thus given by the vector

V dV= [1' x (aGl


-av + aG
- I aw)
+ 1 (aG2
-av + aG 2-
aw) + 1z (aG3
-av + aG -aw)] dv
aw -av - - 3
(V.56)
in» dV
y
in» dV
Similarly, the elemental length of the line of intersection of (V.54) and Vo which is con-
tained between the surfaces Uo and Uo + du can be expressed by the vector

Uu= [1 (-aCI
d aGI aw) +l (dC2
au+ - -
z
au . -
in»
aG 2aw) +
au+ - - au y
in»
1z (ac'J
- aG -aw)] d u
au+ -
aw au
3 (V.57)

The area of this surface element is the magnitude of the vector

dS = U X V du dv (V. 58)

If C is any simple closed curve lying wholly in the surface (V.54), the area enclosed
by Cis
U2 V2(U)

S = f dS =
Ul
J du flU X VI dv
Vl(U)
(V.59)

in which VI(U) and V2(U) are the projections of the two partial curves into which C is
divided by the extreme values UI and U2.
A common problem is the computation of an area in one of the coordinate surfaces.
This corresponds to setting w = ui« in (V.54). In this case (V.59) reduces to

f du f
U2 V2(U)

S = h 1h 2 dv (V.60)
Ul Vl(U)

EXAMPLE V.19
Compute the total surface area of the figure bounded by the plane z = - 2, the cylinder
x2 +
y2 = 2, and the plane z = x + 2.
SECTION V.12 Elementary Geometry in Generalized Coordinates 597

In cylindrical coordinates these three surfaces can be described by the equations z = - 2,


r 2, and z = r cos cP + 2. The first and second are seen to be coordinate surfaces and
=
(V.60) may be used. Thus

J dq, J r dr
27r 2

81 = = 411"
o 0
27r 2 cos 4>+2 27r

82 = J dq, J 2 dz = 2 J (2 cos q, + 4) dq, = 1611"


o -2 0

The third surface is not a coordinate surface so the more general expression (V.59) must be
employed. Making use first of (V.56) and (V.57), one obtains

V = lx cos c/> + l sin c/> + L, cos c/>


y

V = -lxr sin cP + lyr cos c/> - l sin zr c/>


so that

U X V = r[l x ( - sin 2 cP - cos! c/» + l y( - sin c/> cos c/> + sin c/> cos c/» + lz(cos 2 cP + sin" c/»]
IV X vi = V2r
V2 J J r dr =
27r 2

83 = dq, 411" V2
o 0
Thus 8 = 81 +8 +8 2 3 = 25.667r

The volume of an object may be computed if one finds the expression for the volume
of a differential element and then sums the volumes of all the elements contained within
the boundary surface of the object. In generalized coordinates the differential element
consists of the region within the surfaces uo, uo + du, VO J Vo + dv, Wo, and Wo + dw. The
edges of this element are sections of coordinate lines and are given by the vectors

eo, aG + L, -aG3] du
[ 1 au +
2
U du = 1y
x -
au au
-

v dv = [1 aG
iJv +
x
1
1
Y
aG
dV +
aG3]
2
dV
1
dv
Z

W dw = [1 iJG 1
x aw + 1
Y
aG
inn
2
+ 1 aG3]
dW Z
dw

When one makes use of Example V.8, the volume of this element is

(D X V) · W du dv dw = J du dv dw (V.51)

in which J is the Jacobian, given by (V.39). Since attention is being restricted to


orthogonal coordinate systems, J = h 1h 2h 3, and the volume contained within a simple
closed surface S is given by
U2 V2(U) W2(U.V)

V =
Ul
J du J Vl(U)
dv
Wl(U.V)
J h 1h 2h s dw (V.52)

EXAMPLE V.20
Find the volume of the figure bounded by the surfaces z = - 2, x 2 + y2 = 2, and z = x + 2.
(Cf. Example V.19.)
598 Vectors MATHEMATICAL SUPPLEMENT: PART II

Using (V.62), one obtains

J J J
2 21r T cos ct>+2

V = dr df/J r dz = 1611"
o 0 -2

V.13 ADDITION, SUBTRACTION, AND MULTIPLICATION


IN GENERALIZED ORTHOGONAL COORDINATES

If attention is centered on the vector operations already defined, in generalized orthog-


onal coordinates a vector may be expressed in terms of its components by writing

in which the unit vectors are taken in the positive directions along the coordinate lines
at the point where a is acting.
Addition or subtraction of two vectors then takes the form

(V.63)

Since (u,v,w) is an orthogonal coordinate system,

a · b = a.b; + avb v + awbw (V.64)


1 i, i,
11

and a X b = au a, a; (V.65)
b; i; bw

Thus the general expressions for all these operations have the same form as the earlier
specific expressions encountered in Cartesian coordinates. Several precautions should
be noted, however. In (V.63) the vectors must have the same dimensions; this is not a
requirement in (V.64) and (V.65). In all three formulas the vectors must be either free
or fixed and acting at the same point. If they are free, care must be observed in trans-
lating one vector from its original point to a common point. If the directions of the
unit vectors change in the process, the vector must be re-resolved into components
appropriate to the new point. The reader is referred to a discussion of this difficulty
given in Section V.5.

V.14 GRADIENT

Let 'l!(x,Y,z,t) be a scalar function of space and time with continuous first derivatives.
At any specific time, by assigning a sequence of constant values to '1', a family of surfaces
can be described. The locus of points P(x,Y,z) which satisfy w(x,Y,z,lo) = K«; with K o
and, to constants, is one of these surfaces. It has been shown in Section V.IO that a
vector normal to this surface is
a'l! a'l! a'1'
1-+1-+1- (V.66)
x ax Y ay Z az
SECTION V.14 Gradient 599

in which it is understood that the three partial derivatives have been evaluated at the
point in question in the surface. Though not so identified at the time, this normal
vector is called the gradient of '1'.
Because of the form of (V.66), it is convenient to introduce a vector operator by the
definition
a a a
v = 1x ox + 1 oy + 1. oz
y (V.67)

The symbol V is widely called the del operator though some writers prefer the name
"nabla." Wilson gives an engaging footnote on the origins of these appellations.'
Operation on a scalar function 'It with the del operator (V.67) will then mean

a + i a + L, -a) 'It = a'1' + 1 a'1' + lz-


a'1'
V'lt = (1 x -
ax ay
y -
az
lx -
ax ay
y -
a.Z
(V.68)

and the gradient of 'It usually will be referred to by the symbolic shorthand V'!t.
In addition to being normal to a surface over which '1J is constant, the gradient has
several other interesting properties. Suppose P is a point in the surface '1'(x,Y,z,lo) = K«
and that the gradient V'l! is computed at P. Let any space curve be drawn through P
and characterized by the parametric equations

The differential change in '1J along this space curve will be

a'1' a'1' a'1'


d'l! = - dx + - dy + - dz
ax ay az

in which dx, dy, and dz are components of the displacement

dfl df2 df3]


dt = l x dx + 111 dy + L, dz == [ l x -
ds
+1 y -
ds
+ L, -d ds
s

which extends from the point P to a neighboring point on the space curve. Thus

d'l! a'1' dx
- == - -
.u ax dt
+ -aya'l! -dy
dt
+ -a'1' dz
- =
az .u
V'lt· lr (V.69)

1 Somewhat paraphrased, after saying in the main text, "It has been found by experience that the

monosyllable del is so short and easy to pronounce that even in complicated formulae in which V
occurs a number of times no inconvenience to the speaker or hearer arises from the repetition." Wilson
adds in a footnote: "Some use the term Nabla owing to its fancied resemblance to an Assyrian harp.
Others have noted its likeness to an inverted ~ and have consequently coined the none too euphonious
name Atled by inverting the order of the letters in the word Delta. Foppl avoids any special designa-
tion and refers to the symbol as 'die Operation V.' How this is to be read is not divulged." From
J. W. Gibbs and E. B. Wilson, Vector Analysis, p. 138, C. Scribner's Sons, Inc., New York, 1901.
600 Vectors IVIATHEMATICAL SUPPLEMENT: PART II

in which l T is a unit vector tangent to the space curve at P and pointing in the direction
of dt.
It follows from (V.59) that the maximum spatial rate of change of 'It is along that
space curve which is normal to the surface 'I!(x,y,z,lo) = Ki; for then V'I! · IT is opti-
mized. Since 1T is a unit vector, this maximum value equals I"'ltl. Therefore the gradient
is not only normal to the surface \II = Ku; but it has a magnitude and direction which
denote the maximum spatial rate of change of 'It. It is these properties of the gradient
which are responsible for its name,
The physical meaning of gradient makes the task of finding its expression in gen-
eralized coordinates a sim ple one. One need only find the rates of change of 'I! with
respect to distance in three mutually perpendicular directions, multiply these three
quantities by the appropriate unit vectors, and sum,
With reference to (V.48), which is the expression for an element of length in gen-
eralized orthogonal coordinates, hI du, h 2 dv, and h 3 dw can be recognized as being
differential lengths in the directions of the three coordinate lines. Since these three
coordinate lines are mutually perpendicular,

(V.70)

is the formula for the gradient of 'I! in generalized orthogonal coordinates.


Because dl = [(dx)2 + (dy)2 + (dz)2]~~ in Cartesian coordinates, it follows from
(V.48) that the scale factors are h l = h 2 = h 3 = 1 in that system. If this special case is
inserted in (V.70), one obtains (V.68) as expected.
EXAr"IPLE \T.21

A 1'00111 6 m square and 3 111 high has a temperature distribution given by

T(x,Y,z,t) = 300 [1 + (z - 3) ex - 6) e-t]


100

in which the origin of coordinates has been taken in a lower corner of the 1'00111 with the Z
axis pointing; up. The temperature is in degrees Kelvin, distances are in meters, and time is
in hours.
The gradient of this temperature distribution is

This result 111ay be plotted by employing a useful technique known as flux mapping.
At every point in the room there is a value of the gradient possessing magnitude and direc-
tion. If, at every point in a figure representing the room, lines are drawn in the direction of
the gradient, with the density of these lines numerically equal to the magnitude of the gradi-
ent, all the information about gradient can be displayed in the figure. The arrows on the
lines indicate direction, and a practiced eye soon associates closely bunched lines with a high
value, and widely spaced lines with a low value. This technique can, of course, be utilized to
plot any vector function. The plot is often called a flux map or a field representation arid the
lines themsel ves are called flux lines or field lines. Common usage of this technique has caused
vector functions and their field representations to become interchangeable concepts, in much
the same way that the equation of a line and the graph of a line have become equivalent.
SECTIONV.14 Gradient 601

The temperature gradient at t = 0 is shown plotted in the figure, in which a two-dimen-


sional representation is sufficient, since there is no y dependency. The bunching of the lines
indicates that the greatest spatial rate of change of temperature occurs near the Y axis

300
305
310 ~~q;
:+
315 4,.~v

320
325
330
335
340
345

350
X

It is also possible to plot in this figure profiles of the surfaces over which the temperature
is constant, and several of these profiles are indicated. The flux lines representing gradient
are seen to be perpendicular to these isotherms in accordance with the earlier discussion
concerning properties of the gradient.
When one displays, such as has been done here with isotherms, a regularly spaced selection
of the members of the family of surfaces over each of which a scalar function is constant, the
resulting figure is said to be a field representation of the scalar function. Where the surfaces
are closely bunched, the function is changing rapidly as the point of interest moves from one
surface to the next. Where the surfaces are widely spaced, the function is changing slowly.
Of course, along one of these surfaces, the function remains constant. Because of this ability
to represent scalar functions by such a plot, the terms scalar field and scalar function have
also grown to be used interchangeably.
Since heat flows in the direction of temperature decrease, the figure suggests the presence
of a heat source concentrated near the Y axis with the flow of heat being along the flux lines
against the direction of the arrows. The exponential time factor indicates that this heat
source is dying out, with an ultimate uniform temperature of 300 deg predicted. If the field
map were plotted for a succession of times the shape would be unaffected, but the gradient
lines would thin out everywhere and the isotherms would spread further apart.

EXAMPLE V.22
Newton's gravitational law states that the force on a mass m due to other masses tru,

(V.71)

in which G is a universal constant, r, is the distance between m and m i, and l r i is a unit


602 Vectors MATHEl\fATICAL SUPPLEMENT: PART II

vector drawn from m, toward m. If a scalar function <f> is defined by the expression

eJ.>(x,Y,z,t) = -Gm LN

i= 1
mi
r,
(V.72)

in which (x,y,z) are the instantaneous positional coordinates of m, then

f = - V<f> (V.73)

To show this let (Xi,Yi,Zi) be the instantaneous positional coordinates of m, and then

But

and therefore

as was to be proved.
<f>(x,y,z,l) is called the potential energy function. If the mass m is 1110ved over a surface on
which <f> is constant, since the force on 'In is everywhere perpendicular to this surface, no work
is done on the mass 'In and the energy of the system is held constant. If the mass 'In is moved
from one potential surface to another, there is a component of the force on m which is along
the path and work is done on 'In, which changes the energy of the system. The value of <I> is
the energy potentially available if the mass m is removed infinitely far from the proximity
of all the other masses. Since gravitational forces are always attractive, this energy poten-
tially available is negative. Thus additional energy has to be put into the system of masses in
order to remove ni from the influence of the other bodies.
This serves to explain why <f> was defined, in (V.72), as the negati ve of a sum of in trinsi-
cally positive terms, It also becomes physically clear why f should be the negative gradient
of the potential energy function. The force on m is toward the remainder of the mass system,
whereas the potential energy decreases as 'In approaches the other masses.

EXAMPLE V.23
What are the expressions for gradient in spherical and cylindrical coordinates?
To answer this question for spherical coordinates, one can refer to Figure (V.8) and write
the transformation equations linking spherical and Cartesian coordinates ei ther in the form

r = [x 2 + y2 + Z2P~
[x2
(} = arctan - - - -
+ y2]~~ ¢ =
y
arctan- (V.74)
z x
or in the form
x = r sin (} cos ¢ y = r sin (} sin c/> Z = r cos 8 (V.75)

Equations (V.74) are a specific example of (V.3D) whereas Equations (V.75) are in the form
SECTION V.1S Divergence 603

of (V.31). Using (V.49)-(V.51), one finds for spherical coordinates that

h3 = r sin e (V.76)

Thus a differential path length in spherical coordinates is

se = 1 dr
T + 18 r de + 1<1> r sin e del> (V.77)

With reference once again to Figure V.8, this result is consistent with the interpretation of
displacements from the point P, in the directions of the three unit vectors, caused by the
increments dr, dO, and del>.
When (V.70) is used, the expression for gradient in spherical coordinates is seen to be

(V.78)

To answer the question for cylindrical coordinates is a simple matter, since the scale fac-
tors were found in Example (V.17). Once again using the general form (V.70), one can deter-
mine that the expression for gradient in cylindrical coordinates is

(V.79)

V.15 DIVERGENCE

The discussion in Example V.21 established the idea that any vector function may be
represented by a field of lines whose density and direction at every point give the mag-
nitude and direction of the vector at that point. If one visualizes a general vector
function A(x,y,z,l) in this manner, and if a differential element of area dS is erected at a
point P(x,Y,z), then A · dS is instantaneously the number of field lines piercing dS.
When time is imagined to be "stopped," if the element of area dS is oriented in a
succession of directions, A · dS will vary, and will be a maximum when dS is transverse
to the field Of, in other words, when dS is in the same direction as A. For any orientation
of dS, the component of A in the direction of dS will be A · dSjdS which follows readily
from the definition of the dot product.
In physical problems it often occurs that the field lines which represent a vector
function are discontinuous, which implies that the number of lines which enters a
volume element is different from the number of lines which emerges. This is an important
effect which is measured by the divergence of the vector function.
To put this concept into more specific terms, let A(x,Y,z,t) be any vector field and let
Ll V == ~x Lly ~z be a volume element which has as one of its corners the point P(x,Y,z).
This situation is depicted in Figure V.I5. If the net efflux from Ll V is taken to mean the
excess of emerging lines over entering lines (the net efflux may be negative) then

f A·dS
s
(V.80)

is numerically equal to this net efflux. S is the six-sided surface surrounding LlV, and
dS must everywhere be chosen to have the direction of the outward-drawn normal] if
t The convention of the outward-drawn normal will be adhered to in this text.
604 Vectors MATHEMATICAL SUPPLEMENT: PART II

x
FIGURE V.I5 Net efflux [rom a oolume element.

(V.80) is to be interpreted as efflux rather than influx. Application of the mean value
theorem gives
J A· dS
s
= Ax(x+ AX, Y + k 1 Ay, Z + k AZ, t) Ay Az 2

- Ax(x, y + k Lly, Z + k, Llz, t) Lly Liz


3

+ A (x + k Lix, y + Ay, z + k Llz, t) Lix Liz


1I 5 6

- Ay(x + k Llx, y, z + k« Liz, t) Lix Liz


7
+ Az(x + kg Lix, Y + k lo Lly, Z + Llz, t) Llx Lly
- Az(x + k ll Llx, Y +k I2 Lly, Z, t) Llx Lly

in which 0 ~ k: ~ 1. If these terms are expanded in a Taylor's series about the point
P(x,y,z), (cf. Part I of this Supplement), there results

J
s
A · dS = ( aAaxx + -aAy
-
ay
+ -aAz)
az
Llx Lly Llz

aA
+- x

ay
(k 1 - k 3) Lly 2 Llz + -aAax y
(k s - k 7 ) Llx 2 Llz

+ -aA
ax
z
(kg - k ll ) Llx 2 Liy + -aA
az
x
(k 2 - k 4 ) Lly LlZ 2

+ -dAy
dZ
(k 6 - kg) Lix LiZ 2 +-
dA z
ay
(k io - k 12 ) Llx Lly 2

+ ...
in which the undesignated terms are all higher order.
SECTION V.I5 Divergence 605

The divergence of A, written div A, is defined as the limiting value of net efflux per
unit volume at P, that is

s
J A·dS
div A == lim (V.81)
L1V~O ~V

Since k, - k; ~ 0, etc., as ~V ~ 0, it follows that

di aAx aA aA
IvA =-+-'
ax ay
+-az
1, z
(V.82)

Equation (V.82) is the expression for divergence in Cartesian coordinates. Since the
right side of (V.82) is precisely V · A, in which V is the del operator for Cartesian
coordinates defined in (V.67), the divergence is usually symbolized in this way.
The procedure just followed may be repeated to obtain the expression for divergence
in generalized orthogonal coordinates. Let the volume element be contained within the
surfaces, u, U + Su, v, v + ~v, w, and w + ~w. Application of the mean value theorem
to each of the six sides of this volume element gives

J A· dS = Bu(u +
s
LlU, v + i, LlV, W + k 2 LlW, t) LlV Llw

- Bu(u, v + k 3 Su, w + k 4 D.w, t) ~v D.w


+ Bt·(u + k; D.u, v + Su, w + k 6 ~w, t) D.u ~w
- Bv(u + k 7 D.u, v, w + kg ~w, t) ~u D.w
+ Bw(u + kg ~u, v + k lO S», W + ~w, t) ~u ~v
- Bw(u + k ll ~u, V + k 12 ~v, w, t) Su ;j.v
°
in which B; = h 2h 3A u , B v = h 1h 3A v , B w = h1hzA w , and ~ k, ~ 1. If each term of this
result is expanded in a Taylor's series about the point P(u,v,w), and if the limit is taken,
after division by D. V = h 1h 2h 3 ~u ;j.v ~w, one obtains

(V.83)

which is the expression for divergence in generalized orthogonal coordinates.


When one recalls that for Cartesian coordinates h, == h 2 == h., = 1, substitution of
this special case in (V.83) yields the expected result (V.82).
EXAMPLE V.24
Find the expressions for divergence in cylindrical and spherical coordinates.
Since the scale factors for each of these systems already have been determined (see
Examples V. 17 and V. 23), this is a simple matter of substitution in Equation (V.83). For
cylindrical coordinates one obtains
1 a
.
div A =- - (r A r ) + -1 -aA</> + aA
- z (V.84)
r ar r a¢ az
and for spherical coordinates the result is

. 1 a
div A = - 2 - (r 2A r )
1
+ .-.- -a (sin. OAs) + - 1. -aA</>
- (V.85)
r iJr r SIn 0 ao r SIn 0 iJ</>
606 Vectors lVIATHEMATICAL SUPPLEMENT: PART II

Comparing Equations (V.79) and (V.84), one can say that in cylindrical coordinates
the gradient operator is

whereas the divergence operator is


1 a 1 a
- - (T1 )+ -- (1<1>·
r ar
T •
r a¢

Thus these operators are not the same. Similarly, if one were to compare (V.78) and
(V.85), the observation could be made that the gradient and divergence operators in
spherical coordinates are different. Indeed, only in Cartesian coordinates, and only
because in that system hI = h 2 = h 3 = 1, do the gradient and divergence operators
turn out to be identical. Yet it was this identity which caused the suggestion, when
Equation (V.82) was reached, to symbolize div A by V • A.
The use of V • A as the symbol for the divergence of A is widespread, regardless of
the coordinate system being used, and this will cause no difficulty if one remembers
that V is a different operator for divergence than it is for gradient in every system
except Cartesian coordinates. In Section V.I8 it will be found that the del operator for
curl takes still a third form.

V.16 THE LAPLACIAN OPERATOR


The gradient of a scalar function is, in general, a vector function. As such, it may be
represented by field lines (cf. Example V.21) and these lines may originate or terminate
in volume elements, which gives meaning to the concept of the divergence of the
gradient of a scalar- function. If 'l'(x,Y,z,t) is a general scalar function, then V · V'l' will
symbolize the divergence of its gradient. The combination of expressions (V.70) and
(V.83) yields

v · V'l' =
1
h 1h 2h 3
[aau (h2h3 a'1') a(h1h3 a'lt) a (h a'1')J
h: au + av h; a;; + aw F:; dW
1h 2 (V.86)

as the form for this operation in generalized orthogonal coordinates. In the case of
Cartesian coordinates this reduces to

V • V'l' := [~
ax + ~
2 ay
+ ~]
az '1' 2 2
(V.87)

Since, by analogy with Equation (V.7),


a2 a2 a2
ax 2 + -ay 2 + -az 2
V • V = \72 = - (V.88)

it is customary to write (V.87) as V' 2 'lF. The scalar operator V'2, whose Cartesian form is
defined by (V.88), is called the Laplacian operator. From (V.86) the form the Laplacian
takes in generalized orthogonal coordinates is

)+~ (h ~
aw h aw
1h 2

3
)] (V.89)
SECTION V.18 Curl 607

The expressions for the Laplacian in cylindrical and spherical coordinates are listed at
the end of this supplement.
The Laplacian operator may also be applied to a vector, in which case one may write

\72A = - -
1 (a h 2h 3
- -
a + --.-
-
ah -a+ - a h- -a) (luAu + lvAv + lwAw)
1h 3 1h 2
(V.90)
h 1h 2h 3 au n, au av h2 av aw h3 aw

The unit vectors lu, lv, and l w are in general functions of u, v, and w because their
directions may change from point to point even though their magnitudes remain unity.
Care must therefore be exercised in computing the derivatives.
In Cartesian coordinates the unit vectors are not functions of position, so (V.90)
reduces readily to
(V.9I)
The Laplacian of a vector is given for cylindrical and spherical coordinates in Problems
V.26 and V.27 at the end of this Supplement.
The Laplacian of vector and scalar functions arises in the study of differential equa-
tions which stem from a wide variety of physical phenomena, chiefly those concerned
with wave motion.

V.17 THE DIVERGENCE THEOREM

By virtue of the definition of divergence, contained in Equation (V.Si), if A is a vector


field, its net efflux from a volume element dV is V · A dV. This result may be integrated.
Let S be a closed surface (not necessarily simply connected) which bounds a volume V.
The net efflux from V is J v V · A dV, being the number of field lines leaving V minus
the number of field lines entering V. But this result can also be computed from Is A· dS
in which dS is an element of surface area in S to which has been affixed an out.ward-
drawn unit normal vector. Thus

v
J V • A dV = !
,
A · dS (V.92)

Equation (V.92) is known as the divergence theorem. It is valid when A has continuous
first derivatives throughout V and over S, and is applicable only when S is a closed
surface.

v.ia CURL

The last major vector operation to be introduced is curl. Its physical significance can
be anticipated by first discussing an example.
Suppose a small spoked wheel, free to turn on its shaft, is immersed in a stream of
water, as suggested by Figure V.1"6. If the spokes and hub are extremely thin, the
action of the water will be entirely on the rim. If the water is coursing past the rim
more rapidly on one side than on the other, the wheel will rotate. If the center of the
608 Vectors MATH E MA TICA L SU PPL EMEN T: P ART II

wheel is kept at t he sa me point P whil e t he shaft is orie nted in a succession of differ en t


di rect ion s, it will be found in gene ra l t hat t he wh eel will rota t e at a succession of d iffer en t
speeds. F or some orientation of t he sha ft, t his rota tional effect will be a m aximum .
Wh a t is being measured is the te ndency of t he wa ter to "c url" in the neighborhood of
P . If t he wheel is mad e smaller an d smaller it becomes less of a contami nating influence
on t he effect under observation, a nd in t he limi t one would be measuring the curl of
t he water righ t at the point P. T his curl is a vector effect, since it involves t he maximum
rota tional speed of t he whee l a nd also t he orien ta tion of t he sha ft wh en this maximum
speed occurs.

....


FIGU RE V.16 Rotation of submerged wheel.

H ow ca n t h is effect be exp resse d ma thema ti cally ? If A (x,Y,z,t) describes the flow of


wa t er , then t he action at the rim will be pro portiona l to § A . df in whi ch df is an incre-
c
men t of leng th along t he rim and C is the locu s of points (x,Y,z) occupied by t he rim at
time t. (T he sma ll circle attached t o the integra l sign indi ca t es t ha t a complete closed
contour is being taken and t he int egral § A . dt is ofte n ca lled t he circu lation.) In wh at
c
is to follow it will be see n t hat in the lim it , as t he size of th e wheel shrinks t o an infini-
tes ima l, § A . df is a second-order differen tial , an d t hus t he curl of the wa t er a t point
c
P is a ppropriately measured by
¢ A . df
lim -,,-
c _
(V.93)
6S~O t>.S

in which t>.S is the area of t he whe el. The valu e for curl obtained from (V.93) will
depend on th e orientation of the wheel , and thus it is more proper to say t hat (V.93)
measures the curl of t he wa ter around an axis perpendicul ar to t>.S .
Of course, t he concept of curl has wider a pplica bility t ha n t he exa mp le ju st cited .
Any vector field ca n exhibit curl if it yie lds a value up on insertion in (V.93) . Thus the
curl of a gen er al vector fun ction A (x,Y,z,t) at a point P (~,71,r) is defin ed by the following
process: Construct a un it vector l n t hr oug h P in an a rbitrary directi on. Choose a
smooth surface S throug h P suc h t hat ln is norm al to S at P. In S construct an y sim ple
closed pa th C a round P . The curl of A at P , aro un d a n axis in t he di rection of l n, is t hen
defined by the rela tio n
¢ A . df
c
(curl Ah. = lim AS (V.94)
6S~ O ...
SECTION V.18 Curl 609

b.S is the area on S enclosed by C and the direction of integration along C is given by
the right-hand rule. t
The limit in (V.94) must exist and be independent of the shapes of Sand C if (curl A)l n
is to be defined uniquely. It can be shown that these conditions are met if A has con-
tinuous first derivatives in the neighborhood of P.

FIGURE V.17 Circular contour and its projection.

To obtain (curl A)l n in Cartesian coordinates, one may select as S the plane through
perpendicular to in = l xn x + l yn y + l zn z • For C a circle of radius r can be
P(~,l1,r)
chosen with its center at P. The parametric equations of Care

x - ~ = ( 2 r 2)1~ (nxn z cos I/J - n y sin I/J)


nx + ny 7

(V.95)
Y - 'T/ =( 2 r 2) ~~ (nyn z cos I/J
n; + n y
+ ti; sin I/J)
z- s= - (n; + n;)}~ r cos ¢

These equations can be established by recognizing that the projection of C on the XY


plane (see Figure V.17) is an ellipse whose minor axis is in the direction of l xn x + l y n ll
and whose center is at (~,l1,O). The parametric equations of this ellipse are

x' = rn z cos 4> y' = r sin ¢

and a simple rotation of axes gives the parametric equations in terms of x and y. Since
t If the right thumb is placed in the direction of l n , the remaining fingers indicate the proper direction.
610 Vectors l\'IATHEl\iATICAL SUPPLEl\1ENT: PART II

the equation of the plane S is nx(x - ~) + ny(Y - 11) + nz(z - r) = 0, the parametric
equation for z is found by substitution.
From (V.95) an increment along the path C can be determined to be

r d¢ .
df = (2 2)~~ {1x(-nxn z SIn ¢ - n y cos ¢)
n; + ny
+ ly( -nyn z sin ¢ + n, cos cP) + 1A(n; + n~) sin </>J} (V.96)
Then, since
¢ A · ce = ¢ A.(x,y,z,l) de. + ¢ Ay(x,y,z,l) dey + ¢ A.(x,y,z,l) de, (V.97)
c c c c

if Ax is expanded in a Taylor's series, namely,


aA x aA x aA x
Ax(x,y,z,l) = Ax(~,11,r,t) + (x - ~) -
ax
+ (y - 11) -
ay
+ (z - r) -
az
+

and if similar expansions are obtained for A y and A z , substitution in (V.97) gives

¢c A · df = 1rr 2 [n x (-dAayz - dAy)


- '
az
+n y
(aAx
-
oz
-
aAz)
-
ox
+ n z (dAy
-
ax
- -oAx)]
ay
+...

When one divides by the area tlS = 1rr 2 and takes the limit, all the higher order terms
vanish, and thus

[l x (-aayAz- OAy) + l (aAx aAz) + 1 (aA aAx)J


y
(curl A)l = In· - y - - - z - - - (V.98)
n
az az ax ax ay

The direction and magnitude of the maximum curl are therefore given by the vector
which is dotted with In in (V.gg) .. The curl about an axis parallel to the X axis can be
found by setting n y = n, = 0 and n; = 1 in (V.gg), etc. It is therefore completely
appropriate to treat curl A as a vector and write

curI A = 1x ( -aA z -
oy
aAy)
-
GZ
+ ly (aAx
-
oz
- -aAz) + 1 (aA
ax ax z -- -
y aAx)
-
ay
(V.99)

as the expression for the curl of a vector function in Cartesian coordinates. Since (V.gg)
can be put in the alternative form
lit ly
a a (V.IOO)
curl A = -
ax ay
A3: A y

comparison with (V.14) suggests the notation curl A = V X A, in which V is the del
operator defined by (V.67).
SECTION V.1S Curl 611

EXAMPLE V.25
Let the flow of water in a river be represented by the function v = l xK z 2 with K a constant.
Then the curl of the water is
VV x
V X v = ly - = 2Kzl y
vz
which indicates that a waterwheel with its axis transverse to the stream will turn, the
rotation being greatest near the top of the stream.

t --= -
,
h
1
_ ---
--. ---
..x
Since the x, Y, and Z components of (V.99) are precisely the values for curl which
would be found by orienting ln parallel to the three coordinate axes in turn, this fact
suggests a simple way to find the expression for V X A in generalized coordinates.
Suppose that it is desired to find the curl of a general vector function A(u,v,w,t) at the
point P(u,v,w). S can be taken as the u surface through P. The surfaces v ± ~v/2 and
w ± ~w/2 intersect S in such a way as to form a "rectangle" around P. If this "rec-
tangle" is taken as the contour C, application of the law of the mean gives

t A· ae = e; (u, v+ ~v, w+ k ~w) ~w1 - n, (u, v+ k 2 S», W + ~:) ~v

- s; ( u, v - ~~, w + k 3 Su: ) ~w + e, ( u, v + k 4 S», W - ~:) ~1J


in which B; = h 2A v and B; = h 3A w and in which -t ~ k, ~ t. If these terms are
expanded in Taylor's series about the point (u,v,w), one obtains

Cyclical interchange of the coordinates yields as the expression for curl A in generalized
orthogonal coordinates
lu lv lw
--
h 2h 3 h 1h 3 h 1h 2
vxA= a a a (V.IOI)
au av aw
h1A u h 2A v h 3A w
The specific forms which curl takes in cylindrical and spherical coordinates are listed
at the end of this supplement.
612 Vectors l\1ATHEMATICAL SUPPLEMENT: PART II

V.19 STOKES' THEOREM

As an outgrowth of the definition of curl embodied in Equation (V.94), if dS is an ele-


ment of area bounded by the infinitesimal closed contour dC, then

V X A · dS = ¢ A · de (V.102)
dC

This effect may be integrated. Consider the simple closed curve C shown in Figure
V.18. If a surface cap S is constructed with C as its sole boundary, then S can be divided
into surface elements dS. The direction of the outward-drawn normal associated with
dS is determined by first selecting a direction of integration along C. If the right
thumb is then placed along C in this direction, the remaining fingers thread S in the
direction of the outward-drawn normal.

FIGURE V.I8 Stokes' theorem.

Let A be a vector function with continuous first derivatives in a region containing


Sand C. Then integration of (V.I02) gives

S
J V X A · dS = J (¢ dC
A • de) (V.I03)

in which the notation on the right side of (V.I03) indicates that the line integral around
every elemental contour de which divides S is to be included. With reference once
again to Figure V.18, it can be observed that whenever two adjacent surface elements
have a common boundary, a line integration is performed along the common boundary
for each element but in opposite directions. These integrations cancel, and only for those
elements which have an unshared boundary will there be a net contribution. All such
elements border on C and the unshared parts of their boundaries comprise precisely all
of C. Thus (V.I03) can be rewritten

J
S
V X A · dS = ¢ A • de
C
(V.I04)

This important result is known as Stokes' theorem.

V.20 '/ECTOR IDENTITIES


The operations of gradient, divergence, and curl may be applied in many combinations.
One example of this has been noted in the case of the divergence of the gradient of a
SECTION V.21 Green's Integral Theorems 613

scalar function, which led to the definition of the Laplacian operator. Several other
useful relations may be derived from the basic definitions. The reader may wish to con-
vince himself that if <1>, '1t ~ A, and B are regular functions of space and time, then

v (<1>'1') == 'lrV<I> + <I>V'lJ (V.I05)


V(A · B) == (A· V)B + (B · V)A + A X (v X B) + B X (v X A) (V.~06)
v · (<I>A) == <l>V · A + Vel>· A (V.107)
V · (A X B) == B • V X A - A • V X B (V.108)
V X <I>A == <I>V X A + Vel> X A (V.109)
V X (A X B) == A V • B - B V • A + (B • V)A - (A · V)B (V.llO)
V • (v X A) == 0 (V.lll)
V X (V<I» == 0 (V.112)
V X V X A == V(V · A) - \72A (V.113)

I t is sufficient to prove these identities in Cartesian coordinates. Since the basic defini-
tions of gradient, divergence, and curl are independent of any coordinate system, the
above relations will be true in all systems if they are true in one.
Equations (V.I08) and (V.113) are especially important to the subject of electro-
magnetic field theory and are worth committing to memory. Equation (V.III) offers
an interesting sidelight on the del operator since it indicates that V X A is "perpendic-
ular" to V. In this respect V is no different from any other vector. Equation (V.113)
provides an alternative method for computing \72A. (Cf. Problems V.26 and V.27 at the
end of this Supplement.)

V.21 GREEN'S INTEGRAL THEOREMS

An integral formulation due to Green which is useful in the solution of boundary-value


problems can be presented in terms of two identities. If <I> .and 'It are suitably behaved
scalar functions, then the divergence theorem gives

J V • (<I>V'1J) dV = J (<I>V'1f) . dS
v s

However, if one uses (V.I07), the integrand on the left may be expanded, which gives

J (<I>V' 2'1J + V<I> • V'1f) dV J (<I>V'1J) • dS


v
=
s
(V.114)

This is called Green's first identity.


If the roles of ep and '1t are interchanged, one can write

J ('1JV'2<1> + V'1f • V<I»


v
dV = J ('1JV<I»
s
• dS (V.115)

Subtraction of (V.115) from (V.114) gives

Jv (<I>V' 2'1J - '1JV'2<1» dV = J (<I>V'1J -


s
'1fV<I» . dS (V.116)
614 Vectors l\IATHEMATICAL SUPPLEMENT: PART II

which is known as Green's second identity. It may also be written in the form

(V.117)

in which n is a metric variable in the direction of the outward drawn normal.


If one chooses 'It to be a point source function, the value of <P at a point may be
expressed in terms of knowledge of <I> throughout a volume and over a bounding surface.
This is often a useful formulation, especially when the boundary conditions axe simple.
The reader is referred to Chapter 3 for several examples of this technique.

V.22 SOLENOIDAL AND IRROTATIONAL VECTOR FIELDS

The identity V • V X A == 0 leads to the observation that any vector field is divergence-
less if it can be expressed as the curl of another vector field; that is, its flux lines are
continuous, neither originating nor terminating in any volume element. Such a field,
whose divergence is everywhere zero, is said to be solenoidal.
Similarly, the identity V X V<I> == 0 indicates that if a vector field can be expressed
as the gradient of a scalar field, then the vector field is without curl. Such a field, whose
curl is everywhere zero, is said to be irrotational.
All vector fields can be categorized as belonging to one of three types:

1. Irrotational
2. Solenoidal
3. Neither irrotational nor solenoidal

A theorem due to Helmholtz which is concerned with static (time-independent) fields


states, in effect, that under certain conditions a field in the third category can be
expressed as a linear sum of fields from the first two categories. These conditions are so
reasonable as to include all cases of practical physical interest. An alternative way of
stating the Helmholtz theorem is to say that any physically realizable static vector
field is completely determined by its divergence and curl. A proof of this theorem may
be found in many textbooks."
A corollary of the Helmholtz theorem is that any static irrotational vector field is
derivable from a scalar potential function. Thus if E(x,y,z) is a field such that V X E == 0,
then
E = -V<P (V.118)

and application of Stokes' theorem gives

¢ E · ae = sJ V X (- Vel»
c
• dS == 0 (V.119)

Therefore the line integral of an irrotational static field around any closed contour is zero.
2 See, e.g., R. Plonsey and R. Collin, Principles and Applications of Electromagnetic Fields, pp. 29-36,

Mc'Graw-Hill Book Company, New York, 1961.


SECTION V.23 Complex Vectors 615

A second corollary of the Helmholtz theorem is that any static solenoidal vector
field is derivable from a vector potential function. Thus if B(x,Y,z) is a field such that

v· B == 0, then
B==vxA (V.120)

and application of the divergence theorem gives

J B· dS == 0
s
(V.121)

in which S is any closed surface.

V.23 COMPLEX VECTORS

In Section V.4 the notion was introduced that a real vector a could be multiplied by a
complex number 'Y = a + j{3. This generated a complex vector 'Ya whose real and
imaginary parts, aa and {3a were real vectors. By prescribing that aa and ,Ba obey all
the rules for real vectors developed in this chapter, the entire formalism of vector
analysis can be extended to the complex domain, The rules for vector algebra and
complex number algebra combine in a logical manner to yield the results for all opera-
tions on and by complex vectors. Thus, for example, if 'Y == a + j{3 and 'Y' == a' + j{3'
are complex numbers and a and b are real vectors, then

in which the multiplication process 'Y'Y' follows the rule for the product of complex
numbers. The distributive law holds so that

('Y + 'Y')a == 'Ya + 'Y'a


'Y(a + b) == 'Ya + 'Y b
For the addition law one obtains

'Ya ± 'Y'b == (o a ± a'b) + J'C[3a ± ,B'b)


A complex vector may be resolved into components in the usual way, namely,

and the dot product operations are simply

(va) · (v'b) = 'Y'Y'a • b


'Ya • ('Y'b + 'Y" c) == 'Y'Y' a • b + 'Y'Y"a • c
'Ya • 'Y*a == 1'Y12 a2

in which 'Y* is the complex conjugate of 'Y, Similarly, the cross product operations are

'Ya X 'Y'b == 'Y'Y' (a X b)


'Ya X ('Y'b + 'Y"c) == 'Y'Y'Ca X h) + 'Y'Y"(a X c)
616 Vectors MATHEMATICAL SUPPLEMENT: PART II

Some of the most useful operations involve complex vector and scalar fields which
are functions of real variables. Thus if
A(u,v,w,l) = AI(u,v,w,l) + jA 2(u,v,w,t)
in which A is complex and Al and A 2 are its real and imaginary parts, then
aA aA 1 aA
- = - +j-, etc.
2

au au au
and calculation of such quantities as VcP, V • A, V X A, and integrals containing these
quantities proceed naturally if the real and imaginary parts are treated separately as
real vectors and then their complex sum is taken after the operations are concluded.

SUMMARY OF IMPORTANT VECTOR RELATIONS

GENERALIZED ORTHOGONAL COORDINATES

aluau= I va + tl + lwaw
a ±b l uCa u ± bu )
= + Iv(a v ± bv) + lw(aw ± b w)
a · b = aubu avbv + + awbw
ltt lv lw
a Xb = au a, a;
bu bv bw
dt = [(hI dU)2 + (h dV)2 + (h
2 3 dW)2P~
1 0<1> tv a<l> a<I>

V<I> = -- -
hI au
+ -h -av + -hi,-aw
2 3

v ·A = h hI h
1 2 3
[~(h2h3Au)
au
+~
av
(h 1h 3Av) + !.....
aw
(h 1h 2Aw)]

1
\72 _ _
h Ih 2h 3 au
[~ (h 2h 3 ~) + ~ (h Ih 3
hI au av h 2 av
i) + !.....
aw
(h 1h 2
i, in»
!-)]
111. lv lw
- --.
h2h 3 h 1h 3 h 1h 2
vxA= 0 0 a
au av aw
hIA u h2A~ h 3A w

V·vxA==O
V x (Vel» == 0
V X V X A = v(v · A) - \72A
v · (A x B) = B · v x A - A · v x B
J v · A dV
v
= ¢ A · dS
s

s
J v x A · dS c¢ A · ae =
A Summary of Important Vector Relations 617

CARTESIAN COORDINATES

u=x v==y w=z


hI = 1 h2 = 1 h3 = 1
a<I> a<I> a<I>
V<I> = L ax + I, ay + t, az
aA x aAy aA z
v·A=-+-' +-
ax ay az
a2
+ -aya +-
a 2 2
\72 = -
Bx? az 2 2

r. r, i,
a a a
vxA=
ax ay az
Ax Ay Az
CYLINDRICAL COORDINATES

u=r v=ct> w=z


hI = 1 h2 ~ r h3 == 1
a<I> 1¢ a4>a<I>
V4> = 1r - + - - + 1z -
ar r ae/> az
1 a 1 aA¢
v· A = - - (rA r ) + - - + aA - z
r ar r act> az
\72 = !r ~ (r ~) + ! ~
ar ar
+~
ae/>2 az 1"2 2

i, i,

r r
vxA= a a a
ar act> az
Ar rA¢ Az
SPHERICAL COORDINATES

u=r v=O w=ct>


hI = 1 h2 = r h.« = r sin 0
V<I> = 1, a<l>
ar
+ 19 a<l>
r ae
+ + a<l>
r SIn 8 act>
1 a 2 1 a 1 aA¢
v· A = - - (r A r ) + -.- - (sin 0 A 8) + -.---
r ar
2 r SIn () ao r SIn 0 act>

\72 == -.! ~ (r2~) + _1_ ~ (Sin 0 _~) + 2 1 ~


r ar
2 ar 2
r sin 0 dO ae r sin" 8 ac/>2
lr
-- --
16 1¢
r2 sin 0 r sin 0 r
vxA= d d d
ar dO de/>
Ar rA fJ r sin 8A¢
618 Vectors MATHEMATICAL SUPPLEMENT: PART II

REFERENCES

1. Bell, E. T., The Development of ~1f athematics, 2d ed., pp. 199-211, McGraw-Hill Book
Company, New York, 1945.
20 Franklin, P., Methods of Advanced Calculus, Mcflraw-Hill 1300k Company, New York,
1944.
3. Gibbs, J. W., and E. B. \rVilson, Vector Analusis, C. Scribner's Sons, Inc., New York, 1901.
4. Phillips, H. B., Vector Analysis, John Wiley and Sons, Inc., New York, 1933.
5. Wills, A. P., Vector Analysis with an Introduction to Tensor Analysis, pp. 1-111, Dover
Publications, Inc., New York, 1958.

PROBLEMS

V.l A ship is traveling southwest at a speed of 20 knots relative to land. If the water current is
2 knots due north, what is the ship's velocity relative to the water?
V.2 Prove, with the aid of vector analysis, that the diagonals of a parallelogram bisect each
other.
V.3 Use vector analysis to show that for any triangle the line connecting the midpoints of
any t\VO sides is parallel to the third and half its length.
V.4 Prove the law of cosines for a plane triangle with the aid of vector analysis.
V.5 Prove that the diagonals of a rhombus are perpendicular.
V.6 Find the equation of a plane through the point (2,3,4) and perpendicular to the vector
a = 1x5 + 1112 - 1:3.
V.7 How far distant from the origin is the plane 3x + 4y + 5z = 12? Write the equations
of the planes parallel to it and twice as far from the origin.
V.8 Write the equation for the family of planes perpendicular to those of Problem V.7 and
parallel to the Z axis.
V.9 Show that
sin" () = _(a_X_b_)_o _(a_X_b_)
(aoa)(b·b)

V.I0 Use the result of Problem V.9 to prove the identity

(a X b) · (a X b) =
a a a b I
Ia
0 0

0 b bob

V.ll Deduce the law of sines for a plane triangle from the fact that c X c = c X (a + b).
V.12 What is the necessary and sufficient condition that three free vectors be coplanar?
V.13 Show that
d dv du
- (u 0 v) = u · - + v 0 -

ds ds ds
V.14 Show that
d dw dv
- (u 0 v X w) = u ·v X - + u · - X w + -du · v X w
ds ds ds ds
Problems 619

V.I5 .~ top is rotating about a vertical axis at the constant rate of 50 rad/sec. Its point of con-
tact is moving on a horizontal plane in a circle of radius 2 feet at a constant angular veloc-
ity of io rad/sec. For a point in the top a distance -l- in. from the axis, find the velocity in
feet/sec as a function of time. (Reminder: The term velocity refers to a vector; speed is
used to designate the magnitude of a velocity.)
V.16 When air resistance is neglected, the trajectory of a shell is given by

r == l x l ,OOOt + l y ( l ,OOOt - 16.1t 2)


The point of observation is the gun, distances are in feet, and time is in seconds.
(a) Find the velocity v as a function of time.
(b) Find the acceleration a as a function of time.
{c) What is the gun elevation?
(d) What is the range of the shell?
V.l.7 If a particle travels along a space curve with velocity v, show that its acceleration is
given by
dv v2
a == 1 T - + 1N -
dt p

in which 1 T is a unit tangent vector, IN is a unit normal vector in the principal plane, and
p is the radius of curvature of the space curve at the point occupied by the particle.

V.I8 l\ constant-rise helix is wrapped on a cone. Find the parametric equations which describe
it and write the equation of the line tangent to the helix at an arbitrary point.
V.19 Find the equation of the plane tangent to a paraboloid of revolution at an arbitrary point.
V.20 Show that spherical coordinates are orthogonal. Determine and explain the Jacobian.
V.21 With the aid of Equations (V.45)-(V.47), show that the Jacobian for the general Galilean
transformation is not zero.
V.22 When one assumes that the earth is a sphere, the equation of a line of latitude can be
expressed as
r == ro () == ()o ep == ks
Find the circumference at this latitude.
V.23 A right circular cone is truncated by a sphere whose center is at the apex of the cone.
Find the total surface area of the toplike body consisting of the cone and its spherical cap.
V.24 The t\VO cylinders x 2 + y2 == a 2 and x 2 + Z2 == a 2 intersect forming a common region.
Show that the area of this common region is 16a 2•
V.25 Compute the volume of the body of Problem V.23.
V.26 Show that in cylindrical coordinates

\72A == 1, (\7 2A r - ~ aA cP
_ Ar) + let> (\72.11 </> + ~2 a.4r _ Aet» + 1z\72A z
r2 aep r2 r a4> r2
V.27 Show that in spherical coordinates

V'2A = 1, G V'2(rA,) - ~ V · A)
+ Ie ( \72A 8 + -:-22 -aA - -2sin
-1 -
r
Ae - - -- --
2 cos e ali</»
r ao 2
r 0 r 2sin 2 0 d4>
2 dAr 2 cos () dA 8
+ 1<1> ( \72A</> + -- -
r 2sin() aep
+ r--- -dep - r-1-f) A
2sin 2 () 2sin 2
cP
)
620 Vectors l\1ATHEMATICAL SUPPLEMENT: PART II

V.28 An interesting proof of (V.III) is possible through a wedding of Stokes' theorem and the
di vergence theorem.
V.29 Show that

v
J V X A dV = - s¢A X dS
V.30 Show that

J V <P dV = ¢<P dS
v s
V.31 Show that
J V<P X dS = - ¢c <P di
s
A uihor Index
Alhazen, 3-4, 15-16 Davisson, C., 488
Alpert, N. L., 363 Davy, H., 328, 469
Ampere, A. M., 218, 221-223, 244,400, 469 Debye, P., 33~ 366, 386, 490
Anaxagoras, 41 Desaguliers, .10 1'" 468
Arago, D. F., 38, 50, 218 Descartes, R., 4-8, 18-19
Aristotle, 2, 15 de Sitter, vV., 34, 61
Arx, A., 372 Dirac, P. A. M., 126, 476
Avicenna, 3-4, 16 Dirichlet, P. G. L., 151, 173
Drude, P., 472
Dulong, P. L., 472
DuPre, F. K., 457
Bacon, F., 5, 16-17
Durbin, R. P., 76
Bacon, R., 3-4, 16
Bantle, W., 372
Barkhausen, H., 402
Egelstaff, P. A., 76
Bates, L. F., 443
Einstein, Ao, 12-13,40,61-62,72
Beccaria, G. B., 468
Empedocles, 1-2, 15
Bessel, F. v'l., 156
Euclid, 3, 15
Biot, IJ. B., 218-221
Euler, L., 10
Ritter, F., 402-403, 443
Bizette, H., 403, 446
Blackman, M., 493
Fairweather, A., 403
Bradley, .T., 20-23, 37
Faraday, M., 115, 223, 256-259, 327-329,
Brillouin, L., 428
400-401
Broer, L . .1. F., 457
Fermi, E., 476
FitzGerald, G. F., 39
Fizeau, H., 11, 23-25, 38, 49-51, 82-83, 263
Cabeo, N., 399 Forrer, R., 434
Casimir, H. B. G., 457 Forsbergh, P. W., Jr., 370
Cauchy, A. L., 176, 557 Foucault, J. B. L" 11, 23, 25-26
Cavendish, H., 101-106,221,327,468 Fourier, J. B . .I., 153
Cedarholm, J, P., 83-85 Franklin, u., 10, 99
Chu, L . .1.,273,320 Franz, n., 471
Clausius, R., 329, 366 Fresnel, s, .I., 11, 38, 50
Coulomb, C. 4~' de, 106-112,329,399-400
Cranshaw, T. E., 76
Cummerow, R. L., 459 Galileo, G., 4, 17
Curie, P., 372, 378, 402 Galt, J. K., 460
622 A uihor Index

Gauss, !(. F., 115, 134 Larmer, .I., 418


Gellibrand, H., 399 Lawton, \V. ]~:., 117, 204-207
Gerlach, \V.) 403 Legendre, A..\I., 166
Germer, L. rr., 488 Lewis, G. N., 40, 85
Gibbs, ,J. \V., 567-568 Linde, .T. 0., 495
Gilbert, \V., 399 Loar, u. 11., 76
Gordon, ,J. P., 83 Lorentz, u. A., 39, 59, 72, 264, 329-331, 343,
Gorter, C. .I., 457 472
Gorter, F. W., 403
Grassman, II. (i., 566-567
Gray, S., 327, 467-468 MeAlister, 1).,116,199
Green, G., 113,273,315 Macl.ronald, D. K. C., 495
Griffel, xr., 450 Maclaurin, C., 557
Griffiths, .1. If. E., 460 Maxwell, ,J. C., 12,38-39, 115-117,256,260-
264
Mendelssohn, K., 495
Halblfltzel, ,T., 373
Merrit, F. It, 460
Hall, l~:., 235
l\Iichell, .1.,399
Halliday, n., 459
l\Iichels, 1\., 367
Hamilton, \V.H., 565-566
Michelson, 6-\' j\., 26-29, 39, 50,51-58
Havens, \V. \V., .Jr., 76
1\1 iller, I). C., 57
I-Iay, H..1., 76
IVIillikan, R. 6-\" 13
Heisenberg, \V., 403, 441
IVlinko\vski, H., 319
IIehnholtz, H. L. P., 254
l\l¢ller, C., 84
Henry, J., 259
Morley, E. \V., 28,50,51-58
Henry, \V. E., 4:31
Mossbauer, n L., 76
Heron, 15
Mossotti, F. 0., 329, 366
I-Iertz, H., 12, 264
Hitchcock, C. S., 364-365, 389
Hogan, C. L., 464
Neel, L., 40:3-404, 445, 453
Hooke, R., 6-9, 19
Neumann, F. E., 151, 173, 224
Hupse, J. C:, 430
Newton, 1.,7-10,20. 263-264,399
Huygens, (;.,8-10, 18, 20

Jakobsen, 1\1., 76 Oersted, 1-I. o., 217-218


J 0 U 1e, ,J. P., 47 1, 489 Ohm, G. S., 470-471, 478
Onnes, K., 473
Onsager, L., 366
Kennedy, R . .I., 40, 59
Kepler, J., 3-5, 18
I{eyes, F. G., 359
Page, L., 224
Kirchhoff', G. n., 11
Pauling, L., 348, 357
Kirchner, j\., 399
Percgrinus, 398
Kirkwood, J. G., 359
Petit, :\. 'r., 472
Kittel, C., 454
Planck, :\'1., 12-13,40,264
J{ohlrausch, K. \V. F., 263
Plato, 397
Kronig, n., 457
Plimpton, S. ,1., 117, 204-207
Poincare, H., 40, 59, 72
Lagrange, .J. L., 112, 557 Poisson, S. 1).,112-115,148,221,329,400
Langevin, P., 330, 353,401,419 Porta, J. B., 398
Laplace, P. S., 112, 114, 151 Poynting, J. H., 285
A uthor Index 623

Priestley, .J., 99-100 Thorndike, E. IV1., 59


Ptolemy, 3, 15 Tolman, R. C., 40, 85
Townes, C. H., 83-85
Tsai, B., 403, 446
Riemann, G. F. B., 176 Tycho, 21
Roberts, F. F., 403
Robison, J., 100-101
Roemer, 0., 19-20 Uhlig, H. H., 359

Sanders, P., 367


Sanger, R., 360 Van Vleck, J. H., 403, 450
Savart, F., 218-221 Volta, A., 468-469
Schiffer, J. P., 76 Von Hippel, :\.. R., 392
Schipper, A., 367
Schulz, .A., 76
Schwarz, H. A., 182 vVallis, J., 564
Shull, C. G., 446 vVeber, VV., 263, 401
Simmons, ~\. J., 319 Weiss, P. I~., 402, 434
Slater, J. C., 349 Welch, .1. E., 403
Smart, J. S., 446 Went, J . .1., 403
Smyth, C. P., 364-365, 389 vVessel, C., 565
Snell, \V., 6 \¥hittaker, E., 114
Socrates, 397 Wiedemann, G., 471
Sommerfeld, .A., 276, 320, 472 vViegand, C. E., 76
Spees, f\. H., 88-89 vVien, «; 254
Squire, C. F., 403, 446 Wollaston, v: H., 469
Steinberger, J., 76 vVood, E. s., 460
Stern, 0., 403
Stevenson, A. F., 317
Stokes, G. G., 612
Yager, VV ..A., 460
Stoner, E. C., 434
Stout, .1. \V., 450 Young, 1'.,10-11,37-38
Stratton, J. A., 273
Stuart, H. 1\., 359
Zahn, C. '1'., 88-89, 359
Zavoisky, E., 459
Taylor, B., 557 Zeeman, P., 50, 427
Thomson, W. (Lord Kelvin), 116,329 Zeiger, H. J., 83
Subject Index
Aberration of light, 20-23, 37-38 Cavendish experiment
Absolute motion, 48-49 Cavendish, 101-106
Absolute potential, 129 Maxwell-Mc.Alister, 199-201
Absolute reference frame, 48 Maxwell's analysis, 201-204
Acceptors, 503 Plimpton-Lawton, 204-207
Acoustic power, 32-34 Cedarholm-Townes maser experiment, 83-85
Acoustic velocity, 31-32 Child-Langmuir law, 150
.Acoustic wave equation, 29-32 Circular polarization, 296, 463-464
Acoustic waves, 29-34 Circular wa veguide, 306
Adiabatic demagnetization, 458 Circulator, 463
Alkali halides, 349, 363, 369 Classical invariance
Ammonia beam maser, 83 of distance, 43
Ampere's circuital law, 244-245, 270 of mass, 44
Ampere's experiments, 222 Classical velocity transformation, 42-43, 45
Analytic function, 176 applications, 46-49
Anisotropy energy, 443 cars, 46-47
Antiferromagnetism, 403, 445-451 light, 47-48
Associated Legendre functions, 168, 307-308 sound, 47
Atomic magnetic moment, 240-241 Classification
Average field in sphere of conductive properties, 473-478
electric, 544-546 of magnetic materials, 396
magnetic, 552-556 Clausius-Mossotti equation, 366-369
Clock synchronization, 70-71
Coaxial cylinders, 137-138,246-247,284-285
Band theory, 473-478, 500-505
Coefficien t
Barium titanate, 373
of capacitance, 192
Bessel functions, 157-165, 519-523
of electrostatic induction, 192
Bessel's equation, 156
of potential, 191
Biot-Savart experiment, 218-221
Coercive force, 435
Biot-Savart law, 221, 232-234
Compass, 398, 407
Bitter powder technique, 443
Complex paramagnetic susceptibility, 457
Bohr magneton, 423
Complex permittivity, 379-380
Boltzmann transport equation, 482-483
Complex Poynting vector, 290
Brillouin function, 428
Complex Poynting's theorem, 290
Complex vectors, 615-616
Capacitance, 186-193, 337-338 Composite static fields, 252-253
Cauchy-Riemann equations, 176 Composition of general sources, 530-531
626 Subject Index

Conditions at infinity, 275-279 Dipolar relaxation, 383-390


Conduction band, 477-478, 504 Dipole
Conductive materials, 467 electrostatic, 130-131
Conductor, 478 magnetic, 241
Conductor-vacuum interface, 140-142 Dipole moment
Conformal mapping, 175-186 electric, 330-331
Continuity equation, 271 magnetic, 241, 404
Contraction of length, 73-74 Dirac delta function, 126-127
Contraction hypothesis, 39-40, 59 Directional coupler, 318
Corpuscular theory of light, 6 Dirichlet problem, 151, 173
Coulomb experiment, 106-112 Displacemen t curren t, 263, 270
Coulomb's law Displacement vector, 116
formulation, 117-121 Divergence, 603-606
history, 99-112 Divergence theorem, 115, 607
Coulomb's torsion balance, 106-108 Divisions of space-time, 79-80
Covalent bond, 346 Domain wall energy, 443
Covariance of Newton's general force law, Domains, 437, 442-445
43-45 Donors, 503
Crossed-slot coupler, 317-319 Doppler effect
Curie law of paramagnetism, 429 classical, 32
Curie temperature relativistic, 96
ferroelectric, 372 Doppler shift
ferromagnetic, 433 classical, 515
paramagnetic, 433, 438, 442, 447 relati vistic, 96
Curie-Weiss law Dulong-Petit law, 472, 493
electric, 372, 376 Dynamic length, 62-70
magnetic, 433, 438, 453
Curl, 607-611
Cyclotron frequency, 456 Effective mass, 456, 504
Cyclotron resonance, 455-456 Electrets, 379
Electric dipole
dynamic, 547
Damping constant (dipole), 550-551 static, 130-131
Debye equations, 386 Electric dipole n10111en t, 330-331
Debye temperature, 492 Electric field
Debye theory of specific heat, 490-494 dynamic, 266
Debye unit, 353 static, 121-127
Definition Electric flux, 138
of electric field, 121 Electrical conductivity, 480
of magnetic field, 229 Electromagnetic induction, 259
Delta function, 126-127 Electromagnetic nature of ligh t, 263
Diamagnetism, 401, 416-420 Electromagnetics, 256-325
Diamond structure, 501 Electron orbital motion, 421-424
Dielectric constan t Electron paramagnetic resonance, 458-459
definition, 355 Electron spin, 424
of gases, 356-361 Electronic polarizahility
of liquids, 365-366 static, 344-346
of solids, 361-365 time-harmonic (complex), 380-382
Dielectric losses, 390-393 Electrostatic field, 121
Dielectric susceptibility, 354 Electrostatic potential, 127-134
Dilatation of time, 75 Electrostatic stored energy, 193-197
Subject Index 627

Electrostatics, 98-215 Galilean transformation, 42, 45, 72


Electrostriction, 377 Gauss' divergence theorem, 115
Elliptical polarization, 296 Gauss' law, 134-136
Emission theories, 61 General sources, 530-531
Energy Generalized coordinates, 589-594
electrostatic, 193 Generalized electric flux density, 339
kinetic (relativistic), 89-91 Generalized magnetic intensity, 408-410
magnetic, 283-285 Germanium, 500-508
Energy gaps, 475 Gouy apparatus, 414
Equipotentials, 130, 140 Gradien t, 598-603
Equivalence of mass and energy, 40, 91 Gravitational potential, 44
Equivalent magnetic charge, 408 Green's functions, 172-175, 315-317, 534-
Ether, 5, 34-35, 37-40, 47-49, 61 536, 540-543
Ether drag, 50, 58-59 Green's integral theorems, 613-614
Event, definition, 71 Green's reciprocation theorem, 191
Exchange integral, 440-442 Guided waves, 297-303
Extended theorem of mean value, 559

Half-wave dipole, 281-282


Far-field approximation, 280 Hall effect, 235-236
Far-field pattern, 280 Hankel functions, 157
Faraday rotation, 463 Heat flow, 497
Faraday's emf law, 259, 270 Heisenberg's exchange theory, 441
Faraday's experiments, 257-259 Helmholtz coils, 254
Fermi-Dirac statistics, 476-477, 503-505 Hole-electron pairs, 501
Huygens' principle, 9
Fermi level, 477, 487
Hydrodynamic analog of electromagnetism,
Ferrirnagnetism, 403-404, 451-454
261-263
Ferrites, 403, 451-454, 462-464
Hysteresis, 371, 416, 435, 461
Ferroelectric crystals, 370-376 Hysteresis loss, 461-462
Ferromagnetic domains, 442-445
Ferromagnetic resonance, 459-461
Ferromagnetism, 402, 436-445
Ideal gas law, 30
Field tensor, 322
I mages, 142-148
Field transformation equations Impedance of free space, 286
to electromagnetics, 264-267 Impurities in semiconductors, 502
generalization, 532-533 Inductance, 309-314
to magnetostatics, 224-230 Inertial coordinate systems, 41
Fizeau's moving water experiment, 49-51 Infinite homogeneous medium, 369
82-83 Infrared absorption, 392
Flux lines, 115 Initial permeability, 436
Flux maps, 600 Insulator, 478
Integral solutions of Maxwell's equations,
Force between currents, 237-239
272-275
Force transformation law, 92-94
Interdependence of space and time, 61-70
Four-poten tial, 320
Interference of light, 10-11
Fourth fundamental unit, 238 Interference fringes, 51
Fractional magnetization, 438 Internal field constant
Free electron theory of metals, 478 electric, 344
Fringing, 188 magnetic, 411, 440
628 Subject Index

Invariance Lorentz force law, 230-231, 252, 265


of charge, 226, 229 Loren tz internal field
of electric flux, 226, 229 electric, 343
of transverse length, 62-66 magnetic, 411, 440
Inverse square law Lorentz transformation equations, 39, 70-73
formulation, 117-121 Loss tangent, 391
history, 99-112 Lumped circuit concept, 314
Maxwell's derivation, 204
Ionic bond, 346
Ionic crystals, 349, 362 Macroscopic electric field, 331-337
Ionic polarizability Macroscopic magnetic field, 404-408
static, 348 Magnetic current loop, 240-241
time-harmonic (complex), 382-383 Magnetic dipole moment, 241
Ionic polarization, 346-351 Magnetic flux lines, 258, 260
Irrotational vector fields, 614-615 Magnetic focusing, 235
Magnetic intensity, 236-237
Magnetic polarizabili ty, 411
Joule's experimen t, 471 Magnetic poles, 398
Joule's law, 471, 489 Magnetic stored energy, 283-285
Magnetic susceptibility, 411-414
Magnetization density, 405
Kennedy-Thorndike experimen t, 59 l\1agnetostatic field, 231
Magnetostatic vector potential function, 239
Magnetostatics, 216-255
Lande splitting factor, 426 Maser, 83
Langevin function, 353, 440 Mass (variable), 40, 85-89
Laplace's equation, 112, 151-186,249-252 Mass-energy equivalence, 40, 91
Laplacian operator, 606-607 Mass spectrometer, 465
Larmor angular frequency, 418 Mass spectroscope, 254
Law of inertia, 41 Mass transformation law, 91-92
Legendre functions, 524-529 Matthiesen's rule, 496
Legendre polynomials, 167 Maxwell-l\1c.Alister experimen t, 199
Legendre's equation, 166 Maxwell's equations, 268-270, 393-394,
Length contraction, 73-74 464, 509
Light Mean free path (metals), 486
aberration of, 20-23, 37-38 Mean free time (metals), 485
corpuscular theory, 6 Measurernen t
electromagnetic nature, 12 of electrostatic fields, 341
interference, 10-11 of susceptibility
nature of, 1-14 electric, 355
polarization of, 8 magnetic, 414-416
spectrum, 7 of velocity of light, 19-29
transverse vibrations, 10 Meson decay, 76
velocity of, 14-29 Method of images, 142-148
wave theory, 6 l\1ichelson interferometer, 28, 39, 51-58
Linear polarization, 296 Michelson-Morley experiment, 51-58, 513
Local field Minkowskian formulation, 319-323
electric, 342-344 Mobility, 507
magnetic, 410-411, 437 Modified Bessel functions, 159
Loren tz-FitzGerald can traction hypothesis, Molar polarizability, 366
39-40, 59 Momentum, 89-90
Subject Index 629
Momentum principle, 87 Polarization
Mossbauer effect, 76 of light, 8
Motion of source, influence on wave velocity, of waves, 296
32, 34, 47-48, 61-62 Polarized molecules, 326, 328
Moving water cxperimen t of Fizeau, 49-51, Poten tial difference, 129
82-83 Poten tial energy of magnetic moment in a
Multicapacitor systems, 189-193 field, 243
Mutual capacitance, 193 Poten tial function
Mutual inductance, 310-311 scalar, electrostatic, 128
time-varying, electromagnetic, 279-280
vector, magnetostatic, 239
Nature of light, 1-14 Poynting vector, 286
Neel temperature, 447 Poynting's theorem, 285-291
Neumann problem, 151, 173 Principle of relativity, 40-46, 61
Neutron diffraction, 446, 452
Proper distance, 79
Newtonian space, 41
Proper time, 78
Nuclear spin, 424
Properties of ferromagnetic materials, 433-
436
Oersted's experiment, 217 Pyroelectricity, 378
Ohm's experiments, 470
Ohm's law, 470-471, 478-485 Quantum states, 420-421
Optical absorption, 393 Quenching of magnetic moments, 424, 432
Orbital angular momentum, 421
Orientational polarizability, 354
Radiation integral, 280
Orientational polarization, 351-354, 371
Radiation pressure, 295
Orthogonal coordinates, 594
Recessional velocity, limit on, 82
Orthogonalized Bessel functions, 160-162
Rectangular waveguide, 300-303, 317-319,
540-543
Paramagnetic relaxation, 456-458 Regular function, 176
Paramagnetic resonance, 458-459 Relative dielectric constant, 355
Paramagnetism, 402, 427-433 Relative permeability, 412
Periodic table, 422 Relativistic velocity t.ransforrnation, 81-82
Perrnanen t magnetic momen ts, 420-427 Relativity principle, 40, 61
Permeability, 412 Relaxation time in metals, 481
Permittivity Remanent field, 435
free space, 120 Residual resistivity, 494
materials, 355 Resistivity, 480, 494
Photoelectric effect, 12-14 Retarded poten tials, 279
Photon~, 13 Robison experimen t, 100
Piezoelectrics, 376-379 Rochelle sal t, 372
Plane waves, 291-296 Ilolle's theorem, 558
Plimpton-Lawton experiment, 204-207 Ruler experiments, 62-70
Poisson's equation, 114, 148-151,245 Russell-Saunders coupling, 425-427
Poisson's theory, 112-114 Rutherford atomic model, 209
Polar molecules, 346
Polarizability Scalar poten tial function
electronic, 345 electrostatic, 128
ionic, 348 time-varying
orien ta tional, 354 dielectric, 547-549
Polarization density, 334 free space, 279
630 Subject Index

Scattering Susceptibility
impurity, 494 dielectric, 354-356
lattice, 494 magnetic, 411-414
Sch warz transformation, 182 Syrnmetry experiments
Seignette electricity, 372 colliding balls, 85-87
Self-capacitance, 193 rulers, 62-70
Self-inductance, 310, 312-314 Synchronization of clocks, 70-71
Semiconductors, 478, 500-508
Separation of variables, 152, 156, 166
Silicon, 500-508 Table of vector relations, 616-617
Snell's law, 6 Taylor's series
Solenoidal vector fields, 614-615 for one variable, 560-561
Solenoids, 247-248, 407 for several variables, 561-563
Solutions Temperature dependence
to Laplace's equation of dielectric constant, 359, 363-365, 388
using conformal mapping, 175-186 of magnetic susceptibility, 429, 438, 447,
in cylindrical coordinates, 155-165 453
in rectangular coordinates, 152-155 of resistivity
in spherical coordinates, 165-172 in metals, 494-496
to wave equation in semiconductors, 508
in cylindrical coordinates, 303-306 Tensor conductivity, 507
in rectangular coordinates, 291-303 Tensor permeability in Ierrites, 462-464
in spherical coordinates, 306-309 Theorem of mean value, 558
Sommerfeld conditions, 276 Thermal conductivity of metals, 496-500
Sound waves, 29-34 Time dilatation, 75
Source transformation law, 267 'rime-space interdependence, 61-70
Sources of magnetic momen ts, 404 Torque on magnetic moment, 243
Space-time interdependence, 61-70 Transformation
Special relativity, 36-97 of electric force, 224-230
Specific heat, 490-494 of fields, 264-267
Specific inductive capacity, 328 of sources, 267
Spectroscopic notation, 473 Transformation law
Spherical Bessel functions, 307 for force, 92-94
Spherical cavity, 308 for Blass, 91-92
Spinel structure, 452 Transformer action, 257-258
Spin-lattice interaction, 457 Transition elemen ts
Spin-orbit coupling, 425-427 iron group, 423, 425, 432
Spin-spin in teraction, 457 rare earths, 424, 432
Splitting factor, 427
Spon taneous polarization
electric, 370, 374-376 Unguided waves, 291-296
magnetic, 435, 437 Uniform plane waves, 291-296
Uniqueness theorem for electrostatic poten-
Stellar aberration, 20-23, 37-38
tial, 151-152
Stokes' theorem, 612
Stored energy
electromagnetic, 286 Valence band, 477-478, 505
electrostatic, 193 Variability
magnetostatic, 283-285 of longitudinal length, 66-68
Superconductivity, 473 of mass, 40, 85-89
Superposition of forces, 118 Vector analysis, 564-620

You might also like