Professional Documents
Culture Documents
Gary N. Felder
Associate Professor, Smith College
Kenny M. Felder
Math and Physics Teacher, Raleigh Charter High School
Copyright © 2016 John Wiley & Sons, Inc.
CONTENTS
P REFACE xi
Index 805
7in x 10in Felder fpref.tex V3 - February 27, 2015 5:09 P.M. Page xi
PREFACE
Exercises
In almost every section of our book you will find an “Exercise” and a set of “Problems.” What’s
the difference?
The simple answer is that the Problems are, for the most part, independent of each other.
You create an assignment that says “Do Section 18.3 Problems 2, 5, 9, 12, and 20.” By contrast,
an Exercise is an atomic block. You can assign a particular exercise, or you can skip that
exercise, but you can’t assign “Question 5” from the exercise because it only makes sense in
context.
More importantly, the two serve different purposes. Problems are meant to follow a
lecture, building and testing the students’ skills and understanding of the topics you have
discussed in class. Exercises are designed to facilitate active learning.
There are two types of exercises.
Motivating Exercises come at the beginning of each chapter. Their purpose is not to teach
math, but to give a practical example of why the student needs the techniques in this chapter.
Discovery Exercises come at the beginning of (almost) every section. Their purpose is to step
the student through a mathematical process, such as solving a differential equation or find-
ing a Taylor series. Instead of just being told how to do it, the students do it for themselves.
Some frequently asked questions:
∙ Do I need to assign all the exercises? No. If you are uncomfortable with the process, you
may want to try only one or two. We hope you will find them easy to use and valuable,
and over time you will use them more, but you will probably never use them all.
7in x 10in Felder fpref.tex V3 - February 27, 2015 5:09 P.M. Page xii
∙ At home or in class? Alone or in groups? Mix it up. See what works for you. We some-
times assign them as homework due on the day we are going to cover the material,
and sometimes as an in-class exercise to begin the lecture. You can have students do
them individually or in groups, or a mix of the two. One professor we spoke to starts
them in class, and then has her students finish them at home—an approach we never
even thought of. You will probably keep your students’ interest better if you vary your
approach.
∙ How long do they take? Some are five minutes or less; some are twenty minutes or even
more. Very few of them should take the students more than half an hour.
∙ That was all pretty noncommittal. Do you have any solid advice at all? Actually, we do. First,
we hope you will use at least some of the exercises, because we believe they contribute
a valuable part of the learning process. Second, exercises should almost always be used
before you introduce a particular topic—not as a follow-up. You can start your lecture
by taking questions and finding out where the students got stuck.
Problems
There are problems at the end of every section, and there are also “additional problems” at
the end of every chapter. The “additional problems” give us an opportunity to ask questions
where the students don’t know exactly what’s being covered. (When you look at a partial
differential equation in “additional problems” you have to ask yourself: separation of vari-
ables? Method of transforms? What’s the right approach here?)
We have resisted classifying the problems further, but you will find a general pattern some-
thing like this.
∙ A “walk-through” steps the students carefully through the process we want them to
learn. We advise you to generally assign these problems.
∙ Next often comes a batch of straightforward, unmotivated calculation problems.
“Evaluate the following triple integrals in spherical coordinates.” They will generally
move from easier to harder.
∙ Then come word problems. Some of these are practical applications; some fill in details
that were left out of the explanation; some are just cool ideas that occurred to us while
we were brainstorming over Bailey’s Irish Cream.
∙ Finally, in some cases, there are “Explorations.” These are harder, more involved, and
often longer problems that may stretch beyond the presented material.
Problems without a computer icon (which are most of them) can be done entirely by hand,
and should generally require no integration technique beyond u-substitution or integration
by parts. A computer icon can mean anything from “This requires an integral that you can
do on your calculator” to “This involves heavy use of a computer algebra program such as
Mathematica, MATLAB, or Maple.” The problems are written in a platform-independent
way, and we provide no instruction on any of these computer tools in particular.
So the textbook becomes what Stuart Johnson, our first editor at Wiley, called a “Chinese
menu.” You look it over and you decide you want this chapter, that chapter, skip three and
pick up this one, and so on.
For this reason, we have endeavored as much as possible to make the chapters indepen-
dent of each other. You don’t need our vector calculus chapter to cover our linear algebra
chapter, or vice versa. References from one section to another within a chapter are common;
references from one chapter to another are rare.
That being said, there are exceptions. Most importantly, pretty much every chapter in
the book relies on the information presented in Chapter 1, Introduction to Ordinary Dif-
ferential Equations. The information is minimal: most of the techniques for solving ODEs
are deferred to later chapters. But the students have to know what a differential equation is,
whether they get it from our chapter or somewhere else. If you are not sure your students
come into the course highly comfortable with this material, please start with this chapter
before doing anything else. If you want to give it the most minimal treatment possibly, you
can get away with doing the three sections “Overview of Differential Equations,” “Arbitrary
Constants,” and “Guess and Check, and Linear Superposition” only. We believe strongly,
however, that it is worth your class time to do more.
Beyond that, the first page of every chapter lists prerequisites. You should be able to use
that to make sure your students are ready for any given chapter.
Here is a specific example. In the section on Legendre polynomials in the special func-
tions chapter, one problem steps the students through the solution of Hermite’s differential
equation. If you just want your students to be able to work with Legendre polynomials, you
can certainly skip that problem. But if you want your students to follow the derivation of the
Legendre polynomials, that problem will force them through every step of the process.
That brings up the more general topic of proofs. It’s very rare for our book to prove a
theorem before we use it. Much more commonly we present a theorem, show students how
to use it, and then step them through the proof in a problem. The explanation will usually
point to that problem with language such as “You’ll show that this always works in Problem
14.” In some cases the problem doesn’t prove the theorem, but has the students show that it
works for some important cases.
There are two reasons for this unusual structure. First, we believe students follow a
proof better after they understand the result that is being proven. Second, we believe very
strongly that students follow a proof better if they work through it themselves instead of just
reading it.
One of the judgment calls you will have to make, therefore, is which of these proof
problems to assign. Our own opinion on this matter, for whatever that’s worth, is that the
importance of a proof is not based on the importance of the result it proves, but on the
technique that the proof demonstrates. It’s tremendously important for all students to know
that the derivative of sin x is cos x, but very few students can prove it—and that’s OK. On the
L ( )
other hand, the proof that an = L1 ∫ f (x) cos n𝜋 L
x dx in a Fourier series involves an impor-
−L
tant trick that teaches students what orthogonal functions are and how to find the coefficients
of many other such series, so it’s worth some investment of class and homework time.
CHAPTER 1
Introduction to Ordinary
Differential Equations
∙ evaluate a derivative.
∙ evaluate an indefinite integral using u-substitution or integration by parts and use an “initial
condition” to calculate the arbitrary constant (usually called C).
After you read this chapter, you should be able to …
Science is about predicting the behavior of systems. In many cases, we begin by knowing how
the system will change. For instance, classical physics is built around the equation F = ma:
the forces determine how the position changes, and from this information, physicists make
predictions about the actual position. In a similar way, an economist might try to predict the
price of gas by taking into account all the influences that will tend to drive the price up or
drive it down.
Mathematically, knowledge about how a variable will change means knowledge of its
derivatives. Therefore, many natural laws are expressible as equations about derivatives:
“differential equations.” Representing known laws as differential equations, and solving
those equations, is one of the most important tools in modern science and engineering.
A function of one independent variable, such as the position of a particle as a function
of time, may be modeled by an “ordinary differential equation” (or “ODE”). A function of
multiple independent variables, such as temperature in a room as a function of x, y, z, and t,
can be modeled by a “partial differential equation” (or “PDE”). This chapter, Chapter 10,
and Chapter 12 discuss ODEs; Chapter 11 discusses PDEs.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 2
(d) x = sin(mt∕k)
(√ )
(e) x = sin m∕k t
7. The function that you just found as a solution to d 2 x∕dt 2 = −(k∕m)x offers a prediction
for the motion of the mass. Does this prediction correspond to the prediction you
made on physical grounds in Part 1?
(b) Which of the following functions could be y(x)? Hint: you should approach this by
trying each function in that equation. One function will work, and the others will
not. For the left side of the equation, find dy∕dx. For the right side, square y and
then multiply the answer by x. If the two sides come out the same, you have found
a solution!
√
i. y = √x 2 + 9
ii. y = e x+3( )
iii. y = −2∕ x 2 + 3
iv. y = x 3 ∕3 + 6
There are three key questions you may be asking yourself about that equation and those
solutions.
∙ What do you mean, those functions all “work”? You validate solutions the same way
you would for an algebraic equation: by plugging them in. Let’s try the third one.
(√ ) √ (√ )
k 𝜋 k k 𝜋
x(t) = sin t+ → x (t) =
′
cos t+
m 3 m m 3
(√ )
k k 𝜋
→ x (t) = − sin
′′
t+
m m 3
Thus we see that x ′′ (t) equals x(t) multiplied by −k∕m, so this function satisfies
Equation 1.2.1.
∙ What do these answers tell me about the mass on the spring? Each of these solutions
represents a possible motion for such a system. The actual motion depends on “initial
conditions.” For instance, if the spring begins at its relaxed position at rest, it will never
move: x(t) = 0. If the spring is pulled
√ out to position x = 3 and released from rest, it
will begin to oscillate: x(t) = 3 cos( k∕m t).
∙ How do I find such answers on my own? We will begin to address this question in
this chapter, and we will address it further in our chapter on methods of solving ordi-
nary differential equations. For the moment, our emphasis is on understanding what
it means to write and solve a differential equation.
dM
=8 (1.2.2)
dt
If the units in that equation worry you—which they should!—hold onto that reservation; we
will have more to say about that issue below. Right now we want to focus on the fact that
Equation 1.2.2 can be understood in three different ways.
∙ Analytically, it says “Find a function M (t) whose derivative is the constant function 8.”
If you find a function, take its derivative, and get 8, you have found a solution.
∙ Graphically, it says “The slope of this graph is 8 everywhere.” (You may find this easier
to think about as dy∕dx = 8 but we urge you to become comfortable with other variable
names.)
∙ Logically, it says “The rate of growth of Shannon’s money is exactly 8 money-units
(dollars) every time-unit (hour).” More concisely, Shannon makes $8.00/hour. It is this
step—converting between equations and descriptions of reality—that requires human
intervention. Computers generally do a better job than we do of the calculations and
graphing.
These are not three different types of differential equations: they are three different perspec-
tives on the same differential equation. You can measure your understanding by how well you
can easily convert between the three.
It may occur to you, from any of these three points of view, that there is not just one
solution to this equation. Analytically, the function M = 8t has the correct derivative, but so
do M = 8t − 5 and M = 8t + 𝜋. The “general solution” of this equation is M = 8t + C where
C is any constant. Graphically, you can visualize an infinite number of parallel lines, all with
constant slopes of 8.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 6
What does all that tell us logically? We know how fast Shannon is making money, but we
don’t know how much she started with! If she started with $100, then M = 8t + 100 is the
correct function to describe her bank account. This idea—that a differential equation has
an infinite number of solutions, and you choose one based on “initial conditions”—will be
a recurring theme in this chapter.
Don’t confuse this with the similar-looking equation dM ∕dt = 1∕20. That would say “Every
year, M goes up by 1/20.” This says something closer to “Every year, M goes up by 1/20
of M .” With this small step—multiplying the right side of the equation by M , the depen-
dent variable—we have moved out of the world of introductory calculus. If you want to
solve Equation 1.2.2, you simply integrate both sides. That won’t work for Equation 1.2.3!
Nonetheless, we can think about our new equation in the same three ways that we did the old.
∙ Analytically, Equation 1.2.3 says “Find a function M (t) whose derivative is the original
function divided by 20.” You may want to think about this for a moment, and see if you
can find a function that works.
∙ Graphically, it says “The slope of this graph is 1/20 of the M -value.” This will not be
a line: as it goes higher, it will also get steeper. Once again, you may want to try your
hand at sketching such a curve.
∙ Logically, it says “Each time-unit (year), Shannon’s money grows by 1/20 of Shannon’s
money.” Your most important challenge is to convince yourself that this is exactly what
is meant by the phrase “5% interest.”
Let’s take up the challenge analytically. We need a function that satisfies two criteria. Of
course M (t) must satisfy the differential equation, Equation 1.2.3. But it also has to satisfy
the “initial condition” M (0) = 1000; in other words, it must correctly predict that Shannon
starts with $1000. We approach such problems by first solving the differential equation, and
then tweaking the answer to meet the initial condition. So let’s start with this: how can you
take the derivative of a function and get the same function back, multiplied by 1/20?
The most obvious solution, perhaps, is M (t) = 0. Sure enough, the derivative is the original
function, multiplied by 1/20. Graphically, we get a horizontal line at zero; the slope (0) is
always perfectly proportional to the height (0). Logically, we learn that if Shannon starts with
no money, she will never get any money. It all works—but it doesn’t tell us anything about
Shannon’s actual bank account. We have found a solution to the differential equation, but
we cannot make it meet the initial condition.
After some trial and error you might hit on the next solution, M = e t∕20 . If you take the
derivative, you get M ′ (t) = (1∕20)e t∕20 , which is indeed 1∕20 of the original function. Graphi-
cally, an exponential function does exactly what we want: as it gets higher, it also gets steeper.
This solves Equation 1.2.3, but unfortunately starts at M (0) = 1. So our job now is to modify
this solution to meet the initial condition.
Many students next try M = e t∕20 + 999. That gives us M (0) = 1000, which seems like a
good start. But be careful: this doesn’t solve the differential equation! M ′ (t) = (1∕20)e t∕20 as
before, but M ∕20 = (1∕20)(e t∕20 + 999); Equation 1.2.3 is not satisfied unless M ′ (t) and M ∕20
come out the same.
What does work is M = 1000e t∕20 . M ′ (t) and M ∕20 both give the same answer, 50e t∕20 . Fur-
thermore, M (0) = 1000 so it correctly predicts that Shannon starts with $1000. This is the only
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 7
function that satisfies both Equation 1.2.3 and the initial condition, so this is the function
that correctly models Shannon’s bank account.
If you were unable to find that solution on your own, don’t worry about it. We are going
to spend some of this chapter, plus two later chapters, learning how to find solutions. Focus
now on what it means to say that a given function solves a differential equation and an initial
condition.
Part of understanding “what it means” is testing your solutions to see if they make
sense. Since Shannon is earning 5% each year, we might expect that in her first year,
she will earn $1000 × 5% = $50. But in fact, our solution predicts that she ends up with
$1000e 1∕20 = $1051.27. Where did this small discrepancy come from? Our bank is not
literally giving her 5% every year (“compounded annually”); it is giving her interest all the
time, and then interest on her interest, and so on (“compounded continuously”), so after
a year she has slightly more than 5% of what she started with. This distinction is explored
in Problems 1.35 and 1.36. Apart from those two problems, we will ignore this distinction
in this chapter; if we say something grows by 10% each year, you should translate that to
df ∕dt = .1f (with time measured in years of course).
Problem:
Consider the differential equation:
( ) dy
x2 + 1 = 1 − 2xy (1.2.4)
dx
with initial condition y(0) = 6. Which of these three equations is a valid solution?
x+6 x x+6
1. y = 2. y = 3. y =
x+1 x2 + 1 x2 + 1
Solution:
1. We substitute y = (x + 6)∕(x + 1) into both sides of the differential equation to see
if they come out the same. ( )
( 2 ) dy ( 2 ) (x + 1) − (x + 6) 5 x2 + 1
x +1 = x +1 =−
dx (x + 1)2 (x + 1)2
Important note:
These two functions might be the same for some values of x, but that doesn’t make
y = (x + 6)∕(x + 1) a solution. The initial condition is about one particular x-value,
but the differential equation itself must be satisfied for all x-values. (Don’t set them
equal and solve for x: it won’t help.)
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 8
2. We can readily determine that y = x∕(x 2 + 1) does not meet the initial con-
dition, since y(0) = 0. So this cannot be the solution we are looking for.
3. Once again we calculate both sides of the equation and compare. If y = (x + 6)∕
(x 2 + 1) then we get this on the left.
dy x 2 + 1 − 2x(x + 6) x 2 + 1 − 2x(x + 6)
(x 2 + 1) = (x 2 + 1) ( )2 =
dx x2 + 1 x2 + 1
x 2 + 1 2x(x + 6) x 2 + 1 − 2x(x + 6)
1 − 2xy = − 2 =
x2 + 1 x +1 x2 + 1
For the function y = (x + 6)∕(x 2 + 1), the quantities (x 2 + 1)(dy∕dx) and 1 − 2xy do
come out the same; this function satisfies the differential equation. And what about
the initial condition? We find y(0) = 6 as it should, so we have found a solution. Later
in this chapter we will discuss guidelines for determining if this is the only solution.
In this case, it is.
Stepping Back
We began this section with the differential equation that is most important in classical
mechanics and mechanical engineering: Newton’s Second Law, F = ma. That doesn’t look
like a differential equation, but it is because a simply means dv∕dt, or equivalently d 2 x∕dt 2 .
So when we know the forces that act on an object, this equation tells us how fast the object’s
velocity will change, and that in turn tells us how its position will change. We then moved
to economic examples, where terms such as “profit” and “interest” and “expenses” describe
how a bank account changes (in both positive and negative ways); from that we can compute
total amounts of money. In chemistry, “kinetics” describes the rate of a reaction, from which
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 9
you can compute the amounts of the reactants. Calculations on electric circuits begin with
“potential” that causes “current,” the time derivative of charge.
All these examples, and many others in every field of science and engineering, ask the
same fundamental question. “I know how fast this quantity is changing (because I have a
model of the forces causing it to change). Now, what is the function?” That is the question
that every differential equation asks. As you learn how to model physical situations with dif-
ferential equations, solve those equations, and interpret the answers, you are learning one
of the most useful tools for predicting the behavior of important systems.
1.20 The function f (x) has the peculiar property (“Break even” means his net worth is
that its derivative is the same function as its neither increasing nor decreasing; in
square root. (Assume f (x) ≥ 0 for all x.) other words, dM ∕dt is zero.)
(a) Express this sentence—“The derivative of (c) If Ben starts off with more money than
f (x) and the square root of f (x) are the you calculated in Part (b), what will hap-
same function”—as a differential equation. pen to his money over time? Explain
(b) Show that f (x) = 0 is a solution how your answer is based on your dif-
but f (x) = 1 is not. ferential equation in Part (a).
(c) Now assume that f (0) = 1. Is f (x) an (d) If Ben starts off with less money than
increasing function, a decreasing func- you calculated in Part (b), what will hap-
tion, or flat? How can you tell? pen to his money over time? Explain
(d) Is f (x) concave up, concave down, or how your answer is based on your dif-
neither? How can you tell? ferential equation in Part (a).
1.21 The Explanation (Section 1.2.2) discussed two 1.23 Mary starts her career with an initial salary S0 .
financial situations: one in which Shannon Each year she gets a 5% raise, so her salary one
earns money at a consistent (linear) rate, and year later is 1.05 × S0 , her salary the following
one in which she continuously earns inter- year is 1.05 × 1.05 × S0 = 1.052 × S0 , and
est on her money. Now suppose both are so on.
going on at the same time. Shannon makes (a) Write an expression for Mary’s salary S(t).
$30,000/year salary, and her money always This will not be a differential equation,
goes immediately into the bank, where it but a simple function of time.
earns interest at an annual rate of 10%. (b) Assuming Mary never gains or loses money
(a) Write the differential equation that says in any way other than earning her salary,
“Shannon’s money increases every year by write a differential equation for her amount
$30,000 plus 10% of whatever she has.” of money M (t). (You will need to use
Use M for the total of her money and t your answer to Part (a) here.)
for time in years. (To make the answer (c) Now assume that in addition to earning her
look nicer, measure M in “thousands salary, Mary also spends an amount E each
of dollars” instead of “dollars.”) year, which starts at E0 and increases by 2%
(b) Assuming Shannon starts with no money, each year (because of inflation). Write a
show that the function M (t) = 300(e t∕10 − 1) new differential equation for M (t).
satisfies both the differential equation (d) Finally, assume that Mary earns a salary
and the initial condition. S(t), spends an amount E(t), and is putting
(c) We were able to write our solution in that her money in the bank where it is earn-
form because we had already specified ing 4% interest each year. Write a dif-
what units we were using for time and ferential equation for M (t).
money. To write this solution with cor- 1.24 The chemical tetrahydrofrefurol decom-
rect units, so that it would be valid for poses in the presence of a biological agent.
any units of time and money, we can write Research shows that the rate of decomposi-
M (t) = D(e t∕k − 1), where D = 300 and tion is proportional to the remaining mass.
k = 10. What are the units of D and k? When the biological agent is first introduced
1.22 Little Benjamin sneaks onto his father’s com- and the decomposition begins (t = 0), there
puter every night. His father thinks Ben is are 100 kg of tetrahydrofrefurol.
playing video games. Actually, Ben is day Use M for the mass of tetrahydrofre-
trading, and he is very good at it. However furol and t for the amount of time since
much money he has at the beginning of the it began decomposing.
week, he always earns 10% of that in a week’s
(a) Sketch a possible graph for the function
time. Unfortunately, Ben also has a bubble-
M (t) on purely physical grounds. You
gum habit that costs him $15 a week.
don’t need to write any equations yet;
(a) Write a differential equation that repre- just make sure your graph represents the
sents Ben’s net worth. Use M for his money idea that the more you have, the more
and t for time (measured in weeks). you lose.
(b) How much money does Ben need (b) Write a differential equation that represents
in order to break even every week? the sentence “The amount of mass that
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 12
disappears in any given hour is pro- second-order differential equation for the
portional to the mass at that hour.” mass’s position as a function of time x(t).
Your equation will introduce a con- (c) In general, the best way to work with abso-
stant of proportionality k. lute values mathematically is to split into
(c) What are the units of k? Is it positive two cases. When v ≥ 0 you can replace
or negative? How do you know? |v| with v in your differential equation.
(d) The function M (t) = 0 should be Show that in this case one valid solution
a valid solution of your differen- to the differential equation is:
tial equation. Describe the scenario (√ )
described by this solution. k 𝜋 3
x = 2 sin t+ −
(e) Find a function other than M (t) = 0 that m 4 k
solves your differential equation.
(f) Find a function that solves your differen- (d) When v < 0 you can replace |v| in your
tial equation and has the property that differential equation with −v. Find a solu-
M (0) = 100. (If the graph of this function tion to the differential equation that works
doesn’t look more or less like the graph in this case. (The technique of choice
you drew in Part (a), one of them is wrong; here is trial-and-error: play around until
figure out which one, and fix it!) you find a function that works!)
(g) A different chemical follows the same dif- 1.27 Let y = f (x) represent one particular graph
ferential equation, but with a much higher (not the only one!) that follows the dif-
value of k. Compare its decay process to ferential equation dy∕dx = y − x 2 .
the decay of tetrahydrofrefurol.
(a) f (x) goes through the point (−1, 1). What
1.25 As a snowball melts, its mass m decreases is the slope of the curve at that point?
at a rate proportional to its surface
(b) f (x) also goes through the point (−3, 5).
area. The surface area of a snowball is
What is the slope of the curve at that point?
directly proportional to m 2∕3 .
(c) f (x) also goes through the point (0, 2). What
(a) The problem tells us that dm∕dt is
is the slope of the curve at that point?
proportional to m 2∕3 . Is the constant
of proportionality positive or nega- (d) Based on all the information you have accu-
tive? How do you know? mulated, draw a possible sketch of f (x).
(b) Write the differential equation. If the con- 1.28 Let y = f (x) represent one particular graph
stant of proportionality is positive, call (not the only one!) that follows the differen-
it k 2 . If it is negative, call it −k 2 . tial equation dy∕dx = (x + 2y)∕(x + 2).
(c) Show that m = (5 − k 2 t∕3)3 is a valid solu- (a) f (x) goes through the point (−3∕2, 3∕4).
tion to your differential equation. What is the slope of the curve at that point?
(d) What initial condition is implied (b) Find d 2 y∕dx 2 as a function of x and y.
by that solution? In other words, find the derivative with
respect to x of dy∕dx. When you first apply
1.26 A mass on a spring experiences a force
the quotient rule, your answer will have
Fspring = −kx where x is the displacement
dy∕dx in it. You can then substitute dy∕dx =
from the relaxed position and k is a posi-
(x + 2y)∕(x + 2) to obtain a function of just
tive constant. It also experiences the force
x and y. Simplify as much as possible.
of friction from the table the mass is sliding
along: the force is Ffriction = −3v∕|v|. (c) At the point (−3∕2, 3∕4), is the function
concave up or concave down?
(d) Based on your answers to (a) and (c),
what can you conclude about this func-
tion at the point (−3∕2, 3∕4)?
1.29 A quantity Q evolves over time according to the
differential equation dQ ∕dt = Q (Q − 3)(Q − 5).
(a) −3v∕|v|? That’s a strange look- (a) If Q (0) = 3 then, at that point, dQ ∕dt =
ing force. What is its magnitude? 0. So Q will stay 3, which means dQ ∕dt
Which way does it push? will stay 0 forever. Therefore Q = 3 is
(b) Using Newton’s Second Law Ftotal = ma referred to as an “equilibrium solution”
and the definitions of velocity (dx∕dt) of the differential equation. What are the
and acceleration (d 2 x∕dt 2 ), write a two other equilibrium solutions?
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 13
(b) If Q is between 0 and 3, is dQ ∕dt posi- work. Try it!) Does this function match the
tive or negative? Based on this, how do behavior you predicted? (It should.)
you expect Q to evolve over time? (c) Object C has k∕m = 9. Solve its differen-
(c) If Q is between 3 and 5, is dQ ∕dt posi- tial equation by inspection. When k∕m is
tive or negative? Based on this, how do positive, what effect does its value have
you expect Q to evolve over time? on the behavior of the system?
(d) If Q > 5, is dQ ∕dt positive or nega- (d) Finally, Object D has k∕m = −9. Solve its
tive? Based on this, how do you expect differential equation by inspection. When
Q to evolve over time? k∕m is negative, what effect does its value
1.30 Two different objects are both moving along have on the behavior of the system?
the x-axis. Both experience forces that are 1.32 A quantity changes according to the differ-
proportional to their distance from x = 0, ential equation dy∕dx = y2 . You are going to
so we can write F = kx. Since F = ma and draw three possible graphs that could rep-
a = d 2 x∕dt 2 , we can write d 2 x∕dt 2 = (k∕m)x. resent such a quantity. All three graphs will
However, there is one big difference. For be drawn on the domain 0 ≤ x ≤ 1.
Object A, k∕m = 1; for Object B, k∕m = −1. (a) The first graph starts at the point (0, 0). Our
(a) For Object A, d 2 x∕dt 2 = x. Based directly differential equation promises us that the
on this differential equation, if the object slope at that point is zero—the graph is
starts at rest at x = 0, what will its acceler- horizontal—so move horizontally to the
ation be? What will happen over time? right (just a small distance) from that point.
(b) If Object A starts at rest at x = 1, will it accel- (b) At the new point you reach, what is
erate to the right or left? Where will it go the slope, according to the differen-
next, and what will happen to its acceler- tial equation? Based on that new slope,
ation? What will happen over time? draw a little more of the graph.
(c) If Object A starts at rest at x = −1, what (c) Keep going in this way to draw
will its acceleration be? Where will it go one possible graph.
next, and what will happen to its acceler- (d) Now start over at the point (0, 1). At
ation? What will happen over time? this point, our differential equation
(d) For Object B, d 2 x∕dt 2 = −x. If the object promises us a slope of 1, so move
starts at rest at x = 0, what will its acceler- up-and-right a bit at a 45◦ angle.
ation be? What will happen over time? (e) At the new point you reach, is the
(e) If Object B starts at rest at x = 1, will it accel- slope the same, higher, or lower than
erate to the right or left? Where will it go 1? Based on that new slope, draw a
next, and what will happen to its acceler- bit more of the graph.
ation? What will happen over time? (f) Keep going in this way to draw a
(f) If Object B starts at rest at x = −1, what second possible graph.
will its acceleration be? Where will it go (g) Now draw a third graph in the same
next, and what will happen to its acceler- way, starting at the point (0, −1).
ation? What will happen over time?
(h) For each of the three graphs you drew, is y
1.31 [This problem depends on Problem 1.30.] In increasing, decreasing, or staying constant?
Problem 1.30 you looked at two objects: Object
You will explore this equation
A follows the differential equation d 2 x∕dt 2 = x,
further in Section 1.9 Problem 1.198
and Object B follows d 2 x∕dt 2 = −x.
(see felderbooks.com).
(a) Solve the differential equation for Object
A “by inspection”: that is, think of a func- 1.33 A “simple harmonic oscillator” (SHO) is
tion x(t) that is its own second deriva- any system that obeys a differential equation
tive. (Do not use x(t) = 0.) Does this of the form d 2 x∕dt 2 = −𝜔2 x where 𝜔 is
function match the behavior you pre- a constant. The simplest possible SHO
dicted? (It may or may not, depending is one for which d 2 x∕dt 2 = −x.
on the function you choose.) (a) Show that x = sin t and x = cos t are
(b) Solve the differential equation for Object B both solutions to d 2 x∕dt 2 = −x.
“by inspection”: that is, think of a non-zero (b) Each of these solutions represents an oscil-
function x(t) that is negative its own second lation. The “period” of an oscillation is
derivative. (Hint: just throwing a nega- defined as the time it takes to complete
tive into your answer for Object A doesn’t one full cycle from its maximum value to
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 14
its minimum value and back to the max- you would expect to see over time. Where
imum again. What is the period of the would the population head over time?
oscillations described by d 2 x∕dt 2 = −x?
(g) The solution to this differential
(c) If one object has position x = sin t and
equation and initial condition is N (t) =
a different object x = cos t, we have
7500e 0.15t ∕(e 0.15t + 749). Graph this function
just seen that they obey the same dif-
for 0 ≤ t ≤ 100 and make sure it matches
ferential equation and oscillate with
your qualitative prediction from Part (f). (If
the same period. If you observed the
it doesn’t, figure out what went wrong!)
motion of both objects, how would they
differ? 1.35 Exploration: Compound Interest (by hand)
(d) Now suppose an SHO is described by In the Explanation (Section 1.2.2), Shan-
d 2 x∕dt 2 = −4x. Write a solution to this non put $1000 into a bank account with
equation and find its period. 5% interest. Modeling her account with the
equation dM ∕dt = M ∕20, we found that she
(e) What is the period of an SHO that
earned slightly more than $1000 × 5% = $50
obeys d 2 x∕dt 2 = −𝜔2 x?
in her first year. Her balance is not incre-
(f) What is the period of a 2 kg mass oscil- mented once a year, but incremented contin-
lating on a spring with spring constant uously, so she is always getting interest on the
k = 200 N/m? (Recall that the spring interest she just got. This distinction can be
exerts a force F = −kx on the mass.) approached rigorously through a limit.
1.34 Constrained Population Growth (a) Begin with an equation that literally mod-
The logistic equation is used to model popula- els a 5% increase once a year. If Shannon
tion growth in a constrained environment. begins the year with $M0 , then after one
As an example, the ( differential) equation year, she has $1.05M0 . The following year
dN ∕dt = (0.15)N 1 − N ∕7500 has been used she has 1.05 × $1.05M0 = $1.052 M0 . How
to model the elephant population in South much money does she have after t years?
Africa’s Kruger National Park.1 N (t) is the num- (b) Now, suppose Shannon’s interest is “com-
ber of elephants and t is measured in years. pounded” ten times per year. This means
(a) According to this model, what is dN ∕dt that after each 1/10 of the year, she receives
when N = 0? Explain what this result means 1/10 of 5% interest (her money multi-
in terms of the elephant population. plies by 1.005). If she receives such interest
(b) According to this model, what is dN ∕dt ten times per year, how much does she
when N = 7500? Explain what this result have after one year? After t years?
means in terms of the elephant population. (c) The above thinking leads to the formula
(c) According to this model, what hap- for compound interest. If Shannon starts
pens to dN ∕dt when N > 7500? Explain with M0 dollars and receives 5% interest
what this result means in terms of compounded n times per year, then after
the elephant population. t years she ends up with M0 (1 + .05∕n)nt
(d) Make a table of N and dN ∕dt. Include dollars. To compound continuously, take
N = 10, N = 1000, N = 2000, N = 3000, the limit of this formula as n → ∞. (The
N = 4000, N = 5000, N = 6000, and most common trick for taking this limit
N = 7000, plus the values you already starts by taking the natural log of the for-
calculated at N = 0 and N = 7500. mula, using the laws of logs and l’Hôpital’s
(e) Calculate the exact N -value at which rule to find the limit, and then raising
the elephant population would grow e to your answer at the end.)
the fastest. In other words, calculate the (d) Show that the answer to Part (c) is
N -value that maximizes dN ∕dt. a valid solution to the differential
(f) The initial condition in this case is that, in equation dM ∕dt = M ∕20 with the ini-
1905 (which we might as well call t = 0), tial condition M (0) = M0 .
the population was N = 10. Based on your (e) Now apply the same logic to a popula-
answers above and this initial condition, tion growth scenario. If every rabbit pro-
describe qualitatively the kind of growth duces on average ten new rabbits per year, then
1 Gallego, Samy (2003). Modelling Population Dynamics of Elephants. PhD thesis, Life and Environmental Sciences,
a starting population of 15 rabbits will to the question: how much money does
produce 150 rabbits in their first year. Shannon have after 20 years? In all cases, round
What is R(t) in this case? (Make sure your off your answer to the nearest penny.
function predicts R(1) = 15 + 150 = 165.) (a) If Shannon’s money is “compounded
If every rabbit produces on average 1 annually”—that is, if she receives exactly 5%
new rabbit every tenth of a year, what is of her balance once a year—then the for-
R(t)? If every rabbit produces on aver- mula for her balance is M (t) = 1000 × 1.05t .
age 10∕n rabbits n times per year, what Compute her balance after 20 years.
is R(t)? Take the limit of your answer (b) If Shannon’s money is “compounded
as n → ∞ and show that it satisfies the monthly”—that is, if she receives 1/12
differential equation dR∕dt = 10R and of 5% of her balance every month—
the initial condition R(0) = 15. then the formula for her balance is
(f) Based on your answer to Part (e), how M (t) = 1000 × (1 + .05∕12)12t . Com-
long will it be until the Earth is cov- pute her balance after 20 years.
ered with rabbits? Include whatever esti- (c) If Shannon’s money is “compounded
mates you use or quantities you look daily”—that is, if she receives 1/365
up as part of your solution. of 5% of her balance every day—then
the formula for her balance is M (t) =
1.36 Exploration: Compound Interest
1000 × (1 + .05∕365)365t . (Ignore leap
(by computer)
years.) Compute her balance after 20 years.
In the Explanation (Section 1.2.2), Shannon
put $1000 into a bank account with 5% inter- (d) Many banks advertise that your money
est. We solved the equation dM ∕dt = M ∕20 is “compounded continuously.” Use
and got M (t) = ($1000)e t∕20 , and then noted a computer to take the limit of the
that this causes her money to increase by above process as n → ∞.
more than 5% after a year because her bal- (e) Plug t = 20 into the solution M (t) =
ance keeps growing, and she immediately ($1000)e t∕20 . Comparing the result to your
starts earning interest on that new balance. In answers above, what kind of compound-
this problem we take a numerical approach ing does this solution represent?
iv. y = − ln(−x) + 4
v. y = −4 ln(−x)
vi. y = − ln(−x + 4)
vii. y = − ln(−x + 7)
(b) Based on your answers, write a function that has a C in it, about which you can say,
“This function is a valid solution to dy∕dx = e y for any value of the constant C.”
(c) Confirm that your solution works for C = −3.
(d) Find the value of C for which y(0) = 0.
We found a function and it works so we’re done, right? Not so fast: here is another one.
2 ∕2
y = 5e t −3 another solution (1.3.3)
Equations 1.3.2 and 1.3.3 are not the same function. They are not even simple permutations
of each other, such as “add five” or “multiply by five.” But they are both valid solutions to
2
Equation 1.3.1. (Try them!) More generally we will show below that the function Ce t ∕2 − 3
(where C is a constant) is a valid solution. We call C in that formula an “arbitrary constant”
2
because the solution works no matter what value of C you choose. We say that Ce t ∕2 − 3 is
the “general solution” because it comprises all possible solutions. (We will say more about
general solutions later.)
It’s easy to verify mathematically that this general solution works, but what does it mean
physically? A system governed by Equation 1.3.1 must have a particular y at any given t, right?
Yes it must, but the differential equation alone is not enough to determine those values. We
also need an “initial condition”—the value of y at some particular time, usually at t = 0. The
differential equation and the initial condition together determine the state of the system.
Problem:
2
Show that y = Ce t ∕2 − 3 solves Equation 1.3.1 for any value of the constant C.
Solution:
2 2
For y = Ce t ∕2 − 3, dy∕dt = Ce t ∕2 × t. On the other side of Equation 1.3.1,
2
t(y + 3) = t(Ce t ∕2 ). They come out the same so this function satisfies the differential
equation—no matter what constant C is.
Problem:
Find the particular solution that satisfies the initial condition y(2) = 5.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 17
Solution:
2
Plugging t = 2 and y = 5 into our general solution, 5 = Ce 2 ∕2 − 3 solves to give
2
C = 8∕e 2 . So y = (8∕e 2 )e t ∕2 − 3 is the one and only function that satisfies both the
differential equation and the given condition.
Problem:
Graph five different solutions to this differential equation. Include (and clearly
mark) the solution that passes through the point (2, 5).
Solution:
15
10
y(t)
t
1 2
2
The plot shows the function y = Ce t ∕2 − 3 for values of C from 0 to 16∕e 2 . The plot in blue is
for C = 8∕e 2 , which passes through the point (2, 5).
Problem:
Find the particular solution to x ′′ (t) = x with initial position x(0) = 1 and initial
velocity v(0) = 2.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 18
Solution:
The functions x = e t , x = 0, and x = 3e −t are all valid solutions to this differential
equation. More generally, the following function is a valid solution for any values of
the arbitrary constants C and D (as you should verify for yourself).
x(t) = Ce t + De −t
3 t 1 −t
x(t) = e − e
2 2
So how many arbitrary constants is enough? The answer depends on the “order” of the
differential equation. (This and other classification schemes are summarized in
Appendix A.)
is a third order differential equation because it has a third derivative, and it has no fourth derivative,
fifth derivative, etc.
The ideal solution to a differential equation is one that can match any possible initial
conditions. Such a function is called the “general solution.”
They’re all right! Remember that C means “Any number here will make a valid solution.”
2C and C − 5 also mean “any number” so they serve the same purpose.
Now our heroes are told to find the one-and-only solution that goes through the point
(1, 8). Plugging in x = 1 and y = 8, Curly concludes that C = 5. Larry argues that C = 5∕2
while Moe gets C = 10. They can avoid further lamentable violence, however, if they notice
that they all converged on the same solution, y = 3x + 5.
Please make sure you followed that last paragraph. In the solution to a differential
equation, C is perfectly interchangeable with C∕4 or C 3 or ln(C + 17): you are free to
choose whichever form is most convenient. When you match an initial condition you will
find different C-values, but end up with the same specific function.
Here is a subtler example. Consider the following true fact.
dy
The general solution of = y is y = Ce x (1.3.4)
dx
You can (and should) confirm for yourself that this function meets the first criterion of a
general solution: it works for any value of C. We are also telling you that it meets the second
criterion. Any possible solution of this differential equation can be written in the form Ce x .
An apparent exception is the function y = e x+7 . It does work. (Try it!) And it doesn’t look
like Ce x . (Does it?) It seems our general solution isn’t very general after all.
But it is, and the key is a bit of algebra. e x+7 = e 7 e x so this does fit the template in
Equation 1.3.4. The moral of this story is that any solution can be put into the form of the
general solution, but it isn’t always easy or obvious.
The general solution to this equation can be written in either of the following equivalent forms:.
√
An SHO equation could look like x ′′ (t) = −9x or x ′′ (t) = − 2q𝜖∕5e w x, but the system is
an SHO as long as the equation sets the second derivative of a function equal to a negative
constant times that function.
Equations 1.3.6 and 1.3.7 look very different but each one is the general solution, and
remember what that means.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 20
∙ Any solution to Equation 1.3.5 can be expressed in the form of Equation 1.3.6 with the
right constants A and B.
∙ Any solution can also be expressed in the form of Equation 1.3.7 with the right con-
stants C and 𝜙.
This implies that the two functions can be rearranged to look like each other; you will do so
in Problem 1.58.
In many cases Equation 1.3.6 is the simplest for matching initial conditions. For example,
A is just x(0). But Equation 1.3.7 more clearly shows the physical properties of the oscillator
because the arbitrary constants C and 𝜙 represent the amplitude and phase, respectively.
In either representation 𝜔 is the angular frequency, so the period is 2𝜋∕𝜔. Notice that 𝜔 is
not an arbitrary constant; it comes from the differential equation. For instance the equation
x ′′ = −9x always represents an oscillator with period 2𝜋∕3, but that oscillator can have any
amplitude and phase, which depend on the initial conditions.
1.46 For each scenario write a differential equation (e) Of all the functions that solve this dif-
to describe the situation, specify how many ferential equation, only one of them has
arbitrary constants would be in the general the property that y(2) = 5. By plugging
solution, and give one possible set of phys- x = 2 and y = 5 into the general solu-
ical conditions that could be used to find tion, find the value of C that matches this
the values of those arbitrary constants. For condition.
example, for a mass on a spring that exerts a 1.48 For the differential equation dy∕dx = 1 − y2 , the
force F = −kx you might write the following: general solution is y = (Ce 2x − 1)∕(Ce 2x + 1).
∙ m(d 2 x∕dt 2 ) = −kx
(a) List all of the following functions that
∙ Since this is a second-order equation
are valid solutions to this differential
the general solution will have two
equation. (You do not have to test each
arbitrary constants.
one out in the differential equation,
∙ To specify their values you could use the posi-
which would be tedious. Instead, see
tion and velocity of the mass at t = 0.
which functions fit the “template” pro-
(a) A population of mice M (t). Each year vided by the general solution.)
every mouse has on average 4 babies, i. y = 0
and predators kill 100 of the mice. ii. y = −1
(b) A sample of radioactive material R(t). In iii. y = e 2x
some fixed interval of time half of the iv. y = (3e 2x − 1)∕(3e 2x + 1)
sample decays. (Since the half-life isn’t v. y = (−5e 2x − 1)∕(−5e 2x + 1)
specified here your differential equation vi. y = (3e 2x − 1)∕(−5e 2x + 1)
will have an unknown constant in it, sep- (b) Choose one of the functions that you said
arate from the arbitrary constants that was a valid solution, and demonstrate
would appear in the solution if you were to that it does indeed satisfy the differen-
solve it.) tial equation. (Go ahead, choose the
(c) The temperature in a long hallway easy one!)
T (x) obeys the differential equation (c) Find the function that satisfies this
d 2 T ∕dx 2 = 0. (You don’t have to write differential equation with the initial
the differential equation for this one condition y(0) = 5. (It is not one of
since we gave it to you.) the functions listed above.)
(d) A falling object experiences a constant grav- (d) Find lim of the general solution given
itational force −mg and an upward drag C→∞
in the problem and show that the result-
force proportional to the object’s speed. ing function is also a valid solution.
1.47 Consider the differential equation 1.49 Two particular solutions of the equation
x(dy∕dx) = y ln y. 4f ′′ (z) + f (z) = 4f ′ (z) are f1 (z) = e z∕2
(a) Many differential equations can be and f2 (z) = ze z∕2 .
solved for any given initial conditions. (a) Show that any linear combination
This equation, however, is restricted. Af1 (z) + Bf2 (z) is also a valid solution
For instance, we can see immediately to the differential equation.
that no solution is possible with the ini-
(b) Find the particular solution that satisfies
tial condition y(1) = −1. Why? Another
the conditions f (2) = 14e and f ′ (2) = 12e.
impossible condition is y(0) = 2. Why?
(Assume all variables are real.) 1.50 A simple harmonic oscillator follows the
(b) One solution to this problem is the con- differential equation d 2 x∕dt 2 = −9x.
stant function y = 1. Show that this function (a) Show that the function x = C sin(3t)
solves the equation by finding x(dy∕dx) and is a valid solution of this equation for
y ln y and showing they are the same. any value of the constant C.
(c) The general solution to this equation is (b) Show that it is impossible, using the solu-
y = e Cx . Show that this function solves the tion from Part (a), to meet the initial
equation by finding x(dy∕dx) and y ln y conditions x(0) = 4 and x ′ (0) = 12.
and showing they are the same. (c) Show that x = C sin(3t + 𝜙) is a valid solu-
(d) If the function given in Part (c) is the tion to the differential equation for any
general solution, it should include all values of the constants C and 𝜙.
solutions—including the one in Part (b). (d) Find the function that solves this dif-
Show that it does by finding the value ferential equation and meets the ini-
of C that leads to that solution. tial conditions given in Part (b).
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 22
1.51 One of the simplest differential equations is How does k affect the behavior of the
the equation that says “The quantity x grows at system and how does C affect it?
a constant rate.” Suppose that every one unit 1.53 A tank contains 3 l of water. 30 g of salt are
of time t, the quantity x grows by exactly k. mixed evenly throughout the tank. Salt water
(a) Write a differential equation that is draining from the tank at a constant rate
expresses this rule. of 1 l/h. Let S be the amount of salt (mea-
(b) Write the general solution x(t) to your sured in grams) remaining in the tank as a
differential equation. Note that your function of t (measured in hours).
solution will have two unrelated con- (a) If nothing else is happening, write a dif-
stants in it: the k from the equation, ferential equation for the function S(t).
and the C from the solution. Begin by asking yourself: how much salt
(c) Choosing k = 3, graph the different solu- disappears from the tank each hour?
tions for three different C-values. (b) Find the solution to your differen-
(d) Choosing k = −3, graph the solutions tial equation in Part (a) with the ini-
for the same three C-values. tial condition S(0) = 30.
(e) Explain in words: what kind of growth (c) Now assume that from the beginning
results from this differential equation? fresh water is being dumped into the
How does k affect the behavior of the tank at a constant rate of 1 l/h, so the
system and how does C affect it? total volume of the tank is kept con-
1.52 A simple model for population growth is “the stant at 3 l, and the salt is always mixed
more you have, the more you get”: put more evenly throughout the water. Write a
mathematically, “the rate of growth of the pop- different differential equation for S(t)
ulation is proportional to the population itself.” in this scenario, once again beginning
with the question: how much salt disap-
(a) Write a differential equation that
pears from the tank each hour?
expresses this model. Use P for the pop-
ulation, t for the time, and k for the (d) Find the solution to your differen-
constant of proportionality. tial equation in Part (c) with the ini-
tial condition S(0) = 30.
(b) Solve your differential equation “by inspec-
tion”: that is, look at it until you can (e) Draw a quick sketch of your solution.
think of any function that works. (For 1.54 Water is leaking from a hole at the bot-
the moment, avoid the “trivial” solution tom of a tank. The height of water in the
P (t) = 0. We’ll come back to that one.) tank decreases at a rate proportional to
(c) Now, start to modify your solution. If the square root of the height.
you add 5, does it still work? If you mul- (a) Why, physically, does the rate of
tiply by 3, does it still work? Your goal leaking change over time, instead
here is to arrive at the general solu- of being constant?
tion: that is, a function that satisfies the (b) Using h for the height of water in the
original differential equation for any tank and t for time, write the differential
value of an arbitrary constant C. equation for h(t). Your equation will con-
(d) The solution P (t) = 0 should be a spe- tain a constant of proportionality k.
cial case of your general solution. Find (c) Is your constant k positive or negative? How
a value of C for which your general can you tell? What are the units of k?
solution becomes P (t) = 0. (d) Based on your differential equation,
(e) Choosing k = 2, graph three different describe in words how the height of
solutions of this differential equation, the water will change over time.
using different starting values P (0). Make (e) Now suppose it is raining into the top
sure your graphs fit the claim that our of the tank at a rate of 4 in/h. Write a
differential equation began with: the new differential equation for h(t).
higher the population, the faster the
(f) Your new differential equation has one
growth (i.e., the higher the slope)!
special solution, an “equilibrium solution”
(f) Choosing k = 1∕2, graph solutions for where the amount of water flowing in and
the same three starting values P (0). the amount of water flowing out are per-
(g) Explain in words: what kind of growth fectly balanced. What is the height of the
results from this differential equation? equilibrium solution? Hint: you can tell by
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 23
looking at your differential equation and (a) The general solution to this equation is
thinking about dh∕dt at equilibrium! x = Ae t + Be −t , where A and B are arbitrary
1.55 You are running a scientific experiment on the constants. Find the constants A and B given
bacteria Veribadocillin. The amount of bac- the initial conditions x(0) = 0, x ′ (0) = 1.
teria in your container is constant, and every (b) An alternative way of expressing the gen-
second it produces 2 g of the chemical Situn- eral solution is x = C sinh(t) + D cosh(t),
sene. You have filters in your container that where sinh(t) = (1∕2)(e t − e −t ) and cosh(t) =
remove 2% of the Situnsene each second. (1∕2)(e t + e −t ) and C and D are arbitrary
(a) Write a differential equation for the constants. Find the constants C and D given
amount of Situnsene S(t) in the container. the initial conditions x(0) = 0, x ′ (0) = 1.
(b) Verify that S(t) = e −.02t + 100 is a solu- (c) Show that your constants in Parts (a) and
tion to the equation you wrote. part (b) lead to the same solution (the
only correct solution to this differential
(c) Use trial and error to find how you can
equation with these conditions).
modify this solution to write the general
solution. You could try adding an arbitrary 1.58 An object moves according to the simple har-
constant to S, multiplying S by an arbi- monic oscillator equation d 2 x∕dt 2 = −x.
trary constant, or adding or multiplying (a) Show that x = A sin t + B cos t is
arbitrary constants to different parts of a valid solution.
S. For each thing you try you should (b) Show that x = C sin(t + 𝜙) is a
calculate both sides of the differential valid solution.
equation and check whether they match (c) Show that both of these general solutions
for any value of the arbitrary constant. are in fact the same, by expressing A and B
Once you find a solution that works, in terms of C and 𝜙 or vice- versa. You may
you have the general solution. need to look up some trig identities.
(d) Find the solution S(t) if the experiment
started with 200 g of Situnsene. For Problems 1.59–1.61, you need to know the
(e) You should find from your equation that following definitions.
at late times the amount of Situnsene ∙ Position is x (we’re working strictly in one dimen-
approaches a constant. Find the value of sion here), and time is t.
this constant and explain why the amount
∙ Velocity is v = dx∕dt, and acceleration is a =
of Situnsene can stay constant at this value.
dv∕dt = d 2 x∕dt 2 .
(f) Draw a quick sketch of your solution.
∙ Force is F .
Problems 1.56–1.58 consider how one general solution ∙ These quantities are related by Newton’s Second
can often be written in many different forms. Law, F = ma, where m, the mass, is generally a
constant.
1.56 dy∕dx = y is a first-order linear dif-
ferential equation. You may already know the solutions to some of these
problems, but you’re going to represent and solve
(a) Show that y = (C1 + C2 )e x is a solution
them through differential equations.
for any constants C1 and C2 .
(b) The general solution to a first-order lin- 1.59 A car is zooming down the road at a
ear differential equation should have constant speed of 55 mph.
only one independent arbitrary con- (a) Given that v = 55, and given the definition
stant. Explain why the solution in Part (a) of v, write a differential equation for x(t).
does not violate this rule. (b) Solve that differential equation: that is,
(c) Show that y = C1 e x+C2 is a solution for write some function x(t) that makes that
any constants C1 and C2 . differential equation true. You may be
(d) Explain how Part (c) also does not violate able to write many different x(t) func-
the rule that the general solution can have tions that work for this. Ultimately, you
only one independent arbitrary constant. want to wind up with an answer that
(e) Write the general solution to this has an arbitrary constant C in it, to rep-
equation in a form containing only resent all possible solutions.
one arbitrary constant. You now have an equation for x(t). But it has
1.57 Consider the linear differential an arbitrary constant C in it. What does that
equation d 2 x∕dt 2 = x. mean? Given the information we have—“a car
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 24
moving at 55 mph”—we don’t know where (i) Make up a reasonable answer and
the car is, until we get more information. write the function x(t) that matches it,
(c) Here comes more information: at time with no arbitrary constants.
t = 0, the car was at position x = −5. (j) Describe the motion of the piano in words.
Use that fact to find C and write the Does it make sense with the physical situa-
real equation for x(t): the one that will tion? Does it correspond to your answer?
actually tell us where the car is. 1.61 A car is rolling along a flat highway. Neither
(d) Describe in words the motion described the gas pedal nor the brakes are being pushed;
by your x(t) function. Does it make the only force on the car is air resistance.
sense with the physical situation? Why or A simple model is to say that air resistance
why not? is proportional to velocity: that is, the faster
1.60 A piano is dropped from the top of a you go, the more resistance you feel. We can
building. Neglecting friction for the write that as F = −bv. Hint: if you have trou-
moment, the force is given by F = −mg , ble with this problem you may want to work
where g is a positive constant. through Problem 1.60 as a guide.
(a) Write and solve an algebraic (a) What is the sign of b? Explain why this
equation for a(t). equation would not make sense phys-
ically if b had the other sign.
(b) Given that, write a differential equation
for v(t). (The position x should not (b) What are the SI units of b?
be in this equation.) (c) Write a differential equation for v.
(c) Solve that differential equation. Once (d) Solve it in the most general terms possible.
again, you may be able to write many (This will introduce an arbitrary constant.)
different v(t) functions that work, so (e) Now, use that answer to write a dif-
use an arbitrary constant to represent ferential equation for x.
all possible solutions. Call your con- (f) Solve it in the most general terms possible.
stant C1 instead of just C. (This will introduce another constant.)
(d) Now, using your answer to Part (c) and (g) Suppose the car started at position x = x0
the definition of velocity, write a dif- with velocity v = v0 . Find the values of the
ferential equation for x(t). arbitrary constants and write the function
(e) Solve that. Of course, the constant C1 x(t) that describes the motion of the car.
will be in there, since it was part of v. (h) Check that each term in your equation
But still, there will be many possible solu- has correct units.
tions. To represent them all, you will
(i) Take m = 2000, b = 1000, x0 = 0,
need a new arbitrary constant C2 .
and v0 = 20 in SI units. Draw quick
You now have an equation for x(t) that has two sketches of x(t) and v(t).
arbitrary constants, C1 and C2 . We can find the
(j) Describe in words the motion your
first one based on the given information.
sketch shows. Does it make sense
(f) Since the piano was “dropped” (not with the physical situation?
“thrown”), we can assume that v(0) = 0. (k) Sketch the solution again with b = 2000
Use this to find the constant C1 . (and all other numbers the same). How
Then rewrite your x(t) formula with did that affect the motion?
only one constant, C2 .
(g) Let’s look at this graphically. Suppose 1.62 For each SHO below find the period, ampli-
g = 10. Pick a value for C2 and draw a tude, and phase, or say which ones cannot be
graph of your x(t) function. (You can do determined from the given information.
this using a calculator or computer, but (a) x ′′ (t) = −9x
it should also be easy by hand.) Then
(b) x ′′ (t) = −25x, x(0) = 3, x ′ (0) = 0
pick a different value for C2 and draw
another graph. Continue until you have (c) x ′′ (t) = qx where q is a negative constant
at least four graphs. What do they all have (d) x ′′ (t) = −e 2z x, x(0) = 0, x ′ (0) = 1
in common? How do they differ? where z is a constant
(h) To choose one specific function, we need 1.63 The simple harmonic oscillator equation
one more piece of information that was not x ′′ (t) = −𝜔2 x(t) (where 𝜔 is a constant) has
specified in the problem. What is it? solution x(t) = C cos(𝜔t + 𝜙). Solve for C (the
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 25
amplitude of oscillation) and 𝜙 (the phase) in constant and x is the displacement from
terms of the initial conditions x(0) = x0 and equilibrium.
v(0) = v0 and the angular frequency 𝜔. (b) The system in Part (a) is pulled out by a
1.64 For each SHO below find the period, distance of 0.2 m and released from rest.
amplitude, and phase, or say which ones (c) A 2 m pendulum. A simple pendulum
cannot be determined from the given approximately obeys the equation 𝜃 ′′ (t) =
information. −(g ∕L)𝜃 where g = 9.8 m/s2 and L is
(a) A 5 kg mass is on a spring with spring the length of the pendulum.
constant 10 N/m. The force exerted by (d) A 2 m pendulum starts at 𝜃 = 0 with ini-
a spring is −kx, where k is the spring tial angular velocity 𝜃 ′ (t) = 3 s−1 .
don’t.) The drawing also is not meant to suggest that the solution is linear, even for a short
while. It simply says one solution rises with a slope of two through that point, and therefore
momentarily looks something like the drawing.
2. On graph paper, draw lines similar to the one shown above for all integer points 0 ≤
x ≤ 4 and −4 ≤ y ≤ 0 (a total of 25 points in the fourth quadrant).
3. Consider now the solution that begins at (0, −4). The slope at that point is 4, so start
moving to the right with a slope of four. As the curve rises, what happens to the slope?
What happens to the function as a result? Draw the function that results.
You should have wound up with a drawing something like
4
the one on the left.
Keep in mind that this is not ultimately about drawing
curves, but about predicting behavior. If y represents some
meaningful quantity and x represents time we can say the fol-
2 lowing: if y starts at −4 it will increase rapidly at first and then
gradually slow down, approaching but never reaching zero.
We figured all this out visually, without ever solving the differ-
ential equation.
y 0
4. Consider now the solution that begins at (0, 0). Draw
this function as you drew the previous one. Then write
a brief paragraph describing this solution, just as we
–2 described the (0, −4) solution.
5. Draw in slope lines for all the points on 0 ≤ x ≤ 4 and
1 ≤ y ≤ 4 (first quadrant).
6. Consider now the solution that begins at (0, 1). Draw
–4 this function as you drew the previous one. Then write
0 1 2 3 4 a brief paragraph describing this solution, just as we
x described the (0, −4) solution.
The Problem
The picture below shows what electrical engineers call an “RC circuit”: a simple circuit with
a battery, a resistor, and a capacitor. (If you have never studied circuits then the physics in
this problem won’t make much sense to you, but you should still be able to follow the math,
starting with Equation 1.4.2.) Over time, a charge Q will build up on the capacitor according
to the equation V = IR + Q ∕C, where the current I = dQ ∕dt.
. If the voltage of the battery is V = 9 V, the resis-
tance of the resistor is R = (9∕2)Ω, and the capaci-
R tance of the capacitor is C = (1∕3)F, then the equation
V for this accumulated charge (in Coulombs) becomes:
C
dQ 2
=2− Q (1.4.2)
dt 3
What behavior can we expect from this system over time?
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 27
∙ If you look at Equation 1.4.2 and ask the question “What makes dQ ∕dt = 0?” you
quickly arrive at the answer Q = 3. More generally, whenever you find a Q -value for
which dQ ∕dt is always zero, you have found a “constant solution” or “equilibrium
solution.” If Q reaches this value, it will stay there forever.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 28
∙ If you know circuit theory, you know that when a charge Q = 3 builds up on the capac-
itor, the voltage drop across the capacitor is Q ∕C = 9 V, perfectly opposing the voltage
of the battery, so no current flows. (If you have never studied circuits, you can still take
comfort in knowing that our conclusion makes perfect physical sense.)
Now, suppose Q (0) = 0. The slope field shows
6 clearly what will happen. Q will rise quickly at first, but
as it rises, the slope will flatten out, so Q will asymp-
5
totically approach 3 from the bottom. Similarly, if Q
4 starts out higher than 3, it will move downward more
and more slowly, approaching 3 from the top.
Q 3
We can now succinctly describe all the possible
2 behaviors of the circuit: if the charge starts at Q = 3
it will stay that way forever, and if it starts at any
1 other value it will asymptotically approach Q = 3 from
0
above or below.
0 1 2 3 4 5 6
Equilibrium
t
We can rephrase our conclusions about the RC circuit
in the following way: Equation 1.4.2 has a stable equilib-
rium at Q = 3. What makes it “equilibrium” is that, if Q = 3, then Q will stay 3 forever. What
makes it “stable” is that, if Q is close to but not equal to 3, it will approach 3. (In the opposite
situation, an “unstable equilibrium,” a system near equilibrium will tend to move away from
it: picture a pencil balanced perfectly on its point.)
Equilibrium states are classified by what happens when the system is near, but not in, the equilibrium
state. If the system is near a “stable equilibrium” it will tend to move toward equilibrium. If it’s near
an “unstable equilibrium” the system will tend to move away from equilibrium.
EXAMPLE Equilibrium
Problem:
Draw a slope field for the differential equation
dy
= y(y − 1)2 (y + 1)
dt
Identify the equilibrium states and classify each one as stable or unstable.
Solution:
The easiest way to start drawing a slope field is often to find the points where
dy∕dt = 0. In this case that’s at y = 0, ±1, so we can start by filling in those points on a
slope field.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 29
2
If y is any of the three values shown
above, it will stay at that value forever;
these are equilibrium states. To classify
1 them and analyze the behavior, we
consider other y-values.
y 0
∙ For y > 1 the derivative is positive.
For values just above 1, the slope is
‒1 positive but small, indicating a shal-
low incline; as y grows, the slope gets
steeper.
‒2
∙ For 0 < y < 1 the derivative is also pos-
‒2 ‒1 0 1 2
t
itive. It is shallow near y = 0 and y = 1
and steeper toward the middle.
∙ The derivative is negative for −1 < y < 0—once again, most steeply somewhere in
the middle.
∙ Finally, for y < −1 the derivative is positive, getting more so as you go lower.
Putting all of that together we get the following slope field.
Look at the slope field and see if you can
figure out how this system will behave if it
2 doesn’t start in one of its three equilibrium
states.
1 ∙ If the system starts anywhere below
zero, it will approach x = −1 (from
above or below).
y 0
∙ If the system starts between 0 and 1,
it will approach 1 from below.
‒1 ∙ If the system starts above 1, it will blow up.
y = −1 is a stable equilibrium, and y = 0
‒2 is unstable. We could say that y = 1 is stable
‒2 ‒1 0 1 2 from below but unstable from above; such
t a state is sometimes called “semi stable.”
You may now think that “equilibrium means the slope is zero,” but it isn’t that simple: equi-
librium occurs at a y-value for which the slope is always zero. The example below illustrates
this distinction, and shows how you can still use a slope field to analyze a more complicated
system.
Problem:
Draw a slope field for the differential equation
dy
= y − 2x
dx
and describe the possible behaviors of the solutions.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 30
Solution:
Stepping Back
A first-order differential equation gives you the slope y′ as a function of the point (x, y) where
you are; that is why slope fields represent such equations so well. A second-order differential
equation gives you the concavity y′′ as a function of both the point (x, y) and the slope y′ , so
such equations cannot be represented by slope fields. In Chapter 10 we will introduce “phase
portraits” to visually represent the solutions to higher order equations.
In the problems for this section you will draw many slope fields by hand. That’s a useful
skill because it forces you to think about the behavior of the system and about how the slope
field is related to the equation. Once you have practiced that skill enough to be comfortable
with the technique, you can learn a tremendous amount about a system by glancing over a
computer-drawn slope field.
1.77 Draw a slope field for the differential equation (d) Draw curves representing functions that
dy∕dx = k(y − a)(y + b) where k, a, and b are begin at (0, 0), (0, 1∕2), and (0, 1) for
all positive constants. Since you don’t have both equations (six curves in all).
numerical values for these constants you can’t (e) For the equation dy∕dt = 2y − 1, the func-
know exactly what the actual slopes are, but you tion y = 1∕2 represents an equilibrium
can still make a slope field that shows where solution: if y is ever 1/2, it will stay 1/2
the slopes are zero, positive or negative, and forever. If y is “bumped” slightly above
where they tend to be large or small. Label 1/2, how will it evolve over time? If y is
the equilibrium points on the y-axis. “bumped” slightly below 1/2, how will it
1.78 If you’re not familiar with the wonderful func- evolve over time? Does y = 1∕2 represent
2
tion e −x , there are two things you need to a stable or unstable equilibrium?
know for this problem. First, the equation (f) Repeat Part (e) for the equation
2
f ′ (x) = e −x has no simple analytical solu- dy∕dt = 1 − 2y.
2
tion. Second, the graph of y = e −x is shown 1.80 The equations dy∕dx = y − x and dy∕dx =
below; it is always positive, reaches an abso- x − y + 2 both have the same linear solution
lute maximum at (0, 1), and approaches 0 as y = x + 1, but behave quite differently other-
x → ±∞. wise. In this problem, you will draw slope fields
y
and analyze the behavior of both equations;
1 this will require two separate graphs.
y = e‒x
2
(a) Show that y = x + 1 is a valid solution to
both differential equations. (You will do
x this, not with a graph, but by plugging in.)
(b) If y = x + 1 is a solution to a differen-
(a) Based on this graph, draw a slope field tial equation, then dy∕dx must equal 1
2
for the equation f ′ (x) = e −x . everywhere along that line. So begin
(b) Let f (x) be the function that follows the both slope fields by drawing lines with a
2
differential equation f ′ (x) = e −x and con- slope of 1 at the points (−3, −2), (−2, −1),
tains the point (−3, 0). Use your slope and so on through (3, 4).
field to sketch a graph of y = f (x). (c) In the equation dy∕dx = y − x, what hap-
(c) Use your sketch to answer the ques- pens to dy∕dx if you increase y by one
tion: What is the x-value of the point unit while leaving x unchanged? Based
of inflection of f (x)? on your answer, draw in the slope lines
(d) Repeat Parts (a)–(b) for the differen- at all points one unit higher than the
2
tial equation f ′ (x) = e −f . Then answer points you drew in Part (b).
the question: what is the y-value of (d) Based on the same logic you used in
the point of inflection? Part (c), draw in the slope lines one
1.79 The equations dy∕dt = 2y − 1 and dy∕dt = unit higher. Then move down to one
1 − 2y look a great deal alike, but lead to unit, and then two units, below the line
very different types of behavior. In this prob- y = x + 1.
lem, you will draw slope fields and ana- (e) Repeat parts (c) and (d) on your other
lyze the behavior of both equations; this graph for dy∕dx = x − y + 2.
will require two separate graphs. (f) For the equation dy∕dx = y − x, if the
Note: because dy∕dt has no t-dependence, function starts on the line y = x + 1, it
you can ignore negative t-values without will stay on that line forever. How will
any loss of generality. So feel free to draw a function evolve if it starts just above
your slope fields only for positive t. that line? Just below that line?
(a) For both of these graphs, dy∕dt = 0 when (g) For the equation dy∕dx = x − y + 2, if the
y = 1∕2. So begin both slope fields with hor- function starts on the line y = x + 1, it
izontal lines at all points where y = 1∕2. will stay on that line forever. How will
(b) For dy∕dt = 2y − 1, compute the slope a function evolve if it starts just above
when y = 1, and then draw in slope lines that line? Just below that line?
at all points where y = 1. Then repeat for 1.81 In the Explanation (Section 1.4.2), we used
y = 2, y = 3, y = 0, y = −1, and y = −2. a slope field to determine that the equation
(c) Repeat Part (b) for dy∕dt = 1 − 2y. dy∕dx = y − 2x has only one linear solution,
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 33
which is y = 2x + 2. In this problem, you will mark a point on your plot.) Sketch the
prove the same result analytically. solution S(t) with that initial condition
(a) If there is a linear solution, it can be on your slope field. Do not use a solu-
written in the form y = mx + b. So tion where S(t) is constant.
plug y = mx + b into the differential (d) Based on your slope field, what is lim S(t)?
t→∞
equation dy∕dx = y − 2x.
1.84 An object at −200◦ C is placed in a −1◦ C
(b) The resulting equation sets two linear func-
room. The object begins to warm, its
tions of x equal to each other. Put each
temperature u following the differen-
function—the one on the left side of the
tial equation du∕dt = u 4 − 1.
equal sign, and the one on the right side—
into the standard form of a line. (a) Explain why this equation, in this form,
doesn’t make sense for u > 1.
(c) If the left side of the equation is the
same function as the right side, then (b) Using a slope field, sketch the temperature
their slopes (coefficients of x) must be u(t). Be sure to show the correct concavity
equal, and their y-intercepts (constant at all times and the correct limit lim u(t).
t→∞
parts) must also be equal. Use these 1.85 Venusians, the plucky inhabitants of the planet
two facts to solve for m and b. Venus, reproduce by a process called “quato-
(Incidentally, an alternative approach is sis”: every year, every Venusian splits into four
to remember that a line is the only func- little Venusians. Left unchecked, the popu-
tion with zero concavity. So starting with lation of Venusians would soon overwhelm
dy∕dx = y − 2x you can set d 2 y∕dx 2 = 0 their planet. However, Martians kidnap 12,000
and reach the same conclusion.) Venusians from the planet every year.
1.82 The population of the planet Foom is (a) Write the differential equation that gov-
described by a function F (t). Every Foomian erns the number of Venusians V (t). Your
has, on average, 2 babies per year. differential equation should take into
(a) Assuming no Foomians ever die, write account both the gain due to quatosis,
a differential equation for F (t). and the loss due to kidnapping.
(b) Sketch a slope field for the equation you (b) Draw a slope field for that differential
wrote down in Part (a). Since the num- equation. You can safely ignore negative
ber of Foomians can never be negative you t and V -values, which confines your slope
only need to include values F ≥ 0. Identify field to one quadrant. Beyond that, part of
any equilibrium points in this range. your job is to choose enough V -values to see
the behavior of this differential equation.
(c) Using this slope field, describe all the pos-
sible long-term behaviors of F (t). (c) The equation has one equilib-
rium state; what is it?
1.83 A vicious rumor is spreading through a cam-
pus with a total population P . The number (d) Is the equilibrium state stable or unsta-
of students who have heard the rumor obeys ble? How can you tell?
the “logistic” differential equation dS∕dt = (e) Explain in words to a non-mathematician
kS(P − S) where k is a positive constant. (from Earth) what will happen if the Venu-
(a) Draw a slope field for this equation. sian population starts higher than the equi-
Because S cannot be smaller than 0 or librium value, and why it will happen.
larger than P you only need to include val- 1.86 The ancient Martian race is dying. The only
ues of S in that range. Since you don’t have way they keep their ancient culture going is by
numerical values for P and k you can’t know kidnapping; every year, they steal 12,000 Venu-
exactly what the actual slopes are, but you sians. Through a high-tech process known as
can still make a slope field that shows where “Marsification” they transform their victims into
the slopes are zero, positive or negative, Martians. Unfortunately, the process is unsta-
and where they tend to be large or small. ble; the Martians cannot reproduce, and every
(b) For what values of S will S(t) remain con- year, one out of every five living Martians dies.
stant? (You can figure this out either (a) Write the differential equation that gov-
from your slope field or directly from erns the number of Martians M (t). Your
the differential equation.) differential equation should take into
(c) Choose a value of S(0) on your plot. account both the loss due to death, and
(You don’t need to pick a number; just the gain due to Marsification.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 34
(b) Draw a slope field for that differen- you can call k. Assume k is positive and
tial equation. You can safely ignore make sure the signs in your equation
negative t and M -values, which con- make sense physically. (Does it behave
fines your slope field to one quadrant. correctly when Q > QR and when
Beyond that, part of your job is to choose Q < QR ?)
enough M -values to see the behavior (b) Let k = 1 s−1 and QR = 20◦ C. Draw a
of this differential equation. slope field for Q . Choose a range of val-
(c) The equation has one equilib- ues for Q and t that clearly shows the
rium state; what is it? possible behaviors of the system.
(d) Is the equilibrium state stable, or unsta- Now suppose the same object is in the same
ble? How can you tell? room, but now the room is steadily heat-
(e) Explain in words to a non-mathematician ing up: QR = 20 + t with temperature in
(from Earth) why your answer to Part (d) degrees Celsius and time in seconds.
makes sense. In other words, how could (c) Rewrite the differential equation. (Your
you have looked at our description answer should have t in it, not QR .)
of Mars and predicted without doing (d) Draw the slope field for this new situation.
any math whether the equilibrium (e) Your slope field should show that there is
would be stable or unstable? an “attractor solution,” a linear Q = at + b
1.87 Newton’s law of heating and cooling says that that the temperature approaches no mat-
the rate of change of the temperature Q of ter how it starts out. Using your slope
an object is proportional to the difference field as a guide and some algebraic trial
between the temperature of the object and and error, find that linear solution and
the temperature QR of the room it’s in. verify that it works in the differential
(a) Write this law as a differential equation equation.
for Q . Your equation will have a con- (f) Describe what the temperature of the
stant of proportionality in it, which object does over time if Q (0) = 20◦ C.
dR
= 5R − 100 (1.5.2)
dt
The technique comprises three steps: separate the variables, integrate both sides, and then
solve for R.
1. Separate the Variables. Your goal here is to algebraically manipulate the equation into
the form (function of R) dR=(function of t) dt. Equation 1.5.2 readily becomes:
1
dR = dt
5R − 100
2. Integrate both sides. Make sure to put a +C on the right side!
1
dR = dt
∫ 5R − 100 ∫
1
ln |5R − 100| = t + C
5
Why only the right side? You could certainly write +C1 on the left and +C2 on the
right. But then you could combine the two constants into C2 − C1 on the right. As
we have discussed before, C2 − C1 simply means “any constant” so we lose nothing by
calling it C.
On the other hand, many students make the mistake of leaving out the +C
entirely at this step, planning to toss it in later. This is an understandable habit left
over from introductory calculus, where adding +C to every answer is often the last
step before turning in the “integrals” test. That doesn’t work here. Add the C when
you integrate, and then it will appear in the correct spot in the final function.
3. Solve for R.
ln |5R − 100| = 5t + 5C
|5R − 100| = e 5t+5C = e 5C e 5t
5R − 100 = ±e 5C e 5t
1
R = ± e 5C e 5t + 20
5
Another Look at C
The proper care and handling of your arbitrary constants is the subtlest part of this tech-
nique. In the example above we turned C1 − C2 into C and later turned ±(1∕5)e 5C into C.
As we discussed in Section 1.3, when you are finding a general solution, you can keep your
constants as simple as possible.
What you cannot do, as we stressed above, is simply add +C at the end of the problem.
Test R = (1∕5)e 5t + C in Equation 1.5.2; it doesn’t work. Adding the C when you integrate,
and carrying it around the algebra from there, ensures that it ends up in the correct place
in the final solution.
In addition, you cannot turn 5C or anything else into C when you are checking a
solution. Remember that C in the equation means “any number works here”; for instance,
R = 7e 5t + 20 is a valid solution to Equation 1.5.2. Checking should follow all the normal
rules of algebra.
Problem:
Find the general solution to the equation:
dy y
= x (y > 0) (1.5.3)
dx e ln y
Solution:
ln y 1
dy = x dx
y e
ln y
dy = e −x dx
∫ y ∫
1 ( )2
ln y = −e −x + C
2 √
−x
y = e ± −2e +C
You can readily verify that this solves Equation 1.5.3 for any value of C.
Separation of variables fails in both cases. In the first example, there is no algebraic way to
get all the x-dependence on one side and the y-dependence on the other. In the second, the
variables are easy enough to separate, but you can’t integrate the function!
Such problems may simply require a different technique. There are many to choose from,
and you will see more as the book progresses. But many important differential equations turn
out to be literally unsolvable. In that case, approximation techniques are used—either to sim-
plify the differential equation itself, or to approach the answer numerically with a computer.
(a) Find the specific solutions for (d) Calculate lim v(t) and use it to describe
t→∞
R(0) = 10, R(0) = 20, and R(0) = 30. what is physically happening to James after
(b) Sketch the three solutions you found he’s been in the air for a long time.
together on one plot. (e) Now that you have a velocity function
(c) For each of these solutions, explain v(t), integrate it with respect to t to
why the behavior you found (increas- find a position function h(t). This will
ing, decreasing, or staying constant) introduce a second constant C2 .
makes sense. Don’t answer in terms of (f) Supposing James begins at a height
equations and variables; explain it in of 3000 m with no initial velocity,
terms of rabbits and hunters! what is his height 3 s later?
(d) Identify the equilibrium solution for this 1.106 A frozen turkey placed in a 35◦ F refrigera-
equation. Is it stable or unstable? tor begins to thaw according to the equation
1.104 According to NASA, the Arctic sea ice extent dQ ∕dt = (1∕20)(35 − Q ) where Q is the
is melting at a rate of 10% per decade. Sup- temperature of the turkey. Assume tem-
pose that at just the moment when the perature is measured in degrees Fahren-
ice extent reaches 25 Million square kilo- heit and time is measured in hours.
meters of ice, a crack team of engineers (a) Draw a slope field for 0 ≤ t ≤ 5 and
begins replenishing the ice at a constant Q = −5, 15, 25, 35, 45, 55.
rate of 1 Million km2 per decade. (b) Based on your slope field, what is the
Let P equal the size of the Arctic equilibrium solution to this equation? Is
sea ice extent, measured in millions it a stable or unstable equilibrium?
of square kilometers. Let t equal the
(c) Solve the differential equation by sepa-
time, measured in decades.
ration of variables, assuming the turkey
(a) Write a differential equation for P (t). started (t = 0) in a 0◦ F freezer.
(b) Find the equilibrium solution to your (d) Based on your solution, what is lim Q (t)?
equation. Does it represent a stable or t→∞
unstable equilibrium? How can you tell? (e) How long will it take the turkey
to reach 32◦ F?
(c) Solve the differential equation, with initial
condition, to find the function P (t). 1.107 The Expanding Universe
The universe is expanding accord-
(d) Verify that your solution solves the
ing to “Hubble’s Law”:
differential equation.
√
1.105 James has just jumped out of an airplane. After da 8𝜋G
he opens his parachute he experiences two = 𝜌 a (1.5.5)
dt 3c 2
forces: the constant force of gravity, and a wind
drag that is proportional to his velocity. His where a is the distance between two refer-
height may therefore follow the equation: ence objects and 𝜌 is the energy density of
the universe. (Note: it will be easier to work
with this equation if you√ collect the constants
d2h dh
= −9.8 − 2 (1.5.4) into one letter: say, 𝛽 = 8𝜋G∕3c 2 .)
dt 2 dt
(a) If we consider the universe to be primarily
As a second-order differential equation, composed of matter, then 𝜌 = k∕a 3 . (This
this is not technically solvable by sep- makes sense if you think about it.) Solve
aration of variables. However, because the resulting differential equation.
the variable h appears only in its deriva- (b) If we consider the universe to be pri-
tives, we can turn this into a first- marily composed of radiation, then
order equation, and solve that by 𝜌 = k∕a 4 . Solve the resulting dif-
separation. ferential equation.
(a) Letting v = dh∕dt, rewrite Equation 1.5.4 (c) If we consider the universe to be primarily
as a first-order differential equation in v. composed of dark energy (Einstein’s “cos-
(b) Your equation in Part (a) suggests mological constant”) then 𝜌 is a constant.
that there is one velocity for which Solve the resulting differential equation.
dv∕dt = 0. What is this velocity? (d) Describe in words the difference between
(c) Solve your equation using separation of how a universe expands when it is filled
variables. Your solution should contain with matter or radiation vs. how it expands
an arbitrary constant: call it C1 . when it is filled with dark energy.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 39
opposes its motion: Fdamping = −cv. Putting these together with Newton’s Second Law gives
the equation of motion for this mass, ma = −cv − kx. For our example we take m = 1, c = 8,
and k = 15.
d 2x dx
+ 8 + 15x = 0 (1.6.2)
dt 2 dt
To solve this equation and determine the behavior of the mass, our strategy will be to guess
an exponential function x(t) = e pt . It’s important to note that p is not an arbitrary constant!
An arbitrary constant says: “This solution will work for all values of C.” We are asking the
question: “Is there any constant p for which x = e pt is a valid solution to Equation 1.6.2?” If
there is, its derivatives look like this.
dx d 2x
x = e pt → = pe pt → = p 2 e pt
dt dt 2
Now we plug all that into Equation 1.6.2 and see what we get.
( )
p 2 e pt + 8pe pt + 15e pt = 0 → p 2 + 8p + 15 e pt = 0 → p = −3, p = −5
Our guess has now justified itself—it has given us two specific solutions, x = e −3t and x = e −5t .
Of course, you’re never done with a real-world problem until you interpret the answer.
If we had ended up with x = e 2t we would know we had made a mistake; there’s no way our
underwater mass is going to go shooting off like a rocket. But negative exponential solutions
make perfect sense. Our spring will ease down toward its relaxed position at x = 0.
Even assuming you followed all that, you may not feel that you could do the same thing
on your own. The guess x = e pt came out of nowhere; how do you choose a good guess?
The short answer is “try stuff, but be smart.” There are only a few basic functions to choose
from: constant, exponential, power, trig, logarithm. For Equation 1.6.2 most of these forms
can be eliminated at a glance. If you try a power function such as x = t 3 , the three different
terms will all be different powers of t; they will add up to a polynomial. If you try x = sin t,
you will find yourself adding two sines and a cosine; again, they cannot possibly cancel. An
exponential function seems at least a plausible guess, since you can see before you try it that it
will create three different exponential terms that may, possibly, add up to zero.2
Problem:
Use guess and check to find a solution to xy′ (x) + 3y(x) = 0.
Solution:
We’ll try the guess y = e kx . Plug that and its derivative y′ (x) = ke kx into the differential
equation.
xke kx + 3e kx = 0 → e kx (xk + 3) = 0 (1.6.3)
A common student mistake is to solve that equation, but k = −3∕x will not provide a
valid solution because our guess was y = e kx for a constant k. The assumption of a
constant k was built into the process when we wrote y′ (x) = ke kx . No constant k solves
Equation 1.6.3, so we conclude that e kx was not a good guess.
2 Section 10.2 will give more specific guidelines for a good guess.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 41
We will therefore try a different guess: y = x k (again with a constant k). Plug that
and its derivative y′ (x) = kx k−1 into the differential equation.
xkx k−1 + 3x k = 0 → x k (k + 3) = 0 → k = −3
We now have a working solution, y = x −3 . You can verify this by testing it in the
original equation. (In this case you can also arrive at the same solution by separating
variables.)
Left side: Each term is a function of x or a Right side: If this is zero the equation is “linear and
constant, multiplied by y or one of its homogeneous.” If this is a constant or a function of x
derivatives. That makes the equation “linear.” the equation is “linear and inhomogeneous.”
d2y 1 dy
3 + + (cos x)y = ln x
dx2 x dx
(1.6.4)
The point is that y and its derivatives do not appear in any form other than simply being multiplied
by functions of x. For instance cos(y′′ ), y2 , or y(dy∕dx) would render the equation non-linear. The
distinction between homogeneous and inhomogeneous only applies to linear equations.
These definitions, along with several other classifications of differential equations, are in
Appendix A.
∙ Linear homogeneous equations. These equations are generally the most straightforward,
and we will show you how to solve them below.
∙ Linear inhomogeneous equations. Linear inhomogeneous equations require a bit more
work but we can still usually handle them. We will discuss them after the homogeneous
case below.
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 42
∙ Non-linear equations. Many important equations in the real world fall into this cate-
gory, but these are generally difficult to solve. In many cases the best approach is
to solve them numerically (Section 1.8) or approximate them with linear equations
(Chapter 2). We will say nothing more about non-linear equations in this section.
Before you read further, see if you can properly categorize Equation 1.6.2. As we will see
below, that is a key step in deciding how to solve it.
Once you verify these results for this specific case, you may be able to see how they
generalize—and in particular how they rely on beginning with an equation that is both
linear and homogeneous. You have then arrived at the “principle of linear superposition.”
In Problem 1.130 you’ll show why all linear homogeneous differential equations obey lin-
ear superposition, but here we illustrate the point with an example.
2
Given the equation… y′′ − y = 0
x2
A: Plug in y = x 2
→ 2 − 2 = 0
1 2 2
B: Plug in y = → 3
− 3
= 0
x x x
( ) ( )
1 2 2
C: Plug in y = x 2 + → 2+ 3 − 2+ 3 = 0
x x x
The sum of the two individual solutions worked because each term in line C just gave the
sum of the values in lines A and B. Similarly, multiplying a solution by a constant would just
multiply each term by that constant, and the whole thing would still equal zero.
Now consider a non-linear example: y′ − y2 = 0.
Given the equation… y′ − y2 = 0
1 1 1
A: Plug in y = − → − = 0
x x2 x2
1 1 1
B: Plug in y = → − = 0
1−x (1 − x)2 (1 − x)2
( ) ( )
1 1 1 1 1 1 2 2
C: Plug in y = − + → + − + − =
x 1−x x 2 (1 − x)2 x 2 (1 − x)2 x(1 − x) x(1 − x)
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 43
This time the sum of the two individual solutions didn’t work because the y2 in the dif-
ferential equation introduced a cross-term in C that wasn’t just the sum of what had been in
A and B. Similarly, if you plugged in 2 times one of the solutions, the first term would get
multiplied by 2 and the second term by 4, and they would no longer cancel.
We now have a two-step method for solving linear homogeneous differential equations.
1. Use guess and check to find n individual solutions for an nth order equation.
2. Use linear superposition to combine these solutions into a solution with n arbitrary
constants, and you have the general solution.
Problem:
Find the general solution to the equation:
d 2 y dy
2 − − 6y = 0 (1.6.6)
dx 2 dx
Solution:
When all the coefficients in your equation are constants, an exponential function is
usually a good guess.
dy d 2y
y = e kx → = ke kx → = k 2 e kx
dx dx 2
Plug all that into the differential equation.
We now have two individual solutions to our second-order equation, e −3x∕2 and e 2x . A
linear combination of solutions to a linear homogeneous equation is itself a solution.
You can confirm that Equation 1.6.7 is a valid solution to Equation 1.6.6 for any
values of the constants A and B, and it has the right number of arbitrary constants to
be the general solution.
As a final note on that example, we never proved that we had the general solution to
Equation 1.6.6. It’s easy to show that our solution works for any values of A and B. It’s much
harder to prove that all possible solutions can be written in the form of Equation 1.6.7—and
we’re not going to. Here and elsewhere we will assume that when we have n independent
arbitrary constants in our solution to an nth order linear ODE, we have the general solution.
Exceptions (such as Problem 1.132) don’t come up often.
d 2x dx
+ 8 + 15x = −9.8 (1.6.8)
dt 2 dt
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 44
Our first step is to start over with “guessing.” Can you think of an function x(t)—any function
at all—that solves Equation 1.6.8? Hint: there is a simple one!
Answer: x(t) = −9.8∕15. Almost looks like cheating, doesn’t it? Since this function is a
constant, dx∕dt and d 2 x∕dt 2 are both zero, so it clearly works. At the same time, this one-off
solution doesn’t seem to bring us any closer to finding the general solution that we need to
solve problems.
Inhomogeneous equations do not obey superposition as we described it earlier. If we
multiply our solution by three the equation will give us 3 × (−9.8); if we add our solution
to another solution the equation will give us (−9.8) + (−9.8). In neither case will we have
another working solution.
But because Equation 1.6.8 is still linear we can use superposition in a slightly different
form. We will add our one-off solution to the general solution we found to the homogeneous
Equation 1.6.2.
9.8
x(t) = Ae −3t + Be −5t − the solution to Equation 1.6.8
15
Watch what happens when we plug that into the differential equation.
Plug x = Ae −3t + Be −5t into x ′′ + 8x ′ + 15x and you get 0
Plug x = −9.8∕15 into x ′′ + 8x ′ + 15x and you get −9.8
Plug x = Ae −3t + Be −5t − 9.8∕15 into x ′′ + 8x ′ + 15x and you get 0 + (−9.8) = −9.8
The linear homogeneous equation x ′′ + 8x ′ + 15x = 0 is called the “complementary
equation” to Equation 1.6.8. When we add the general solution of the complementary
equation to a particular solution to Equation 1.6.8 we find a solution to Equation 1.6.8 that
has the arbitrary constants it needs to meet any initial conditions. In other words, we find
the general solution.
As before we can make immediate physical sense of this solution. Like the horizontal
spring, this mass eases quickly toward equilibrium as the exponential terms decay. However,
this does not occur at the relaxed position of the spring (x = 0) but at a lower point, where
the upward pull of the spring perfectly balances the downward pull of gravity.
We now have a three-step method for solving linear inhomogeneous differential
equations.
1. Find a specific solution to your equation: any function that works, no matter how sim-
ple. In the example above (and the example below) we find a constant solution by just
looking, but sometimes this step is more complicated and involves a guess-and-check
form of its own.
2. Write the complementary homogeneous equation by replacing the inhomogeneous
term with zero. Find the general solution to this equation, as we discussed above.
3. The sum of these two solutions is the general solution to your inhomogeneous
equation.
Problem:
Find the general solution to the equation:
d 2 y dy
2 − − 6y = 30 (1.6.9)
dx 2 dx
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 45
Solution:
We begin with the complementary homogeneous equation.
d 2 y dy
2 − − 6y = 0 (1.6.10)
dx 2 dx
2k 2 e kx − ke kx − 6e kx = 0
The solution works, and it has the right number of arbitrary constants, but it is the
solution to the wrong equation. We want to solve Equation 1.6.9! Fortunately, it isn’t
hard to find a solution to that one just by looking:
yp (x) = −5
Because yp is a constant, its first and second derivatives are zero, so you can readily
confirm that it does solve Equation 1.6.9. It is one particular solution. If we add it to our
complementary solution the resulting function will still be a solution.
y = −5 + Ae −(3∕2)x + Be 2x
That is the function we have been looking for. It solves Equation 1.6.9, and it has the
two arbitrary constants it needs to meet any initial conditions.
Stepping Back
You now have a set of tools that can be used to find general solutions to many—although
certainly not all—linear differential equations. But because the entire process is based on
inspired guesses, it may not seem rigorous. Shouldn’t we have to prove something?
In response, remember what we said about the algebra problem at the beginning of this
section: a working solution is its own proof. If you plug it in and it works, it is a solution. And if
it has the requisite number of independent arbitrary constants, it is the general solution. All
the guessing and adding of solutions are just tricks to lead us to a solution that, when found,
justifies itself.
different solutions to the complementary each case, try an exponential solution and find as
equation. many real solutions as you can. Then give the general
(d) By multiplying your answers to Part (c) solution, or explain why you cannot find the general
by arbitrary constants, and adding solution by this method.
the two resulting solutions together, ( )
1.111 d 2 y∕dx 2 − 7 dy∕dx + 12y = 0
write the general solution to the
complementary equation. 1.112 d 2 s∕dt 2 + 3 (ds∕dt) + 7s = 0
(e) Find a simple solution—a constant—to 1.113 d 2 s∕dt 2 + 5 (ds∕dt) − 4s = 0
the original inhomogeneous equation. 1.114 d 2 𝜌∕dz 2 + 10 (d𝜌∕dz) + 25𝜌 = 0
(f) By adding your general solution from
Part (d) to your particular solution from For Problems 1.115–1.119 find the general solution
Part (e), write the general solution to using guess-and-check. You may need to use trial and
the problem we started with. error to find the right kind of function to guess for
(g) Demonstrate that your function from the complementary and/or particular solutions.
Part (f) solves the differential equation.
1.115 d 2 y∕dx 2 + (k 2 + p 2 )y = 0
1.109 Consider the problem dy∕dx = xy + y.
Suspecting an exponential solution, 1.116 t 2 (d 2 s∕dt 2 ) + 10t(ds∕dt) + 20s = 0
you decide to try y = e kx . 1.117 d 4 y∕dx 4 − y = 0
(a) Plug the trial solution y = e kx into the dif- 1.118 d 2 y∕dx 2 + 𝜔2 y = e kx Hint: To find a par-
ferential equation dy∕dx = xy + y. ticular solution, guess yp = ae kx and
(b) The resulting equation can easily be solve for the constant a.
solved to yield k = x + 1. Have you found 1.119 x 2 (d 2 u∕dx 2 ) + 4x(du∕dx) − 4u = 4
a solution? Test y = e x(x+1) in the differ-
ential equation dy∕dx = xy + y. 1.120 The equation d 2 x∕dt 2 + 4(dx∕dt) + 3x = sin t
(c) It didn’t work! Where did we go wrong? can represent a harmonic oscillator that is
(d) Now solve this equation correctly damped (by friction or air resistance) but
using separation of variables. also driven by an oscillatory force.
( 2 ) (a) Write the complementary homogeneous
1.110 t d r ∕dt 2 − 9t (dr ∕dt) + 16r = 4 is an
2
(c) Write a solution to the original dif- (b) By experimenting with basic func-
ferential equation that has an arbi- tions, find another solution.
trary constant in it. (c) With a little more trial and error, rewrite
(d) Show that the function y = xe 3x is your answer to Part (b) with an arbi-
a solution to the complementary trary constant in it. Make sure it still
homogeneous equation. solves the differential equation.
(e) Write the general solution (complete with (d) Is the solution you found in Part (c) the
two independent arbitrary constants) to general solution? Why or why not?
the original differential equation. 1.125 Consider the equation (dy∕dx)2 = y. Impor-
3t
′′ ′
1.122 The equation x (t) − 7x (t) + 10x = 50 + e tant: don’t confuse (dy∕dx)2 with d 2 y∕dx 2 ,
has two inhomogeneous terms. You can solve which is something completely different!
it by approaching them separately. (a) Plug in a solution of the form y = kx p
(a) Find the general solution xc (t) to the com- where k and p are constants.
plementary homogeneous equation. (b) You should now have an equation of
(b) Find a particular solution xp1 (t) to the the form a1 x b1 = a2 x b2 . The only way
equation x ′′ (t) − 7x ′ (t) + 10x = 50. You these can be the same function is if
can do this quickly, by just looking. a1 = a2 and b1 = b2 . Use these two facts
(c) Find a particular solution xp2 (t) to the to solve for k and p and write a solu-
equation x ′′ (t) − 7x ′ (t) + 10x = e 3t . To do tion to the differential equation. (You
this, plug in a solution of the form x = may find the solution y = 0; that func-
Me 3t and find the value of M that works. tion does work, but don’t use it.)
(d) Write the function xc (t) + xp1 (t) + (c) Show that your solution works. Then
xp2 (t) and show that it solves the orig- multiply your solution by an arbitrary
inal differential equation. constant and show that it doesn’t work.
Explain why this doesn’t violate the prin-
1.123 In the Explanation for slope fields
ciple of linear superposition.
(Section 1.4), we found that one specific
solution to the equation dy∕dx = y − 2x is 1.126 The picture shows a standard diagram for
y = 2x + 2. We discussed the behavior of other a circuit with four elements: a battery that
solutions, but did not find them analytically. maintains a constant voltage V , a resistor with
(a) Begin by rewriting the differential resistance R, a capacitor with capacitance
equation in a more standard form, as C, and an inductor with inductance L.
a1 (x)(dy∕dx) + a0 (x)y = f (x).
(b) Write and solve the complementary
homogeneous equation.
R
(c) Find a particular solution to the orig-
V C
inal equation by guessing a solution
L
of the form yp = Ax + B.
(d) Find the general solution to the
original equation.
(e) Test your general solution. A charge Q will build up on the
(f) In the slope field explanation we con- capacitor according to the equation
cluded that if a function lies perfectly V = R(dQ ∕dt) + Q ∕C + L(d 2 Q ∕dt 2 ).
on the line y = 2x + 2 then it will stay (a) Find the function Q (t) for a circuit with
there forever, but if it does not lie on V = 9, R = 3, C = 1∕5, and L = 1∕4.
that line it will move farther and far- (b) Your solution should consist of a “tran-
ther away from that line. Explain how sient” part that dies out quickly and a
your general solution correctly pre- “steady-state” part that represents the
dicts both of these behaviors. long-term behavior of the circuit. Dis-
1.124 Consider the equation y2 + (dy∕dx)2 = 1. Impor- cuss the steady-state solution: what is
tant: don’t confuse (dy∕dx)2 with d 2 y∕dx 2 , the circuit doing? How could you have
which is something completely different! predicted this behavior—quantitatively,
(a) Find the two constant solutions not just qualitatively—without solv-
to this equation. ing the differential equation?
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 48
1.127 Newton’s law of gravity says that an object in find? What does that suggest about our
space falling directly towards Earth will experi- guess? About the behavior of the system?
ence an acceleration r ′′ (t) = −GME ∕r 2 , where
G and ME are constants representing the The rest of the problems in this section concern
strength of gravity and the mass of the Earth, themselves with why and when the principle of linear
and r is the object’s distance from the center superposition works.
of the Earth. (We assume here that the object’s
1.129 Use Equation 1.6.4 to determine if each
mass is much smaller than the Earth’s.)
of the differential equations below is lin-
(a) Try a power law solution of the form ear, and if so whether it is homogeneous.
r (t) = a(t − b)p and plug it in. As always, Then test the proposed solutions y1 and
the result should be an algebraic equation. y2 , and also their sum y1 + y2 .
(b) The equation you found in Part (a) (a) y′′ − y = 0, y1 = e x , y2 = e −x .
should have been in the form something
(b) y′′ − y = 2 sin x, y1 = e x , y2 = − sin x.
(t − b)something = something (t − b)something . The
only way the two sides of that equation can (c) y′′ − yy′ = 0, y1 = −2∕x, y2 = 2.
represent the same function is if they both 1.130 In this problem you will prove the prin-
have the same exponent and the same ciple of linear superposition. Because
coefficient in front, so you can rewrite we want a general proof, we begin with
this as two algebraic equations and find the general form for every linear homo-
the values of two of the constants in your geneous differential equation:
original guess. Write down the solution
r (t) that you find and verify that it does dny d3y d2y
an (x) n
+ … + a3 (x) 3 + a2 (x) 2
solve the original differential equation. dx dx dx
dy
(c) Does your solution have any arbitrary con- + a1 (x) + a0 (x)y = 0
stants in it? Is it the general solution? dx
(1.6.11)
1.128 In the Explanation (Section 1.6.2) we showed
that a damped harmonic oscillator is mod- (a) Suppose y = u1 (x) is a solution to
eled by the equation ma = −cv − kx. We then Equation 1.6.11. Show that the
solved this equation for particular values function y = Au1 (x) also works as a solu-
of the constants—values that were chosen tion. (Remember that if A is a con-
to work out nicely. In this problem you will stant, (d∕dx)Au1 (x) = A(du1 ∕dx).)
take up the challenge more generally. (b) Suppose u1 (x) and u2 (x) are both solu-
(a) Explain why the nature of a spring tions to Equation 1.6.11. Show that
dictates that k must be positive, and the function u1 (x) + u2 (x) also works as
the nature of a damping force dic- a solution.
tates that c must be positive. (c) Now consider a linear inhomogeneous
(b) Plug the guess x = e pt into the differential differential equation:
equation m(d 2 x∕dt 2 ) + c(dx∕dt) + kx = 0.
dny d3y d2y
(c) Solve the resulting equation for p. an (x) + … + a 3 (x) + a 2 (x)
dx n dx 3 dx 2
(d) If the resulting equation has only one dy
solution for p, the system is said to be + a1 (x) + a0 (x)y = f (x) (f (x) ≠ 0)
dx
“critically damped.” What relationship (1.6.12)
between the constants c, k, and m leads
to a critically damped oscillator? Express Suppose uh (x) is a solution to
this relationship by writing a formula Equation 1.6.11 and ui (x) is a solu-
for c as a function of k and m. tion to Equation 1.6.12. Show that
(e) If c is greater than the formula you found the function uh (x) + ui (x) solves
in Part (d), the system is said to be “over- Equation 1.6.12 but not 1.6.11.
damped.” How many real p-values do 1.131 To see why the principle of linear superpo-
you find in this case, and are they posi- sition only applies to linear equations, con-
tive or negative? What sort of behavior sider a non-linear equation of the form
results from an overdamped system?
(f) If c is less than the formula you found in d2y dy
a2 (x) 2
+ a1 (x) + a0 (x)y2 = 0 (1.6.13)
Part (d), how many real p-values do you dx dx
7in x 10in Felder c01.tex V3 - February 26, 2015 9:43 P.M. Page 49
(a) Suppose u1 (x) and u2 (x) are both solutions not apply to the non-linear differen-
to Equation 1.6.13. Show that the function tial equation y′′ − y′ y = 0.
u1 (x) + u2 (x) does not work as a solution. (a) What makes this equation non-linear?
(b) Repeat Part (a) assuming that dy∕dx (b) Assume you have found two solutions y1
is squared but y is not. Once again and y2 . Plug the guess y = y1 + y2 into the
assume that u1 (x) and u2 (x) are solu- left side of this equation and simplify as
tions to this new equation and show much as possible using the fact that y1
that u1 (x) + u2 (x) is not. and y2 are both solutions. What term are
1.132 Consider the linear differential you left with that doesn’t go away?
equation x ′ (t) = x(t). (c) Show that y1 = 2 tan x is a solution.
(a) Show that x(t) = |C|e t is a solution for (d) Show that y2 = 2 is a solution.
any value of the constant C. (e) Plug in y = 2 tan x + 2. Show that it doesn’t
(b) Since this is a first-order linear differ- work, and show that the term you are
ential equation and we have a solution left with is the same term you found in
with an arbitrary constant you would nor- Part (b).
mally expect it to be the general solution.
Prove that it isn’t by finding a solution that 1.134 Consider the equation dy∕dx = 1∕y.
doesn’t match this for any value of C. (a) Explain how you can tell this
The moral of the story is that it isn’t equation is non-linear.
√
always easy to tell if you have the “general (b) Show that y = 2x + C is a solution.
solution” or not. But the truth is, we really We have a solution with an arbitrary con-
had to contrive that example. In practice stant for a first-order equation, so it must be
if you have n arbitrary constants for an the general solution…right? That’s a good
nth order linear differential equation, you rule of thumb but it does not always work,
almost certainly have the general solution. especially for non-linear equations.
(Be much more careful making generaliza- √
(c) Show that y = − 2x + 1 is a solution, and
tions about non-linear equations.)
that it cannot be written in the form given
1.133 In this problem you will show that the by Part (b). (This shows that Part (b) was
principle of linear superposition does not in fact the general solution.)
CHAPTER 2
∙ find a linear approximation to a given function at a given point, and use that line to find
approximate solutions to problems involving that function near that point.
∙ find a “Maclaurin series”—a polynomial approximation to a given function near
x = 0—and use that polynomial to solve problems involving that function for small
values of x.
∙ find a “Taylor series”—a polynomial approximation to a given function near any given
x-value—and use that polynomial to solve problems involving that function for nearby
values of x.
∙ from one given Taylor series, find many other Taylor series.
∙ analyze a sum of infinitely many numbers, known as an “infinite series,” to see under what
conditions it “converges” (adds up to a finite number). In particular, determine what
x-values allow a particular Taylor series to converge.
∙ use “asymptotic expansions” to approximate functions with finite sums of divergent
series.
Almost any function—there are exceptions, but they are rare—can be rewritten as a poly-
nomial called a “Taylor series.” In most cases this series has infinitely many terms. But a few of
these terms often make a reasonable approximation to the function you started with, allow-
ing you to find approximate solutions to problems that may be difficult or impossible to solve
exactly.
Unfortunately, this technique does not work in every case: sometimes the Taylor series for
a function does not approach any finite value as you keep adding more terms. Many Taylor
series work for some values of x and not for others. In Sections 2.6–2.7 you will learn how
to check whether an infinite series has a finite sum, thus determining when you can use a
Taylor series.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 51
The good news is, you now have the correct equation to describe the motion of a vibrating
nucleus in a crystal. The bad news is, the equation is difficult to solve. (Try it on a computer
if you don’t believe us!)
Now comes the key step. If we assume that the displacement x is small compared to the
distance between nuclei d, we can make the following approximation:
( )
𝜅 1 1 4𝜅
− ≈− x (2.1.1)
m (d + x)2 (d − x)2 md 3
We are not asking you to prove this approximation—we are waving our hands magically
and pulling it out of a hat—although we will ask you, in Part 7, to confirm that it works
pretty well.
7. Using the values 𝜅∕m = 0.1 N⋅m2 ∕kg, d = 3 × 10−10 m, and x = d∕20, compute both
the original function (on the left side of Equation 2.1.1) and the approximation
(right side).
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 52
2. At the point where x = 64, calculate the y-value of the curve, f (64). (The line will have
the same y-value at x = 64.)
3. At the point where x = 64, use a derivative to calculate the slope of the curve, f ′ (64).
(The line will have the same slope.)
4. Based on the point from Part 2 and the slope from Part 3, find the equation of the
tangent line. √
5. Plug x = 69 into both functions: the original curve y = x, and the tangent line. If
the tangent line value is used to approximate the real value, what is the percent error?
| (real value)−(approximation) |
Recall that the formula for percent error is || | × 100.
(real value) |
√ | |
6. Use the tangent line to approximate √100. What is the percent error this time?
7. Use the tangent line to approximate 64.5. What is the percent error this time?
Now we’re going to repeat the same exercise, but for a generalized function y = f (x) at a
point x = x0 .
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 53
8. At the point where x = x0 , the y-value of the curve—and therefore the line—is f (x0 ).
The slope of the curve—and therefore the line—is f ′ (x0 ). Find the equation of the
line. That is, find a general formula for the only function y = mx + b that has a slope
of f ′ (x0 ) and goes through the point (x0 , f (x0 )).
9. Plug the point x = (x0 + Δx) into the tangent line equation to calculate its y-value
on the line, and therefore approximate its y-value on the curve.
See Check Yourself #8 in Appendix L
. √
y = f (x) 10. When we approximated 69,
the linear approximation was
f (x0+Δx)
An approximation to f (x0+Δx)
higher than the actual value.
f (x0)
For the curve in Figure 2.4, will
y
1. f (2) = 8 on the curve, and therefore the line must also go through the
point (2, 8).
2. f ′ (2) = 12 on the curve, and therefore the line must rise with a slope of 12.
This means that the line goes up 12 units every time it moves 1 unit to the right
of x = 2.
3. Therefore, at every point on the line, y = 8 + 12(x − 2). If you want, you can
rewrite this in the more familiar y = mx + b form: in this case, y = 12x − 16.
Finding a tangent line is not a pointless geometrical exercise. The function y = 12x − 16
can serve as a stand-in for the function y = x 3 , as long as you are working with x-values
close to 2.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 54
10.4
y = x3
y = 12x‒16 9.2
(2,8)
8
y
y
(2,8)
8
2 2.2
2 2.1 2.2
x x
(a) (b)
FIGURE 2.6 A tangent line approximation to y = x 3 (a) and a close-up view of the tangent line (b).
The left side of Figure 2.6 shows two different curves. The function y = x 3 goes through
the point (2, 8) with a slope of 12, but gets steeper as you move to the right. Its tangent line
meets the curve at (2, 8), and rises with a constant slope of 12. As the tangent line moves 0.2
to the right, it moves up 0.2 × 12 = 2.4, ending at the point (2.2, 10.4).
Rising in this way, the black line goes through the point (2.2, 10.4), as shown in the zoomed
view on the right side of Figure 2.6. Rising in a similar way, the blue curve should come close
to this point—which is to say, (2.2)3 should be roughly 10.4, so that was our approximation.
The same reasoning leads us to the equation for the line: instead of x = 2.2 we ask the same
question about an arbitrary x-value. Rising from y = 8, the line rises with a slope of 12 as it
moves to the right by a distance of (x − 2), so it reaches the value y = 8 + 12(x − 2). Remember
that this formula—not the specific number 10.4, but the linear function itself—is the useful
result we are looking for.
We are now ready to answer the question we started with: why is a tangent line a good
approximation to a curve? It’s because the tangent line starts at the same point as the curve,
and rises at the same rate. In this example, both the curve and the tangent line move away
from the point (2, 8) with a slope of 12, so as x moves from 2 to 2.2, they each go up by
roughly 2.4.
(e) Use g (23) to estimate f (23). (b) Does your estimate fall above, or below,
(f) Your answers to Parts (c)iv and (e) should the actual C(1∕8)? (Don’t just guess:
be exactly the same as each find the answer mathematically.)
√ other. How-
ever, they are not exactly 23. Are they 2.21 Figure 2.8 depicts a familiar situation: a mass m
higher or lower than the actual value? swings at the end of a massless rope of length
Why? L, subject to the force of gravity and the ten-
sion force from the string. It can be shown
In Problems 2.8–2.15, use a linear approximation at that this pendulum will obey the differen-
the first given point to estimate the value of the tial equation d 2 𝜃∕dt 2 + (g ∕L) sin 𝜃 = 0. (You
function at the second given point. (In each case, you can work that out yourself if you’ve taken
may want to check the answer on your calculator. Was an introductory mechanics class.) Unfortu-
your estimate close? Was your estimate a bit high for nately, this equation is not easy to solve!
concave down functions and a bit low for concave up
functions?)
(b) Use a computer to solve the original Although 𝜙(t) does oscillate, it is not
(not approximate) equation d 2 𝜃∕dt 2 + a simple harmonic oscillator. To solve
(g ∕L) sin 𝜃 = 0 with initial conditions for it you need to use an approxima-
d𝜃∕dt = 0, 𝜃 = 5◦ and plot the result. tion. Since 𝜙(t) oscillates about 𝜙 = v you
(Caution: Many computer programs expect should expand about that point.
angles to be in radians, so you may have to (e) Find a linear approximation to the right-
convert before entering 𝜃.) Use trial and hand side of the Coleman–Weinberg
error to find a final time that allows you equation valid near 𝜙 = v.
to see at least two full oscillations. (f) Solve the resulting, simplified differential
(c) Have the computer find the period of equation. You can do this with a com-
the solution you found in Part (b). You puter or by hand. If you do it by hand
should be able to do this by asking it you may find it useful to start by defining
to find two successive roots of the solu- a new variable u = 𝜙 − v and plugging
tion and subtracting them. this into the differential equation.
(d) Repeat Parts (b) and (c) for initial angles (g) Find the period for small oscillations
between 10◦ and 45◦ in steps of 5◦ . Make a of a Coleman–Weinberg field.
plot of oscillation period vs. amplitude in (h) Explain why this period is only cor-
the range 5◦ ≤ A ≤ 45◦ . Be sure to choose rect for small oscillations.
a scale for the vertical axis that lets you see
how the period changes with amplitude. 2.24 Consider the equation dy∕dt =
( )−2
(e) Describe in words what happens to 1 + sin y .
the period as the amplitude decreases. (a) Replace the right-hand side with its lin-
Does it keep increasing, keep decreas- ear approximation around y = 0 and
ing, level off at some value, or do have a computer solve the resulting
something else entirely? differential equation analytically with
(f) Using your results from Parts (a) and initial condition y (0) = 0.
(d) estimate the maximum amplitude (b) Numerically solve the original equation
that a grandfather clock could swing with the same initial condition and
at if you wanted its period of oscilla- plot this together with your approxi-
tion to be within 1% of the one given mate solution out to time t = 5. You
by the linear approximation. should find that the two solutions do
2.23 The “Coleman–Weinberg equation” d 2 𝜙∕dt 2 = not match well at late times.
−𝜆𝜙3 ln (𝜙∕v) plays an important role in some (c) Explain why the linear approximation
theories of particle physics.1 The quantity 𝜙(t) that you used was not a good approx-
describes a “homogeneous scalar field.” (Don’t imation for this problem. (Hint: Look
worry; you don’t need to have any idea what at how y behaves at late times.) Don’t just
that means to solve this problem.) The param- say that the two solutions in Part (b)
eters 𝜆 and v are both positive constants. looked different; explain why the linear
(a) If you start with 𝜙 = v, d𝜙∕dt = 0, will approximation, which works so well for
d 2 𝜙∕dt 2 be positive, negative, or zero? many problems, was not very accurate for
Based on that answer, how will 𝜙(t) behave this one.
with these initial conditions?
2.25 Consider the equation dy∕dt = 2 − e y .
(b) If you start with 𝜙 > v, d𝜙∕dt = 0, will
(a) Replace the right-hand side with its lin-
𝜙(t) start increasing, start decreas-
ear approximation around y = 0 and
ing, or stay constant?
have a computer solve the resulting
(c) If you start with 𝜙 < v, d𝜙∕dt = 0, will differential equation analytically with
𝜙(t) start increasing, start decreas- initial condition y (0) = 0.
ing, or stay constant?
(b) Numerically solve the original equation
(d) Based on your answers so far, explain with the same initial condition and
why, for initial conditions near plot this together with your approxi-
𝜙 = v, you would expect 𝜙(t) to oscil- mate solution out to time t = 5. You
late about the point 𝜙 = v.
1 S. R. Coleman and E. Weinberg, “Radiative Corrections As the Origin of Spontaneous Symmetry Breaking,”
should find that the two solutions do both functions with their linear approx-
not match well at late times. imations, derive l’Hôpital’s rule, which
The asymptotic value of y at late times f (x) f ′ (x)
states that lim = lim ′ .
(also known as the “steady-state” value) is x→a g (x) x→a g (x)
In Problem 2.29 you showed that you can get a more accurate approximation by using a
quadratic function. We could use the same technique to arrive at the formula:
1
f (x) ≈ f (0) + f ′ (0)x + f ′′ (0)x 2
2
In general, the more powers of x you add, the more accurate your approximation will be
around x = 0. As you keep adding terms you will, in many circumstances, approach the exact
value of f (x). In the exercise below, you will find a polynomial approximation of arbitrary
order for a function near x = 0. In Section 2.4 we’ll generalize this procedure to find a poly-
nomial approximation about values of x other than x = 0.
c 0 + c1 x + c 2 x 2 + c3 x 3 + …
Our goal in this exercise is to write a power series that equals sin x for all values of x: that is,
sin x = c0 + c1 x + c2 x 2 + c3 x 3 + … (2.3.1)
In writing Equation 2.3.1, we have not actually solved anything. Equation 2.3.1 is merely the
assertion that there exists a power series expansion of the function sin x, if we can only find
the right coefficients. In finding these coefficients for the sine function, we will develop a
general process for finding the power series coefficients for any given function for which
such a power series exists.
We start with a simple observation: if the equation is true for all x-values, then it must
certainly be true for x = 0.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 61
1. Plug the value x = 0 into both sides of Equation 2.3.1. Note that all the terms on the
right except the first one vanish, enabling you to solve for the first coefficient, c0 . (As
one student wryly observed, “One down, infinity to go.”)
2. Now, take the derivative of both sides of Equation 2.3.1. (If two functions are equal
everywhere, their derivatives must also be equal.)
3. Plug the value x = 0 into both sides of the equation you wrote down in the previous
step. Once again, almost all the terms on the right disappear, enabling you in this case
to solve for the second coefficient, c1 .
4. Repeat steps 2 and 3: take the derivative of both sides, and then plug x = 0 into both
sides, until you have found all the coefficients through c8 .
If you’ve done everything correctly so far your first few terms should be:
x3 x5
sin x = x − + +…
3! 5!
and you should have found at least one more term after that. The first term f (x) = x is the
linear approximation to sin(x) at x = 0. When you include the first two terms, f (x) = x −
x 3 ∕(3!) should be a better approximation of sin(x) near x = 0. And f (x) = x − x 3 ∕(3!) + x 5 ∕(5!)
should be better still, and so on. For the function sin x, and for many (though not all!)
functions, we can continue adding terms to estimate the function as accurately as we need.
We refer to these successive approximations as “partial sums” of the series, and write:
Let’s see how these partial sums work as approximations for the sine function.
5. Fill in all the numbers in the following table to at least five decimal places accuracy.
These values of x are in radians.
6. Answer in words: looking at the first row in your table (for x = 1), what do we learn
about successive sums of this series?
7. Answer in words: looking at all three rows of your table, what do we learn about this
series as x gets farther from zero?
8. Now, use graphing software, a spreadsheet, or a graphing calculator to make graphs
of all five functions: sin x, S1 (x), S2 (x), S3 (x), and S4 (x), on the interval −𝜋 ≤ x ≤ 𝜋.
9. Answer in words: what do the five graphs tell us about successive sums of this series?
You can keep going, writing a third-order polynomial, then fourth order, and so on. If you
want the polynomial to be within 0.01 of the correct value at x = 1, you should be able to get
there by adding enough terms. If you want the polynomial to be within 0.001 of the correct
value, you need to add more terms. In the limit as you add infinitely many terms, the “power
series”
f (x) = c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + …
with appropriately chosen coefficients cn will usually approach the original function exactly.
Can you find such a polynomial series, and use it to approximate a function to any desired
degree of accuracy? The answer is…for some functions you can, and for some functions you
can’t. In this section, we discuss a method for answering the question: “If this function has
a polynomial representation, what is it?” We discuss a bit later the question of: “For what
functions does this work?”
Maclaurin Series
When we replace a function with a power series as described above, that infinite polynomial
is called a “Maclaurin series.” Every Maclaurin series has the form:
c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + …
What changes from one function to another are the coefficients c0 , c1 , and so on. In the
following example, we are looking for the Maclaurin series for the function y = 1∕(1 − x).
The strategy goes as follows.
1. In the first step, we write 1∕(1 − x) = c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + …. This step
reflects nothing more than the unproven assumption that there exists a power series
that equals 1∕(1 − x). As we go through the process, we will determine what the
coefficients in this series must be, if that assumption is correct.
2. At this point, we plug x = 0 into both sides of the equation. On the right side, all the
terms vanish except for the first (constant) term. We can therefore solve for that term.
3. To move to the next step, we take the derivative of both sides of the equation. (If two
functions are the same, their derivatives must be the same.) Then, plugging in zero to
both sides again, we find the next coefficient, and so on.
Each equation is found by taking the derivative of the one above it.
Equation Plug in x = 0
1
= c0 + c 1 x + c 2 x 2 + c 3 x 3 + c 4 x 4 + … c0 = 1
1−x
1
= c1 + 2c2 x + 3c3 x 2 + 4c4 x 3 + … c1 = 1
(1 − x)2
2
= 2c2 + (2 × 3)c3 x + (3 × 4)c4 x 2 + … c2 = 1
(1 − x)3
2×3
= (2 × 3)c3 + (2 × 3 × 4)c4 x + … c3 = 1
(1 − x)4
2×3×4
= (2 × 3 × 4)c4 + … c4 = 1
(1 − x)5
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 63
By this point, the pattern is clear: all the coefficients come out 1. Plugging these
coefficients into the original equation, we conclude that:
1
= 1 + x + x2 + x3 + x4 + …
1−x
By determining the coefficients, we have found the Maclaurin series for the function
1∕(1 − x).
1
= 1 + x + x2 + x3 + x4 + … (2.3.2)
1−x
We can understand the meaning of such an equation in several different ways.
We can interpret Equation 2.3.2 as a generalization about numbers because it represents in
one equation many different numerical statements. For instance, you can plug in x = 1∕3
to get:
1.5 = 1 + (1∕3) + (1∕3)2 + (1∕3)3 + … (2.3.3)
We will discuss the idea of infinite series more formally in Section 2.6 but one key point must
be made from the outset. Equation 2.3.3 does not mean “when we finally add the infiniti-eth
term, it will all add up to 1.5”; we never will, and it never will. Rather, this equation—and any
infinite series—must be viewed as a limit, a limit of “partial sums.” In this case:
and so on. The partial sums collect more and more terms as you go. Equation 2.3.3 is telling
us that as we keep adding terms, the partial sums approach 1.5.
Similarly, plugging x = −0.23 into Equation 2.3.2 gives us:
1
= 1 − 0.23 + (0.23)2 − (0.23)3 + (0.23)4 + …
1 + 0.23
Once again, we have a limit; if you add up the first hundred terms you will get S100 , which will
be very close to 1∕1.23. The original power series uses an x to indicate that many different
numbers can be substituted into the equation.
But we can also look at Equation 2.3.2 as describing a function. Remember that our interest
in series began by looking for a way to replace one function (such as the complicated function
in the vibrating crystal example) with another function that is easier to work with. We now
know that the function 1∕(1 − x) can be reasonably replaced with S1 , the linear function
1 + x. But S2 , the quadratic function 1 + x + x 2 , approximates the function more closely, and
S3 (1 + x + x 2 + x 3 ) works better still, and so on. This insight suggests a general strategy: we
can replace a difficult function with a polynomial, using as many terms as needed to obtain
the desired accuracy (Figure 2.10).
You may have already noticed one obvious limitation: we could not have used this pro-
cess at all if our original function, 1∕(1 − x) in this example, was not differentiable at x = 0.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 64
x x
1/(1–x) 1/(1–x)
1+x+x2 1+x+x2+x3
x x
(You can’t make a Maclaurin series for |x|, for example.) Below we address other limita-
tions of this process. It turns out, however, that most functions that arise in engineering and
physics problems can successfully be replaced with polynomials, making this one of the most
common and powerful approximation techniques.
We should note one peculiar feature of this particular example. If you closely followed our
algebra in “Example: Building a Maclaurin Series” above, you may have noticed factorials
such as 2 × 3 × 4 appearing on both sides of the equation. When you go through this process
you will always find factorials on the right side; the 4th derivative of x 4 is 4! for instance. But
in this unusual case the factorials also appeared on the left side, because the 4th derivative
of 1∕(1 − x) is 4!∕(1 − x)5 . The factorials therefore canceled on both sides, leaving very sim-
ple coefficients. For most functions this will not occur, so most Maclaurin series will have
factorials in their coefficients.
We don’t need any fancy limit-manipulation to tell us that this equation is absurd. The series
1 + 2 + 4 + 8 + … clearly does not approach −1 (or any other finite number). So we see that
our original equation works for some x-values, but not all of them.
More generally, a power series of the form c0 + c1 x + c2 x 2 + c3 x 3 + … will work best for
“small” x-values, x ≈ 0, because those values make the later terms diminish in significance
quickly. (0.15 is a lot less than 0.1.) In the next section, we will address this problem by
generalizing from Maclaurin series, which work best near zero, to “Taylor series,” which can
generally be created to work best near any chosen number.
Still later, we will take up the question of determining for which x-values a particular series
works at all. A series can “not work” in the sense of blowing up (as in the example above),
or—in rare cases—a power series can even add up to the wrong function. In Problem 2.52
you will see an example of the latter situation.
We will not take up the subject of proving that the power series you found is equivalent
to the function you started with. We will confine our discussion to the following simple rule.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 65
In the domain you are interested in, if the original function is everywhere differentiable
(which is usually easy to check), and the power series you generate doesn’t blow up (we’ll
see how to check that in Section 2.7), then the series almost always adds up properly to the
original function.
It looks ugly, but it’s not particularly difficult to work with. Since we’re going to now treat
x∕d as our variable, let’s call that u, and use it to build a Maclaurin series of the function in
parentheses above. We’re only going as far as the first derivative—the linear approximation—
which turns out to be enough for most problems.
Equation Plug in u = 0
1 1
2
− = c 0 + c1 u + … c0 = 0
(1 + u) (1 − u)2
−2 2
− = c1 + … c1 = −4
(1 + u)3 (1 − u)3
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 66
We conclude that for small values of u we can replace 1∕ (1 + u)2 − 1∕ (1 − u)2 with
−4u which is −4x∕d, making our differential equation:
d 2x 4𝜅
= − 3x
dt 2 md
You may recognize this as the equation that we presented without justification in the
motivating exercise.
The point here is that you never declare a physical quantity with units to be “small” in
some absolute sense: it is only meaningful to say that “quantity a is small compared to quantity
b,” where b has the same units as a. You can then describe the ratio a∕b as small, which is
often the first step in finding a useful approximation to a physical equation.
In many instances, though, we need something more specific than “x is much smaller
than d.” Suppose you need a given approximation to within 1%. Once you’ve estimated
a function with a partial sum of a Maclaurin series, how can you know how far off your
answer is? Appendix B gives a formula for the “Lagrange remainder,” an upper bound on
the difference between f (x) and a series approximation. Section 2.7.4 Problems 2.233–2.237
(see felderbooks.com) explore the use of this formula and other rules that bound the error
in series approximations.
∑
∞
f n (0) f ′′ (0) 2 f ′′′ (0) 3
f (x) = x n = f (0) + f ′ (0)x + x + x +… (2.3.4)
n=0
n! 2! 3!
For most functions, the first few terms of this series will provide a good approximation to
f (x) for values of x sufficiently near zero, and in the limit where you keep adding more terms
the sum will approach the exact value f (x). We discuss later in the chapter how to tell for
what values of x this statement will hold true.
2 Some people write the first term separately and start the series at n = 1, which avoids 00 appearing in the first term
for x = 0.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 67
of the famous “binomial series.” (Recall We are going to simplify this electric field
that you are writing the nth term in closed expression for two special cases. In the first
form, not the entire nth partial sum.) case the particle is very close to the bar.
2.48 (a) Fill in the following table to find a This is the case where z ≪ L, so the unit-
general formula for the Maclaurin less quantity z∕L is very small.
series for any function f (x).
z
Equation Plug in x = 0
f (x) = c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + … c0 = f (0)
L
′ 2 3
f (x) = c1 + 2c2 x + 3c3 x + 4c4 x + … c1 =
(a) We can rewrite
( √ the electric
) field as E =
c2 = (2kQ ∕zL) 1∕ 4x 2 + 1 , where x = z∕L.
Expand this function in a Maclaurin series
c3 = up to the second power of x. Plug x = z∕L
back in to get an approximate expres-
c4 = sion for E valid when the particle is near
the bar.
(b) Express the formula from Part (a) (b) The electric field at a distance z from an
in summation notation. infinite bar of charge is E = 2k𝜆∕z, where
𝜆 = Q ∕L is the charge per unit length on
2.49 In Einstein’s theory of special relativity the rod. The first term that you found
the total √
energy of a particle is given by should be equal to this expression, while
E = mc 2 ∕ 1 − (v∕c)2 where m is the rest the second term makes the approxima-
mass of the particle, v is the speed, and c, a tion more accurate since the bar is not
constant, is the speed of light. (This prob- infinite. Explain why the second term has
lem does not require any knowledge of spe- the sign that it does. In other words, would
cial relativity beyond that equation.) you expect the field from the finite bar to
For most everyday uses—such as race- be larger or smaller than the field from
cars, supersonic jets, and rockets to the an infinite bar with the same charge den-
moon—the quantity v∕c is very small. We sity, and does your result confirm your
can therefore make this formula easier to expectation?
work with by creating a Maclaurin series Our second special case is a particle very far
for this function, using v∕c as the variable from the bar. For this case L∕z will be very small.
and treating m and c as constants.
(a) Create
√ a Maclaurin series for the function
1∕ 1 − x 2 up to the second power of x.
(b) Use your answer to Part (a) to create
a second-order√approximation for the z
namely E = kQ ∕z 2 . The second term makes (b) Find a Maclaurin series for this function.
the expression more accurate since the bar (After a few terms, you should be able to
is not an infinitely small point. Explain why see the pattern for all the terms.)
the field from a bar of charge Q should (c) As you add more and more terms, does
be larger or smaller than the field from your series approximate the original
a point of charge Q and check that the function better and better?
sign of your second term matches this 2.53 (a) Find the Maclaurin series for y = cos2 x
expectation. up to the eighth power of x. (This
2.51 In introductory physics you learn that a pro- gets algebraically messy.)
jectile moves in a parabola, but that is only (b) Identify the pattern and write the
true if you assume that the acceleration general formula for all terms in
due to gravity is constant. When you throw this series. (The first term will
a projectile it actually briefly enters a very not follow the pattern, so write it
narrow elliptical orbit about the center of separately.)
the Earth, which is then interrupted when
(c) Find the Maclaurin series for y = sin2 x.
the projectile hits the Earth’s surface. If the
(This should be a trivial one-step
projectile is thrown horizontally and its dis-
derivation based on your answer to
tance from the center of the Earth is RE (the
Part (b).)
radius of the Earth), the equation for this
ellipse is: 2.54 Calculating 𝜋. If you want to know the value of 𝜋
to several decimal places you can of course just
( √ )
1 plug it into your calculator. But how were those
y (x) = V (1 − 𝜌) + V 2 − 𝜌(2 − 𝜌)x 2
𝜌(2 − 𝜌) decimals found, and how do computers nowa-
days calculate 𝜋 to billions of decimal places?
The parameters are V = v02 ∕g and 𝜌 = V ∕RE . Many of the methods used involve Taylor
These parameters are useful to keep the series.
equations clean, but your answers should be (a) Find the Maclaurin series for tan−1 x
expressed in terms of v0 , g , and RE . out to seventh order. (You can get the
(a) Find the Maclaurin series for y(x) up to derivatives you need from a computer
the zeroth order (the constant term). program, a table of derivatives, or just
Simplify your result as much as possi- by searching the Web.) From there you
ble and explain what it means. should be able to spot the pattern and
(b) Find the Maclaurin series for y(x) up to be able to write it down to any order you
the second order (the x 2 term). want. Use this to write an equation of the
(c) The equation for the ellipse is exact, but ∑∞
form tan−1 x = <something>.
usually we use the simple approximation n=0
F = −mg for motion near Earth’s surface. (b) Plug x = 1 into the equation you just
Using this approximation find x(t) and y (t) wrote. This should give you an expres-
and eliminate t to find y (x). (Remember sion involving 𝜋 on the left and a series
that the projectile is thrown horizontally involving simple fractions on the right.
at an initial height of RE .) Your result You can thus use this series to calculate 𝜋
should match your answer to Part (b). to any desired accuracy. Use your series
(d) Make a rough sketch (not to scale!) of up to the seventh-order term to find an
the exact elliptical trajectory and the estimate of 𝜋.
parabolic approximation you found to it.
(c) Now have a computer plug x = 1 into
Include a circle for the Earth in your
the 8th order polynomial, then the 9th, and
sketch.
so on up to the 1000th order polynomial.
2.52 Do Maclaurin series always work? Generate a plot of the partial sum vs. n. The
Consider the function: plot should converge toward the value of 𝜋.
{ −1∕x 2 (In practice this series is not a great way
e x≠0
f (x) = to calculate digits of 𝜋 since it converges so
0 x=0
slowly, as you saw in your plot. There are other,
(a) Is this function continuous at more complicated series that converge more
x = 0? Why or why not? quickly.)
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 70
∑
∞
f (x) = c0 + c1 (x − a) + c2 (x − a)2 + … = cn (x − a)n
n=0
In the example below, we once again approximate the function 1∕(1 − x). In this case, how-
ever, instead of powers of x, we use powers of (x − 3). Successive terms of the resulting series
will decrease rapidly when x is close to 3. If the function is well behaved around x = 3 a few
terms of this series should therefore approximate the function well in that neighborhood.
The strategy is the same as for Maclaurin series, except that we plug in x = 3 at each step
to make all the terms of the series go to zero.
Each equation is found by taking the derivative of the one above it.
Equation Plug in x = 3
1
= c0 + c1 (x − 3) + c2 (x − 3)2 + c3 (x − 3)3 + c4 (x − 3)4 + … c0 = 1∕(−2)
1−x
1
= c1 + 2c2 (x − 3) + 3c3 (x − 3)2 + 4c4 (x − 3)3 + … c1 = 1∕(−2)2
(1 − x)2
2
= 2c2 + (2 × 3)c3 (x − 3) + (3 × 4)c4 (x − 3)2 + … c2 = 1∕(−2)3
(1 − x)3
2×3
= (2 × 3)c3 + (2 × 3 × 4)c4 (x − 3) + … c3 = 1∕(−2)4
(1 − x)4
2×3×4
= (2 × 3 × 4)c4 + … c4 = 1∕(−2)5
(1 − x)5
Plugging these coefficients into the original equation, we conclude that:
1 1 1 1 1 1
= − + (x − 3) − (x − 3)2 + (x − 3)3 − (x − 3)4 + …
1−x 2 4 8 16 32
This can be written more compactly in summation notation as
∞ ( )
1 ∑ 1 n+1
= − (x − 3)n
1 − x n=0 2
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 71
1/(1–x) 1/(1–x)
x x
2 2 3
1 x–3 (x–3) 1 x–3 (x–3) (x–3)
‒ + ‒ ‒ + ‒ +
2 4 8 2 4 8 16
1/(1–x) 1/(1–x)
Once again the more terms of this series we use, the closer it approximates the original
function, as we see in Figure 2.12. In this case however the series approximates 1∕(1 − x) best
for x-values close to 3, since higher order terms vanish quickly for those values.
Taylor series are the culmination of everything we have done so far in this chapter.
∙ A linear approximation is a special case of a Taylor series. If we take the linear approx-
imation to the function 1∕(1 − x) at x = 3, using the techniques developed earlier in
the chapter, we get y = −1∕2 + (1∕4)(x − 3). With the Taylor series, however, we can
add more terms to approximate the original function as accurately as we need.
∙ A Maclaurin series is also a special case of a Taylor series. A Taylor series can be built
to approximate a function around any specified x-value. If that value happens to be
x = 0, then the Taylor series is a Maclaurin series.
Polynomials are among the easiest functions to work with. We can add and subtract them,
compare them, take their derivatives, and integrate them in straightforward ways. The ability
to replace any function with a polynomial is therefore a tremendously powerful mathemati-
cal tool.
Once again, however, a cautionary note must be sounded. Just because you can build a
Taylor series does not mean that it works! The Taylor series we just built for 1∕(1 − x) works
wonderfully at x = 3.01; it approximates the function very well with only a few terms. At x = 4
the series still works, but requires far more terms to create a reasonable approximation. And
at x = 6 the series doesn’t work at all. (Try it and see what the terms are doing.) So there are
important questions of domain that are taken up in the final sections of this chapter.
Applying this formula directly takes you through the same steps we used above—the same
derivatives and substitutions—in a way that you may find a bit quicker. On the other hand,
3 Some people write the first term separately and start the series at n = 1, which avoids 00 appearing in the first term
for x = a.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 72
if you write out the process as we did above, you’re much more likely to remember how to
find a Taylor series a year from now.
d 2𝜃 cos 𝜃 g
= h 2 3 − sin 𝜃 (2.4.2)
dt 2 sin 𝜃 L
Looks awful,
√ doesn’t it? But most of the
√ ugliness is just constants. If we define
2 2
a = 4h 3 − g ∕(2L) and b = 40h + ( 3g )∕(2L) then the differential equation is just
the following:
( )
d 2𝜃 𝜋
= a − b 𝜃 − (2.4.4)
dt 2 6
with solution: (√ ) (√ )
a 𝜋
𝜃= + + C1 sin b t + C2 cos bt (2.4.5)
b 6
We have shunted all the algebra in this problem to you in Problem 2.70, which may
leave you wondering why we bothered showcasing this example at all. But our point
here is not to demonstrate how to find a Taylor series, or how to solve a differential
equation, but how to use a Taylor series to turn an otherwise intractable problem into
something we can handle.
Students often get confused in such problems about the role the variables play. In
the differential equation and the solution, 𝜃 is the dependent variable and t the
independent. But our Taylor series takes a piece of the differential equation—a
function of 𝜃—and expands it around a given 𝜃-value. In that expression, 𝜃
momentarily acts as an independent variable.
And of course, we must not lose sight of the fact that our approximation is only
useful if 𝜃 begins, and stays, close to 𝜋∕6.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 73
Stepping Back
We are now in a position to summarize everything we’ve said so far about Taylor series.
(You’ll find summaries of the basic formulas, as well as some commonly used Taylor
series, in Appendix B.)
One of the most common techniques used to simplify intractable problems is to replace a
complicated function with a polynomial that approximates the function within some range of
values. The higher the order of the polynomial, the better the approximation will generally
be, but scientists and engineers often use a first-order approximation because lines are so
easy to work with.
The process begins by writing this equation, which simply asserts that there exists a power
series replacement for f (x) around x = a:
By setting x = a on both sides you can find c0 . By taking the derivative of both sides and then
setting x = a again, you can find c1 , and so on. Equivalently, you can find the terms of the
series from Equation 2.4.1.
This process does not work in all cases, however. Specifically:
∙ A function f (x) can only have a Taylor series about x = a if all of its derivatives exist at
x = a.
∙ It is possible for the Taylor series for a function to diverge for some values of x, in which
case it is simply not usable for those values. In Section 2.7 we discuss how to determine
the values of x for which a given Taylor series converges.
∙ It is possible for a Taylor series to converge to the wrong function, but this is rare.
Strictly speaking, the only guarantee you get is that if a function f (x) has a power series
representation about x = a, the coefficients will be given by Equation 2.4.1. In practice, how-
ever, most Taylor series converge for values of x close to a, and the only cases where they
converge to the wrong value are contrived cases dreamt up by clever mathematicians. (See
for example Section 2.3, Problem 2.52.) Most scientists and engineers use Taylor series all
the time without worrying about them converging to the wrong value, and we have never
heard of a bridge collapsing as a result.
A more common problem is that, for many functions, the derivatives become complicated
and it becomes difficult to calculate new terms. In the next section we will show you a set of
methods that can often calculate Taylor series that would be tedious to find with the method
described above.
2.56 (a) Find the Taylor series for sin x 2.68 (a) Fill in the following table to find the first
around x = 𝜋. Express your answer five terms of the Taylor series for any func-
in summation notation. tion f (x) around any given value x = a.
(b) Use the first three non-zero terms of
Equation Plug in x = a
your series to estimate sin(3).
2
f (x) = c0 + c1 (x − a) + c2 (x − a) c0 = f (a)
(c) Calculate the percent error in your
+ c3 (x − a)3 + c4 (x − a)4 + …
estimate from Part (b).
f ′ (x) = c1 + 2c2 (x − a) c1 =
(d) Draw graphs of sin x, and of the + 3c3 (x − a)2 + 4c4 (x − a)3 + …
first three partial sums of your series, c2 =
for 𝜋∕2 ≤ x ≤ 5𝜋∕2. Does each par-
tial sum approximate the original func- c3 =
tion better than the previous?
2.57 (a) Find the Taylor series for cos x around c4 =
x = 𝜋∕3. (Hint:
√ cos (𝜋∕3) = 1∕2 and
sin (𝜋∕3) = 3∕2.) (b) Find the pattern in your answer to Part (a)
(b) Use the first three non-zero terms of and write a formula for the Taylor series
your series to estimate cos(1). of f (x) around the value x = a. Your
(c) Calculate the percent error in your answer should be an infinite series.
estimate from Part (b). 2.69 The function y = f (x) has a Taylor series
around x = −2 given by:
(d) Draw graphs of cos x, and of the
first three partial sums of your series, (x + 2)2 (x + 2)3 (x + 2)4 (x + 2)5
for −𝜋∕2 ≤ x ≤ 𝜋. Does each partial f (x) = + + +
4 9 16 25
sum approximate the original func- ∑ (x + 2)
∞ n
tion better than the previous? +…= − 3 < x < −1
n=2
n2
2.58 (a) Find the Taylor series for ln x around
x = 1. Express your answer in sum-
mation notation. (a) Show that this function has a crit-
ical point at x = −2.
(b) Use the first three non-zero terms of
your series to estimate ln(1∕2). (b) Does this critical point represent a rel-
ative minimum, a relative maximum,
or neither? How can you tell?
In Problems 2.59–2.67 find the specified Taylor series
of the given function up to the third order. You may (c) The function g (x) is defined by two prop-
find it helpful to work through Problem 2.55 as a erties: g ′ (x) = f (x), and g (−2) = 5. Find
model. the Taylor series for g (x) around x = −2
( ) √ up to the fourth power of x.
2.59 sin x 2 about x = 𝜋 2.70 In the Explanation (Section 2.4.1) we work
√
2.60 1∕ x about x = 1 through the example of a spherical pendulum
2.61 e x + e −x about x = ln 2 modeled by Equation 2.4.2. For a given pendu-
lum g and L will be fixed, but different values of
2.62 sin2 x about x = 𝜋∕2
h will correspond to different possible motions
2.63 cos2 x about x = 𝜋∕2. (Hint: If you’ve with different angular momenta. For each value
solved the previous problem there’s a of h there will still be different motions possible
quick way to do this one without taking any with different initial conditions 𝜃 and d𝜃∕dt.
derivatives.) (a) We began our solution by expanding the
2
2.64 e −x about x = 1 right side of the differential equation
2.65 sec x about x = 𝜋. Hint: If you’re not in a Taylor series around 𝜃 = 𝜋∕6. Show
sure how to take the derivative of sec x, the steps required to find the first two
rewrite it in terms of cos first. terms of the relevant Taylor series. In
√ other words, derive Equation 2.4.3.
2.66 tan x about x = 𝜋∕4. Hint: If you’re not
sure how to take the derivative of tan x, (b) To solve Equation 2.4.4 we begin
rewrite it in terms of sin and cos first. by rewriting it as:
2.67 sin x∕x about x = 2𝜋 d2𝜃 𝜋
+ b𝜃 = a + b
dt 2 6
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 75
i. Write down and solve the complemen- where G is a universal constant, and ME is also a
tary homogeneous equation. constant, representing the mass of the Earth.
ii. Find a particular solution. Unfortunately, this simple-looking differ-
iii. Put your solutions together to find ential equation cannot be solved in simple
the general solution. form. One approach is to replace the func-
(c) For any given value of h Equation 2.4.2 tion −GME ∕r 2 with a Taylor series around
has one “constant solution”: one value r = RE , the radius of the Earth.
where 𝜃 has no time-dependence. What (a) A “zeroth-order Taylor series” is the sim-
must h be in order for 𝜃 = 𝜋∕6 to be a plest approximation to any function: a
solution to Equation 2.4.2? What kind of horizontal line, corresponding to a con-
motion does this solution describe? stant function. Find the zero-order Tay-
(d) Plug in the value of h you found in lor series approximation of the function
Part (c) to express a and b in terms of −GME ∕r 2 at the point r = RE .
g ∕L. Plug these in to write your solution (b) Where would you expect your answer in
𝜃(t) from Part (b) in terms of g ∕L. Part (a) to be most useful and accurate?
(e) Our final solution represents an oscilla- Where would you expect it to be inaccurate
tory 𝜃. Express the period of this oscil- and useless? (Your answer to this ques-
lation in terms of g and L. tion should be expressed in non-technical
and non-mathematical language.)
2.71 A hydrogen molecule consists of two hydrogen
atoms bound to each other by the interactions (c) Substituting that constant function for
of their nuclei and electrons. While the details the original function, find a solution to
of the interactions are very complicated, it’s Newton’s Law d 2 r ∕dt 2 = −GME ∕r 2 .
been found empirically that their behavior can (d) Now, using the values G = 6.67 ×
be modeled by saying their potential energy V 10−11 m3 /(kg ⋅ s2 ), ME = 5.97 × 1024 kg,
as a function of their separation r is given by and RE = 6.37 × 106 m, estimate the accel-
( )2
eration due to gravity near the surface of
the “Morse potential”: V (r ) = D 1 − e −b(r −r0 ) ,
the Earth. This should give you the familiar
where D, b and r0 are positive constants.
constant g from introductory mechanics.
(a) Sketch the Morse potential.
(e) To increase our accuracy, create a first-order
(b) Use a Taylor series to approximate Taylor series (a linear approximation) for
V (r ) near the minimum at r = r0 . Keep the function −GME ∕r 2 at the point r = RE .
only the first non-zero term.
(f) The solution to Part (e) leads to a dif-
(c) You should have found a potential of ferential equation of the form d 2 r ∕dt 2 =
( )2
the form V (r ) = (1∕2)k r − r0 , where a + br . What are the constants a and
k is a constant that depends on D and b in terms of our more fundamental
b. It can be shown that this potential constants G, ME , and RE ?
function√will lead to oscillation with fre- (g) The general solution to the equation
quency k∕2mp ∕𝜋 (which of course in Part (f) is:
depends on the k you found). Given
√ √
that D = 7.24 × 10−19 J, b = 2 × 1010 m−1 , bt bt
r (t) = C1 e + C2 e − − a∕b
and the mass of a hydrogen nucleus is
mp = 1.67 × 10−27 kg, find the oscillation
Confirm that this is a valid solution of the
frequency of a hydrogen molecule.
differential equation. (You could find it
2.72 Exploration: The Gravitational Attraction yourself using guess-and-check.)
of the Earth (h) Solve for the arbitrary constants in this solu-
Based on Newton’s Law of Universal Gravi- tion using the initial conditions r (0) = RE ,
tation we can conclude that a body of mass r ′ (0) = 0: in other words, an object falling
m acting under the influence of the Earth’s from rest near Earth’s surface. Use this to
gravitational attraction (such as a satellite) write an expression for the function r (t)
experiences a force given by F = −GME m∕r 2 that depends only on G, ME , RE , and t.
where r is the object’s distance from the center
(i) Expand the solution in Part (g) in a
of the Earth. Since F = ma, the m terms on both
Maclaurin series around t = 0, going out
sides cancel, leaving a differential equation:
to 4th order in t. Write the resulting solu-
d2r −GME tion r (t) in terms of RE and g (which you
= found in Part (a)). The second-order
dt 2 r2
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 76
version of this series is the familiar equation gravity is a constant. The fourth-order ver-
from introductory mechanics, based on sion of this series gives an equation that can
the approximation that the force of Earth’s be more accurate for long-distance falls.
In this section, we are going to find a number of Maclaurin series, all based on the four
basic series expansions given in Appendix B. In the problems you’ll show that these same
techniques can be applied to find Taylor series that aren’t Maclaurin series as well.
Here is our first example: suppose you want to find the Maclaurin series for the function
sin(x 2 ). The standard Taylor series process is tedious for this series: the first derivative is
2x cos(x 2 ), the second derivative is 2 cos(x 2 ) − 4x 2 sin(x 2 ), and things only get worse from
there.
We can avoid this process if we start with the Maclaurin series for sin x. According to
Appendix B, the series
x3 x5 x7
sin x = x − + − +… (2.5.1)
3! 5! 7!
is true for all x-values. For instance, if we replace every occurrence of x with 1∕2, we get:
x4 x6 x8 ∑∞
(−1)n x 2n+2
x sin x = x 2 − + − …=
3! 5! 7! n=0
(2n + 1)!
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 77
It is also valid to divide both sides of the original equation by x, which tells us that:
sin x x2 x4 x6 ∑∞
(−1)n x 2n
=1− + − …= (2.5.3)
x 3! 5! 7! n=0
(2n + 1)!
We have to be careful here: dividing both sides by x is not valid when x = 0, and the left side
of Equation 2.5.3 is not defined in that case. However, it does work perfectly in the limit; you
can show using l’Hôpital’s rule that lim sin x∕x = 1.
x→0
Now suppose we want the Maclaurin expansion of cos x (and suppose we didn’t already
have it). We can come back to Equation 2.5.1 and take the derivative of both sides, yielding:
3x 2 5x 4 7x 6
cos x = 1 − + − …
3! 5! 7!
A nice bit of simplification occurs when we note that 5∕(5!) is the same as 1∕(4!): the numer-
ators in each term reduce the denominators, bringing us to the series for cos x given in
Appendix B.
x2 x4 x6
cos x = 1 − + − …
2! 4! 6!
It may occur to you that there is another way to get from sine to cosine: by integrating! This
is often useful, and it is perfectly valid as long as you are careful about one thing: the first
(constant) term will be missing, since integrating adds one to each power of x. That first
term appears again as part of the +C that follows any indefinite integral; you solve for the C
by plugging in x = 0.
x3 x5 x7
sin x = x − + − …
3! 5! 7!
Integrating both sides gives:
( )
x3 x5 x7
sin x dx = x− + − … dx
∫ ∫ 3! 5! 7!
x2 x4 x6 x8
− cos x = − + − …+C
2 4! 6! 8!
x2 x4 x6 x8
cos x = − + − + …+C
2 4! 6! 8!
Plugging in x = 0 to both sides, we conclude that C = 1.
While it is valid to replace x on both sides of an equation with any function, the result is
not always a power series. For instance, suppose you wanted to find the Maclaurin series of
sin(sin x). You might be tempted to return to Equation 2.5.1 and replace every x with sin x,
yielding:
sin3 x sin5 x sin7 x
sin(sin x) = sin x − + − …
3! 5! 7!
This equation is valid for all x-values, but it is not the Maclaurin series for sin(sin x) or anything
else, because it is not a polynomial. It does not have the properties that make Taylor series
so useful to us: for instance, the first two terms do not define a line.
To get a Taylor series, we can plug the Maclaurin series for the sine directly into itself.
x3 x5
sin x = x −
+ −…
3! 5!
( ) ( )3 ( )5
x3 x5 1 x3 x5 1 x3 x5
sin(sin x) = x − + … − x− + … + x− + … …
3! 5! 3! 3! 5! 5! 3! 5!
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 78
We can determine from this expression the coefficients of the actual Maclaurin series for
sin(sin x). For example, the first expression in parentheses starts with x. The lowest power of
( )3
x in the next part, x − x 3 ∕(3!) + x 5 ∕(5!) … , is x 3 . The lowest power in the next part is x 5 ,
and so on, so the only term with x raised to the power 1 is that first x, and the Maclaurin
series thus begins
sin(sin x) = x + …
In Problem 2.85 you’ll expand out different parts of this expression to derive the next several
terms of the Maclaurin series.
This being a chapter on series, it’s appropriate to sum up. It is often easiest to generate a
Taylor series by starting with one that you already know. You can replace each occurrence of
x with another function, multiply both sides by a power of x, take the derivative, or integrate
(remembering the +C). These tricks can turn basic Taylor series into a wide variety of series
and avoid a lot of messy derivatives.
But there is one other rule to bear in mind: algebra is always allowed! For instance, we
have seen that:
1
= 1 + x + x2 + x3 …
1−x
Suppose we want the Maclaurin series for 1∕(2 − x)? We start by dividing the top and
bottom by 2.
( )
1 (1∕2) 1 1
= =
2−x 1 − (1∕2)x 2 1 − (1∕2)x
( ( )2 ( )3 )
1 1 1 1 1 1 1 1
= 1+ x+ x + x … = + x + x2 + …
2−x 2 2 2 2 2 4 8
For most of the problems in this section, you should not have to start from first principles to find the given
series. Instead, find these series as variations of the ones given in Appendix B.
√ √
2.73 e x
=1+ x + x∕(2!) + x 3∕2 ∕(3!) + x 2 ∕(4!) + … the entire (infinite) series in summation
notation.
(a) Explain in your own words why this
equation follows directly from one 2.74 2 cos(3x)
of the common Maclaurin series 2.75 sin x + cos x
listed in Appendix B. 2.76 e 3x
2
(b) Plug x = 9 into both sides of this 2.77 xe x
equation. If you use the fifth partial
2.78 1∕(1 + x)
sum of the series to estimate the actual
e 3 , how close do you come? 2.79 1∕(2 + x)
√
(c) If you find the Maclaurin series 2.80 x 2 1 + x (For this problem it is not
√
for e x , you will not get this series. necessary to write the entire series in
Why not? summation notation.)
√
2.81 4 + x 2 (For this problem it is not necessary to
write the entire series in summation notation.)
In Problems 2.74–2.82, find the first five non-zero
2.82 ln(1 − x)
terms of the Maclaurin series. Then express
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 79
2.83 Use one of the common Maclaurin series In this problem we are going to repeat
listed in Appendix B to find the Maclaurin that exercise—and get the same answer!—
series for (1 + x)2 . What happens to all the with considerably less work.
terms that are higher order than x 2 ? (a) Begin by factoring out 1∕d 2 from this
2.84 Use one of the common Maclaurin force law. This re-expresses the force
series listed in Appendix B to find the as a function of x∕d, a unitless quan-
Maclaurin series for (1 + x)3 . tity that we are assuming is small.
2.85 Find the Maclaurin series for sin(sin x) (b) Now, replace each of the two terms
through the x 5 term. with a first-order Maclaurin series, with
x∕d as the variable. You can do this
2.86 Find the Maclaurin series for cos(sin x)
quickly and easily by using the (1 + x)p
through the x 5 term.
expansion.
2.87 Find the Maclaurin series for e sin x
(c) Solve the resulting differential equation
through the x 5 term.
to show that the nucleus will undergo sim-
2.88 Find the Maclaurin series for cos(cos x) ple harmonic oscillation. Find the period
through the x 3 term. Why is this harder of this oscillation as a function of k, d,
than Problems 2.85–2.87? and the mass m of the nucleus.
2.89 In this problem, we’re going to find the 2
2.92 The “Gaussian” function e −x is defined for all
Taylor series for ln(x 2 ) around x = 1. x-values. Therefore, for any two given x-values,
(a) Begin by finding the Taylor series for ln x we can ask “What is the area under this curve
around x = 1. (This will not follow from between these two x-values?” and the answer
our basic Maclaurin series: you will gener- will be finite and well-defined. Unfortunately,
ate this Taylor series “from scratch.”) You we cannot find the answer in the traditional way,
2
should be able to take the first few deriva- because the function e −x has no antiderivative
tives and then spot the pattern so you can that can be written in terms of elementary
write the series in summation notation. functions.
2
(b) Explain why it is not valid to plug x 2 (a) Write the Maclaurin series for e −x
into your answer to Part (a) to find through the x 8 term.
the Taylor series for ln(x 2 ). (b) Use your result from Part (a) to
1 2
(c) Find and apply an alternative method approximate ∫0 e −x dx.
to quickly and easily find the Taylor
series for ln(x 2 ) from the (already (c) Check your answer by integrating
found) Taylor series for ln x. on a calculator or computer.
2.90 In this problem, √we’re going to find the 2.93 (a) Find the Maclaurin series for 1∕(1 + x 2 ).
3
Taylor series for x 2 around x = 1. (b) Find the Maclaurin series for tan−1 x. (Hint:
(a) Begin by finding
√ the 3rd-order Taylor the derivative of tan−1 x is 1∕(1 + x 2 ).)
series for 3 x around x = 1. (c) By plugging x = 1 into your answer to
(b) Explain why it is not valid to plug x 2 Part (b), find a series that converges to 𝜋∕4.
into your answer to Part
√ (a) to find (d) Add the first five terms of your answer
3
the Taylor series for x 2 . to Part (c) and multiply the answer
(c) Find and apply an alternative method by 4 to approximate 𝜋.
to quickly and easily
√ find the 3rd-order 2.94 The term (x n ) means “terms including
3
Taylor series for x 2 from√ the (already x raised to the n and higher powers.” For
found) Taylor series for 3 x. example, the Maclaurin series for sin x can be
2.91 In the Motivating Exercise (Section 2.1), we written as sin x = x − x 3 ∕3! + (x 5 ), meaning
looked at an atomic nucleus in a crystal. Its the terms that weren’t written out explicitly
motion is subject to the differential equation: include terms of order 5 and above. You can
use this to help you keep track of what you’re
( ) neglecting in complicated calculations. As an
d2x 𝜅 1 1
= − example, consider finding the Maclaurin series
dt 2 m (d + x)2 (d − x)2
for f (x) = (sin x − x cos x)∕(3 sin x − x cos x).
In the section on Maclaurin series, we (a) Write out f (x) as a fraction, using the
replaced that function with a simpler one. Maclaurin series for sin x and cos x.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 80
Keep all terms through 6th order on top the following form.
and through 4th order on bottom. Your ( )(
numerator and denominator should end f (x) = some stuff + (x 6 ) some other
with (x 7 ) and (x 5 ) respectively. )
stuff + (x for you to figure out )
(b) Divide top and bottom of the fraction
by 2x. Dividing (x 5 ) by x turns it into (d) Multiply out the two series to get the
(x 4 ). Explain why dividing (x 4 ) by Maclaurin series for f (x). Based on
2 doesn’t do anything to it. the symbols you’ve been carrying
(c) Use the binomial expansion to through the calculations, to what order
turn the fraction into a product of is your answer accurate?
S = 5 + 15 + 45 + 135 + 405
1. What is 3S? (Write it out term by term; don’t add them up.)
2. Now, write the equation 3S = (what you just said), and below that, write our original
equation for S. Then subtract the two equations. You will find that, on the right side
of the equal sign, all but two of the terms cancel out.
3S =
S=
S = a1 + a1 r + a1 r 2 + a1 r 3 + a1 r 4 + … a1 r n−1
Multiply this whole equation (both sides) by r . Then, subtract that new equation from
the original equation. You should once again find that all the terms on the right but
two cancel, leaving a formula you can solve for S. Do that. When you are done, you
should have a general formula for the sum of any geometric series that starts at a1 and
goes on for n terms with common ratio r .
See Check Yourself #10 in Appendix L
1
= 1 + x + x2 + x3 + x4 + …
1−x
We have discussed the fact that this works very well for x = 1∕3, and does not work at all
for x = 2. This brings up the practical question: how can we determine when a Taylor series
works? But underlying that question is a more theoretical concern: what does it mean for an
infinite series to add up to anything at all?
As an example consider the series
∑∞ n
n=1 (1∕2) , or (1∕2) + (1∕4) + (1∕8) + ….
We begin our examination of this series
0 1
visually, on the number line below.
1. The series begins at 1∕2. Mark this point on the number line, and label it S1 for the
first partial sum.
2. After the second term, the series reaches (1∕2) + (1∕4). This is the second partial sum:
label it S2 on the number line. (You are not labeling the number 1∕4; you are labeling
(1∕2) + (1∕4), the series after two terms.)
3. Now label S3 which is (1∕2) + (1∕4) + (1∕8). (You can of course punch this into a cal-
culator, but you will find the pattern more easily if you work with fractions instead of
decimals.)
4. Find S4 , S5 , and S6 . Write down their values and label them on the number line.
5. As we add more and more terms, where are we heading?
This brings us to the mathematical definition of an infinite series: the sum of an infinite
series is defined as the limit as n → ∞ of the partial sums Sn . Let’s see how that definition
applies to a few different series.
∑
6. Consider the series ∞ n
n=1 (1∕3) .
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 82
(r )—that is, the ratio between any two adjacent numbers—is 3. Another example is 162, 54,
18, 6: in this case r = 1∕3.
A “recursive definition” of a sequence defines each term as a function of the previous one.
So the recursive definition of a geometric sequence is an = ran−1 . (Read, “to generate any
given term, multiply the previous term by the common ratio.”)
An “explicit definition” of a sequence defines the nth term without reference to the previ-
ous term. This type of definition is more useful, because it enables you to find (for instance)
the 20th term without finding 19 other terms first.
To find the explicit definition of a geometric sequence, we start listing the terms. Say the
first term is a1 . The second term is, of course, the first term multiplied by the common ratio,
or ra1 . The third term multiplies that by the common ratio, yielding r 2 a1 . So we get a list
like this:
a1 a2 a3 a4 a5
a1 ra1 r 2 a1 r 3 a1 r 4 a1
and so on. From this you can see the generalization that an = r n−1 a1 , which is the explicit
definition we were looking for. Read this as: “The nth term is obtained by starting with the
first term and multiplying it by the common ratio (n − 1) times.”
Subtracting the first equation minus the second, we see that 2S = 486 − 2, so S = 242. We
were thus able to add up the series without actually adding up all the terms. That may seem
like a lot of trouble to add up five numbers, but remember that we could have added up a
hundred numbers just as easily! (See, for instance, Problem 2.99 in this section.)
We can generalize this to an arbitrary geometric series. Recall that a generalized geometric
series looks like a1 + a1 r + a1 r 2 + a1 r 3 … a1 r n−1 . So we can write
S = a1 + a1 r + a1 r 2 + a1 r 3 + … a1 r n−1
rS = a1 r + a1 r 2 + a1 r 3 + a1 r 4 … a1 r n−1 + a1 r n (confirm this!)
S − rS = a1 − a1 r n
S(1 − r ) = a1 (1 − r n )
1 − rn
S = a1 (2.6.1)
1−r
This is a general formula for the sum of any finite geometric series with the first term a1 ,
common ratio r , and a total of n terms. This would be a good place to stop and check yourself:
can you reproduce these quick derivations without looking? Can you apply them to a new
sequence or series? Most importantly, do all the steps make sense to you?
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 84
Some argued that the series must add up to 0, by grouping it like this:
1 −1 + 1 − 1 + 1 − 1 + 1 …
⏟⏟⏟ ⏟⏟⏟ ⏟⏟⏟
0 0 0
Still others with different definitions argued that the series added up to 1∕2! These mathe-
maticians were not just disagreeing about the “right answer” to this sum: they were disagree-
ing about what an infinite sum means. Once you settle on a definition, you can begin to look
for the answers.
The most commonly accepted definition today is based on partial sums.
(Equation 2.6.1 is not valid when r = 1, but it is easy to check that the infinite geometric
series diverges in this case.)
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Harmonic: + + + + + + + + + + + + + + +…
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Alternate: + + + + + + + + + + + + + + +…
2 4 4 8 8 8 8 16 16 16 16 16 16 16 16
The second series can easily be shown to diverge. It has two 1∕4 terms, which add up
to 1∕2. It has four 1∕8 terms, which also add up to 1∕2, and so on. The second series
is an infinite string of halves, which will therefore grow without bound.
But term for term, the elements in the harmonic series are at least as big as the
elements in this alternate series! So when the alternate series reaches 2, the harmonic
series will be higher than 2. When the alternate series reaches 100, the harmonic
series will be higher than 100, and so on. Since the alternate series grows without
bound, the harmonic series must also diverge.
Important notes:
1. This test refers to an , the nth term of the series, not to Sn , the nth partial sum.
2. This test can prove that a series diverges, but it can never prove that a series converges.
The nth-term test gives us a tool for examining a series without calculating the nth partial
sum. As with all such tools, however, the validity of the test itself must ultimately be proven
based on the Sn definition of convergence. (By analogy, remember that we rarely use the lim
h→0
definition to take a derivative—but the rules we do use, such as “derivative of sine is cosine”
or the product rule, must be proven from that definition.)
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 87
So in order to understand why the nth-term test works, consider the fact that a series only
converges if the partial sums approach some finite value. That can’t possibly happen for the
series in our example above: no matter what value it seems to be approaching by the 100th
term, it will jump past that value in the 101st term, adding a number that is almost 1. The
series will continue to grow without bound.
If you follow that logic for this one example, you can probably see why it generalizes to
the statement that the partial sums can never approach a finite value unless the individual
terms are approaching zero.
The converse of the nth-term test would state that “For any series in which lim an = 0, the
n→∞
series converges.” If that were true, proving that any series converges or diverges would be
as easy as taking a limit! Unfortunately, no such guarantee can be given. For instance, in the
“harmonic series” presented above, lim an = 0 but the series diverges anyway. In Section 2.7
n→∞
we will present other methods that can be used when the nth-term test fails.
a great service for the king, and the king (b) How many transistors are on the last
asked the man to name his reward. The man circuit (the one from 2011)?
pointed to a chess board. (Chess boards, (c) How many transistors are there
then as now, had 64 squares.) “Your majesty, in the entire exhibit?
place one grain of wheat in the first square (d) Real life inventor and futurist Ray-
of this board. On the next square place two mond Kurzweil has very publicly pre-
grains. On the next square four grains, and dicted that by the year 2020 a personal
so on, each square getting twice as many computer will have roughly the same
grains as the square before it. When you have computational capacity as a human
placed the last grains on the 64th square, brain. How many transistors will be
I will take my reward and leave.” The king on one circuit in that year?
thought he was getting a great deal.
2.118 On graduating from college, Johnny gets
(a) How many grains of wheat would be
himself a good job and starts saving for
placed on the final square?
retirement (good). His savings plan is that,
(b) How many grains, total, would be each year, he stuffs some money under
placed on the chess board? his mattress (bad). The first year, he puts
2.116 You buy a used car for $5000. You pay for it aside $200. The second year, as his income
by borrowing $5000 on a credit card. The goes up, he is able to put aside $220. The
credit card company charges 15% inter- third year he puts aside $242. Each year,
est: that is, every year, your debt multiplies the amount he saves goes up by 10%.
by 1.15. (This assumes that you are not (a) How much does Johnny put aside
paying any of the money back.) in the 30th year?
(a) How much do you owe after four years? (b) In what year does Johnny first put
(b) How much do you owe after ten years? aside over $10,000?
(c) How much do you owe after n years? (c) After 40 years, as he is nearing retire-
(d) How long will it be before you ment, Johnny needs to know how much
owe $100,000? money has piled up under his mat-
tress over the years. Express this ques-
2.117 In 1975 Intel founder Gordon Moore
tion in summation notation.
predicted that every two years, the num-
ber of transistors that can be inexpensively (d) Now answer the question.
placed on an integrated circuit would double. 2.119 One important use of series is in Riemann
This famous prediction is now known as sums, which approximate an integral as a
“Moore’s Law.” At the time the Intel 8080 series of rectangles. In a left-hand Riemann
chip held 4500 transistors and Moore’s law sum the height of each rectangle is deter-
has so far proven quite accurate. mined by the height of the function at the
left-most point in the rectangle.
70
60
50
y = x2
40
y
30
20
10
(b) Use summation notation to repre- (b) What would be the 100th term
sent the question in Part (a). in this sequence?
(c) Calculate the percent error if your answer (c) Find an explicit formula for generating
8
in Part (a) is used to approximate ∫0 x 2 dx. this sequence: that is, a formula for an
(d) Suppose the area under the function as a function of n. (Make sure your for-
y = ln x between x = 1 and x = 4 is calcu- mula correctly predicts that a5 = 22.)
lated by using a left-hand Riemann sum 2.123 Just as the Greek capital letter Sigma,
with six rectangles of equal width. Use Σ, is used to indicate repeated addition,
summation notation to write the resulting the Greek capital letter Pi, Π, is used to
sum, and then calculate the answer. indicate repeated multiplication.
∏5
(e) Suppose the area under the function (a) Compute n=1 (n + 1).
y = (1.2)x between x = 0 and x = 50 is cal- (b) Write the product 2 × 6 × 18 × 54
culated by using a left-hand Riemann sum in product notation.
with fifty rectangles of equal width. Use ∑50
(c) Rewrite the sum n=1 ln n using
summation notation to write the resulting
product notation.
sum, and then calculate the answer.
2.124 You are saving money in an Individual Retire-
2.120 One definition of “area under a curve” is
ment Account (IRA). At the beginning of each
the limit of a Riemann sum as the num-
year you deposit $1000 into your account.
ber of rectangles approaches infinity. In
At the end of each year your IRA gains 6%
this problem you will approach the inte-
50 interest; that is, the total amount of money
gral ∫0 1.2x dx using this definition. (If in the IRA is multiplied by 1.06.
you’re not familiar with Riemann sums
At the beginning of the first year you
they are defined in Problem 2.119.)
deposit $1000. At the end of the first year
(a) Use summation notation to represent the interest brings your total to $1060. At the
area under this curve if you use a Riemann beginning of the second year you deposit
sum with n rectangles of equal width. another $1000, bringing the total to $2060.
(b) Rewrite your answer in “closed form”: At the end of the second year interest
not a series, just a function of n, the brings that total up to $2183.60.
number of rectangles. (a) How much money do you have after
(c) Use a calculator or computer to esti- ten years? (Hint: Work backward. Your
mate the limit of your area as n → ∞. last deposit was made after the ninth
50 year. How much was it worth after the
(d) Evaluate ∫0 1.2x dx and see how close tenth year? Your next-to-last deposit was
your estimate was. Hint: To find the made after the eighth year. How much
antiderivative of (1.2)x it is helpful to was it worth after the tenth year?)
rewrite the integrand as (1.2)x = e x ln(1.2) .
∑ (b) How much money do you have
2.121 The series (3n−1 ∕4n ) does not, on the face of after y years?
it, appear to be a geometric series—but it is.
(c) How much money do you have
(a) Rewrite it in such a way that it is after thirty years?
obviously geometric.
2.125 If an insurance company is going to pay
(b) Find the first term a1 and the you $1000 in one year they need to set
common ratio r . aside less than $1000 right now. Assum-
(c) Does the series converge or diverge? If it ing they make 1.4% interest on their sav-
converges, what does it converge to? ings, they need to set aside approximately
2.122 An “arithmetic sequence” adds a constant to $1000 × 0.986 today to make a payment of
each term to generate the next term. An arith- $1000 next year. If they need to pay you
metic sequence has a “common difference” $1000 two years from today they need to set
d instead of a “common ratio” as a geomet- aside roughly $1000 × (0.986)2 today.
ric sequence has. As an example, consider the (a) How much money do they need to
arithmetic sequence 10, 13, 16, 19, 22. set aside today if they need to pay
(a) Find a recursive formula for generating you $1000/year for the next thirty
this sequence: that is, a formula for the years? (This is called the “net present
nth term an as a function of an−1 . value” of the payout.)
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 91
(b) How much do they need to set aside if an easy sum to evaluate, but we’re going to
they need to pay you $1000/year forever ? show you a trick for making it easier.
∑∞
2.126 The ancient Greek philosopher Zeno (b) We define f (t) = n=1 e nt ∕2n . (The
attempted to prove that all motion is impos- function f (t) is called a “moment gen-
sible by means of the following paradox. erating function.”) Rewrite f (t) as a
Suppose you are walking to someone’s house. geometric series and find its value.
Before you can get there you must of course Write your answer in the form of an
∑∞
travel halfway there. Once that’s accomplished equation: n=1 e nt ∕2n = <something >.
you just have to cover that same distance (c) Take the derivative with respect to t
again, right? But not so fast—before you can of both sides of the equation you just
do that you have to cover half the remaining wrote. The result should be a new
distance. And after that step you have a fourth equation that will still have an infinite
of the original distance left. Before you can sum on the left and a normal expression
cover that you have to travel half that dis- on the right, both functions of t.
tance, and so on. Because you have to cross (d) By evaluating both sides of this
an infinite number of distances, Zeno con- equation at t = 0, find the expecta-
cluded that you can never get anywhere. tion value of n as defined above.
Translate Zeno’s paradox into an infi-
nite series. Then, by showing that this 2.131 Exploration: Chaos Theory
series converges, show that it is indeed Biologists model constrained population
possible to take Zeno’s infinite num-
( by )the recursive sequence pn+1 =
growth
ber of steps in a finite time. kpn 1 − pn , where pn is the population after n
∑∞
2.127 The series n=1 1∕n 4 converges, although we generations and k is a positive constant called
won’t be able to prove it until Section 2.7. the “growth rate.” pn is always between 0 (no
Estimate the sum of this series by comput- population) and 1 (full capacity): p = 1∕2 does
ing successive partial sums on your cal- not mean the population is half an animal, but
culator or a computer and seeing what that the population is at half its capacity.
they tend to converge to. (a) Letting k = 1.2 and p1 = 0.1, use a
∑99 n+1 computer program to rapidly gener-
2.128 Determine n=0 ∫n x 3 dx.
ate at least thirty generations. You will
2.129 The series 1 + 2x + 4x 2 + 8x 3 + 16x 4 + …
see the population approaching an
is geometric. However, we cannot ask the
equilibrium point. What is it?
question “does it converge or diverge?”
because the answer depends on the value (b) With k still set to 1.2, experiment with
of x. Rather, we ask the question: “For what different values of p1 ranging from 0.1
values of x does this series converge?” Make to 0.9. Note that the resulting num-
sure your answer takes into account both bers differ each time but they approach
positive and negative values of x. the same limiting value.
(c) Repeat the experiment with k set to
2.130 You flip a coin over and over. The probabil-
1.5, 2.0, 2.3, and 2.7. Record in a table
ity that the first time you gets heads is on the
the k-values and the resulting equilib-
third flip is 1/8, because it requires starting
rium points. For each value of k be sure
with the sequence tails, tails, heads. More
to try at least two values of p1 .
generally the odds that the first heads will
occur on the nth flip is P (n) = 1∕2n . (d) Repeat the experiment with k =
3.1. What happened?
(a) Add up all the probabilities P (n) from
n = 1 to ∞. Explain why your answer (e) Repeat the experiment with k = 3.2. The
makes sense for this scenario. results are similar, but not identical, to
the results in Part (d). Describe them.
Your goal for the rest of this problem is
going to be to find the “expectation value” (f) Repeat the experiment with k = 3.5.
of n. If you did this experiment many times What happened now?
and wrote down the value of n each time (i.e., (g) Finally, repeat the experiment
how many flips it took you to get heads), with k = 3.7.
the expectation value is what all of those With k = 3.7 the sequence represents what
results would tend to average to over time. mathematicians call “chaos.” There is no dis-
∑∞
The formula for it is n=1 n∕2n . This is not cernable pattern to the numbers. If you repeat
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 92
the experiment with the exact same start- (h) Chaotic behavior—including sensitive
ing value, of course you will get the same dependence on initial conditions—
results. However, if you change the start- has been shown to apply to real-world
ing value, even by a very small amount, the systems including the weather and
resulting numbers will be completely dif- the stock market. What implications
ferent. (This property goes by the impos- does this have for scientists attempt-
ing name “sensitive dependence on initial ing to predict the behavior of such
conditions.”) systems?
2 4 8 16
Does the series 1
+ 1×2
+ 1×2×3
+ 1×2×3×4
+ … converge or diverge?
∑∞ n ∕(n!).
This series can also be written as n=1 2 Applying the ratio test…
( n+1 ) ( )| |( n+1 ) ( )|
| an+1 | || 2n+1 ∕(n + 1)! || || 2 n! | | 2 n! |
| |=|
| a | | 2n ∕(n!) || = || (n + 1)! |=|
n | | n
|
|
| n | | | | 2 | | 2 (n + 1)! |
Now, 2n+1 ∕2n reduces to 2 by the rules of exponents. n!∕(n + 1)! may be less familiar,
1×2×3…×n
but if you write out what each part means, it looks like 1×2×3…×n×(n+1) and everything
cancels but the last term, leaving 1∕(n + 1). So, returning to our ratio test, we have…
lim |2∕(n + 1)| = 0. Since the limit is less than 1, the series converges.
n→∞
Pay close attention to those steps, because they are the same almost every time you apply
the ratio test. First, you write out the ratio. Then you rearrange the fractions so that similar
terms are grouped together: in this case, the two exponential terms, and the two factorial
terms. Then you simplify and take the limit.
The three examples above illustrate the process for applying the ratio test to power series.
They also illustrate the types of series where the ratio test is most powerful—those involving
exponential functions and factorials—since the ratio an+1 ∕an simplifies naturally for these
functions.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 95
Less obviously, these three examples illustrate the only three possible results of applying
the ratio test to a power series.
1. Some power series converge everywhere.
2. Some power series diverge everywhere except at one point: the x-value around which the
power series was built. (At this value, a power series obviously converges to C0 , since
all the other terms are zero!)
3. Some power series converge in a given interval, and diverge outside that interval. The
center of the interval is always the x-value
( that the series
) was built around: for instance,
∑
a power series built around x = 5 i.e., cn (x − 5)n may converge for all x-values
between 3 and 7, or between −1 and 11. At the end points of this interval, the ratio
test is inconclusive, and other techniques (discussed in the following section) must be
used.
∑
One final question must be raised. We proved earlier that the series x n ∕(n!) converges
for all x-values, but not that it converges to e x . Is it possible that it converges to the wrong
thing?
∑
It turns out that x n ∕(n!) does in fact converge to e x for all x-values. However, there is no
mathematical guarantee for all convergent Taylor series: for instance, we saw in Section 2.3
2
Problem 2.52 that the function e −1∕x can be expanded into a Maclaurin series that does
2
not converge back to e −1∕x . For full mathematical rigor, after you prove that a Taylor series
converges, you must then prove that it converges to the function you want. You can find a
more thorough treatment of this issue in many calculus texts. We ignore it here for a purely
pragmatic reason: it is possible in principle, but extremely rare in practice, for a Taylor series
to converge to the wrong function.
|an+1 |
< 0.91 → |an+1 | < 0.91 |an |
|an | | | | |
Now consider a series with an+1 = 0.91an . Such a series would be geometric with r = 0.91,
so it would converge. And the individual terms of this series would always be higher than the
terms of our actual series. We therefore conclude that our original series cannot possibly
blow up, and must converge.
(This intuitive notion can be expressed more rigorously in terms of the “basic comparison
test” discussed in the next section. If you’re concerned about the absolute values in that
equation, that will be addressed in “absolute convergence,” also in the next section. But
even without all that analysis you should be able to convince yourself that since the geometric
series converges, our always-closer-to-zero series converges too.)
This example proves half of the ratio test: if lim ||an+1 ∕an || < 1 then the series must con-
n→∞
verge. Proving the other half is left for you to tackle in Problem 2.146.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 96
As we have seen, the ratio test has an important limitation: if the limit comes out 1, then the
test is inconclusive. Many power series converge on a finite interval, and the ratio test pro-
vides no hint of whether these series converge at the end points of this interval. We therefore
need to apply other tests.
Below we list the most common tests used to find whether a given series converges or
diverges and give pointers on how to choose an appropriate test for a given series. The rea-
sons why these tests work are explored in the problems.
A few brief notes are in order before the tests are listed.
1. Convergence and divergence happen at the tail of a series, not at the beginning. For
instance, if you ignore the first five-hundred terms of a series, and prove that the
remaining (still infinitely long) series converges, then the original series must still con-
verge. (Of course, those first terms that you ignored, although they are not relevant
to whether the series converges, are often the most important in estimating what the
series converges to!)
2. Remember that we have already seen a few key ways of determining if a series con-
verges: the nth-term test, the rule for geometric series, the trick for “telescoping” series,
and of course the ratio test.
The tests for series convergence that we describe below, along with the ones we have
already covered, are summarized in Appendix C.
You can see why the integral test works in general by viewing the series as a Riemann Sum
for the integral: see Problem 2.181.
Note, however, that the improper integral and the series are not the same. For instance,
∞
you can easily determine that ∫1 (1∕x 2 )dx converges to 1. This does not mean that
∑∞ 2
n=1 (1∕n ) converges to 1; it only proves that the series converges. (In fact it converges
to 𝜋∕6.)
This test works for almost any function you can integrate. The primary limitation is that
some functions—for instance, anything involving factorials—cannot be integrated.
∑∞
∙ n=1 (ln n∕n) Be careful! The integral test requires a function that is everywhere
decreasing. The function ln x∕x increases until x = e and decreases thereafter. But
we can prove convergence or divergence after a certain point, so we can start at
n = 3 and prove convergence or divergence from there. The first two terms will not
invalidate our result.
∞ ]∞
ln x (ln x)2
dx =
∫3 x 2 3
which diverges. The integral test therefore promises us that the original series
diverges.
∑∞ n
∙ n=1 (n∕e ) ∞ ]
x −x − 1 ∞ 2
dx = =
∫1 e x ex 1 e
The integral test therefore shows that the original series converges. It does not,
however, tell us what the original series converges to.
As a final note, the lower limit of your integral need not always be 1. Remember that
(assuming no terms are undefined) you can prove convergence or divergence of a series
starting with any convenient term.
p-Series
∑
A p-series is any series of the form (1∕n p ). These series follow a simple convergence
rule.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 98
p-Series
∑
If p ≤ 1, the series (1∕n p ) diverges; if p > 1, then the series converges.
You can prove this rule with the integral test: see Problem 2.182.
EXAMPLES p-series
∑∞
√ ) is a p-series with p > 1. It therefore converges.
∙ (1∕n 2
∑n=1
∞
∙ n=1 (1∕ n) is a p-series with p < 1. It therefore diverges.
This is one of the easiest tests to apply: if you recognize the form of a series as a p-series,
then you know immediately what the series does. In that sense, working with p-series is very
similar to working with geometric series (which, as you may recall, converge if and only if
|r | < 1). In fact, the two types are so similar that the biggest danger is confusing them! Be
∑ ∑
careful about this: (1∕2n ) is a geometric series; (1∕n 2 ) is a p-series.
Be careful to avoid using the basic comparison test backward. If your series is smaller than
some other series that diverges, that doesn’t prove anything about your series one way or the
other. If your series is larger than a convergent series, that doesn’t help either. In order to use
this test, you must be able to say “My series is smaller than a convergent series” or “My series
is larger than a divergent series.”
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 99
The basic comparison test is usually used to compare your series with a geometric series
or a p-series, since those have simple convergence rules.
∑∞ √ √ √
∙ 1∕( n − 1). We can readily see that 1∕( n − 1) ≥ 1∕ n for all n values. Since
∑n=2√
1∕ n is a p-series with p = 1∕2, it diverges. The basic comparison test therefore
proves that
√ original series√ diverges as well.
√
∑∞
∙ n=1 1∕( n + 1). Now 1∕( n + 1) ≤ 1∕ n for all n-values. However, being smaller
than a known divergent series does not prove anything; therefore the basic compari-
son test does not help with this example. (Below we will see a different test that can
be used on this) series.)
∑∞ ( √
∙ n=1 ln n∕ n . Here we are going to use the inequality ln n > 1. Of course
that is not true for the first few n-values, but we can prove convergence or diver-
gence after a certain point and those first( couple of terms will not invalidate
√ ) ( √ )
our result. With than in mind we write ln n∕ n > 1∕ n . The latter is
a divergent p series (p = 1∕2) which assures us that the original series diverges
as well.
When using the basic comparison test, you may want to recall these inequalities:
−1 ≤ sin n ≤ 1
−1 ≤ cos n ≤ 1
0 ≤ cos−1 n ≤ 𝜋
In practice, you use the limit comparison test when you begin with an intuition that
“for very high values of n, my (complicated) series is really a lot like this other (simpler)
series.”
∑∞ √
∙ n=1 1∕( n + 1). We saw above that the basic comparison test doesn’t help
us with this one, because being less than a divergent series does not √
prove
anything. We intuitively note, however, that when n is very large,√ n is
∑
much bigger than 1, so this series should behave a lot like (1∕ n).
We can make this intuition rigorous with the limit comparison test.
√
1∕ n
lim √
n→∞ 1∕( n + 1)
=1
(n + 1)∕(2n 3 + 3n 2 − 5n) 1
lim =
n→∞ 1∕n 2 2
Since 1∕n 2 is a convergent p-series, the original series must also converge.
Alternating Series
An “alternating series” alternates between positive and negative values. Such series can be
written by multiplying the terms of a positive-term series by (−1)n . For instance,
∑
∞
(−1)n 1 1 1
= −1 + − + …
n=1
n2 4 9 16
Note what the bn terms represents in this formulation: not the terms in the original (alter-
nating) series, but their absolute values. If these terms are everywhere decreasing and approach-
ing zero, then the original (alternating) series converges.
for all n-values, and lim (1∕n) = 0, we can conclude that this series con-
n→∞
verges by the alternating series test. Since the positive-term harmonic series
∑
has been shown divergent, we conclude that ∞ n
n=1 (−1) ∕n is conditionally
convergent.
∑∞ n+1 n∕(n 2 + 1) is more difficult to analyze. If we remove the alternator we
∙ n=1 (−1) ∑
find the series ∞ 2
n=1 n∕(n + 1) which diverges. (Prove this using the limit com-
parison test with 1∕n.) So we know that our series does not absolutely converge;
we will apply the alternating series test to determine if it converges at all.
It is not difficult to prove (using l’Hôpital’s Rule, for instance) that
lim n∕(n 2 + 1) = 0. But what of the other part of the alternating series test:
n→∞
how do we show that the series is everywhere decreasing? By using the deriva-
tive! The derivative of n∕(n 2 + 1) is (1 − n 2 )∕(n 2 + 1)2 . Since the denominator
is always positive and the numerator is negative for all n > 1, the derivative is
negative, which means the original function is decreasing. We can therefore
∑
pronounce ∞ n=1 (−1)
n+1 n∕(n 2 + 1) “conditionally convergent” by the alternating
series test.
7in x 10in Felder c02.tex V3 - February 26, 2015 9:46 P.M. Page 102
The two end points of an interval of convergence are, in general, unrelated to each other:
they may both converge, they may both diverge, or (as in the example above) one may con-
verge and the other diverge.
2.165 n=1
(−1)n ∕ sin n
∑∞
2.166 n=1
(−1)n (ln n)∕n
CHAPTER 3
Complex Numbers
∙ use Ordinary Differential Equations (ODEs) to model physical situations (see Chapter 1).
∙ solve ODEs by using separation of variables or “guess and check” and linear superposition
(see Chapter 1).
∙ graph functions in polar coordinates, and convert between polar and Cartesian
representations.
∙ in addition, you should have some basic familiarity with imaginary numbers.
After you read this chapter, you should be able to …
When René Descartes coined the term “imaginary” for the square root of a negative num-
ber, he intended the word as an insult. Like most mathematicians of the time he viewed any
attempt to take the square root of a negative number as a beginner’s error, like dividing by
zero or starting a land war in Asia.
Today these “imaginary numbers” are indispensable in electrical engineering, quantum
mechanics, and many other fields. In this chapter we will discuss the nature of such numbers,
how to work with them, and how to visualize them on a graph. But we will also see how these
numbers help solve important problems.
If you have never worked with complex numbers before, the introduction in Sections 3.2
and 3.3 will be too brief; we recommend you begin by consulting an algebra text to gain
some experience with the basics. On the other hand, if it’s been a few years since you encoun-
tered i, those sections may be useful to polish some slightly rusted skills and get you ready
for the rest of the chapter.
1. Explain, using words and equations, why the equation x 2 = −1 has no real answer,
but x 3 = −1 does.
But suppose there were an answer to x 2 = −1? It couldn’t be a real number such as 5, −3∕4,
or 𝜋. So we give it a new name: i for the imaginary number.
If i 2 = −1 then what is
2. i + 5i?
3. (i + 1)2 ? (Hint: it isn’t zero!)
4. (5 + 3i)(5 − 3i)?
See Check Yourself #14 in Appendix L
5. (3i) 2
(√ ? )2
6. 2i ?
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 106
(√ ) 2
7. 2i ?
√
8. √−25?
9. −3?
10. i(−i)?
11. (−i)2 ?
12. (−3i)2 ?
IMPORTANT: For the purpose of these definitions, the variables a and b designate real numbers.
In Words Symbolically
√
i ≡ −1 or i 2 = −1
Since no real number can square to give a negative answer, we
define a new number i that does.
A “real number” has no i in it. If you square a real number, you a2 ≥ 0
get a non-negative real number.
(bi)2 ≤ 0
An “imaginary number” is a real number times i
(or)
an “imaginary number” is the square root of any negative real
number.
z = a + bi
A “complex number” is the sum of a real number and an imagi-
nary number.
Re(a + bi) = a
“Re” designates the “real part” of a complex number, the part
that is not multiplied by i.
Im(a + bi) = b
“Im” designates the “imaginary part” of a complex number, the
real number that is multiplied by i.
Algebra involving i is generally no different from algebra involving any other number. If
you multiply i by 5, you get 5i. If you add i to 5i, you get 6i. But you have to keep in mind
that i 2 = −1, and that your answer can always be written in the form a + bi.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 107
Powers of i
The first two powers of i are easy: i 1 = i (of course) and i 2 = −1 (by definition). To find i 3
we multiply i 2 by i. We get −1 × i = −i.
To find i 4 , we multiply by i again: −i × i = −1 × i × i = −1 × −1 = 1.
Multiplying by i again, we find that i 5 = i; we’re back where we started! You don’t need to
do any more hard thinking to figure out the next powers: i 6 will be −1 and so on.
i1 i2 i3 i4 i5 i6 i7 i8 i9 …
i −1 −i 1 i −1 −i 1 i …
i 4 and i 8 are both 1, and you can probably see that i 12 and i 16 will come out the same way.
Raising i to a multiple of 4 is easy: i 40 and i 68 and i 4352 are all 1.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 108
And what about, say, i 31 ? The key is figuring out where you are in the cycle of four. We
know that i 32 = 1, so we are right before that in the cycle, which puts us at −i.
i 30 𝐢𝟑𝟏 i 32 i 33
−1 −𝐢 1 i
Whenever you are asked to find i to a power, find a nearby multiple of four, and use that to
locate yourself in the cycle.
So whenever we assert that two complex numbers equal each other we are actually making
two separate, independent statements.
Question: If 3x + 4yi + 7 = 4x + 8i, where x and y are real numbers, what are x and y?
Solution:
Normally it is impossible to solve one equation for two unknowns, but this is really
two separate equations.
Real part on the left = real part on the right: 3x + 7 = 4x
Imaginary part on the left = imaginary part on the right: 4y = 8
We can solve these equations to find x = 7, y = 2.
And what about inequalities? The answer may surprise you: there are no inequalities with
complex numbers.
The real numbers have a property that for any two real numbers a and b, exactly one of the
following three statements must be true: a = b, a > b, or a < b. This may seem too obvious
to glorify with the name “property,” but it becomes more interesting when you realize that
it is not true for complex numbers. i is not equal to 1, greater than 1, or less than 1; it is just
different from 1. (This corresponds to the fact that real numbers can be represented on a
number line. Choose any two distinct points, and one of them is to the left of the other. We
shall see in Section 3.3 that complex numbers can be represented on a plane, which offers
no such comparison.)
For the fraction 1∕i we accomplish the same thing by multiplying the top and bottom of
the fraction by −i to find that the real part is 0 and the imaginary part is −1.
How about 10∕(3 − 4i)? Your first impulse might be to multiply the numerator and denom-
inator by i, or by 3 − 4i. Give these a try: they don’t help! What does help is multiplying
by 3 + 4i.
Problem:
10
Find the real and imaginary parts of the number .
3 − 4i
Solution:
10 √
z= A complex number not currently in the form a + bi
3 − 4i
10(3 + 4i)
= Multiply the top and bottom by the same (carefully chosen)
(3 − 4i)(3 + 4i)
number
30 + 40i
= Remember that (x + a)(x − a) = x 2 − a 2
32 − (4i)2
30 + 40i
= (4i)2 = 42 i 2 = 16(−1) = −16, which we are subtracting from 9
9 + 16
30 + 40i
= The bottom no longer has i. The top does, but this is easy to
25
deal with.
30 40i
= + Break up the fraction
25 25
6 8 6 8
= + i So we’re there! Re(z) = and Im(z) =
5 5 5 5
Why did we choose 3 + 4i, and why did it work? This trick is a specific example of using a
complex conjugate.
When you multiply any complex number by its complex conjugate, the result is a non-
negative real number. The square root of that number is called the “modulus” of z, and is
written |z|. We’ll have much more to say about moduli in Section 3.3, but for now the result
we need is simply that zz ∗ is real.
In the example above we multiplied the top and bottom of a fraction by the complex
conjugate of the denominator. Because this gave us a real number in the denominator, it
was easy to separate the resulting fraction into real and imaginary parts.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 110
Our definition tells you how to find the complex conjugate of a number that is in a + bi
form. If a number is expressed in a different form, you can of course put it in a + bi form and
then switch the sign of b, but there is generally an easier way: replace i with −i everywhere
in the expression. For example, the complex conjugate of sin2 (3∕i) is sin2 (3∕−i). This is not
guaranteed to work in all cases, but as a practical matter it works for virtually any expression
you are likely to work with.
Problem:√
Rewrite i in Cartesian form.
Solution: √
Since we know that i can be written in the form a + bi, we start by writing this fact as
an equation.
√ √
a + bi = i This is the assertion that the number i can somehow
be written in a + bi form. Our challenge is to find the
real numbers a and b.
(a + bi)2 = i You can think of this as “squaring both sides” but we pre-
fer to think of it as the definition of a square root; we are
looking for the number (in a + bi format) that squares
to give i.
a 2 + 2abi − b 2 = i Expand.
You know that x 2 = 9 has two answers (3 and −3), and x 2 = −25 does too (5i and −5i).
So it isn’t terribly surprising that we just found√two answers for x 2 = i. These are specific
examples of a √ very general result: the function z has exactly two answers1 except at z = 0.
Furthermore, 3 z gives three answers for all z ≠ 0, and so on. You’ll explore this idea further
in the problems.
You might think that we will next tackle e i and sin i and every other function you can think
of. These functions are all meaningful, and can all be converted to a + bi form. But √ before
we can find answers, we have to define the questions! There was no struggle to define i; we
were looking for a complex number a + bi that multiplied by itself to get the answer i. But e i
is less clear—it certainly isn’t “multiply e by itself i times”—and we cannot set out to find its
real and imaginary parts until we define what it means.
One very general definition comes from Taylor series. From the Maclaurin series for expo-
nentials we know that for any real number x,
1 1 ∑∞
1 n
ex = 1 + x + x2 + x3 + … = x
2 6 n=0
n!
If we replace x with a complex number z, every term in this series can be expressed in the form
a + bi, so we define the complex function e z by that series. Similar definitions can be applied
to the sine, cosine, and many other functions. You will experiment with such definitions in
Problems 3.44 and 3.45.
Using Taylor series to define complex functions is powerful, general, and consistent. It
allows us to write virtually any expression involving i in the form a + bi. It preserves the alge-
braic properties of these functions such as trig identities, rules for multiplying exponentials,
and even rules for differentiating and integrating these functions.
We will explore the behavior of these complex functions and the meaning of their derivatives in
more depth in Chapter 13 but here it is important to know that all the usual rules of algebra and
calculus apply.
1 When
√
we work with real numbers we make the function x single-valued by fiat: we simply declare that “only the
√
positive answer counts” so 9 is 3, not −3. With complex numbers there is no easy way to pick out one answer to
“count” so the function is inherently multivalued. We will discuss this issue further in Chapter 13.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 112
3.44 Here are the Maclaurin series for 3.45 The Maclaurin series for the expo-
the sine and cosine. nential function is:
x3 x5 x7 x2 x3 x4 x5
sin x = x − + − +… ex = 1 + x + + + + +…
3! 5! 7! 2! 3! 4! 5!
x2 x4 x6
cos x = 1 − + − +…
2! 4! 6! (a) Write the first six non-zero terms of
the Maclaurin series for e ix . Assum-
(a) Write the first four non-zero terms of ing x is a real number, write your
the Maclaurin series for sin(ix). Assum- final answer in a + bi form.
ing x is a real number, write your (b) Use your answer to show that dxd e ix is ie ix .
final answer in a + bi form.
(c) Plug x + iy into the Maclaurin series
(b) Write the first four non-zero terms above to find the first four non-zero
of the Maclaurin series for cos(ix). terms of the Maclaurin series for e x+iy .
Assuming x is a real number, write your Assuming x and y are real numbers, write
final answer in a + bi form. your final answer in a + bi form.
(c) Use your answers to Parts (a) and (b) to (d) Multiply the Maclaurin series for e x by
write sin2 (ix) + cos2 (ix). Expand it out and the Maclaurin series for e iy , keeping all
simplify keeping all terms up to fourth terms up to the combined third order.
order in x, and show that (up to fourth (For example, keep a term with xy2 but
order at least) sin2 (ix) + cos2 (ix) = 1. not one with x 2 y2 .) Write your answer in
(d) Use your answers to Parts (a) and (b) Cartesian form and show, using this and
to show that dxd sin(ix) is i cos(ix). your previous answer, that e x+iy = e x e iy .
Depending on your background this may be the first unfamiliar fact you’ve encountered
in this
√ chapter, so it’s worth slowing down a bit to take it in. Let’s multiply the numbers
z1 = 3 + i and z2 = 2 + 2i twice: once using a Cartesian representation, and once polar.
Cartesian Polar
(√ ) Im √
z1 z2 = 3 + i (2 + 2i) 6i |z1 | = 3+1=2
√ √ z3 = z1z2 √
= 2 3 + 2i 3 + 2i + 2i 2
5i 𝜙1 = tan−1 (1∕ 3) = 𝜋∕6
√ √ √ √
= 2 3 + 2i 3 + 2i − 2
4i |z2 | = 4 + 4 = 2 2
( √ ) ( √ ) 3i 𝜙2 = tan−1 (1) = 𝜋∕4
= 2 3−2 + 2 3+2 i
2i z2 = 2+2i To find the product, multiply
z1 = √3+i
the moduli and add the phases.
i
√ √
Re Modulus: |z1 z2 | = 2(2 2) = 4 2
1 2 3 4 5 6
Phase: 𝜙 = 𝜋∕6 + 𝜋∕4 = 5𝜋∕12
Problem:
Simplify (1 + i)10 .
Solution:
If you plot the point 1 + i√on the complex plane you can quickly see that its phase is
𝜋∕4 and its magnitude is 2.
If we multiply this number by itself ten times, the modulus will multiply ten times,
(√ )10
so the new modulus will be 2 = 32. The phase will add to itself ten times, so the
new phase will be 10 × 𝜋∕4 = 5𝜋∕2, which places us on the positive imaginary axis. We
conclude that (1 + i)10 is the imaginary number 32i.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 116
z2 Im
in the first quadrant (upper-right-hand (e) Note on your drawing that z and z ∗ have
corner) of the complex plane? In the second the same modulus |z|. What does that
quadrant? In the third? In the fourth? tell us about the modulus of zz ∗ ?
3.56 The product of any number and its own 3.57 For each number z below, is there a num-
complex conjugate is a non-negative ber c that, when multiplied by its own com-
real number that represents the square plex conjugate, gives z? If so, find such a
of the modulus: zz ∗ = |z|2 . number c. If not, explain why not.
(a) To prove this result algebraically, let (a) z = 25
z = a + bi. Algebraically write the expres- (b) z = −13
sion for z ∗ and then multiply it by z. When (c) z = 3 + 4i
you simplify your answer, it should have
no imaginary part. This proves that it 3.58 Consider two complex numbers r = a + bi
is real. and s = c + di, where a, b, c, and d
are all real numbers.
(b) We defined |z| as the distance from the ori-
gin to the point z = a + ib on the complex (a) Find the phase and modulus of
plane. What is that distance in terms of a both r and s.
and b? How does this answer compare to (b) Multiply a + bi times c + di to get
what you found for zz ∗ in Part (a)? the product p = rs.
(c) To prove the same result graphically (c) Use the expression you found for p to
on the complex plane, begin by draw- find its phase and modulus.
ing an arbitrary point z and its complex (d) Show that the phase you found for p is
conjugate z ∗ . the sum of the phases that you found
(d) If z has a phase of 𝜙, what is the phase of for r and s and that the modulus you
z ∗ ? Since phases add, what is the phase found for p is the product of the moduli
of zz ∗ ? What does that tell us about the you found for r and s. This will require
number? the use of some trig identities.
d 2y
= −y (3.4.1)
dx 2
1. Show that y = A cos x + B sin x is a valid solution to Equation 3.4.1 for any values of the
constants A and B.
2. Explain how we know that y = A cos x + B sin x is the general solution to Equation 3.4.1.
3. Show that y = e ix is a valid solution to Equation 3.4.1.
Since y = A cos x + B sin x is the general solution to Equation 3.4.1, any other solution can
be rewritten in that form. So we must be able to write:
It is vital to note that Equation 3.4.2 is not true for all values of A and B; e ix represents a
particular solution to Equation 3.4.1, which means that Equation 3.4.2 must be true for some
particular values of A and B. Our job is now to find them.
4. Plug x = 0 into both sides of Equation 3.4.2 to find one of the two constants.
5. Take the derivative with respect to x of both sides of Equation 3.4.2 and then plug in
x = 0 to find the other constant.
See Check Yourself #15 in Appendix L
6. The function e kt is used to describe a growing quantity when k is a positive real number,
and a decaying quantity when k is a negative real number. Based on the results you’ve
found in this exercise, what kind of quantity does e kt describe when k is imaginary?
Euler’s Formula
∙ Question: What is e i𝜋 ?
Answer: cos(𝜋) + i sin(𝜋) = −1.
∙ Question: Express 2i in a + bi form.
( )i
Answer: 2i = e ln 2 = e i ln 2 = cos(ln 2) + i sin(ln 2).
∙ Question: Express ln(i) in a + bi form.
Answer: We can see from Euler’s Formula that e (𝜋∕2)i = i. Since ln(i) asks the question
“e to what power gives i?” we now have an answer: ln(i) = (𝜋∕2)i.
∙ Question: Simplify e −ix .
Answer: From Euler’s formula e −ix = cos(−x) + i sin(−x). Recalling that cos(−x) = cos x
and sin(−x) = − sin x, we get e −ix = cos x − i sin x. In other words e −ix is the complex
conjugate of e ix , as it should be since we just replaced i with −i.
Sines and cosines are related by dozens of trig identities. Exponential functions, with three
simple rules of exponents, are generally much easier to work with. So one of the great
benefits of Euler’s formula is allowing us to substitute exponential functions for sines and
cosines. As our first example, we show below how easy Euler’s formula makes deriving the
double-angle formulas.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 119
Problem:
Express the number z = 3 + 4i in complex exponential form.
Solution:
The modulus is 5 and the phase is tan−1 (4∕3) ≈ 0.927. We conclude that
3 + 4i = 5e 0.927i .
Remember that these are not two different numbers, but two different ways of
expressing the same number. The first representation makes it clear that Re(z) = 3
and Im(z) = 4, while the second shows us that |z| = 5 and 𝜙z ≈ 0.927.
Problem:
Express the number 10e (2𝜋∕3)i in a + bi form.
Solution:
Euler’s formula does the conversion for us.
2𝜋 2𝜋 √
10e (2𝜋∕3)i = 10 cos + 10i sin = −5 + 5 3 i
3 3
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 120
If you follow the number 3 + ti as t goes from 0 to 1 you trace out a line segment on the
complex plane. What does the number 5e it trace out? Consider that question for a moment
before proceeding to the next section.
When we started writing this chapter, we asked each other, “If the students only remember
one thing from this chapter, what do we want it to be?” Our answer: “We want them to know
how complex exponential functions represent oscillatory behavior.” We can summarize this
behavior with three rules, using a and b to represent real numbers:
Problem:
Sketch the real and imaginary parts of the function f (t) = e (2+5i)t .
Solution:
We begin by using the rules of exponentials to rewrite this as f (t) = e 2t e 5it . Next we
can use Euler’s formula to break f (t) into real and imaginary parts: f (t) = e 2t cos(5t) +
ie 2t sin(5t).
t t
Stop and ask yourself what those graphs tell us, about this function and about
complex exponentials in general. We started with f (t) = e (2+5i)t . The imaginary part of
the exponent (5) gave us the frequency of oscillation, or equivalently gave us a period
of 2𝜋∕5. The real part of the exponent (2) determined the rate of exponential
increase of the amplitude.
A Damped Oscillator
In the Motivating Exercise (Section 3.1) you wrote the equation for a damped harmonic
oscillator such as an underwater mass on a spring. Newton’s Second Law for such a system
becomes ma = −cv − kx, and with the constants m = 1∕2 kg, c = 2 kg/s and k = 20 N/m the
differential equation becomes:
1 d 2x dx
+ 2 + 20x = 0 (3.4.4)
2 dt 2 dt
( )
Guessing x = e pt leads to (1∕2)p 2 + 2p + 20 e pt = 0, and the quadratic formula yields two
solutions. Because this is a linear homogeneous differential equation, we can combine them
to find the general solution:
x = Ae (−2+6i)t + Be (−2−6i)t (3.4.5)
This solution satisfies Equation 3.4.4 (try it!), but where’s that soggy mass (at any given
time t)? We can approach that question two ways.
We have stressed that complex numbers act as a bridge from real-valued questions to real-
valued answers, so one approach is to turn this complex answer into a real one. We begin
with the laws of exponents and, of course, Euler’s Formula.
[ ]
x = Ae −2t e 6it + Be −2t e −6it = e −2t Ae 6it + Be −6it
= e −2t [A cos(6t) + Ai sin(6t) + B cos(−6t) + Bi sin(−6t)]
Cosine is an “even function” (cos(−x) = cos x) and sine is an “odd function” (sin(−x) =
− sin x), so:
x = e −2t [A cos(6t) + Ai sin(6t) + B cos(6t) − Bi sin(6t)]
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 122
(We could have gotten there a bit more quickly by using e −ix = cos x − i sin x in the previous
step.) We can now group sine and cosine terms:
Finally, recalling that A + B just means “any constant” and Ai − Bi also means “any constant”
we can call them a new A and B, and write a general—and real—answer:
Are we really allowed to replace Ai − Bi with a new arbitrary constant with no i in it? As always,
the final test of any approach to differential equations is to check the answer. Take a few
moments to confirm that this solution satisfies Equation 3.4.4. No imaginary numbers will
appear in your calculation; they helped us find the answer, but the answer stands on its own.
Done checking? Then notice that this solution, with all real numbers, does tell us clearly
how the system will act. The spring oscillates, as it would with no damping force (although
with a slightly different frequency), but the amplitude decays due to the damping force.
So we have a real and meaningful solution to our real problem. That’s wonderful, but it
took a fair bit of algebra and it isn’t always necessary: you can directly interpret the complex
exponential solution, Equation 3.4.5, based on the rules we developed above. Ae (−2+6i)t rep-
resents an oscillatory function. The imaginary part of the exponent tells us that the angular
frequency of that oscillation is 6 (which is equivalent to saying the period is 2𝜋∕6). The real
part of the exponent represents a decay in the amplitude of that oscillation: e −2t of course
decays faster than e −t would, but not as fast as e −3t . The arbitrary A tells us that the initial
amplitude can be anything at all, depending on how far back we pulled the spring.
And what about the other solution, Be (−2−6i)t ? The fre-
Im quency and decay rate are the same; only the phase, and
ω=6 possibly the initial amplitude, are different. (You will see
this in Problem 3.68, but you may want to also think about
A
Re a more familiar ODE solution, x = A cos t + B sin t. How do
the two parts of that solution compare to each other?)
Our point—and it’s a big one—is that once you are com-
fortable with complex exponential functions, you can inter-
The function Ae (−2+6i)t on the pret them without needing to isolate their real and imagi-
complex plane nary parts. We will return to this point in the problems, and
again in the next section.
(a) Use Euler’s formula to rewrite this func- (a) Plug this solution into the ODE
tion in the form x(t) = a(t) + b(t)i, where and show that it works.
a(t) and b(t) are real-valued functions. (b) Find the amplitude, period, and phase of
(b) Plot the real and imaginary parts of this Re(x(t)). Note that some of them will be
function on two separate graphs. You functions of A and B and some will not.
will need to consider how to choose The phase of cos(𝜔t + 𝜙) is defined to be 𝜙.
your limited domain to convey the (c) Find the amplitude, period, and phase of
behavior of these functions. Im(x(t)). (To find the phase of a sine func-
(c) Plot the real and imaginary parts of the tion you can rewrite it as a cosine.)
function x(t) = e (−2+4i)t . How do they dif- (d) Choose constants A and B such that the real
fer from the original function? part oscillates with an amplitude of 22 and
(d) Plot the real and imaginary parts of the the imaginary part with an amplitude of 8.
function x(t) = 3e (−1+4i)t . How do they
(e) The solution to the damped oscilla-
differ from the original function?
tor equation x ′′ (t) + 4x ′ (t) + 8x(t) = 0
(e) Plot the real and imaginary parts of the is x(t) = Ae (−2+2i)t + Be (−2−2i)t . Describe
function x(t) = 3e (−2+i)t . How do they dif- qualitatively how the behavior of this
fer from the original function? system is like, and unlike, the behavior
(f) Plot the real and imaginary parts of the described by the previous function.
function x(t) = 3e (−2−4i)t . How do they
3.69 The picture on Page 115 shows what happens
differ from the original function?
when you repeatedly multiply a complex num-
3.67 In the Explanation (Section 3.4.2) we visual- ber by i; it rotates 90◦ counterclockwise each
ize a complex function by plotting its real and time. The picture on Page 120 shows what
imaginary parts separately. Another approach happens when you repeatedly add 𝜋∕2 to the
we demonstrated is to plot the trajectory argument of the function e ix . Explain, using
the function follows in the complex plane. Euler’s formula, why these two pictures look
For example, you can plot the function e it the same.
by drawing a point for e 0 , then a point for
3.70 Prove that the point e i𝜃 on the complex
e i , and so on, connecting the points with a
plane is exactly 1 unit away from the
smooth curve, and then drawing arrows on
origin for any real 𝜃.
that curve to show the direction the trajectory
is moving in. The result is shown below. 3.71 Find the distance from the point Ae i𝜃 to the
origin (A and 𝜃 real and A > 0).
Im
3.72 Complex numbers are often written in the
form z = 𝜌e i𝜙 where 𝜌 and 𝜙 are real num-
bers and 𝜌 ≥ 0. In this problem you will prove
that, for a number written in this way, 𝜌 rep-
Re resents the modulus and 𝜙 the phase.
(a) Use Euler’s formula to rewrite this num-
ber in the form z = a + bi. (Of course, a
and b will be functions of 𝜌 and 𝜙.)
(b) Using the form you found in Part (a), find
Draw trajectories in the complex plane the modulus and phase of this number.
for the following functions: 3.73 In this problem you will prove that any complex
(a) e −it number can be written as 𝜌e i𝜙 where 𝜌 and 𝜙
(b) e (−1+i)t are the modulus and phase of z, respectively.
(c) e (−2+i)t (a) We begin with z = a + bi which repre-
sents all complex numbers by defini-
(d) e (−2+4i)t
tion. Find the modulus 𝜌 and phase 𝜙 of
(e) e (−2−4i)t this number in terms of a and b.
(f) 3e (−2−4i)t
(b) Of course you can write a number 𝜌e i𝜙
(g) e t with the two values you found in Part (a).
3.68 The solution to the simple harmonic But does it really represent the number
equation d 2 x∕dt 2 = −4x can be writ- we started with? To find out, use Euler’s
ten x(t) = Ae 2it + Be −2it . For this prob- formula to break this 𝜌e i𝜙 into its real
lem assume A and B are real. and complex parts, and simplify as much
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 124
as possible. You may need to look up variable!) You should now have the gen-
some trig identities to do this. eral solution to the differential equation in
3.74 In this problem you will prove Euler’s for- a form that has no imaginary numbers.
mula using Maclaurin series. (i) Verify that your solution works.
(a) Write the first three non-zero terms of the (j) Describe in words the behavior of this
Maclaurin series for sin x and for cos x. function as x grows larger.
(b) Write the first six non-zero terms of
the Maclaurin series for e z . In Problems 3.78–3.82 find the general, real-valued
solution to the given differential equation. (It may
(c) Plug z = ix into the Maclaurin series
help to first work through Problem 3.77 as a model.)
you just wrote for e z . Show that the first
six terms of the Maclaurin series for 3.78 d 2 x∕dt 2 − 2(dx∕dt) + 2x = 0
e ix equals the first three terms of the
3.79 d 2 x∕dt 2 + 4(dx∕dt) + 8x = 0
series for cos x plus i times the first three
terms of the series for sin x. 3.80 4(d 2 x∕dt 2 ) + 4(dx∕dt) + 5x = 0
(d) Now write the entire infinite Maclaurin 3.81 9(d 2 x∕dt 2 ) − 6(dx∕dt) + 2x = 0
series for sin x, cos x, and e z in summation 3.82 3(d 2 x∕dt 2 ) + 6(dx∕dt) + 7x = 0
notation. Plug z = ix into the infinite series
for e z and show that the result is the series 3.83 For a system described by the given complex
for cos x plus i times the series for sin x. function, answer the following two questions.
3.75 Triple-Angle Formulas In the Explanation ∙ Does the system oscillate? If so, with
(Section 3.4.2) we used Euler’s formula to what period?
derive the “double-angle formulas” for sin(2x) ∙ Does the system gradually increase (toward
and cos(2x). Using a similar process, derive ±∞), decay (toward zero), or neither?
expressions for the sine and cosine of 3x. (a) e 7t
3.76 Trig Identities for Sums In the Explanation (b) e −7t
(Section 3.4.2) we used Euler’s formula to (c) e 5it
derive the “double-angle formulas” for sin(2x) (d) e (7+5i)t
and cos(2x). Using a similar process, derive
(e) e (7−5i)t
expressions for sin(a + b) and cos(a + b).
(f) e (−7+5i)t
3.77 Walk-Through: Using Complex Exponen-
3.84 For a real number x, what is lim e kx if
tials to Solve a Homogeneous ODE. In x→∞
this problem, you will solve the differ- (a) k is a positive real number?
ential equation 5(d 2 y∕dx 2 ) − 6(dy∕dx) + (b) k is a negative real number?
9y = 0. You will first find a complex solu- (c) k = 0?
tion, and then a real-valued solution. (d) k = i?
(a) Substitute a guess of the form y = e px (e) k = −1 + i?
into the differential equation.
3.85 An underwater mass on a spring experi-
(b) Solve the resulting algebra equation for
ences two forces: Fspring = −kx and Fwater =
p. You should find two solutions.
−cv. Newton’s Second Law for this system
(c) Write down two functions y(x) that is ma = −cv − kx. For this problem, assume
solve the differential equation. the mass m = 2 kg and the spring constant
(d) Combine your solutions to write the general k = 5 N/m. Higher values of the constant
solution to this differential equation. c indicate a stronger damping force.
(e) Rewrite your solutions using the rule (a) If c = 6 kg/s the system is “under-
e a+b = e a e b , and then factor out the com- damped.” Solve the differential equation
mon factor from both terms. in this case and show that the result-
(f) Rewrite the complex exponential ing motion is oscillatory.
terms using Euler’s Formula. (b) If c = 7 kg/s the system is “overdamped.”
(g) Using the fact that the cosine is an Solve the differential equation in
even function and the sine is an odd this case and show that the resulting
function, combine terms so you have motion is not oscillatory.
one cosine and one sine. (c) Somewhere between c = 6 kg/s and c = 7
(h) Finally, combine arbitrary constants. kg/s the system is “critically damped.”
(Remember that i is a constant, not a For any c-value below this critical
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 125
2 Electrical engineers tend to write their equations in terms of the current I , which is easier to directly measure than the charge
3.91 [This problem depends on Problem 3.90.] (c) Rearrange your answer to collect the e 3t
A system is modeled by the differential terms and e −3t terms together.
equation d 2 x∕dt 2 = 9x. (d) By creating two new arbitrary con-
(a) Show that x = A sin(3it) + B cos(3it) is a stants A2 and B2 that are functions
valid solution of this differential equation of the old A and B, write a com-
for any constants A and B. pletely real solution to this differential
(b) Break x(t) from Part (a) into its real equation.
and imaginary parts, using the for- (e) Explain why this is not the best
mulas for sin(ix) and cos(ix) that you way to find the solution to this
found in Problem 3.90. equation.
conventionally requires integration by parts (twice) and an extra trick. In this exercise you
will solve it in an easier way. Begin by considering a different integral.
e 2ix e x dx (3.5.2)
∫
1. Use the properties of exponentials to combine the two exponentials in Equation 3.5.2
into one exponential. Do not expand the complex exponential with Euler’s formula.
2. Evaluate the integral to find ∫ e 2ix e x dx. Don’t forget to include an arbitrary
constant C.
See Check Yourself #16 in Appendix L
3. Rewrite your solution in the standard form a + bi with real a and b. This will require
using Euler’s formula as well as the trick from Section 3.2 for getting i out of the
denominator of a fraction. To split your arbitrary constant you can just write it as
C = A + iB. Your answer should be in the form
4. Use Euler’s formula to rewrite ∫ e 2ix e x(dx as the integral of a) real function plus i times
the integral of another real function ∫ a(x)dx + i ∫ b(x)dx . For this part you are not
using the results you’ve derived so far; just split the integrand into real and imaginary
parts.
5. Recall that every complex equation is two real equations in disguise: one that says the
real part of the left-hand side equals the real part of the right-hand side and one that
says the same for the imaginary parts. Using your results from Parts 3 and 4 rewrite
Equation 3.5.3 as two real equations.
It should now be clear why we set out to solve a complex integral when what we really
wanted was the solution to a real integral. ( One of the two real) equations you just wrote down
should be in the form ∫ sin(2x)e x dx = some real function of x . In other words, by solving the
relatively simple integral ∫ e 2ix e x dx you were able to find the solution to the more difficult
integral ∫ sin(2x)e x dx.
6. Take the derivative of the solution you found for ∫ sin(2x)e x dx and verify that it does
give you sin(2x)e x .
In this exercise you solved a purely mathematical problem with no physical context, but
in doing so you’ve learned a technique for solving many physical problems: define a sinu-
soidal quantity as the real or imaginary part of a complex quantity, and you replace a math
problem involving trigonometry with an easier math problem involving exponentials.
Sinusoidal oscillations arise in applications ranging from circuit theory to string theory. In
all but the simplest cases working with sine and cosine functions leads to a confusing mass
of algebra with trig identities. The great power of Euler’s formula is that it allows us to do
that algebra using the much simpler rules for exponential functions. We illustrate this point
below with several physical examples.
A Driven Oscillator
Consider a damped oscillator that is being pushed (or “driven” in the lingo) by a sinusoidally
oscillating force:
x ′′ (t) + x ′ (t) + 5x = 10 cos(2t) (3.5.4)
This is an inhomogeneous differential equation so the general solution will be the sum of
a particular and a complementary solution. In the previous section we solved the comple-
mentary equation (an undriven damped oscillator) and found solutions that decay exponen-
tially. So if we are interested in late-time behavior, the complementary solution will become
negligible and we can focus on the particular solution. An oscillating function seems likely
on both mathematical and physical grounds, so we could try guessing sine and/or cosine
solutions. You’ll solve the problem this way in Problem 3.102, but it’s a pain in the neck.
To make things easier, consider a different equation:
Here 𝐗(t) is a complex function. (We will generally use boldface letters for complex-valued
functions and non-boldface letters for real-valued functions.) Convince yourself that
Equation 3.5.4 is the real part of Equation 3.5.5, provided we identify x(t) as the real part
of 𝐗(t).
Plugging in the guess 𝐗(t) = X0 e 2it leads to X0 = 10∕(−4 + 2i + 5), which simplifies to:
This looks a lot like the work we did in Section 3.4, but with some important differences.
There we solved a real-valued differential equation by guessing an exponential solution
x = e pt , and when we did the math, p wound up complex. We interpreted the answer by
converting it to a real solution with Euler’s formula, or by thinking about the properties of
complex exponentials.
By contrast, a guess-and-check approach to Equation 3.5.4 requires a messy combination
of sines and cosines. So we took a less direct approach, creating a complex-valued differ-
ential equation whose real part was Equation 3.5.4. That changes the last step: the function
we’re looking for is the real part of the solution. Expanding the complex exponential and
multiplying everything out, we find that Re(𝐗(t)) is:
You can verify that Equation 3.5.7 is a valid solution to Equation 3.5.4, and imaginary num-
bers will never come up in the process. As always, we treat i as an indirect path from a
real-valued question to a real-valued answer.
But just as in Section 3.4, we need to discuss an alternative ending to this story: one in
which we skip the process of isolating the real part, and directly interpret the complex func-
tion.
3 When we write x = A cos(𝜔t) we use 𝜔 for the “angular frequency” of the motion. When an object is moving around
a circle, we use 𝜔 for the “rotational velocity” d𝜙∕dt. This may seem like two different uses for the same letter—
which does happen sometimes of course—but if you look at the sinusoidal motion as the real part of a complex
oscillation, the two uses are the same.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 129
Of course, we don’t want to lose sight of our original objective. The real part of this func-
tion is the x-value on that circle; it’s a sinusoidal wave whose amplitude, frequency, and phase
are determined by X0 and 𝜔. But you will save yourself a lot of algebra, and build your intu-
ition, if you work directly with the complex form as long as possible.
Question: Find a particular solution to the following equation, where y and t both
represent real numbers.
Solution:
The problem asks only for a particular solution, so the answer will have no arbitrary
constants. To find the general solution, we would begin by solving the complementary
homogeneous equation and add the two solutions.
To find the particular solution we begin instead by writing a new differential
equation involving complex numbers.
Equation 3.5.8 is the real part of Equation 3.5.9 (confirm this for yourself), so the y(t)
we want will be the real part of the 𝐘(t) we find.
We now guess a solution of the form 𝐘 = Y0 e 2it . Plugging this into Equation 3.5.9
gives
−8iY0 − 12Y0 + 4Y0 = 5
5 5 1 5 1−i 5
Y0 = =− =− = − (1 − i)
−8 − 8i 81+i 81+1 16
5
𝐘=− (1 − i)e 2it
16
You can check that this is a solution to Equation 3.5.9. The figure below shows the
initial value of 𝐘 on the complex plane. From there it will rotate with angular
frequency 𝜔 = 2, which means its real
Im part y(t) will oscillate on the real
axis with that same angular frequency.
Y(0) The function Y0 e 2it describes a sinusoidal
√
ω=2 oscillation with amplitude |Y0 | = 5 2∕16,
angular frequency 2, and initial phase
ϕ(0) = 3π/4 𝜙(0) = 3𝜋∕4. For most purposes we would
y(0) y(1) y(1.5) stop there.
Re
. If necessary you could take the real
∣Y∣ = 5 √2 /16
part of this function to get the real
solution
√ y(t) = −(5∕16) (cos(2t) + sin(2t)) =
5 2∕16 cos(2t + 3𝜋∕4), which matches
Y(1.5) what we learned from the complex solution.
Y(1)
You can confirm that this solution
satisfies Equation 3.5.8.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 130
Adding Waves
As a final example for this section, consider different
x waves with equal frequencies and amplitudes but dif-
ferent phases. This could describe a diffraction grat-
ing, in which a wave passes through many small slits
y and then travels to a receiver. The phase of the wave
from each slit is proportional to the distance from the
slit to the receiver.
In Problem 3.118 you’ll show that the signal at the
receiver is a sum of sinusoidal waves with equal fre-
quency and evenly spaced phases. In that problem you’ll use cosines, so just for variety we’ll
do it here with sines.
∑N
S= sin(𝜔t + n𝛿)
n=1
This sum can be calculated using trig identities, but the math is tedious. Instead we define the
following complex quantity. (Because we used a sine instead of a cosine, S is the imaginary
part of 𝐒.)
∑N
∑
N
( i𝛿 )n
𝐒= e i(𝜔t+n𝛿)
=e i𝜔t
e
n=1 n=1
∑N n
( )
The sum of a geometric series is n=1 r = r 1 − r N ∕(1 − r ), so this sum becomes
1 − e iN 𝛿
𝐒 = e i(𝜔t+𝛿) (3.5.10)
1 − e i𝛿
We conclude that the signal S is an oscillation with angular frequency 𝜔, as we would expect,
and the amplitude of that oscillation is given by
√ √
√ 1− e iN 𝛿 −i(𝜔t+𝛿) 1 − e −iN 𝛿 1 − cos (N 𝛿)
|𝐒| = 𝐒𝐒∗ = e i(𝜔t+𝛿) e =
1 − e i𝛿 1 − e −i𝛿 1 − cos 𝛿
That information is sufficient for most purposes. If we need to know the exact form of the
signal S and not just its magnitude, we can rewrite 𝐒 in the form a + bi and take its imaginary
part. You’ll do that in Problem 3.114.
You can also use the complex form of series like this to add up sine waves graphically in
the complex plane like vectors. See Problem 3.117.
𝐗(t) = pe 2it and find the value of p. Express (c) Write a possible function 𝐗(t) that
your answer in the form p = a + bi. matches Figure 3.5.
(c) What is the modulus of p? What does that 3.98 Figure 3.5 shows a complex oscillation 𝐗(t)
tell you about the behavior of x(t)? and its real part x(t) at a few different times.
(d) What is the period of oscillation of x(t)? (a) Copy the picture and on the same plot
(e) Plot p on the complex plane. Using that draw a function 𝐘(t) with the same angu-
plot, and recalling that x = Im(𝐗), what lar frequency and initial phase as 𝐗 but
is the initial value of the particular solu- twice the modulus. Include 𝐘(0) and 𝐘(4)
tion x(t) that you are finding? and their real parts y(0) and y(4).
(f) You have a particular solution to (b) Copy Figure 3.5 again and on the same
Equation 3.5.12 in the form pe 2it . plot draw a function 𝐙(t) with the same
Use Euler’s formula and a little bit modulus and initial phase as 𝐗 but twice
of algebra to rewrite this solution in the angular frequency. Include 𝐙(0) and
the form 𝐗(t) = y(t) + ix(t). 𝐙(4) and their real parts z(0) and z(4).
(g) Based on your answers to Parts (a) 3.99 For each scenario draw a picture like
and (f), write the real-valued answer Figure 3.5 where the real function x(t) rep-
to Equation 3.5.11. resents the position of the given oscillator.
(h) Plug your answer for x(t) into Your picture should show 𝐗(t) and x(t) at
Equation 3.5.11 and check that it works. time t = 0 and at least one other time. Unlike
Figure 3.5 your pictures should indicate the
In Problems 3.93–3.96 find a particular solution to radius of the circle traced out by 𝐗.
the given differential equation. You will use complex (a) A mass on a spring is pulled out
exponentials to find your solutions, but your final to a distance .2 m to the right and
answer should have no complex numbers in it. It may released from rest, after which it
help to first work through Problem 3.92 as a model. oscillates with period 3 s.
(b) A mass on a spring starts at the equi-
3.93 y′′ (x) + 2y′ (x) + y(x) = sin x
librium position with an initial veloc-
3.94 y′′ (t) + 3y′ (t) + 2y(t) = 3 cos(2t) ity to the right and then oscillates with
3.95 f ′′′′ (x) + 6f ′′ (x) + f (x) = 2 cos x amplitude .3 m and period 2 s.
3.96 f ′′ (x) + 4f ′ (x) + 2f (x) = cos(x) + sin(2x) (c) A 2 kg mass on a spring with spring con-
Hint: One way to approach this is to stant 3 N/m is pulled out to a distance of
write two separate functions whose sum .3 m to the left and released from rest.
solves this differential equation. 3.100 Each plot below shows a pair of com-
plex oscillations: 𝐗(t) = |X |e i𝜙0x e i𝜔x t and
3.97 Figure 3.5 shows a complex oscillation 𝐘(t) = |Y |e i𝜙0y e i𝜔y t . For each pair, answer
𝐗(t) and its real part x(t) at a few different the following questions. Is |X | greater than,
times. The radius of the circle is 3. less than, or equal to |Y |? Is 𝜙0x greater
than, less than, or equal to 𝜙0y ? Is 𝜔x greater
Im than, less than, or equal to 𝜔y ? Assume in all
X(2) cases that 𝜙 and 𝜔 are between 0 and 2𝜋.
X(4) X(0)
Im Im Im
x(4) x(0) Re Re Re
Re
x(2)
Y(2)
X(2) Y(2)
(c) Suppose the driving force had been 3.105 Calculate the integral ∫ e x cos x dx using
F = 2 cos(3t). In the complex equation the method outlined in the Discovery
for 𝐗 you would have replaced this with Exercise (Section 3.5.2).
𝐅 = 2e 3it . Plot the initial values of the com- 3.106 A metal ball of mass 2 kg is attached to a
plex driving force 𝐅(t) and the complex spring with spring constant 2 N/m. As it oscil-
position 𝐗(t) on the complex plane. lates it experiences a drag force Fdrag = −cv,
(d) The system winds up oscillating at the where v is the box’s velocity and c = 5 kg/s.
same frequency as the driving force, It is also being pushed back and forth by
but out of phase with it. Is the position an oscillating magnetic field that exerts
oscillating behind the driving force, a force Fm = F0 cos (𝜔t), where F0 = 3N
or ahead of it? By how much? and 𝜔 = 2s−1 .
3.102 In the Explanation (Section 3.5.2) we solved (a) Write the equation of motion for this ball.
Equation 3.5.4 by using complex exponen- (b) Solve the complementary homogeneous
tials. You can also solve it directly with trig equation. This will involve guessing an
functions. Plug in a guess of the form x(t) = exponential solution, finding that solu-
p cos(2t) + q sin(2t) and find the values of p tion, and then writing its real and imagi-
and q. Write your answer in the form of a nary parts. Your final answer to this part
particular solution to Equation 3.5.4. should be a completely real-valued func-
3.103 An oscillator obeys the equation x ′′ (t) + tion with two arbitrary constants.
2x ′ (t) + 2x(t) = 10 cos t. (c) Find a particular solution. This will involve
(a) Write a differential equation for a com- writing a complex differential equation of
plex function 𝐗(t) whose real part is x(t). which the original differential equation
(b) Guess a solution of the form 𝐗(t) = X0 e it is either the real or the imaginary part,
and plug it in to find X0 . Write X0 in the solving, and isolating the appropri-
form a + bi where a and b are real. ate part. Your final answer to this part
(c) What is the amplitude of oscillations should be a real-valued function.
of x(t)? (d) Write the general solution to the equation
(d) What is the period of oscillations of x(t)? of motion, and verify it. No imagi-
nary numbers should be involved in
(e) Plot X0 on the complex plane. Recall-
this part of the problem.
ing that x = Re(𝐗), use your plot to
find the initial value x(0). (e) What happens to the ball in the long run?
3.104 Consider a more general damped, driven 3.107 A driven, undamped oscillator obeys the dif-
oscillator than we considered in the Explana- ferential equation x ′′ (t) + 4x(t) = cos(3t).
tion (Section 3.5.2): Ax ′′ (t) + Bx ′ (t) + Cx(t) = (a) Find the complementary solution to
D cos(𝜔t). All constants and variables in this this differential equation.
equation represent real quantities. (b) Find the particular solution.
(a) Write a differential equation for a com- (c) Write the general solution.
plex function 𝐗(t) whose real part is x(t).
(d) Find the solution if the object starts at rest
The right-hand side of the complex dif-
at position x = 4. Verify that your answer
ferential equation must be De i𝜔t , as the
is a solution to the differential equation.
real part matches the right-hand side
of the given differential equation. The 3.108 Consider the differential equation
left-hand side will be of the same form f ′ (x) + f (x) = sin x.
as the given equation, with x(t) replaced (a) Find the complementary solution.
by 𝐗(t). (b) Write a differential equation for a function
(b) Guess a solution of the form 𝐗(t) = 𝐅(x) with the property that f (x) = Im(𝐅).
X0 e i𝜔t and plug it in to find X0 in (c) Find a particular solution to the equation
terms of A, B, C, and D. for 𝐅(x) and take its imaginary part to
(c) What is the amplitude of oscil- find a particular solution for f (x).
lations of x(t)? (d) Find the general solution for f (x) and
(d) What is the period of oscillations of x(t)? plug it in to verify that it works.
(e) Find the real solution x(t). (e) Find f (x) if f (0) = 0.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 133
3.109 The technique you learned in this section— (c) Write a complex function 𝐗𝟑 (t) that fol-
replacing an equation for a real function lows a circle of radius 13 in the com-
f (x) with an easier-to-solve equation for a plex plane, moving clockwise, starting at
complex function 𝐅(x)—only works if the 5 − 12i and moving ever faster.
equation is linear. To see why that is, you 3.112 Consider the function 𝐗(t) = 𝐗𝟎 e (a+bi)t ,
will attempt to solve the real-valued equation where 𝐗𝟎 is a complex number and a
( ′ )2
f + 13f 2 = cos(6x) by first solving the and b are real numbers.
complex equation (𝐅′ )2 + 13𝐅2 = e 6ix . (a) If you change 𝐗𝟎 , how does the oscillation
(a) Find the values of k for which 𝐅 = ke 3ix is a around the complex plane change?
valid solution to the differential equation (b) If you change a, how does the oscillation
for 𝐅. You should find two answers. around the complex plane change?
(b) Take the real part of the resulting 𝐅(x) (c) If you change b, how does the oscillation
functions to find two guesses for f (x). around the complex plane change?
(c) Plug in those guesses into the dif- (d) One system is modeled by Re(𝐗(t))
ferential equation for f (x) and show and a different system is modeled by
that they do not work. Im(𝐗(t)). How are these two systems
(d) Explain why this method doesn’t work alike, and how do they differ?
for a non-linear equation. In other
3.113 In the Explanation (Section 3.5.2) we
words, look at the equations for f and
analyzed the function 𝐗(t) = X0 e i𝜔t by
𝐅 and explain why you would not nec-
looking at its modulus and phase. In this
essarily expect that a solution to the
problem you will look at the real and imag-
f equation would be the real part of
inary parts of the same function.
a solution to the F equation.
(a) Writing X0 as a + bi, find the real
3.110 Consider the complex function 𝐗𝟏 (t) = 5e 𝜋it . and imaginary parts of X0 e i𝜔t . Your
(a) Draw a curve indicating the path this answer will of course be a function
function traces out on the complex of a, b, and 𝜔 as well as of t.
plane. Use arrows to indicate the direc- (b) Find the modulus of your answer to
tion of travel, and label the time val- Part (a), and show that |𝐗(t)| = |X0 |.
ues of a few key points.
(c) All dependence on t dropped out
(b) If Im(𝐗𝟏 (t)) represents the behavior of a of your calculations: although 𝐗(t)
real-world quantity, describe the behav- depends on t, its modulus |𝐗(t)|
ior of that quantity. (You should be does not. What does that fact alone
able to provide a full description based tell you about 𝐗(t) on the complex
on your plot, without ever writing the plane?
real-valued function explicitly.)
3.114 In the Explanation (Section 3.5.2) we cal-
(c) Repeat Parts (a) and (b) for the func- ∑N
culated the sum S = n=1 sin(𝜔t + n𝛿) by
tion 𝐗𝟐 (t) = (3 + 4i)e 𝜋it . Your description
defining it to be the imaginary part of the
should focus on how the behavior is sim- ∑N
sum 𝐒 = e i𝜔t n=1 e in𝛿 , which we found was
ilar to, and different from, 𝐗𝟏 (t).
given by Equation 3.5.10. Write this solu-
(d) Repeat Parts (a) and (b) for the func- tion in the form a + bi and take its imagi-
tion 𝐗𝟑 (t) = (3 + 4i)e 2𝜋it . Your description nary part to find the real signal S.
should focus on how the behavior is sim-
ilar to, and different from, 𝐗𝟐 (t). 3.115 In this problem you( will examine )
∑N
the sum n=1 cos 𝜔t + n𝛿 + 𝜙0 .
3.111 (a) Write a complex function 𝐗𝟏 (t) that fol-
(a) Write this sum as the real part of a
lows a circle of radius 13 in the com-
complex exponential sum.
plex plane, moving counterclockwise,
starting when t = 0 at 13i, and cov- (b) Using the laws of exponents, rewrite
∑
ering a full circle every 3 s. that sum in the form a (r n ).
∑N
(b) Write a complex function 𝐗𝟐 (t) that fol- (c) Using
( the) fact that n=1 (r n ) =
lows a circle of radius 13 in the com- r 1 − r N ∕(1 − r ), write a closed-
plex plane, moving counterclockwise, form expression for the sum.
starting at 5 + 12i, and covering a full (d) How did 𝜙0 affect the final answer?
circle 30 times per second. (Consider possible changes in
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 134
amplitude, frequency, and phase. You (f) Add up those five numbers on your
do not have to find the real and imagi- calculator and compare it to what
nary parts to answer this question.) you got graphically.
∑5
3.116 In this problem you will examine (g) Find N such that n=1 cos(n𝜋∕8) = 0.
∑N
the sum n=1 2n cos (𝜔t + n𝛿). Notice that we’re using cosines here,
(a) Write this sum as the real part of a so you want the real part of the sum
complex exponential sum. to be zero. Explain how you got
your answer based on the graph-
(b) Using the laws of exponents, rewrite
∑ ical method, with no calculation
that sum in the form a (r n ).
∑N of trig functions or exponentials
(c) Using
( the) fact that n=1 (r n ) = needed.
r 1 − r N ∕(1 − r ), write a closed-
form expression for the sum. 3.118 Exploration: Diffraction Grating
In a diffraction experiment light passes
(d) In the Explanation (Section 3.5.2)
∑N through an array of N thin slits, lined
we described n=1 sin (𝜔t + n𝛿) as a
up in a row with spacing 𝛿 between adja-
receiver at a particular location receiv-
cent slits. The light has wavelength 𝜆. You
ing signals from several identical,
may assume the light beams all have zero
evenly spaced sources of electromag-
phase and equal amplitude A as they pass
netic radiation. Changing from sine to
through the slits. The light is detected on a
cosine doesn’t make a substantial dif-
wall parallel to the wall with the slits, a dis-
ference, but inserting 2n does. How
tance y behind it. The detector is placed
would you change that scenario to
on the back wall at a distance x beyond the
match the sum in this problem?
first slit.
(e) Did the new factor of 2n in this prob-
x
lem change the frequency of the result-
ing oscillations? If so, how?
∑5 y
3.117 Consider the sum n=1 sin(n𝜋∕8), which
we can consider as the imaginary part of
∑5 in𝜋∕8
n=1
e . In the Explanation (Section 3.5.2)
we calculated a similar sum using a geo- (a) How far does the light travel as it moves
metric series, but you can also do it graph- from the nth slit to the detector?
ically by thinking of the terms in the com- (b) Use a Maclaurin series to approximate
plex sum as vectors in the complex plane. your answer to Part (a) as a linear func-
This problem will be easiest with a pro- tion in 𝛿. Under what circumstances
tractor and a piece of graph paper. is this a reasonable approximation
(a) What are the modulus and phase of the to use?
first term in the complex series? (c) Using the approximate distance you
(b) Draw the complex plane and draw found, what phase change does the light
an arrow from the origin to this first experience as it goes from the nth slit
term. The modulus will tell you the to the detector? (The phase difference
length of this arrow and the phase can be defined as 2𝜋 times the number
will give the direction. of wavelengths between the slit and the
(c) Draw another arrow representing the sec- detector.)
ond term, once again using its proper (d) The signal received from a single slit is
modulus and phase. To add the two num- A cos(𝜔t + 𝜙), where 𝜔 is the angular fre-
bers, place the beginning of this second quency of the light beam and 𝜙 is the
arrow at the end of the first one. phase change from the slit to the detec-
(d) Continue in this way until you have tor. Write a sum representing the total
included all five arrows. Then draw an signal received at the detector.
arrow from the origin to the end of (e) The total signal can be rewritten in
the arrow for the last term. This long ∑
the form A cos(𝜔t + c + dn). Find
arrow represents the total sum. c and d in terms of the constants
(e) Recalling that the real sum we want is the given in the problem.
imaginary part of this complex sum, use (f) Rewrite the signal as the real or imaginary
∑5
your drawing to estimate n=1 sin(n𝜋∕8). part of a complex exponential function.
7in x 10in Felder c03.tex V3 - February 27, 2015 6:46 P.M. Page 135
Find the sum of this complex series. Give 𝛿 = 7 × 10−4 m, and 𝜆 = 10−6 m. Our
your answer in terms of c and d, which will goal is to see the pattern of light and
be much shorter and easier than writing it dark that will emerge on the wall, so we
in terms of the original constants in the need to leave x as a variable.
problem.
(g) Plug your solution from Part (f) into
Your sum should have been in the form Ze i𝜔t , a computer using N = 2 for a double-slit
where Z is a complex constant. The total sig- experiment. Ask the computer to plot the
nal received at the detector is the real part amplitude of the signal as a function of
of your answer, but without calculating that x. What pattern emerges on the wall?
you can see that the signal will oscillate with
an amplitude |Z |. Even finding that modu- (h) Repeat Part (g) for 5 slits, 10 slits, and
lus is a fairly tedious calculation, however, so 100 slits. How does the pattern change
let’s turn to a computer. Let A = 1, y = 9 m, as the number of slits increases?
CHAPTER 4
Partial Derivatives
∙ take derivatives using the product rule, quotient rule, and chain rule.
∙ graph single-variable functions in Cartesian (rectangular) and polar coordinates.
∙ work with vectors, including computing the dot product.
∙ graph and interpret “multivariate” functions (one dependent variable, multiple indepen-
dent variables).
After you read this chapter, you should be able to …
“How far will the meteoroid descend into the atmosphere before burning up entirely?”
An introductory calculus problem might claim that the distance d is a function of the mete-
oroid’s speed v, or of its mass m, or of the atmospheric temperature T . Any of these functions
might be valid under the right circumstances, but all are oversimplifications because the
meteoroid’s penetration actually depends on all these variables—among others.
Functions of one variable are the exception; most important quantities depend on many
different variables. In this chapter we assume you have some basic familiarity with multivari-
ate functions, but have not necessarily done much calculus with them. We look at derivatives
of multivariate functions, known as “partial derivatives,” and discuss some of their many
applications.
1. The slope of the string at the point marked in Figure 4.1 is “the derivative of y with
respect to x,” sometimes written 𝜕y∕𝜕x. Estimate 𝜕y∕𝜕x at the point shown by looking
at the drawing.
2. You cannot estimate “the derivative of y with respect to t” by looking at the drawing,
but suppose we told you that at this particular point on the string at this particular
moment, 𝜕y∕𝜕t = −3. Estimate where that piece of string would be at t = 10.5.
See Check Yourself #17 in Appendix L
An “equation of motion” relates the acceleration of an object to its position and/or veloc-
ity. For example, a mass attached to a rusted spring might obey the equation d 2 x∕dt 2 =
−(b∕m)(dx∕dt) − (k∕m)x. If you know the position and velocity at any given moment, you can
use this equation to calculate the acceleration and thereby predict the motion over time.
The equation of motion for our string is the “wave equation”:
𝜕2y 𝜕2y
= c2 2 (4.1.1)
𝜕t 2 𝜕x
On the left is the second derivative of y with respect to t: the vertical acceleration. On the
right is the second derivative of y with respect to x: the concavity. This equation asserts that
if you know the concavity at any given point—any particular x and t—you can determine the
acceleration at that point. If you know the shape of the entire string at a particular moment,
as well as the velocity at each point, you can use this equation to predict the motion of the
string over time.
3. Consider a string that begins at rest stretched into a horizontal line. (We can express
this mathematically by writing y(x, 0) = 0.) Based on Equation 4.1.1, what will happen
to this string over time? Your answer will not require calculations, but you shouldn’t
just say what you would expect physically. Explain what behavior Equation 4.1.1
predicts by thinking about concavity and acceleration.
4.
. Now consider a string that begins at
rest stretched into a diagonal line from y
the point (0, 0) to the point (10, 20), so
y(x, 0) = 2x. Based on Equation 4.1.1, 2
what will happen to this string over 1
time? x
See Check Yourself #18 in Appendix L 2 4 6 8 10
‒1
5. Now consider the string whose ini-
tial position y(x, 0) is represented in ‒2
Figure 4.1. Based on Equation 4.1.1,
which parts of the string will accelerate FIGURE 4.1 A vibrating string.
downward? Which parts will accelerate
upward?
(x0, y0)
At the point (x0 , y0 ), the slope of the black curve is 𝜕z∕𝜕x and the slope of the blue curve is 𝜕z∕𝜕y.
But another visualization, more useful in many circumstances, comes from the “height of
a string” function y(x, t) from Section 4.1.
x
2 4 6 8 10
‒1
‒2
This function has two different first derivatives with different meanings.
∙ 𝜕y∕𝜕x asks the question “how much does the height increase per unit distance as we
move to the right?” In a word, it gives the slope.
∙ 𝜕y∕𝜕t asks the question “how much does the height increase per unit time if we watch
one point on the string?” In a word, it gives the velocity.
Roughly speaking, 𝜕y∕𝜕x is how much higher the string is one unit further to the right, and
𝜕y∕𝜕t is how much higher the string will be at the point you’re looking at one unit of time
later.
Since we can take the derivative of either of these derivatives with respect to either variable,
there are four second derivatives.
∙ (𝜕∕𝜕x)(𝜕y∕𝜕x) asks the question “How much does the slope change per unit distance if
we move to the right?”
∙ (𝜕∕𝜕t)(𝜕y∕𝜕x) asks the question “How much does the slope change per unit time if we
watch one point and wait?”
∙ (𝜕∕𝜕x)(𝜕y∕𝜕t) asks the question “How much does the velocity change per unit distance
if we move to the right?”
∙ (𝜕∕𝜕t)(𝜕y∕𝜕t) asks the question “How much does the velocity change per unit time if
we watch one point and wait?”
The first item in this list, generally written as 𝜕 2 y∕𝜕x 2 , represents the concavity at a particular
point on the rope and moment in time. The fourth item in the list, 𝜕 2 y∕𝜕t 2 , represents the
acceleration of one point on the rope at one instant.
The middle two, 𝜕 2 y∕𝜕t𝜕x and 𝜕 2 y∕𝜕x𝜕t, are called “mixed partial derivatives,” or “mixed
partials” for short. Surprisingly, partial derivatives generally commute: that is, if you take the
same derivatives in a different order, you get the same result.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 141
2
As an example of Schwarz’ theorem, consider the function z(x, y) = xe xy .
2 2
𝜕z∕𝜕x = e xy + xy2 e xy
2 2 2 2 2
𝜕 2 z∕𝜕y𝜕x = 2xye xy + 2xye xy + 2x 2 y3 e xy = 4xye xy + 2x 2 y3 e xy
2
𝜕z∕𝜕y = 2x 2 ye xy
2 2
𝜕 2 z∕𝜕x𝜕y = 4xye xy + 2x 2 y3 e xy
Notation
The most common way to write a partial derivative is the one we’ve been using: 𝜕z∕𝜕x. An
alternative is a subscript, with or without a comma: z,x = zx = 𝜕z∕𝜕x. It’s also common in
physics to use a dot over a variable to mean a derivative with respect to time: ż = 𝜕z∕𝜕t.
In cases where there is potential ambiguity about which variable is being held constant,
you can indicate that with a subscript on the entire partial derivative:
( )
𝜕z || 𝜕z
| or means partial derivative of z with respect to x, holding y constant
𝜕x |y 𝜕x y
You’ll work through an example with ambiguity like this in Problem 4.27, and we’ll show
some important applications of this idea in Section 4.10 (see felderbooks.com).
1 It’s a common mistake to refer to the rock as a “meteor,” which actually means the visible trail left by a meteoroid
as it burns up. If a meteoroid reaches the ground it becomes a “meteorite.” Now don’t you feel smart?
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 142
the same questions for the partial deriva- (d) Suppose 𝜕 2 u∕𝜕t 2 is negative. What would
tive with respect to that variable. that mean? (Same comment.)
4.2 The amount you pay in taxes depends on your (e) Suppose 𝜕 2 u∕(𝜕t𝜕h) is zero. What
income I , your total number of dependents would that mean? (Same comment.)
d, and the amount you give to charity c. There are two equally valid answers you
In Parts (a)–(c), explain the meaning of could give to this question.
the given partial derivative in terms a 12 year
old could understand. Also give the units
For Problems 4.6–4.9 find the partial derivatives
of the derivative and whether you would
𝜕f ∕𝜕x, 𝜕f ∕𝜕y, and 𝜕 2 f ∕𝜕x𝜕y.
expect it to be positive or negative.
(a) 𝜕T ∕𝜕I 4.6 f = xy2
(b) 𝜕T ∕𝜕d 4.7 f = x∕y
(c) 𝜕T ∕𝜕c 4.8 f = a cos(bxy) + ce d∕y
(d) List one other factor that your taxes 4.9 f = sin x∕ cos y
depend on. Then answer the same
questions for the partial derivative 2
with respect to that variable. 4.10 If f (x, y, t) = ae bx y sin(𝜔t) find 𝜕f ∕𝜕t
and 𝜕 2 f ∕𝜕x𝜕y.
4.3 Your puppy’s weight W depends on its caloric
intake c and how many hours per week you 4.11 For the function f (x, y) = x∕cos y calcu-
walk it, h. In Parts (a)–(b), explain the mean- late both mixed partial derivatives and
ing of the given partial derivative in terms a verify that they are equal.
12 year old could understand. Also give the 4.12 For the function f (x, y) = xe xy calculate
units of the derivative and whether you would both mixed partial derivatives and ver-
expect it to be positive or negative. ify that they are equal.
(a) 𝜕W ∕𝜕c 4.13 You are standing on a surface whose height
(b) 𝜕W ∕𝜕h is given by the function z(x, y). At the spot
(c) List one other factor that your puppy’s where you are standing, 𝜕z∕𝜕x = 5 and
weight depends on. Then answer the 𝜕z∕𝜕y = −5. Are you facing uphill or downhill if
same questions for the partial deriva- you face…
tive with respect to that variable. (a) East (positive x-direction)?
4.4 Consider a function u(x, y, z, t) that gives (b) North (positive y-direction)?
the temperature of the air in a room (c) West (negative x-direction)?
as a function of time. (d) South (negative y-direction)?
(a) What would it mean physically if 𝜕u∕𝜕z > 0 4.14 You are standing on a surface whose height
at some point (x0 , y0 , z0 , t0 )? is given by the function z = e x+y ∕y2 . Your
(b) What would it mean physically if 𝜕u∕𝜕t = 0 position is (1, 1, e 2 ), and you are facing
at some point (x0 , y0 , z0 , t0 )? in the positive x-direction.
(c) Is it possible for both of the above (a) Do a calculation that proves that
statements to be true at the same you are looking uphill.
place and time? (b) If you move in the positive x-direction,
4.5 The temperature u on Skullcrusher Moun- will the uphill slope in the x-direction
tain depends on height h and time t. increase or decrease?
(a) Everywhere on the mountain 𝜕u∕𝜕h (c) If instead you move in the positive
is negative. What does that tell you y-direction—still facing in the positive
about the mountain? x-direction—will the uphill slope in the
(b) Throughout the morning 𝜕u∕𝜕t is x-direction increase or decrease?
positive. What does that tell you 4.15 If a battery with potential V is hooked across a
about the mountain? resistor of resistance R, a current flows whose
(c) Suppose 𝜕 2 u∕𝜕h 2 is negative. What magnitude is I = V ∕R. (V , I , and R are all
would that mean? (Explain in the clear- positive quantities.) In Parts (a) and (b) your
est and least technical language you final answers should be in language that would
can, for the benefit of a mountain- make sense to someone who has never taken
climber who knows no calculus.) calculus.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 143
(a) Calculate 𝜕I ∕𝜕V . Is it positive or neg- 4.18 In special relativity, the length√
of an object
ative? What does this tell us about the is given by the formula L = L0 1 − v 2 ∕c 2
current through such a circuit? where L0 is the “rest length” of the object,
(b) Calculate 𝜕I ∕𝜕R. Is it positive or neg- v is its speed, and c, the speed of light, is
ative? What does this tell us about the a constant. (Note that v ≥ 0.)
current through such a circuit? (a) Calculate 𝜕L∕𝜕L0 . Is it positive or negative?
(c) Demonstrate that 𝜕 2 I ∕𝜕V 𝜕R = 𝜕 2 I ∕𝜕R𝜕V . What does this tell you about the length of
high-speed objects in special relativity?
4.16 A light source of strength S shining on an
object d meters away provides an illumination (b) Calculate 𝜕L∕𝜕v. Is it positive or negative?
given by I = kS∕d 2 where k is a positive con- What does this tell you about the length of
stant and S and d are positive variables. high-speed objects in special relativity?
(a) Without doing any calculations, would (c) Calculate lim− (𝜕L∕𝜕v). Based on the result,
v→c
you expect 𝜕I ∕𝜕S to be positive, neg- explain how you can guess the sign of
ative, or zero? Explain your answer. 𝜕 L∕𝜕v without taking another derivative.
2 2
To estimate 𝜕z∕𝜕x(3, 1), you can look at the than one of the solutions may work. (Any letters that
average rate of change as x goes from 2 to 3 don’t appear in the derivatives are constants.)
(holding y constant): (70 − 50)∕(3 − 2) = 20.
Then repeat the calculation as x goes from 4.23 𝜕f ∕𝜕t = c 𝜕f ∕𝜕x
3 to 4: (80 − 70)∕(4 − 3) = 10. Averaging the (a) f (x, t) = cxt
two, we approximate 𝜕z∕𝜕x(3, 1) = 15. (b) f (x, t) = c
(a) Estimate 𝜕z∕𝜕y(2, 3). (c) f (x, t) = (x + ct)2
(b) Estimate 𝜕z∕𝜕y(3, 3). (d) f (x, t) = (x − ct)2
(c) Estimate 𝜕z∕𝜕y(4, 3). (e) f (x, t) = x 2 + ct 2
(d) Based on your three answers, esti- 4.24 “The Wave Equation”: 𝜕 2 y∕𝜕t 2 = c 2 𝜕 2 y∕𝜕x 2 .
mate 𝜕 2 z∕𝜕x𝜕y(3, 3). This describes (among other things)
(e) Use a similar method to estimate waves on a stretched string.
𝜕 2 z∕𝜕y𝜕x(3, 3). The equality of mixed (a) y(x, t) = cxt
partials tells us that your final answer (b) y(x, t) = c
should come out the same as Part (d),
(c) y(x, t) = (x + ct)2
although the intermediate calculations
will be completely different. (d) y(x, t) = (x − ct)2
(e) y(x, t) = x 2 + ct 2
4.21 [This problem depends on Problem 4.20.] Another
way to estimate a derivative from a table is to 4.25 “The Heat Equation”: 𝜕u∕𝜕t = c 2 𝜕 2 u∕𝜕x 2 .
skip the value you are interested in, and find This describes (among other things) tem-
the average around it. For instance, to estimate perature along a thin rod.
𝜕z∕𝜕x(3, 1) in Problem 4.20, you look at the two (a) u(x, t) = x 2 + 2c 2 t
values around the 70: (80 − 50)∕(4 − 2) = 15. (b) u(x, t) = cx
(a) Estimate 𝜕z∕𝜕y(2, 3) using this quicker (c) u(x, t) = (x + ct)2
method. 2
(d) u(x, t) = e x+c t
To show that this method will always ( )( )
work, consider the more generic 4.26 𝜕f ∕𝜕t = 1 − x 2 𝜕 2 f ∕𝜕x 2 − 2x(𝜕f ∕𝜕x).
table of values below. Note that several of the proposed solutions
z(x, y) below involve functions that you may not
y\x 1 2 3 4 5 have heard of, but you can still use a com-
1 z11 z21 z31 z41 z51 puter algebra program to take their derivatives
2 z12 z22 z32 z42 z52 and check them in the given equation.
3 z13 z23 z33 z43 z53 (a) f (x, t) = sin xe −2t
4 z14 z24 z34 z44 z54 (b) f (x, t) = J1 (x)e −2t (“Bessel function”)
5 z15 z25 z35 z45 z55 (c) f (x, t) = P1 (x)e −2t (“Legendre polynomial”)
(d) f (x, t) = Ai(x)e −2t (“Airy function”)
(b) Compute 𝜕z∕𝜕x(3, 2) in this table by using
(e) f (x, t) = H1 (x)e −2t (“Hermite polynomial”)
the “find two different rates of change and
average them” method from Problem 4.20.
4.27 In cases where a function depends on sev-
(c) Compute 𝜕z∕𝜕x(3, 2) in this table by eral variables that are related to each other,
using the quicker method demon- the notation 𝜕f ∕𝜕x can be ambiguous with-
strated in this problem. out a specification of what other variable is
4.22 [This problem depends on Problems 4.20 and 4.21.] being held constant. (We will take up this
Calculate 𝜕 2 z∕𝜕x𝜕y(3, 3) and 𝜕 2 z∕𝜕y𝜕x(3, 3) issue in more detail in Section 4.10 (see
for the table of values given in Prob- felderbooks.com).) To illustrate how that can
lem 4.21, to demonstrate the equality of happen, suppose you are designing a box with
mixed partials more generally. no top for your company to sell. The volume
is given by V = wlh where w, l, and h are the
A differential equation is an equation involving width, length, and height respectively. You
derivatives. If it involves partial derivatives, it’s called a want to use a fixed amount of material, given
“partial differential equation.” In Problems 4.23–4.26 by the surface area S = wl + 2wh + 2lh.
you should plug each of the given solutions into the (a) Using the constraint that the surface
partial differential equation given in the problem to area S must remain constant, express
see whether it’s a valid solution. In some cases more V as a function of w and l.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 145
(b) Find (𝜕V ∕𝜕w)l , meaning the rate of (e) You should have found in this case that
change of the volume with respect to one of the two derivatives is zero and
w, holding the length l constant. the other is positive. Explain this result
(c) Now express V as a function of w in terms that could easily be under-
and h and find (𝜕V ∕𝜕w)h . stood by a student who has never taken
(d) Compute (𝜕V ∕𝜕w)l and (𝜕V ∕𝜕w)h for the calculus.
case l = 1 m, h = 1 m, and S = 5 m2 .
To understand the chain rule more generally, consider the breakdown of functions in that
example. We can write the problem like this:
z(y) = sin y
y(x) = x 2
Take a moment to calculate dz∕dy and dy∕dx and convince yourself that Equation 4.3.1
does represent
( ) the steps you take when you use the chain rule to find the derivative of
z = sin x 2 . (We’re serious; you’ll get a lot more out of this section if you stop now and
work through this. We’re just waiting…done? Okay, let’s move on.)
We use the chain rule whenever we have a chain of dependence among several variables.
The above example demonstrates the simplest possible case:
depends
on
depends
on
The chain rule tells us to move down the picture from z at the top to x at the bottom,
multiplying as we go.
A slightly more complicated case that might appear in an introductory calculus class is the
chain of functions: a(b) = e b , b(c) = cos(c), c(d) = ln(d). The chain rule tells us that:
( )( )( )
da da db dc
= (4.3.2)
dd db dc dd
Once again, we urge you to convince yourself of two things.
1. Equation 4.3.2 makes perfect sense if you think of each derivative as a fraction, with
numerators and denominators canceling.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 147
2. This is really nothing new. If someone had asked you yesterday to take the derivative of
the function e cos(ln d) , you would have gone through precisely the steps represented by
Equation 4.3.2—even though you might never have written the equation like that—in
order to end up with e cos(ln d) (− sin(ln d))(1∕d).
depends
on
c g
depends depends
on on
t t
You can calculate dS∕dt by following the chain rule down the tree from S to t, but in which
direction? The answer—and this is the key to applying the chain rule in any multivariate
situation—is that you go in all possible directions, and you add their contributions.
( ) ( ) ( ) ( dg )
dS 𝜕S dc 𝜕S
= + (4.3.3)
dt 𝜕c dt 𝜕g dt
You can convince yourself of Equation 4.3.3 with common sense. If the concentration of
cocoa is increasing by 3 units every minute, and every additional unit of cocoa decreases the
sweetness
( )by( 5, then) the cocoa is decreasing the sweetness by 15 every minute. The effect
of 𝜕S∕𝜕g dg ∕dt similarly measures the increase (or decrease) based on the changing
sugar concentration. The new element, the “plus” between the two terms, indicates that you
experience both effects at once.
A different example makes this result
more visual: consider graphing a function
z(x, y), so z is the height above the xy-plane.
As you move along the surface, you change
your x- and y-coordinates simultaneously,
and your z-coordinate changes in response
(since you are staying on the surface).
How fast does z change? We answer this
question by replacing one diagonal step
across the xy-plane with two perpendicular
steps. If you take a step in the positive
dx x-direction, the height increases by
dy (𝜕z∕𝜕x)dx. A step in the positive y-direction
increases the height by (𝜕z∕𝜕y)dy. The two
effects combine when you take a diagonal
step, so dz = (𝜕z∕𝜕x)dx + (𝜕z∕𝜕y)dy. (The
Discovery Exercise, Section 4.3.1, was
The total change in z is the sum of the changes due designed to bring you to this conclusion
to changing x and y. on your own.)
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 148
Question: Consider the surface represented by the function z = x 2 ∕y. A point moves
along that surface with x = sin t and y = ln t. How fast is the point’s height changing
as a function of time?
Answer:
The chain rule tells us that:
( ) ( ) ( ) ( dy ) ( ) ( 2)( )
dz 𝜕z dx 𝜕z 2x x 1 2 sin t cos t sin2 t
= + = cos t + − 2 = −
dt 𝜕x dt 𝜕y dt y y t ln t t(ln t)2
If we approach the same problem without the multivariate chain rule, we begin by
writing z(t) = sin2 t∕ ln t and use the quotient rule:
dS 𝜕S 𝜕c dx 𝜕S 𝜕c 𝜕S 𝜕g dx 𝜕S 𝜕g
= + + + (4.3.4)
dt 𝜕c 𝜕x dt 𝜕c 𝜕t 𝜕g 𝜕x dt 𝜕g 𝜕t
If there were any branches of the dependency tree that didn’t end in t, then dS∕dt wouldn’t
be a meaningful quantity. For example, we could define a quantity R(x, t) as the sweetness
of the river. (At any given place and time, the river has a certain sweetness, regardless of the
fish.) We could speak of 𝜕R∕𝜕t: at a given place and time, the river’s sweetness is increasing
by this much. But we could not speak of a dR∕dt because the river doesn’t have a single
sweetness at each time. Recalling that S is the sweetness experienced by the fish, dS∕dt exists
because the fish does experience a particular sweetness at each time.
S In terms of dependency trees, R depends on x, y, z, and t, but none
of those are functions of the other ones.
depends
on With a bit of practice, you can easily write a chain rule for any
given dependency diagram. If we now gave you specific functions
that describe the fish’s position as a function of time, and the con-
c g centration of cocoa as a function of time and position, and so on, you
depends depends could calculate change in sweetness as seen by the fish. Of course this
on on problem is deliberately silly, but this technique is important in many
areas of physics and engineering; you will see some serious examples
in the problems.
x t x t
Total Derivatives and Partial Derivatives
depends depends
on on Focus for a moment on the left-hand branch of Figure 4.2. The tree
allows us to find the rate of change of c, the concentration of cocoa:
t t
dc 𝜕c dx 𝜕c
= + (4.3.5)
FIGURE 4.2 dt 𝜕x dt 𝜕t
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 149
It isn’t hard to follow the diagram and write the correct equation. Here’s the hard part: how
do we make sense of an equation that contains two different “derivative of c with respect
to t” quantities? Discussing what these two quantities mean, and how they differ from each
other, offers us a window into the more general issue of total derivatives (dc∕dt) and partial
derivatives (𝜕c∕𝜕t).
Remember that the cocoa concentration is changing over time for two reasons. First, c is
different in different places, which matters because the fish is moving. Second, c is different
at different times. Both of these factors contribute to the overall change the fish experiences.
If both factors cause increases, the fish will experience an extra-large cocoa increase; if one
causes a decrease, they may cancel each other out.
The quantity 𝜕c∕𝜕t isolates one of these changes, the part that is directly due to the fact
that concentration depends on time; it ignores the indirect effect due to the fact that the
concentration depends on position, which in turn depends on time. To put it another way,
𝜕c∕𝜕t represents the rate of concentration change the fish would see if it stopped swimming.
(Note from Equation 4.3.5 that dc∕dt and 𝜕c∕𝜕t are the same if dx∕dt = 0.) On the other
hand, dc∕dt takes into account both the direct and indirect effects.
So whenever you see a total derivative, you know that “If this variable changes by this
much, that variable will change by that much.” If dc∕dt is a constant 5, wait 3 s and c will go
up by 15 for the fish. A partial derivative tells you much less: “If this variable changes while
all other variables remain constant” is, in many cases, an impossible scenario. (If the fish is
swimming, you cannot literally change t without also changing x.) The partial derivative is
still useful, but often only as a step toward finding the total derivative.
This is not just a philosophical issue. Understanding dx in this way is vital to setting up
integrals, as we will discuss extensively in Chapter 5. Differentials are used extensively in
thermodynamics, where you will be expected to work with equations such as the “thermo-
dynamic identity” dU = T dS − P dV . (“If S changes by this much while V changes by that
much, then how much will U change?”)
Differentials can also be viewed as margins of error. For instance, the kinetic energy of a
Newtonian object is E = (1∕2)mv 2 . Based on this equation, we can write:
dE = mv dv + (1∕2)v 2 dm (4.3.6)
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 150
Divide both sides by dt and you have an equation relating dE∕dt to dv∕dt and dm∕dt. But you
can also take Equation 4.3.6 on its own, to say that if the velocity is 10 ± .1 m/s and the mass
is 30 ± .01 kg, then the energy is (1500 ± 30.5) J.
But if it is vital to understand that dx can be treated as a variable, it is also important to
know that 𝜕x cannot! If you try to multiply both sides of an equation by 𝜕t or cancel 𝜕t out
of the top and bottom of a fraction you will at best get something meaningless and at worst
get an equation that is simply wrong. You will explore this issue in Problem 4.50.
̇ and v.
v, m, ̇ (Remember that a dot indi- The ball is moving away from the large charge
cates a derivative with respect to time.) at a rate dr ∕dt while charge is being added
(b) Kinetic energy is given by the formula to it. If you measure that the force on it is
E = (1∕2)mv 2 . If the snowball is gaining changing at a rate dF ∕dt, how fast must the
0.2 kg of mass each second, it’s speeding charge be getting added to the ball? (Assume
up by 0.1 m/s each second, and its current the large charge Q stays constant.)
mass and speed are 12.5 kg and 3.2 m/s, 4.42 A can is manufactured as a right circular
how fast is its kinetic energy increasing? cylinder. The radius is r = 3′′ with a margin
of error of dr = .01′′ . The height is h = 6′′
4.37 According to the Stefan-Boltzmann law, the with a margin of error of dh = .001′′ .
power emitted by a star is given by P = 𝜎AT 4 ,
where A is the star’s surface area (4𝜋R 2 ), T is
its surface temperature, and 𝜎 = 5.67 × 10−8 J
s−1 m−2 K−4 . Near the end of its life our sun will
expand to become a “red giant.” As it expands
its radius R will increase at approximately 10−5
m/s while its surface temperature will decrease h
at roughly 10−13 K/s. Assuming it starts at its
current state with R ≈ 7 × 108 m and T ≈ 6000
K, how fast will its total power output change r
as it expands toward the red giant phase?
4.38 Consider a function F (x, y) with known deriva-
tives 𝜕F ∕𝜕x and 𝜕F ∕𝜕y. Find formulas for
the derivatives 𝜕F ∕𝜕𝜌 and 𝜕F ∕𝜕𝜙, where 𝜌
and 𝜙 are the polar coordinates. (a) Calculate dV , the margin of error
of the volume.
4.39 Object A is moving with velocity vAB rela- (b) Calculate dA, the margin of error
tive to object B, and B is moving with veloc- of the surface area.
ity vBC (in the same direction) relative to
4.43 In order to measure the total force F on a
object C. According to special relativity, the
particle you measure its mass and acceler-
velocity of A with respect to C is:
ation to be m ± dm and a ± da. What is the
vBC + vAB uncertainty in your measurement of F ?
vAC =
1 + vBC vAB ∕c 2 4.44 When an object is at a distance o from a
thin lens with focal length f , it projects an
where c, the speed of light, is a constant. If image at a distance i from the lens accord-
vAB = 0.7c, dvAB ∕dt = .1c∕s, vBC = 0.8c, and ing to the lens equation: (1∕o) + (1∕i) =
dvBC ∕dt = −.2c∕s, how fast is the velocity (1∕f ). You place an object at (3.2 ± 0.3) cm
of A with respect to C changing? (If you’ve from a lens with focal length (1.7 ± 0.2) cm.
studied relativity you might wonder what Predict the image distance i, including the
reference frame t is measured in. Physically uncertainty in the prediction.
interpreting these derivatives can be tricky, but 4.45 If you stand outside, the perceived temperature
this has no effect on the calculations we’re T you feel depends on the actual outside tem-
asking for.) perature u and the amount of clothing c you
4.40 The temperature at a point (x, y) is T (x, y), are wearing. However, the amount of clothing
measured in degrees Celsius. The temper- you wear depends on the outside temperature.
ature function is independent of time and (a) Draw the dependency diagram
satisfies 𝜕T ∕𝜕x(2, 3) = 4 and 𝜕T ∕𝜕y(2, 3) = 3, for this situation.
where x and y are measured in centimeters. A (b) Your dependency diagram should allow you
bug crawls so√ that its position after t seconds is to conclude that dT ∕du = (𝜕T ∕𝜕c)(dc∕du) +
given by x = 1 + t, y = 2 + t∕3. How fast is the (𝜕T ∕𝜕u). Explain the meanings of the quan-
temperature rising on the bug’s path after 3 s? tities dT ∕du and 𝜕T ∕𝜕u. Your explanation
4.41 A small metal ball with charge q at a dis- should explain how these two quantities
tance r from a large charge Q experiences are different from each other, and how
a force F = −kqQ ∕r 2 , where k is a constant. the former depends on the latter.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 152
4.46 Explain why we wrote dx∕dt and not 𝜕x∕𝜕t on (b) Write the chain rule for du∕dt.
the right-hand side of Equation 4.3.4. (c) Assume the climber is ascending the
4.47 Consider a fly buzzing around a room. The mountain sometime around sunrise. For
temperature Tfly that the fly feels at any given each of the derivatives in your answer to
moment depends on the fly’s position (because Part (b) (including du∕dt itself) explain
temperature varies throughout the room) and whether you would expect it to be pos-
the time (because the temperature through- itive or negative, and why.
out the room is changing). The fly’s posi- 4.50 Exploration: The Meaning of 𝜟𝒙, 𝒅𝒙 and 𝝏𝒙
tion is, of course, a function of time. The local cheese factory is growing. They
(a) Describe in terms accessible to an are adding two new cheese presses every
average 12 year old what 𝜕Tfly ∕𝜕t and day, and each cheese press can make 300 lb
dTfly ∕dt represent physically. of cheese per day. In this part of the prob-
(b) Describe a situation in which dTfly ∕dt lem you will be working with only three
is positive and 𝜕Tfly ∕𝜕t is negative. Give variables: Δt (time interval), Δp (change
enough detail to explain why the two in number of cheese presses), and Δc
derivatives have the signs they do. Your (change in daily cheese output).
explanation should not contain any math (a) How quickly is the daily cheese out-
terms like “with respect to” or “hold- put increasing? (Your answer will be
ing … constant,” but should just be a a number, with units.)
description of what’s happening in the (b) Express the statement “they are adding two
room and what the fly is doing. new cheese presses every day” in terms of
(c) Consider the function Troom (x, y, z, t). our three variables. Hint: It isn’t just Δp = 2.
Explain why dTroom ∕dt doesn’t (c) Express the statement “each cheese press
mean anything. can make 300 lb of cheese per day” in
(d) We use Troom and Tfly to represent two terms of our three variables.
different variables. However, 𝜕Troom ∕𝜕t (d) Write an equation that shows how you
and 𝜕Tfly ∕𝜕t are equal. Why? got the answer to Part (a). Your equation
(e) We said in the Explanation (Section 4.3.2) should be based on our three variables,
that you can only calculate a total derivative and should not contain any specific
df ∕dx if all the branches of the depen- numbers.
dency tree for f end with x. Draw the Unfortunately, mice have gotten into the
dependency trees for Tfly and Troom and factory. Every day brings five more mice, and
use them to verify that dTfly ∕dt is a mean- every mouse eats 1/10 lb. of cheese every
ingful quantity and dTroom ∕dt isn’t. day. In addition to the three variables above,
4.48 In older physics textbooks it was common to you will now also work with Δm.
talk about “relativistic mass,” which increases (e) Now how quickly is the daily cheese
as an object goes faster. In such a formula- output increasing?
tion, the energy of a moving object depends (f) Write an equation that shows how
on its velocity v and its relativistic mass m, you got the answer to Part (e). Your
while m in turn depends on v. equation should be based on our four
(a) Draw a dependency tree for the energy E. variables with no numbers. Your equation
(b) Write the chain rule for dE∕dv. should contain Δc in three places,
(c) For each of the derivatives in your answer but it won’t mean the same thing in
to Part (b) (including dE∕dv itself) each place.
explain whether you would expect it to (g) In each place Δc appears in your equation
be positive or negative, and why. in Part (f), what does it mean? (Answer
4.49 A mountain climber feels a temperature separately for each of the three places.)
u that depends on the climber’s height (h) In the limit as Δt → 0, all the Δ vari-
h and the time t. Of course the climber’s ables become d or 𝜕. Rewrite your
height also depends on the time. answer to Part (f) in this case.
(a) Draw a dependency tree for the (i) Explain, based on this problem, why dc is
temperature u. a meaningful variable but 𝜕c is not.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 153
an 2
+ PV = nRT (4.4.1)
V
Here n is the number of moles of gas and a and R are constants. Suppose we change the
temperature and pressure at controlled rates dT ∕dt and dP ∕dt and we want to know how
quickly the volume will change in response.
2 You may recognize this as a variation of the more familiar “ideal gas law” PV = nRT . This equation takes into
account some surface effects that are neglected by the ideal gas law. This is a simplified version of the “van der
Waals” equation, which you will use in Problem 4.72.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 154
One approach to this problem would be to first solve Equation 4.4.1 for the volume, and
then find dV ∕dt with the multivariate chain rule. However, that first step can be messy (as it
is in this case) or impossible. We therefore find dV ∕dt without first solving for V , a process
called “implicit differentiation”: we take the derivative of both sides of the equation with
respect to time.
The right-hand side of Equation 4.4.1 is simply a constant times the function T (t), so its
derivative is nR(dT ∕dt). To take the derivative of the left-hand side we have to use the chain
rule and the product rule.
an 2 dV dV dP dT
− +P +V = nR
V 2 dt dt dt dt
Then we solve for dV ∕dt to get
dV nR(dT ∕dt) − V (dP ∕dt)
=
dt P − (an 2 ∕V 2 )
If you’ve never seen this process before, those derivatives are likely to throw you. Why isn’t
the derivative of 1∕V just −1∕V 2 ? The answer is that we are taking the derivative with respect
to time. To see why that matters, it may help to consider a few specific functions.
∙ The derivative of 1∕(ln t) is −1∕(ln t)2 × (1∕t).
∙ The derivative of 1∕(sin t) is −1∕(sin2 t) × (cos t).
∙ The derivative of 1∕(3t 2 + 5t + 12) is −1∕(3t 2 + 5t + 12)2 × (6t + 5).
Do you see the pattern? For any function f (t), the derivative of 1∕f is −(1∕f 2 ) × (df ∕dt). We
applied the chain rule as always, but in this case we applied it to a function V (t) that we don’t
know, so when we did the step that says “multiply by the derivative of what’s inside” we wrote
dV ∕dt instead of some specific function of t.
Problem:
2
The quantities x(t), y(t), and z(t) are related by e xy∕z = 2. Find ż in terms of ẋ and y.
̇
(Remember that a dot means “derivative with respect to time.”)
Solution:
Take the derivative of both sides with respect to t. First use the chain rule to take the
derivative of the exponential times the derivative of what’s inside. Then use the
quotient rule on the fraction. Finally, use the product rule for the derivative of the
numerator. That sounds like a lot of steps, but each one is straightforward.
( )
2 z 2 (x ẏ + yx)
̇ − 2xyz ż
e xy∕z =0
z4
This can only be satisfied if the numerator of the fraction is zero. That gives us a
simple equation we can solve for z.̇
z
ż = (x ẏ + yx)
̇
2xy
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 155
4.61 A curve is defined by the relation 4.68 [This problem depends on Problem 4.67.] Find
y3 + y = e x + x. the derivative of y = cos−1 x by following the
(a) Use implicit differentiation to find same steps you took in Problem 4.67.
the slope of this curve. Call the result- 4.69 [This problem depends on Problem 4.67.] Find
ing function m(x, y). the derivative of y = tan−1 x by following the
(b) Is m(x, y) positive or negative? Explain same steps you took in Problem 4.67. You will
in words what this tells us about need to know that the derivative of tan x is
the shape of the curve. sec2 x, and the identity tan2 x + 1 = sec2 x.
(a) Assuming dP ∕dt and dT ∕dt are known, 4.75 The Cobb–Douglas production function mod-
find a formula for dV ∕dt. els the production of a commodity with the
(b) The Explanation (Section 4.4.2) solved equation Y = AL 𝛼 K 𝛽 where Y is the total value
this problem for the case b = 0. Check of the commodity produced, L and K are the
your answer by verifying that it reduces input of labor and capital respectively, and
to our answer in this case. A, 𝛼, and 𝛽 are constants related to overall
productivity (e.g., the technology level).
4.73 When an object is placed near a thin mag-
nifying glass it can project an image onto (a) If you want to increase production at a
a piece of paper on the other side of the specified rate dY ∕dt, but the available
glass. The distance from the glass to the labor is decreasing at a rate dL∕dt, how fast
object o and the distance from the glass to must you increase the capital input?
the image i are related by the “thin lens (b) In the previous part you took A, 𝛼, and
formula.” 𝛽 to be constants. Now suppose that
1 1 1 increased technology is improving the
= +
f o i productivity of labor at a rate d𝛼∕dt. Still
treating A and 𝛽 as constants, how fast
Here f is the “focal length” of the magnify-
must you now increase capital input to
ing glass. Assume a magnifying glass has focal
keep productivity rising at the desired
length 3 cm and you are holding a piece of
rate dY ∕dt?
paper at a distance of 8 cm from it. If the
object is moving away from the glass at 2 cm/s 4.76 Exploration: A Formula for Implicit Differen-
how fast do you need to move the piece of tiation. Any implicit differentiation problem
paper away to keep the image in focus on with two variables begins with an equation
the paper? relating those two variables. This equation
can always be expressed in the form F (x, y) = k
(a) First answer this question by using implicit
where k is a constant. By solving that generic
differentiation to take the derivative
equation for dy∕dx, you derive a formula for
of the thin lens formula with respect
the derivative of any such function.
to time.
The dependency tree in this situation
(b) Next answer by solving the thin looks like this:
lens formula for i and then tak-
ing the derivative directly. Make F
sure you get the same answer both
ways. depends
on
4.74 An economic rule of thumb states that to max-
imize profit you should set the price P for a
product so that it is related to the “marginal x y
cost” M (cost to produce one more unit)
and the “price elasticity of demand” E (a depends
measure of how much demand drops as the on
price increases) according to the following
equation. x
P −M 1
=−
P E (a) Starting with the equation F (x, y) = k, take
the derivative of both sides with respect
(Note that E is a negative number; otherwise to x. Use the dependency tree drawn above
this formula would make no sense.) Assume to expand the left side of your result.
you are selling widgets that have a marginal
(b) Solve the resulting equation for dy∕dx.
cost of $2 and you have found the optimal
price to be $4. If your marginal cost is going up (c) Use the formula you found in Part (b) to
50 cents every month and your market research find the slope of the circle x 2 + y2 = 25.
shows that the price elasticity of demand is (d) If you haven’t already solved Problem 4.58
decreasing (becoming more negative) at a use implicit differentiation directly on
rate of 0.2 per month, how fast should you the formula x 2 + y2 = 25 to find dy∕dx
change your price to keep it at the optimal and check that it matches your answer to
level? Part (c).
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 158
3
y 2
1
0
5
z
0
0
1
x 2
3
In all the questions that follow, assume that you begin at the point (1, 1, 0). As you change your
x- and y-coordinates, your z-coordinate changes to keep you on the surface.
Parts 1–4 can be answered exactly using partial derivatives. Make sure, however, that your
answers make sense with the picture.
1. If you move in the positive x-direction, are you moving up or down? With what slope?
2. If you move in the negative x-direction, are you moving up or down? With what
slope?
3. If you move in the positive y-direction, are you moving up or down? With what slope?
4. If you move in the negative y-direction, are you moving up or down? With what slope?
See Check Yourself #22 in Appendix L
Parts 5–8 should be answered approximately, based on the picture.
5. Now suppose you move diagonally, following the vector î + ĵ along the xy-plane (but
allowing z to change as always so you stay on the surface). Are you moving up or down?
With approximately what slope?
6. Suppose you follow the vector î − ĵ along the xy-plane. Are you moving up or down?
With approximately what slope?
7. What direction would you move along the xy-plane if you wanted to go upward
as steeply as possible? (In Section 4.6 you’ll learn how to use something called a
“gradient” to easily calculate this.)
8. If you placed a ball on this surface at the point (1, 1, 0), which way would it roll?
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 159
As always, don’t let the smooth-sounding phrase “rate of change” pass you by too quickly.
We might say, somewhat loosely, that Du⃗ f gives the amount that f will change if you move by
⃗
precisely one unit in the u-direction. It would be more accurate to say that if you take a small
⃗
step (magnitude ds) in the u-direction, Du⃗ f represents the change in f per unit change in
position, df ∕ds. In the limit as ds → 0, this ratio becomes the actual directional derivative.
This can be expressed in the following equation, where û represents a unit vector in the
⃗
direction of u.
̂ − f (⃗r )
f (⃗r + h u)
Du⃗ f (⃗r ) = lim
h→0 h
The partial derivatives we have seen are special cases of directional derivatives. For instance,
for the direction u⃗ = i,̂ Du⃗ f = 𝜕f ∕𝜕x. But a scalar field has an infinite number of directional
derivatives—not just two or three pointing along the coordinate axes. If you look in a partic-
ular direction u⃗ and see a constant derivative Du⃗ f = 3, that means the temperature is rising
at a rate of 3 degrees for each unit step you take in that direction.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 160
Question: Consider the scalar field f with f (3, 2, 7) = −4. For the direction vector
u⃗ = 3î + 4k,
̂ Du⃗ f = 2. Estimate f (6, 2, 11) and explain why your answer is only
approximate.
Answer:
⃗ so we have the
The journey from (3, 2, 7) to (6, 2, 11) is taken in the direction of u,
derivative we need. A derivative of 2 means that f is increasing by 2 units per unit
traveled; since we travel 5 units, f will increase by a total of 10. We estimate that
f (6, 2, 11) = 6.
This answer is approximate because it assumes that Du⃗ f = 2 throughout the
journey. For most functions, the directional derivative changes as you move from one
point to another. Approximations of this sort work well over short distances; for
longer distances we need a “line integral” (Chapter 5) to add up all the small
displacements.
You’ll show in Problem 4.80 why this formula only works when û is a unit vector. You’ll
show in Problem 4.81 that this formula is really another version of the chain rule.
Question: Find the derivative of the function f (x, y, z) = x 4 y∕z at the point (3, 2, 1) in
the direction u⃗ = 5î − 12k.
̂
Answer:
We begin by “normalizing” u∶ ⃗ |⃗u | = 13, so û = 5∕13î − 12∕13k.
̂
𝜕f ∕𝜕x = 4x y∕z, so 𝜕f ∕𝜕x(3, 2, 1) = 216. Similarly, 𝜕f ∕𝜕y(3, 2, 1) = 81 and
3
𝜕f ∕𝜕z(3, 2, 1) = −162.
So Du⃗ f (3, 2, 1) = (216)(5∕13) + (81)(0) + (−162)(−12∕13) ≈ 232.6154.
The point of this exercise is not simply that “you were taught to think of the derivative as
a slope and you still can,” and it is certainly not that “finding the slopes of surfaces in various
directions has tremendous real-world importance.” The point is to engage your visual mind
in understanding the math. It’s difficult to imagine looking in a certain direction and “seeing
the temperature go up”; it’s comparably easy to imagine a mountain path rising up in front
of you.
If you visualize surfaces in this matter, you can solve some problems without doing any
calculations at all. For instance, suppose that for a particular surface S at a particular point
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 162
(x0 , y0 ), the derivative Du⃗ z in a particular direction u⃗ is 3. What can you say about D−⃗u z?
A picture here is worth a thousand calculations: if you are looking up a slope of 3 in one
direction, then turning around 180◦ you will see a slope of −3.
(a) If you set out in the positive x-direction, direction), which way will reduce the concen-
will you be hiking up or sliding down? tration of poison gas you are breathing the
(b) If you set out in the positive y-direction, fastest?
will you be hiking up or sliding down? 4.87 Exploration: The Direction of Fastest Increase
(c) If you set out at a 45◦ angle between Survey teams have determined that the density
the positive x- and y-directions, of gold deposits in your area roughly
( follows
) 2 2the
will you be hiking up or sliding function 𝜌(x, y) = sin(𝜋x∕2) sin 𝜋y2 ∕2 e −x +y .
down? Your team is currently digging at the posi-
(d) Find an angle between the positive x- tion (1, 1) and tomorrow you want to start
and y-directions such that, if you walk hiking in the direction in which the den-
in that direction, you will be following sity will increase the fastest.
a “level curve”—momentarily (a) For a given direction 𝜙 (measured coun-
moving neither up nor down. terclockwise from the positive x-axis),
4.85 Standing at the origin on the plane 4x + 6y − find the unit vector û pointing in the
2z = 0, you begin to climb. The positive z-axis direction of 𝜙. Hint: The answer is not
points straight up but you cannot move directly related to your current position.
that way: you can travel in a direction along the (b) Find the directional derivative Dû 𝜌 in the
x-axis, along the y-axis, or along the direction direction of the angle 𝜙. Your answer
y = x, always staying on the plane 4x + 6y − 2z = should be a function of x, y, and 𝜙.
0. Order these three climbs from most to least (c) Evaluate Dû 𝜌 at the point (1, 1). The
steep. result should be a function of 𝜙.
4.86 You are in a space station that is being flooded (d) Using the result you found for Dû 𝜌, find
with poison gas. The concentration of the gas the direction your team should hike
2
is given by e −x y2 . You are currently at the point in so as to increase the density of gold
(1, 1) and there are hallways leading off in the as fast as possible, and the directional
directions of the lines y = x, y = −x, and the derivative in that direction.
x- and y-axes. Of the eight possible ways you In the next section you will learn an eas-
could go (along any of the four lines in either ier way to solve problems like this.
5. Repeat the steps above. Find the directional derivative of f in the direction î + c ĵ
and use that to find the direction in which f increases the fastest. Once again,
express your final answer as a unit vector û that points in the direction of fastest
increase of f .
6. What is the rate of change of f in the direction you just found? In other words, what
is the fastest rate of increase that f can have in any direction?
7. Multiply the vector you found in Part 5 by the answer you found in Part 6.
Your final result was a vector that points in the direction in which f increases the fastest
and whose magnitude is the rate of change of f in that direction. That vector is called the
“gradient” of f . We explore its properties in this section.
𝜕f 𝜕f 𝜕f
Dû f = ux + u y + u z
𝜕x 𝜕y 𝜕z
The left-hand vector in this dot product, made up of all the partial derivatives of f , is called
⃗ . (You’ll sometimes see grad(f ), and you’ll also some-
the gradient and is generally written ∇f
times see ∇f without the vector sign.)
𝜕f 𝜕f 𝜕f
⃗ =
∇f î + ĵ + k̂ (4.6.1)
𝜕x 𝜕y 𝜕z
Using the gradient, we can define the directional derivative of f in the direction of a unit vector û
more concisely.
⃗ ⋅ û
Dû f = ∇f (4.6.2)
Question: Find the gradient of the function f (x, y, z) = x 4 y∕z at the point (3, 2, 1), and
use it to find the derivative in the direction u⃗ = 5î − 12k.
̂
Answer:( ) ( ) ( ) ( ) ( ) ( )
⃗ = 𝜕f ∕𝜕x î + 𝜕f ∕𝜕y ĵ + 𝜕f ∕𝜕z k̂ = 4x 3 y∕z î + x 4 ∕z ĵ − x 4 y∕z 2 k̂
∇f
At the point (3,2,1), the gradient is the vector 216î + 81ĵ − 162k.̂ Note that this
calculation has nothing to do with any particular direction u; ⃗ it is a property of f at
that point that can be used to find the derivative in any direction.
For this particular direction, û = (5∕13)î − (12∕13)k,
̂ so
Du⃗ f (3, 2, 1) = (216)(5∕13) + (81)(0) + (−162)(−12∕13) ≈ 232.6154.
This is a good place to stop and make sure you’ve followed everything. We began with a
mathematical definition of the gradient—a minor change in notation around our already-
existing idea of a directional derivative. From there, we have arrived at an interpretation of
what the gradient means. If you calculate that for a particular field at a particular point the
gradient has a magnitude of 7 and points in the direction î + 5ĵ − 3k,
̂ can you explain what
that means about the field at that point? Can you explain how that physical interpretation
comes from the math?
Question: For the surface described by z(x, y) = e x−y at the point (1, 1), what is the
gradient and what does it mean?
Answer:
⃗ = e x−y î − e x−y j,̂ so ∇z(1,
∇z ⃗ 1) = î − j.̂ This indicates that if you want to move up as
quickly as possible you should move at a 45◦ angle √ South of East, and that in that
direction the surface will rise with a slope of 2.
The closer your angle is to that particular direction, the more steeply uphill you are
looking. If you let go of a ball at this point on the surface, it would roll along the
steepest decline, which is always −∇z, ⃗ or −î + ĵ in this case.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 167
The gradient
Question: On that same surface, what is Du⃗ z(1, 1) for u⃗ pointing 60◦ North of East,
and what does it mean?
Answer: √
We begin with a quick drawing and a bit of trig to find that û = (1∕2)î + ( 3∕2)j.̂
1
uy
60°
x
ux
( √ ) √
( ) 1̂ 3̂ 1− 3
Dû z(x, y) = î − ĵ ⋅ i+ j =
2 2 2
If you take a small step of horizontal distance ds along that 60◦ angle, your height
(z-coordinate) will go down by roughly 0.366 ds. This makes sense if you look at the
relative contributions of dx and dy. As x increases, z increases. As y increases, z
decreases. The total dz is negative because, at a 60◦ angle, y is changing faster than x.
A final word of warning is in order. When we talk about the gradient as the slope of a
surface, we are talking about a function with two independent variables, x and y. The gradient
points in the direction of steepest increase, but the gradient does not point up the hill; the
gradient has no z-component at all! The gradient points in the direction in the xy-plane that
you would travel to obtain the steepest incline.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 168
For a function f (x, y) the gradient points horizontally, not “up the hill.”
0.5 1 2
0.6 y
0.9
0.8 5
0.4
x
0.5
0.2
0.3
4.95 Consider the function f (x, y) = x 2 + y2 . (a) One way to mathematically represent
(a) Calculate the gradient of this func- that hemisphere
√ is z = f (x, y) where
tion at the following five points: (0, 0), f (x, y) = 25 − x 2 − y2 . Calculate ∇f ⃗
(0, 1), (0, −1), (1, 0), and (−1, 0). at the indicated point and describe
in words which way it points.
(b) What might this function look like
based on your five gradients? (b) Another way to mathematically repre-
sent that hemisphere is with the equation
4.96 Consider the function f (x, y) = x 2 − y2 .
g (x, y, z) = 25 where g (x, y, z) = x 2 + y2 + z 2 .
(a) Calculate the gradient of this func- Calculate ∇g ⃗ at the indicated point and
tion at the following five points: (0, 0), describe in words which way it points.
(0, 1), (0, −1), (1, 0), and (−1, 0).
(c) Neither of your above answers seems to
(b) What might this function look like point “up the hill” in the drawing. Explain
based on your five gradients? how both of them follow the rule that the
4.97 The function z(x, y) at the point (0, 0) gradient always points in the direction in
has a gradient 4î + 6j.
̂ which the function increases most steeply.
(a) What is the derivative in the direc- 4.100 You stand in a valley described by the equation
tion pointing toward (2, 3)? z = x 2 − xy − 2y at the point (2, 1, 0).
(b) What is the derivative in the direc- (a) If you put a ball down at your feet, in what
tion pointing toward (−2, −3)? direction will the ball roll? (Your answer
(c) If you put a ball down on this surface at will be a vector in three-space.)
the point (0, 0), which way would it roll? (b) If you want to walk a level path, in
(d) Find a unit vector (magnitude of what direction should you walk?
1) for which Dû = 0. (Your answer will be a vector with
(e) For how many other unit vectors only x- and y-components.)
does Dû = 0? 4.101 A surface fluctuates according to the equation
4.98 Each part below gives information about a z = cos(xt) + sin(yt). A ball is to be placed at
function of x and y. Describe each function. the point (1, 1, cos t + sin t) at some time t.
Your answer may include a description in The descriptions below are about the direc-
words, a 3-D drawing, or anything else that tion the ball will roll in x and y, assum-
will help understand the function. Suggest- ing its vertical motion will be whatever is
ing a possible function f (x, y) may be part of needed to keep it on the surface. At what
your answer, but it is not a complete answer. time t could the ball be placed…
(a) For function f (x, y), the gradient (a) …if you want the ball to begin rolling
⃗ = 2î everywhere.
∇f straight toward the z-axis?
⃗ = x î
(b) For function g (x, y), the gradient ∇g (b) …if you want the ball to roll in
everywhere. (So at all points where x = 2, the +x-direction?
the gradient is 2î and so on.) 4.102 The magnitude of the gravitational
(c) For function h(x, y), the gradient ∇h⃗ = −𝜌𝜌̂ force between two objects is given by
everywhere, where 𝜌 is the distance from F = Gm1 m2 ∕r 2 where G is a constant and
the z-axis and 𝜌̂ points directly away the other three letters are variables.
from the z-axis at each point. (a) Find the gradient ∇F⃗ (m1 , m2 , r ).
4.99 The figure below shows a hemisphere (b) Based on your answer to Part (a), which
of radius 5. The point (0, 4, 3) on that of the three variables should increase,
hemisphere is indicated. and which should decrease, if you set out
z to increase the gravitational force?
4.103 One cause of air flow is the so-called
“pressure gradient force” given by the formula
(0,4,3)
F⃗ = −k ∇P
⃗ where P is the air pressure
and k is a positive constant. Explain why
it makes sense to expect air to move in
y
the direction of this vector.
4.104 Cecelia the depth-seeking crab is on a lake bed
x shaped like z = x 2 ∕4 − y∕10 + 10. Assuming
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 171
Cecelia starts at the point (4, 10, 13) and always 4.108 A “conductor” is a material with free elec-
crawls to lower depths as quickly as possible, trons that can easily move in response to
describe with words and pictures the path she electric fields. If you generate an electric field
will follow. (You do not need to give a mathe- near a conductor the electrons will move
matical equation that expresses her path.) in response, thus changing their own elec-
tric fields, until the electric field within the
conductor is zero. This happens so quickly
Electrostatic theory is based on a scalar field called that for most purposes you can assume
“electric potential” V (x, y, z). Problems 4.105–4.108 that the electric field is always zero inside a
require you to know the following facts about electric conductor.
potential.
(a) Explain why every conductor is an
∙ The proximity of positive charges increases the “equipotential,” meaning a region
electric potential; the proximity of negative of constant potential.
charges decreases the potential.
(b) A thin metal sheet is in the shape of a
∙ The electric field is computed as E⃗ = −∇V ⃗ .
spherical shell: x 2 + y2 + z 2 = R 2 . Know-
∙ Positively charged particles tend to flow in the ing that the entire surface of the sphere
direction given by the electric field. has a constant potential, in what direc-
tion must the electric field point at
4.105 The electric potential in a region of space
(R, 0, 0)? (There are actually two possible
is given by V = x 2 + y2 − z 2 . If a positive
directions.)
charge were placed at the point (1, 2, 3)
what direction would it move in? (Your
answer should be a vector, but you don’t 4.109 Exploration: An Alternative Basis for Direc-
have to write it as a unit vector.) tional Derivatives For the function f (x, y) at
the point (2, 3), the derivative in the direction
4.106 In the presence of a single positively charged
of the vector 3î + 4ĵ is 2, and the derivative in
particle located at the origin,
√ the poten- the direction of the vector 5î + 12ĵ is −2.
tial field is given by V = k∕ x 2 + y2 + z 2
(a) Find 𝜕f ∕𝜕x and 𝜕f ∕𝜕y at this point.
where k is a positive constant.
(b) Give one possibility for what the
(a) Calculate the electric field E⃗ = −∇V
⃗ .
function f (x, y) might be.
(b) Use your answer to Part (a) to predict
(c) If this function were a hill, and you
the motion of a positively charged par-
released a ball at the point (2, 3), which
ticle that starts at rest in this field.
way would the ball initially roll?
(c) In the presence of a single negatively
We are used to starting from the derivatives
charged particle at the origin, √ the poten- in the x- and y-directions, and using those
tial field is given by V = −k∕ x 2 + y2 + z 2 .
to figure out the derivatives in other direc-
How does this change your answers?
tions. In this case we started with two other
4.107 An “electric dipole” consists of two equal directional derivatives but got the same infor-
and opposite charges. If a dipole in two mation. In general, the derivatives of f (x, y)
dimensions consists of a positive charge in almost any two directions are enough
at position (x0 , 0) and a negative charge to find them in all other directions.
at position (−x0 , 0) then the potential
(d) Let û = a î + b ĵ and v̂ = c î + d ĵ be two unit
field produced by the dipole is
vectors and let the directional derivatives
of f at a point be Dû f = 𝛼 and Dv̂ f = 𝛽.
k k
V = √ −√ Find 𝜕f ∕𝜕x and 𝜕f ∕𝜕y at that point in
(x − x0 )2 + y2 (x + x0 )2 + y2 terms of a, b, c, d, 𝛼, and 𝛽.
(e) We said that the derivatives in the direc-
(a) Calculate the electric field E⃗ = −∇V ⃗ .
tions of almost any two vectors are good
(b) A positive charge feels a force push- enough to find all others. What would
ing it in the direction of the electric have to be true of the vectors û and v̂
field. If you place a positive charge on for you to be unable to find 𝜕f ∕𝜕x and
the positive x-axis at x > x0 , in what 𝜕f ∕𝜕y from their directional derivatives?
direction will it be pushed? Answer this twice. First, give an alge-
(c) If you place a positive charge at position braic answer based on your calculations,
(5x0 , 2x0 , 0), in what direction will it be and then say what that would imply geo-
pushed? (Give your answer as an angle.) metrically about the two vectors.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 172
4.110 [This problem depends on Problem 4.109, and an n-dimensional space if the derivatives
on some knowledge of linear algebra.] Gen- of f along those directions weren’t suffi-
eralizing your answer to Problem 4.109, cient for you to know the derivatives in all
what would have to be true of n vectors in directions?
y = f (x)
∙ A “critical point” is defined as a point where the derivative is either zero or undefined.
∙ All local minima and maxima will occur at critical points.
∙ A continuous function on a closed interval will achieve an absolute maximum and
an absolute minimum. These will occur at critical points or at the endpoints of the
interval.
As an example, consider the picture above. On the closed interval [a, b] this function attains
an absolute maximum and an absolute minimum. On the open interval (a, b) it attains an
absolute maximum but does not attain an absolute minimum.
Before we extend these rules to higher dimensions, let’s define our terms. We present
these definitions in terms of a function of two variables f (x, y) but they extend naturally to
more (or fewer) variables.
This “constrained” volume function V (L, H ) will attain a maximum. We begin by taking its
gradient.
2 2 2 2
⃗ = H (40 − 3L − 12LH ) L̂ + 2L (10 − 3H − 3LH ) Ĥ
∇V
(L + 2H )2 (L + 2H )2
Recall that a critical point is one where the gradient is zero or undefined. We would there-
fore have to consider all the points with L + 2H = 0 as critical points, but we can ignore
them because the lengths must be positive quantities. We ignore H 2 = 0 and L 2 = 0 for the
same reason. We are therefore left finding critical points where 40 − L 2 −√ 12LH = 0 and
10 − 3H 2 − 3LH = 0. Solving these two equations simultaneous leads to H = 10∕3 √and L =
√ √
2 10∕3. We can use these values to find W = 10, which gives a volume of V = 20 10∕9.
So we have found a critical point: what does it tell us about our box? If we were just using
mathematical brute force, we could use the “second derivatives test” described below to check
if this is a minimum or a maximum. Then we would also have to check the boundaries L = 0,
W = 0, and H = 0. In this case those steps are unnecessary because we’ve already argued that
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 175
the volume reaches a maximum and no minimum, and clearly √ the volume would be zero if
any of the dimensions were zero, so we conclude that V = 20 10∕9 ft3 is the largest possible
box we can build for $20.
So within this region the function attains an absolute minimum of 5 at (1, 2) and an
absolute maximum of 22 at (0, 6).
∙ If D(a, b) > 0 and 𝜕 2 f ∕𝜕x 2 (a, b) > 0 then f (x, y) attains a local minimum at (a, b).
∙ If D(a, b) > 0 and 𝜕 2 f ∕𝜕x 2 (a, b) < 0 then f (x, y) attains a local maximum at (a, b).
∙ If D(a, b) < 0 then f (x, y) attains neither a minimum nor a maximum at (a, b).
The second derivatives test is far from obvious. A deep understanding of why it works
begins by recognizing that D is the determinant of a matrix called the “Hessian.”
Problem:
Find and classify the critical points of f (x, y) = x 3 + 2y3 − 6x 2 y − 60x.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 177
Solution:
We begin by finding the critical points.
⃗ = (3x 2 − 12xy − 60)î + (6y2 − 6x 2 )ĵ = 3(x 2 − 4xy − 20)î + 6(y2 − x 2 )ĵ
∇f
This gradient is never undefined, so all critical points will occur at points where
⃗ = 0.
∇f ⃗ Setting y2 − x 2 = 0 yields y = ±x. Plugging y = x into 3x 2 − 4xy − 20 = 0 gives
no real answers, but plugging y = −x into the same equation gives x = ±2.
Critical points: (2, −2) and (−2, 2)
To classify these points we begin by finding D.
Plugging in our two critical points we find that D(2, −2) < 0 and D(−2, 2) < 0. We
conclude that this function has no local maxima or minima anywhere!
4
f
‒4 0 y
‒2
0 ‒2
x
2
‒4
4
The plot confirms that the two critical points are both saddle points.
Stepping Back
A typical optimization problem involves one objective function that you want to optimize
and any number of constraints on the independent variables. The constraints come in two
broad categories: inequalities and equations.
An inequality defines a boundary to the domain on which you are trying to optimize the
objective function. In the “optimization on a bounded region” problem above, the variables
have to satisfy x ≥ 0, y ≥ 0, and y ≤ 6 − 2x. These three inequalities restrict f (x, y) to the tri-
angular domain shown at the beginning of that problem. In the storage chest example, the
inequalities L > 0, W > 0, and H > 0 restrict the domain to the first octant of the 3D region
(L, W , H ).
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 178
4.133 [This problem depends on Problem 4.132.] partial derivatives must be zero. Set
(a) Use the second derivatives test on the crit- them equal to zero and solve to find
ical point you found in Problem 4.132. the dimensions of the box.
What does it tell you about that crit- (f) Make a simple argument—no calculus
ical point? Does its result agree with required—that your dimensions can-
what you found in that problem? not possibly maximize the cost, because
you can make the cost arbitrarily high
(b) Plot f (x, y) on the domain given in within the given constraints.
the problem. Mark the critical point
4.139 A box with no top is to be made all of the same
and check that it is of the type you
material, so its cost is C = C0 (LW + 2HW +
found in Part (a). Check that the abso-
2LH ). The post office puts a limit of 130" on
lute minimum and maximum occur at
the linear dimension D = H + L + W that can
the points you found them at.
be sent by first class mail. What is the most
expensive such box that can be sent?
In Problems 4.134–4.137 find the absolute maximum
and minimum values of the given function in the 4.140 Prove that if you construct a rectangular
given closed, bounded region. You may find it helpful solid of surface area A the largest volume
to work through Problem 4.132 as a model. it can have occurs when it’s a cube.
4.134 f (x, y) = x 2 − 6x −√
y2 on the region bounded In Problems 4.141–4.144 your job is to find the point
by the curves x = 36 − y2 and x = 0. on the given surface S that is closest to the given
point P . To put it another way, your job is to find the
4.135 f (x, y) = (2x − 2) cos y − x on 0 ≤ x ≤ 2𝜋,
point with the minimum distance to P , subject to the
0 ≤ y ≤ 2𝜋.
constraint that your answer must lie on S. Hint:
4.136 f (x, y) = xe −y on the region bounded by Rather than minimizing “distance to P ” it may be
the curves y = x 2 − 3 and y = 2x. easier to minimize “distance to P squared,” which will
4.137 f (x1 , x2 , x3 , x4 ) = 4x1 x4 + 3x1 x2 − 3x2 x3 sub- give you the same point.
ject to the constraints x1 − 4x2 − x3 = 0, 4.141 Point P is the origin and surface S is
x12 + x3 − x4 = 0, 0 ≤ x1 ≤ 2, 0 ≤ x2 ≤ 2. Hint: the plane z = 2x + 5y − 10.
You can make this problem somewhat easier z
by choosing the right variables to eliminate.
2 y
4.138 In the Explanation (Section 4.8.2) we consid-
5
ered a storage chest with three independent x
dimensions. The volume of the storage chest
is given by the formula V = LWH and the cost
to build the chest is C = LW + 2HW + 3LH .
In this problem we will ask the question:
if your goal is to build a 24 ft3 chest, what
dimensions minimize the cost? Note that ‒10
the roles of the two equations in this prob-
lem are reversed from the Explanation: the
cost is now the function we want to minimize, 4.142 Point P is (1, 0, 0) and surface S is the
and the volume provides the constraint. paraboloid z = x 2 + y2 .
(a) Use the constraint to write an equation z
solving for one of the three variables
as a function of the other two.
(b) Substitute your formula from Part (a)
into C(L, W , H ) to find the cost as a
function of two variables.
(c) Find the gradient of your function
from Part (b).
(d) A critical point can occur if the gra-
dient is undefined. Explain why that y
does not apply in this case.
x
(e) A critical point can also occur where 4.143 Point P is (1, 2, 0) and surface S is
the gradient is zero, which means both the paraboloid z = x 2 + y2 .
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 180
4.144 Point P is the origin and surface S is About what pivot point does this system have
the surface x 2 y2 z = 1. the smallest possible moment of inertia?
4.148 [This problem depends on Problem 4.147.] The
4.145 A box is inscribed inside the paraboloid “center of mass” of a collection of particles
z = x 2 + 2y2 as follows. Start with a point (x, y, z) with masses m1 , m2 , …located at positions
in the first octant with z < 1. Extend left to −y, (x1 , y1 ), (x2 , y2 ), …is a point with coordinates
back to −x, and up to z = 1. Find the starting
point (x, y, z) of the largest possible box.
m1 x1 + m2 x2 + …
z
xCOM = ,
m1 + m2 + …
m1 y1 + m2 y2 + …
yCOM =
m1 + m2 + …
A storage chest is to be built to the following specifications. The left side, right side, back, and
bottom are made of a material that costs $1∕ft2 . The front is made of a more decorated material
that costs $2∕ft2 . There is no top. Find the largest-volume chest that can be made for $20.
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 182
Lagrange Multipliers
Given a multivariate function f to be optimized, and a constraint g = k, where g is another function
⃗ = 𝜆∇g
and k is a constant, the extrema will be found at points where ∇f ⃗ where 𝜆 is a scalar constant.
⃗ = ∑ 𝜆i ∇g
In the case of multiple constraints gi = ki , the extrema will occur where ∇f ⃗ i.
i
Below we demonstrate how to use Lagrange multipliers to solve the storage chest problem.
Following that we offer a justification for the Lagrange multiplier formula.
LH = 𝜆(L + 2H ) (4.9.2)
We have only three equations to solve for four unknowns. The final equation is always the
constraint itself:
LW + 2HW + 3LH = 20 (4.9.4)
If these were all linear equations we could approach them systematically with matrices.
However, there is no general, systematic approach for solving n non-linear equations for n
unknowns. Our best advice, as is so often the case, is that the more you do it the better you
get at it.
One approach that is often useful is to divide two equations, so 𝜆 cancels. Dividing
Equation 4.9.1 by Equation 4.9.2 yields W ∕L = (W + 3H )∕(L + 2H ). Cross-multiplying and
simplifying turns this into 2HW = 3HL which means that either H = 0 or 2W = 3L. We
discard H = 0 on physical grounds.
Similarly, dividing Equation 4.9.2 by Equation 4.9.3 becomes 3H = W . We can then sub-
stitute both of these results into Equation 4.9.4 to replace both L and H with W , producing:
(2∕3)W 2 + (2∕3)W
√
2 + (2∕3)W 2 = 20
Solving, W = 10. It is now a trivial matter to find L and H and then the maximum
possible volume. Of course we get the same answer we got in Section 4.8.
Lagrange multipliers can be used for many different types of constrained optimization
problems, but one common use of them is for optimization in a bounded region. Recall
from Section 4.8 that if you want to optimize a function in a bounded region you first find the
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 183
critical points inside the region, and then you optimize the function along each of the bound-
aries. The example below shows how you can do this last step with Lagrange multipliers.
Problem:
In Section 4.8 we found the absolute maximum and minimum of the function
f (x, y) = x 2 + y2 − 2x − 4y + 10 on a triangular region. As part of the problem, we had
to find its extreme along the line y = −2x + 6. Find the same point using Lagrange
multipliers.
Solution:
⟨ ⟩
⃗ = 2x − 2, 2y − 4 . The constraint can be written as g (x, y) = 6 where
∇f
⃗ =<2, 1>. So the Lagrange multiplier formula yields the
g (x, y) = y + 2x, so ∇g
following two equations:
2x − 2 = 2𝜆 (4.9.5)
2y − 4 = 𝜆 (4.9.6)
y = −2x + 6 (4.9.7)
In this trivial case, Equations 4.9.5 and 4.9.6 immediately yield 2x − 2 = 2(2y − 4) or
x = 2y − 3. Substituting into Equation 4.9.7 we find that y = 12∕5, just as we found
before. From there we can quickly get x = 9∕5 and f (9∕5, 24∕5) = 29∕5. We can also
get 𝜆 = 4∕5; we don’t need that number to optimize the function, but it can be
useful, as we will discuss below.
Note that Lagrange multipliers in this problem are not a substitute for the entire
process of finding critical points and checking boundaries. They help with one part
of the process, finding critical points along a given boundary.
(d) The third equation relating the three (a) In Section 4.8 Problem 4.149 Part (a)
unknown variables is the constraint itself. the two negative charges were confined
Solve these three equations to find the to the upper half of the unit circle. Now
(x, y) coordinates of all critical points. suppose those charges could be any-
where on the unit circle. Explain why
In Problems 4.156–4.165 you will redo the given loosening this restriction would make
problems from Section 4.8. If you already did a given the problem considerably more difficult
problem in that section, much of the work—using the using the method of Section 4.8.
second derivatives test to classify critical points, or (b) Now allowing the two negative charges
plugging in values to find maxima and minima—will to be anywhere on the unit circle, you
not have to be redone here. The only step you need will solve this problem using Lagrange
to redo is finding the critical points of the objective multipliers. Write, but do not yet solve,
function subject to the constraint in the problem, the six equations for the six unknowns
which you will now do using Lagrange multipliers. (the x and y positions of the two par-
You may find it helpful to work through ticles, plus two Lagrange multipli-
Problem 4.155 as a model. ers for the two constraints).
4.156 Problem 4.138. (c) Solve your equations to find the posi-
4.157 Problem 4.139. tions of the two negative charges.
4.158 Problem 4.140. 4.171 The Explanation (Section 4.9.1) justified
the formula ∇f ⃗ = 𝜆∇g⃗ by arguing that at
4.159 Problem 4.141.
such a point the curve g (x, y) = k is parallel
4.160 Problem 4.142.
to the level curves of f . But this logic does
4.161 Problem 4.143. not work if 𝜆 = 0. Does that special case also
4.162 Problem 4.144. lead to a critical point? Why or why not?
4.163 Problem 4.145. 4.172 There is a classic optimization problem
4.164 Problem 4.146. known as the “milkmaid problem.” A milk-
maid is at the location (xM , yM ) and she
4.165 Problem 4.153 Part (b). needs to milk a cow at location (xC , yC ).
Before milking the cow, however, she needs
In Problems 4.166–4.168 you will redo the given to stop by the river to wash her pail.
bounded-region problems from Section 4.8. If you
already did a given problem in that section you can
M
use your results from that for the critical points in the
interior of the region. However, you will now find the
critical points on the boundaries by using Lagrange
C
multipliers with the boundary equations as
constraints.
4.166 Problem 4.132.
4.167 Problem 4.134. The path of the river is described by the
equation g (x, y) = 0 for some function g . Her
4.168 Problem 4.136.
goal is to find the point (x, y) on the river that
4.169 Section 4.8 Problem 4.137 asked for the criti- will allow her to wash her pail and then get to
cal points of f (x1 , x2 , x3 , x4 ) = 4x1 x4 + 3x1 x2 − the cow with as little travel distance as possible.
3x2 x3 subject to the constraints x1 − 4x2 − (a) Using Lagrange multipliers, write the
x3 = 0 and x12 + x3 − x4 = 0. (That problem equations that need to be solved to
also involved two other constraints that we find the optimal point (x, y) where the
will ignore here.) Find those critical points milkmaid should wash her pail.
using Lagrange multipliers. Hint: The box (b) Write and solve those equations assum-
on Page 182 gives the formula for Lagrange ing the river flows long the x-axis, the
multipliers with multiple constraints. milkmaid’s initial position is (3, 2), and
4.170 This problem is based on Section 4.8 Prob- the cow’s position is (1, 2). Explain
lem 4.149. If you haven’t done that problem why your answer makes sense.
you should read it, but you do not need to (c) Write and solve those equations assum-
have done that problem to solve this one. ing the river flows long the line y = x,
7in x 10in Felder c04.tex V3 - February 26, 2015 9:52 P.M. Page 186
the milkmaid’s initial position is (1, 3), (b) Find the value of 𝜆 in the equations you
and the cow’s position is (3, 5). Explain used. Explain what this value represents.
why your answer makes sense. (c) Repeat Parts (a)–(b) with the con-
straint y2 − x 2 = 0.
(d) Assume the river is described by
(d) You should have found that the point
the curve y = x 3 , the milkmaid’s initial
(x, y) (and thus the distance) was the
position is (0, 2), and the cow’s posi-
same for the two constraints, but 𝜆
tion is (1, 4). Have a computer numer-
was different. Explain why you should
ically calculate where the milkmaid
expect both of those to be true.
should wash her pail. Make a plot show-
ing the river, the two points given here, 4.175 A projectile launched from the ground
and the point you just found. with initial velocity components vx and
4.173 Your spaceship needs to gather rock samples vy will travel a distance equal to 2vx vy ∕g .
from a planet’s surface and deliver them to The energy that you initially give to the
a nearby space station. You are currently at projectile is (1∕2)m(vx2 + vy2 ).
location (50, 10, 20) and the space station is (a) Use the method of Lagrange multipliers
at (30, 20, 30). (All coordinates are measured to find the maximum possible distance
in thousands of miles in a coordinate system a projectile can go subject to the con-
with the origin at the center of the planet.) straint that its initial energy is E.
You need to choose where to gather the rocks (b) Find the value of 𝜆 from the equations
so that you can reach the space station with you wrote.
as little distance traveled as possible.
(c) Using that value of 𝜆, how much farther
can the projectile go if you increase the
energy you launch it with by dE?
of a worker, and the operating costs of fewer workers than the law requires, in
a thneed-making machine, respectively.4 exchange for a modest “compensation”
However, labor laws require that each worker (bribe). Based on the value of 𝜆 you
may only oversee one machine: L = K . found, what is the maximum bribe you
(a) Use Lagrange multipliers to find the should be willing to offer in order to have
value of L that maximizes your profit w fewer workers than machines?
subject to the constraint L = K . (d) Just to get specific, suppose T = 1000,
(b) Solve for 𝜆 in the equations you 𝛼 = 0.8, 𝛽 = 0.2, s = 10, and c = 1.
wrote. How much should your company be
(c) A government worker offers to bend willing to pay in order to be allowed
the regulations, allowing you to use to hire 1000 fewer workers?
4 Thanks to Martin Osborne, from whose idea this problem was adapted with permission.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 188
CHAPTER 5
∙ evaluate definite and indefinite integrals, including those that require u-substitution or
integration by parts.
∙ graph points and functions in polar coordinates.
∙ find the sum, dot product, and cross product of two vectors.
After you read this chapter, you should be able to …
This chapter is not about how to evaluate an integral. We assume you can find basic
antiderivatives, including those that require u-substitution or integration by parts. You may
also know more advanced techniques such as partial fractions and trig substitution—but if
you don’t, you won’t find them here.
Instead, we’re going to begin with a couple of sections that explain the connection
between real-world problems and integrals. (“Find the area under this curve” hardly counts
as an important engineering application.) We encourage you to pay special attention to
those first few sections. If you can adopt our perspective on what an integral is, then the
rest of the chapter, in which we extend the idea into multivariate functions, should follow
naturally.
V (x, y, z) = − GM ∕r (5.1.1)
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 189
where G is a constant and r is the distance from the particle to the point. If there is more
than one particle around, the gravitational potential at each point is the sum of the potentials
created by all the nearby particles. (Gravitational potential is used to figure out gravitational
force, but we won’t be doing that here.)
In all the problems below, the question is the same: Use Equation 5.1.1 to calculate the
gravitational potential at the point (0, d) on the xy-plane.
d d d
1. Suppose the only object around is a particle of mass M at the point (d, 0) (the first
image above). Find the gravitational potential.
2. Now suppose there are two objects, each of mass M ∕2, located at (0, 0) and (d, 0). (See
the second image.) Calculate the gravitational potential.
See Check Yourself #25 in Appendix L
3. Now suppose there are n objects, each of mass M ∕n, spread evenly between (0, 0) and
(d, 0) (third image). Object number i is located at position xi = d(i − 1)∕(n − 1).
(a) Check that this gives the correct positions for x1 and xn .
(b) Calculate the gravitational potential created by object i.
(c) Write a sum representing the total potential at (0, d) created by this collection of
objects. (You do not need to evaluate the sum; just write it.)
In the limit as n → ∞ your final answer becomes the total gravitational potential generated
by a mass M spread evenly between x = 0 and x = d. But calculating such a quantity by adding
up a series and then taking a limit is generally not practical. Our biggest goal in this chapter
is to get you to think of integrals as the solution to that problem: a fast method for calculating
the limit of the sum.
In Problem 5.13 you will properly finish this problem using an integral. In the later
sections, you will see how to extend such an integral into two or three dimensions, and
in Section 5.7 Problem 5.188 you will sum up the gravitational potential due to a sphere.
Finally in Section 5.11 we will return to, and solve, Newton’s original problem.
you can ask a computer to approximate the integral numerically, which is essentially a
Riemann sum with many slices. Of course a numerical integral is only possible if all the
constants in the integral have been expressed as numbers.
1. An object travels with velocity v = 4 mi/h from 2:00 until 8:00. How far does the object
travel?
2. Another object travels with velocity:
⎧ 4 mph 2≤t<4
⎪
v(t) = ⎨ 8 mph 4≤t<6
⎪ 16 mph 6 ≤ t ≤ 8
⎩
(All times are measured in hours.) How far does the object travel between t = 2
and t = 8?
See Check Yourself #26 in Appendix L
3. A thin metal bar has a linear density given by 𝜆 = 4 kg/m. How much mass lies within
a 6-meter length of this bar?
4. Another bar has a linear density given by:
⎧ 4 kg∕m 2≤x<4
⎪
𝜆(x) = ⎨ 8 kg∕m 4≤x<6
⎪ 16 kg∕m 6≤x≤8
⎩
(All distances are measured in meters.) What is the mass of the part of the object that
lies between x = 2 and x = 8?
What is an integral?
If you answered “the area under a curve” then you know how to find the area under a curve
(just in case anyone asks), but perhaps not much else. If you answered “an antiderivative”
then you know how to evaluate integrals, but you don’t necessarily know what you’re finding.
This section presents a different way of thinking about integrals. The test of your
understanding will be your ability to set up integrals that represent solutions to real-world
problems.
The Discovery Exercise (Section 5.2.1) presents two common applications for the integral:
finding distance traveled from velocity, and finding mass from density. In both cases, the
problem is a simple multiplication if the variable (velocity or density) is constant. If the
variable is a step function (sequential horizontal lines), you do the multiplication on separate
pieces, and then add up the results (distances or masses) to find the total.
If the function is not a step function, you can approximate it with a step function. As
the number of steps approaches infinity, the approximation becomes exact. That is what an
integral—every integral!—represents.
We illustrate this process below. The first two examples show the basic process of setting
up an integral as a sum of infinitely many infinitesimal pieces. The remaining examples
apply this process to a variety of physical problems. Finally the section labeled “Stepping
Back” reviews the steps involved. We strongly encourage you to read the “Basic Examples”
and “Stepping Back” sections carefully. On the other hand, you might find it more helpful to
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 191
skim the advanced examples for the moment, and then come back to them as you encounter
those types of physical examples later in the chapter. For instance, it may be easier to follow
a center of mass calculation in 2D or 3D if you first review our 1D center of mass example.
Basic Examples
Mass and Density If you have a 2 meter bar with linear density 𝜆 = 3 kg∕m then the bar has
a total mass of 6 kg. That’s just the definition of density in 1D:
m = 𝜆L if 𝜆 is constant (5.2.1)
Now consider a different bar, of length L, whose density is proportional to the distance from
its left edge. We set our origin at the left edge, so 𝜆 = cx where c is a positive constant. How
do we find the mass of this bar? We cannot use Equation 5.2.1 directly and write m = cxL;
the mass of the entire bar cannot depend on x, the position along the bar! Do we need two
different equations, one for constant densities and an unrelated one for variable densities?
No, there is a way to apply Equation 5.2.1 to our new bar. Imagine breaking the bar into
tiny little pieces, like the shaded one shown below.
The variable x designates the location of one particular piece.
dx is the change in x as you go from the left to the right side of
L
that piece: like Δx, but smaller. How small? So small that we can
dx
reasonably treat 𝜆 as constant within that one little piece, and
therefore we can use Equation 5.2.1: dm = 𝜆 dx. ( Just as dx is a x
tiny change in position, dm is a tiny change in mass.) The next
FIGURE 5.1 The smaller dx
piece has a different 𝜆, but also constant, leading to a different gets, the closer the small blue
dm. To approximate the mass of the bar, we add up the masses of slice comes to constant density.
all the pieces.
How accurate is that approximation? The smaller each dx is,
the more reasonable our constant-𝜆 assumption becomes, and therefore the more accurate
our final answer will be. In the limit as dx → 0 (or as the number of pieces approaches ∞)
the mass becomes exact.
And that—this is the moment we have been building to—that is what an integral does.
When we write the integral below, we mean “multiply cx (the density of one piece) by dx
(the length of one piece), and then add all of them up as x goes from 0 to L, in the limit as
dx → 0.” [ 2 ]L
L
x 1
M= cx dx = c = cL 2 (5.2.2)
∫0 2 0 2
As a final check, we can consider the units of this result. We first introduced c when we said
that the density was cx. Since density in 1D has units of mass per length, c must have units of
mass per length squared to make cx be a density. Looking at our final answer, the expression
we got for M thus has units of mass, which is reassuring.
Distance Traveled You already know that velocity is the derivative of position, so position
is the integral of velocity. But let us briefly walk-through the argument in terms of what an
integral is.
First we ask, what if velocity is constant? If you travel 60 mi/h for 3 h, then your total
distance traveled is 180 miles: Δx = v Δt.
We approach the more general case of a function v(t) by breaking the journey into small
“slices,” a series of trips of duration dt. For each such trip, we assume that the velocity is rela-
tively constant; we can therefore use the same product we wrote above, and write dx = v(t)dt.
The total distance traveled is the sum of all such trips, in the limit as dt → 0, which we write as:
Δx = v(t)dt
∫
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 192
Notice once again how the dt appeared in this calculation. We did not put it there because
“you can’t have an integral without it”; we multiplied it by velocity to find distance.
Advanced Examples
Center of Mass You may recall from introductory physics that every object has a special
point called its “center of mass.” For instance, consider an object dangling partly over a
ledge: will it stay put or will it fall? The answer depends on which side of the ledge contains
the object’s center of mass.
The center of mass of a collection of n point masses in 1D is given by:
1 ∑
n
x1 m1 + x2 m2 + x3 m3 + … + xn mn
xCOM = = xm (5.2.3)
M M i=1 i i
Here M = m1 + m2 + … + mn is the total mass of the collection. Note that this formula is
a weighted average of the positions, influenced more by heavier objects than lighter. For
example, if you have a mass m1 = 2 at x = 1 and a mass m2 = 3 at x = 4 then the center
of mass is (2 ⋅ 1 + 3 ⋅ 4)∕5 = 14∕5, which is somewhere between the two but closer to the
heavier one.
Now suppose we want the center of mass of the rod of length L and density 𝜆 = cx from
the example above. Our strategy is to break the rod into pieces so small that we can treat
each piece as a point mass (Figure 5.1). The position of our slice is just x, and its mass is
dm = cx dx. So what we are adding up for each slice, based on the xi mi in Equation 5.2.3, is
x(cx dx). Adding them all up with an integral, the center of mass formula becomes
L
1 1
xCOM = x dm = cx 2 dx
M∫ M ∫0
Evaluating this integral and dividing by M = (1∕2)cL 2 (from the mass and density example
above) gives xCOM = 2L∕3. Clearly this has correct units. It also makes sense that the
answer is somewhere between L∕2 and L, since the right side of the bar is heavier than
the left.
Note that if we hadn’t already found the mass (Equation 5.2.2) we would have to do a
separate integral to calculate that. Center of mass problems often involve an integral for the
numerator (sum of xi mi ) and another one for the denominator (total mass).
Moment of Inertia The “moment of inertia” measures how hard it is to start an object rotat-
ing about a given axis. The moment of inertia of a point particle is mr 2 where m is the
mass of the particle and r is the distance of the particle from the axis. If you have more
than one particle their combined moment of inertia is simply the sum of their individual
ones. For example, if you have a dumbbell with masses m = 2 at positions x = 3 and x = −3
then the moment of inertia of that dumbbell about a perpendicular axis through its cen-
ter is 2(32 ) + 2(32 ) = 36. As you’ve surely figured out by now, to find the moment of inertia
of an extended object you break it into slices so small you can consider each one to be
a point particle, find the moment of inertia of a single slice, and integrate to add them
all up.
∑
I = mr 2 or I = r 2 dm
∫
We want to calculate the moment of inertia of a uniform rod of length L and mass M about
a perpendicular axis through one of its ends, as shown below. We’ve also included, as usual,
a thin slice with thickness dx and position x.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 193
Since the bar is uniform you might think we can just multiply
instead of integrating, but even though each slice has the same dx
mass, they do not all have the same moment of inertia. To find
the moment of inertia of a slice we need to know its mass dm and x
its distance r from the axis. (Stop to make sure you understand
why we wrote the mass as a differential and the distance as a normal variable.) Normally we
would have labeled r on our picture, but in this case it’s just the same thing as x.
The mass of our slice is the total mass of the bar multiplied by the fraction of the bar
represented by our slice: dm = M (dx∕L). Pause to make sure you follow that piece of the
calculation, and that you see that it only works because the bar is uniform!1 For a variable
bar we would find dm from a density function that would have to be specified in the problem.
So the total moment of inertia is:
L
M 2 1
I = x dx = ML 2
∫0 L 3
The units are correct: moment of inertia is mass times distance squared. We can also easily
check in a physics book or online that this is the correct moment of inertia for a thin rod
about a perpendicular axis through one end.
Electric Potential A set of charges creates at each point in space an “electric potential,”
which measures how much energy another charge would have if you put it at that point. If
you haven’t studied electricity you’re welcome to think deeply about that definition or ignore
it. But for the purposes of this chapter there are two things you need to know about electric
potential.
1. A charge q creates a potential at every point in space equal to kq∕r where k is a constant
of nature and r is the distance from the charge to the point where you’re calculating
the potential. (This is one form of a physics principle known as “Coulomb’s Law.”)
2. If there is more than one charge in some region the potential at each point is the sum
of the potentials created by each charge individually.
For example, if you have a charge q1 = 2 at x = 0 and a charge q2 = 3 at x = 2, then the
potential at x = 7 is (2k∕7) + (3k∕5). Simple enough, right?
Now suppose you have a bar of length L with charge Q uniformly distributed across the
length of the bar. We want to find the electric potential at a distance w from the left end of
the bar. We’ve shown the bar below, and as always we’ve also drawn a small slice and labeled
its position x and its thickness dx.
Because the charge density is uniform, the charge of our
L
tiny slice is the total charge of the bar multiplied by the frac- dx
tion of the bar represented by our slice: dq = Q (dx∕L). The w
distance from the slice to the point where we are calculating x
the potential is r = x + w. So the potential at that point is:
dq L Q
1 kQ kQ kQ ( L + w )
V = k = k dx = [ln(x + w)]L0 = (ln(L + w) − ln(w)) = ln
∫ r ∫0 L x+w L L L w
Checking units is especially easy for this example. The argument of the logarithm is unitless
and the coefficient in front is in the form k times a charge divided by a distance, which we
know is correct for potential. We can also check this answer by thinking about a limit where
we know what the answer must be. See Problem 5.12.
1 You can also think of similar ratios: dm∕M = dx∕L. Or you can say that M ∕L is the density, and multiply that by the
length dx. All of these of course get you to the same place, but remember to use them only for uniform density!
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 194
3
(d) Write ∫1 in front of your answer to (d) Calculate the mass of each interval.
Part (c) to indicate that you are adding (e) When you add up all the small masses to
up all the segments from x = 1 to x = 3 find the mass of the bar, the resulting sum
in the limit as dx → 0. is an arithmetic sequence. The sum of an
(e) Evaluate the integral to find the total mass arithmetic sequence is (n∕2)(mfirst + mlast ).
of the bar between x = 1 and x = 3. Find the sum of all the small masses as a
function of n, the number of intervals.
5.2 [This problem depends on Problem 5.1.] In Prob-
lem 5.1 you found the exact mass of the (f) Find lim m(n). The result should match
n→∞
region of a bar between x = 1 and x = 3 your answer to Problem 5.1.
with a linear density given by 𝜆 = 3x. In this 5.4 A thin metal bar has linear density 𝜆 =
problem you are going to approximate the 1∕(x + 1)2 where x is the distance from
same mass using Riemann sums. the left side of the bar.
(a) We begin with a Riemann sum with (a) What is the mass of the segment of the bar
two intervals, so Δx = 1. We will use a that stretches from x = 0 to x = 3?
left-hand Riemann sum, so 𝜆 for the (b) What is the mass of the segment of the bar
first integral is 𝜆(1) = 3. What is the that stretches from x = 3 to x = 5?
mass of the first interval? (c) What is the mass of the segment of the bar
(b) The second interval, stretching from that stretches from x = 0 to x = 5?
x = 2 to x = 3, has the same width
(d) If the bar begins at x = 0 on the left
(Δx = 1) but a different density (𝜆(2) = 6).
and stretches infinitely far to the
What is the mass of this interval? right, what is its total mass?
(c) Add your two answers to estimate
5.5 A car starts at the origin when t = 0 and
the mass of the bar.
moves thereafter with velocity v(t) =
(d) Now repeat this exercise, dividing the 1 − (t − 1)2 (with time measured in sec-
region from x = 1 to x = 3 into four onds and distance in meters).
equal intervals with Δx = 1∕2. Add the
(a) How far has the car moved after
masses of the four individual segments
one second?
to estimate the mass of the bar.
(b) What is the largest x-value the car will
(e) In Problem 5.1 you found the exact mass
reach? (Hint: The answer is not 5∕3.)
of this region of the bar. Why are your
estimates in this problem lower than the
For Problems 5.6–5.9 you will be given information
actual mass? Why is your second estimate
about a rod that stretches from x = 0 to x = L. For
more accurate than your first?
each problem calculate the total mass (unless it’s
5.3 [This problem depends on Problems 5.1 and 5.2.] already given), the center of mass, and the moment of
In Problem 5.2 you estimated the mass of a bar inertia about a perpendicular axis through x = 0. As
using a Riemann sum with two intervals, and always, you should use the information you were
then a more accurate sum with four intervals. given to find the units of all constants in the problem
This time you will use n intervals and take a and use that to check the units on your answers. All
limit, which is exactly the calculation that an letters other than 𝜆 and x designate positive
integral performs for you. Your answers to all constants.
of the parts in this problem should depend on
nothing but n (the number of intervals) and 5.6 The rod is uniform with total mass M .
i (which goes from 1 to n over the different 5.7 The mass density is 𝜆 = c sin(𝜋x∕L).
intervals). Explain why your answer for center of
(a) If the bar has n intervals, what is the mass makes sense physically.
width Δx for each interval? 5.8 The mass density is 𝜆 = c∕(x + w). Explain why
(b) If the bar has n intervals, what is the your answer for center of mass makes sense
x-value at the left of the ith interval? physically in the limits w → 0 and w → ∞.
(To test your answer, make sure (You can take these limits by hand, or you
that your formula gives x = 1 for may plug them into a computer.) Hint: all
the first interval and x = 3 − Δx of the integrals you’ll need can be evalu-
for the last.) ated with the substitution u = x + w.
(c) Calculate the density of each interval 5.9 The mass density is 𝜆 = ae −bx .
(based on a left-hand Riemann sum).
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 196
5.10 Find the moment of inertia of a uniform Under that assumption, how much does the
rod of length L and mass M about a per- y-value rise between x = 10 and x = 10 + dx?
pendicular axis through its center. (Your answer will not involve an integral.)
5.11 A rod goes from x = 0 to x = L with charge den- (c) How much does the y-value rise
sity 𝜆 = cx 2 . Find the electric potential at x = 0. between x = 10 and x = 13?
5.12 In the Explanation (Section 5.2.2) we derived 5.16 The “profit” of a business is defined as
the electric potential at a distance w from a uni- “revenue” (money received from customers)
form rod of charge Q and length L. Take the minus “cost” (money spent). (Throughout this
limit of that answer as L → 0 and explain why problem time will be measured in days.)
it’s the answer you would expect physically. (a) If a business makes a steady profit of
$20,000/day, how much money does
5.13 A uniform rod of charge Q goes from it make in a (7-day) week?
the origin to the point (d, 0). Find the elec-
(b) Negative profit is “loss.” If a business has
tric potential produced by this rod at the point
a steady loss of $5000/day, how much
(0, h). (This completes the exercise you began
money does it make in a week?
in Section 5.1; although that was gravity and
this is electricity, the math is the same. Note (c) If a business makes $1000/day for three
that you will need to use a picture to express the days, then $3000/day for three days, and
distance from a slice to the point in terms of x.) then −$500/day for three days, what is the
total profit for this nine-day period?
5.14 In this problem you’re going to calculate
(d) If a business makes p(t) dollars/day
the area under a curve using the same
during a small time period dt, what
kind of logic we’ve been using to set up
is its total profit?
other integrals in this section.
(e) If a business makes p(t) dollars/day from
(a) First consider the function h(x) = 4. t = 32 to t = 59 (the entire month of
Sketch the function and find the area February), what is its total profit?
under the curve between x = 2 and
(f) If the net worth of the business
x = 5. This should be quick and easy and
is the function m(t), explain why
should not involve an integral. 59
∫32 p(t)dt = m(59) − m(32).
(b) Part (a) was easy because h was constant.
(g) If the profit function is given by
Now sketch a function f (x) that is not con-
p(t) = t 3 + t 2 − 8t − 12, find the value
stant. (Just to keep things simple, make t
of t such that ∫0 p(t)dt = 0.
the function continuous and make f pos-
itive everywhere in your sketch.) (h) Explain to the president of the company—
who understands business, but doesn’t
(c) Label a thin slice on the x-axis at posi-
know much math—the significance of
tion x with thickness dx.
the t-value you found in Part (g).
(d) Because we are assuming dx is extremely
small, we can treat f as constant across 5.17 A plant is growing. The variable g represents
the entire slice. Write the area under its rate of growth in inches per week.
the curve just in the thin region above (a) If g = 2 inches/week, how much does the
your slice. You should do this exactly plant grow during any given 3-week period?
the same way you did Part (a). (b) If g = g (t) inches/week, how much does
(e) Take an integral of your result to get an the plant grow during a small time period
expression for the total area under the of duration dt weeks? Assume dt is small
curve of f (x) between x = 2 and x = 5. enough that the growth can be consid-
ered constant for that interval.
5.15 A curve rises with a slope of m. (Remember, as
always, that slope is defined as rise/run!) (c) If g = g (t) inches/week, how much does
the plant grow from Week 7 to Week 10?
(a) We begin with a curve of constant m,
i.e. a line. If m = 10, how much does (d) If the height of the plant is the function
10
the y-value rise between x = 13 and h(t), explain why ∫7 g (t)dt = h(10) − h(7).
x = 15? (No calculus required!) 5.18 Water is flowing into a bucket. Let r rep-
(b) Now consider a curve with a variable func- resent the flow rate of the water, mea-
tion m(x). But we will assume that the slope sured in gallons/minute.
does not change too much as you move (a) If r = 1∕2, how much water fills the
a short distance, so m(10 + dx) ≈ m(10). bucket between t = 4 and t = 10?
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 197
(a) A rock at height h above the ground experi- Here d s⃗ is a vector of length ds that points
ences a constant gravitational force −mg . in the direction of the current flow and r⃗
The reference point is the ground. goes from the wire to P . We use r for the
(b) A block on a spring experiences a force −kx magnitude of r⃗. A straight wire goes from
where k is a constant and x is the amount (0, −H ) to (0, H ) with a current in the positive
the spring is stretched. The reference y-direction.
point is the position where the spring isn’t (a) What is the direction of d B⃗ at the
stretched at all, and the block is currently point (L, 0)? Hint: it’s the same no mat-
at a distance L from that position. ter where d s⃗ is on the wire.
(c) An asteroid is currently at a distance R (b) Set up an integral to find the mag-
from the center of the Earth. The refer- netic field at (L, 0).
ence point is a point infinitely far away; (c) Evaluate your integral. You can do
in other words it’s at the limit r → ∞. this with a computer or using the trig
5.24 A short, straight wire of length ds carrying substitution y = L tan 𝜃.
a current I produces the following mag- (d) The magnetic field produced by an infi-
netic field at a nearby point P . nite wire with a current I is 𝜇0 I ∕(2𝜋L).
Check that your answer reduces to
𝜇0 I d s⃗ × r⃗
d B⃗ = this in the limit H → ∞.
4𝜋 r 3
The process we outlined above can also be used to solve some problems in more than one
dimension. The hardest part is choosing the correct shape for your “slices.” For most prob-
lems, there is only one correct slicing; if you try it a different way, you will not be able to set
up the integral. So as you work through the examples below, pay particular attention to why
we chose the slice we did in each case.
2𝜋𝜌, and its width is d𝜌. We therefore write that the area of our ring is dA = 2𝜋𝜌d𝜌 and its
mass is dm = (2𝜋𝜌 d𝜌)c𝜌2 .
To find the mass of the entire disk, we sum up all the pieces by integrating:
R
1
M= 2𝜋c𝜌3 d𝜌 = 𝜋cR 4
∫0 2
We would like to draw your attention to three key features of this process.
∙ We could have used the same strategy for any density 𝜎(𝜌), provided that the density
was solely a function of 𝜌.
∙ As we stressed in the previous section, d𝜌 entered the calculation as a part of a multi-
plication: it was not “tacked on” at the end.
∙ Finally, the ring that we used as a slice cannot actually be bent up into a rectangle, since
the outer edge is longer than the inner edge. This is another approximation—like the
constant density of the ring—that gets better the smaller we make d𝜌. If we computed
the mass of this disk by using five rings, we would get a reasonable approximation; if we
used ten rings, our approximation would be better. As the number of rings approaches
infinity, the approximation approaches the mass of the disk. And that is exactly what
the integral does for us: it adds up all the slices in the limit as d𝜌 → 0. If you’re not
comfortable with this argument, you will derive this approximation in a different way
in Problem 5.26.
Finally, we check our answer. To check units notice that the density c𝜌2 has units of mass
per distance squared, so c has units of mass per distance to the fourth, and thus the final
answer has correct units for mass. You can also notice that the largest density is on the outer
rim with 𝜎 = cR 2 . If the entire disk had that density the total mass would be (cR 2 )(𝜋R 2 ) =
𝜋cR 4 . The actual answer should be smaller than that, which it is.
(You can easily look up the integral of sin2 𝜙 or you can find it using trig identities.) Since
angles measured in radians are unitless, the formula for 𝜎 implies that c has units of density,
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 200
so c times R 2 has correct units for mass. Once again you can check that this is a fraction of
what the mass would be if the highest density on the disk, c, were the density throughout.
It may occur to you to wonder: how would we find the mass of such a disk if its density
were a function of both 𝜌 and 𝜙? Although the basic strategy is the same, the steps are a bit
more involved and lead to a “double integral,” a topic we will take up in the next section.
dρ
ρ R
2M 3 1
I = 𝜌 d𝜌 = MR 2
∫0 R2 2
This has correct units for a moment of inertia, and once again it’s
easy to look up that this is the correct formula.
(b) Make a similar prediction for yCOM . 5.41 Calculate the volume of a right circu-
(c) Find the center of mass of the trapezoid lar cone of height H that makes a 30◦
(both xCOM and yCOM ). Make sure your angle with the horizontal.
answers match your predictions. 5.42 Calculate the volume of a right circular cone
5.33 A uniform flat piece of metal of mass density 𝜎 of height H and upper radius R.
in the xy-plane has boundaries y = 0, x = L, and 5.43 A very simple model of the atmosphere (ignor-
y = cx 2 (where L and c are positive constants). ing differences in temperature and humidity)
Find its center of mass (both components). says that the density of air falls off exponen-
5.34 Find the moment of inertia of a uni- tially with altitude h. The density at sea level
form square of mass M and edge length (h = 0) is approximately 1.2 kg/m3 and it falls
L about one of its edges. to half that value at an altitude of roughly
h = 6000 m.
5.35 A square with side length a has density
𝜎 = kx where x is the distance from the left (a) Write a function for the density of air
side and k is a positive constant. as a function of altitude, using the val-
ues given in the problem.
(a) Calculate the mass of the square.
(b) Using this simplified model, find the
(b) Calculate the center of mass of the square.
total amount of air in a one square
(c) Calculate the moment of inertia of meter column extending from sea
the square about its left edge. level up infinitely high.
(d) Calculate the moment of inertia of the (c) The temperature of the air, like the
square about a vertical line drawn a density, falls off roughly exponentially
distance b from the left edge. with altitude. Explain why it does not
(e) Show that the moment of inertia is smallest make sense to integrate the tempera-
around an axis drawn at the center of mass. ture with respect to altitude.
5.36 Given that the surface area of a sphere is 5.44 [This problem depends on Problem 5.43.] Consider
4𝜋R 2 , show that the volume of a sphere the Earth as a sphere, and the atmosphere
is (4∕3)𝜋R 3 . Hint: build the sphere from to be a spherical shell around the Earth. In
concentric spherical “shells.” this problem you will assume that the atmo-
5.37 Given that the surface area of a sphere is 4𝜋R 2 , spheric density follows the same model as in
find the mass of a sphere of radius R whose Problem 5.43, but you will have to rewrite
3
density is given by 𝜌 = 𝜌0 e −(r ∕R) where r is the that model as a function of r (distance from
distance from the center of the sphere. Hint: It the center of the earth) instead of h (height
may be helpful to work Problem 5.36 first. above sea level). Use R0 to represent the radius
5.38 (a) Find the electric potential at the origin pro- of the Earth (so r = R0 when h = 0).
duced by a uniform hollow sphere of radius (a) Write h as a function of r , and then use
R and charge Q centered on the origin. this to write the density of the atmosphere
(b) Find the electric potential at the origin pro- as a function of r instead of h.
duced by a uniform solid sphere of radius R (b) Calculate the total amount of air, where
and charge Q centered on the origin. Hint: the atmosphere starts at r = R0 and
It may be helpful to work Problem 5.36 first. continues up to infinity.
(c) Which answer came out bigger? Why? 5.45 A four-sided pyramid has height H and
5.39 In the Explanation (Section 5.2.4) we asserted four bottom edges of length L.
that the volume of a right circular cone of
height H that makes a 45◦ angle from the
horizontal is (1∕3)𝜋H 3 . Set up and evaluate
an integral to derive this formula. H
5.40 A right circular cone of height H that
makes a 45◦ angle with the horizontal
has a density a∕(H 3 + z 3 ) where z is the
distance from the ground.
(a) Find the mass of the cone.
(a) Draw a pyramid with a horizontal slice (a) Draw the disk with a ring-shaped
through it at a distance y from the slice at a distance 𝜌 from the
base. What shape is your slice? center.
(b) What is the volume of your slice in terms (b) Find the distance of this ring from
of its width w and its thickness dy? the point (0, 0, h). (The answer will be
(c) If you look at the pyramid directly fac- essentially the same for all points on
ing one of the sides, it looks like an the ring.)
isosceles triangle with base L and height (c) Find the potential produced by this
H . Draw that triangle and show your ring at the point (0, 0, h).
slice. Use that drawing to find the width (d) Integrate to find the total poten-
w in terms of y, L, and H . tial at (0, 0, h) produced by
(d) Use an integral to add up the vol- the disk.
umes of all the slices and find the (e) As a reality check, find lim V by apply-
R→0
total volume of the pyramid. ing l’Hôpital’s rule to your answer.
5.46 The Great Pyramid of Cheops is a four-sided Explain why the result you got makes
pyramid (see the picture in Problem 5.45) sense.
with bottom edges L = 230 m long and 5.52 In this problem you are going to calculate
height H = 147 m. Its volume is (1∕3)L 2 H . the volume of a sphere, starting only with the
Find its center of mass, assuming uniform knowledge that the area of a circle is 𝜋r 2 . The
density. drawing shows a sphere of radius R and a small
5.47 Find the moment of inertia of a uniform “slice”—a disk at height y from the center of the
cylinder of mass M , height H and radius sphere, with width dy. The radius r of the disk is
R about an axis drawn parallel to the shown: don’t confuse it with the radius R of
height, through the center. the sphere!
5.49 A disk in the xz-plane defined by x 2 + (a) A cylinder of radius r and height h has vol-
z ≤ R 2 , with mass density 𝜎 = kx 2
2 ume 𝜋r 2 h. Use this fact to find the volume
dV of the disk shown, as a function of r
5.50 A uniform solid sphere of mass M and and dy.
radius R centered on the origin. (b) Based on the drawing, rewrite your func-
tion dV so that it depends on the variables
5.51 A uniform disk of radius R centered on y and dy and the constant R, but not the
the origin in the xy-plane has total charge variable r .
Q . In this problem you will find the elec- (c) Integrate to add up the volumes of all such
tric potential produced by this disk at the disks. (Remember that y is the variable
point (0, 0, h). and R is a constant!) You should conclude
at the end that the volume of a sphere is
V = (4∕3)𝜋R 3 .
h 5.53 Find the height of the center of mass of
a hemisphere of radius R and constant
density 𝜎. (If you haven’t done Prob-
lem 5.52, it may help to work through that
R
one first.)
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 204
y
H
(b) This strip is at position x in the horizontal direction. Multiply the area of the strip
times its density to find the mass dm of this thin strip.
(c) Put an integral sign in front of the expression you just wrote for dm and fill in the
appropriate limits of integration. Evaluate this integral to find the mass M of the
plank. You should be able to find the units of k from the expression 𝜎 = kx and
use them to check that your answer has units of mass. (Remember that density for
a 2D object has units of mass per area.)
See Check Yourself #27 in Appendix L
(d) Explain why the procedure you just followed would not have worked if you had
started by drawing a thin horizontal strip instead of a vertical one.
Now you will redo this problem assuming the density is given by 𝜎 = qxy. (You should figure
out the units of the constant q so you will be able to check the units on your final answer.)
Once again you begin by drawing a thin strip as shown in the picture above. Its area dA is
the same as what you calculated above.
2. Explain why you cannot find the mass of the strip dm just by multiplying density times
area as you did above.
3. Supposing our thin strip were drawn at x = 1, set up and evaluate an integral with
respect to y for the mass of the strip. Your integral will involve dx and dy and q, but
after you integrate with respect to y your final answer will be a function of dx and q.
4. Now supposing our thin strip were drawn at x = 2, set up and evaluate an integral with
respect to y for the mass of the strip.
5. Now generalize: our thin strip is drawn at some fixed x-value. Set up and evaluate an
integral with respect to y for the mass of the strip. Your final answer will be a function
of x, dx, and q.
6. Put an integral sign in front of the expression you just wrote for dm and fill in the
appropriate limits of integration. Evaluate this integral to find the mass M of the plank
(and check that your answer has correct units).
See Check Yourself #28 in Appendix L
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 205
7. In Part 1, just as in many problems in the previous sections, you calculated the mass
of a two-dimensional object using only one integral. What was it about the second
problem that made it require two integrals?
dy y
H
Finally, the rat population on the farm is the integral of dRstrip as y goes from 0 to H .
y=H ( x=W )
R= q(H − y)(W − x)dx dy
∫y=0 ∫x=0
Question: A plastic rectangle with corners at (0, 0) and (3, 5) has an electric charge
with the charge density (charge per unit area) 𝜎 = x 3 ∕(1 + x 2 y)4 . Calculate the total
charge on the rectangle.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 207
Answer: We can begin this problem by slicing the rectangle into horizontal strips as
shown.
Beginning students often think such a diagram (3,5)
means “integrate y first” (after all, the strip has a height
dy), but it means the opposite: first we integrate dx
to add up the mass along the slice, and then dy to add all
the slices.
5 3 y dy
x3
Q = dx dy (correct, but hard)
∫0 ∫0 (1 + x 2 y)4
Start again with the inner integral, now treating x as a constant. This time, a simple
u-substitution does the trick:
5 ]5
x3 −x −x −x
dy = = −
∫0 (1 + x 2 y)4 3(1 + x 2 y)3 0 3(1 + 5x 2 )3 3
3( ) [ ]3
−x x 1 x2
+ dx = + = 1.483
∫0 2
3(1 + 5x ) 3 3 60(1 + 5x 2 )2 6 0
Notice that the units seem to have disappeared entirely, because many of the units
were implicit in the original problem. The upper limit for x was given as 3, which had
to be shorthand for 3 meters or inches or some other unit of distance, but that unit
was hidden in the calculations. The constant 𝜎 also appeared to have wrong units,
which means it must really be aye bxy where a = b = 1 in whatever units the problem
was using. If you start with unitless numbers, you can’t check units at the end.
Our strategy, as always, is to divide the problem into infinitesimal “slices.” The picture
below shows a strip at a fixed y-value, extended along the x-direction, with a small thickness
dy. (We could just as easily have chosen the opposite.)
a d
y
b
x
dy
The volume under this slice is the thickness (dy) times the two-dimensional area (parallel
to the xz-plane). And how do we find that area? We integrate! This is the sort of integral we
are used to: y is constant, so we just want the area under(a curve as x)goes from a to b. The
b b
area is therefore A = ∫a z(x, y)dx, and the volume dV = ∫a z(x, y)dx dy. To find the entire
volume, we add up the differential volumes of all such slices; that is, we integrate as y goes
from c to d.
d b
V = z(x, y)dx dy
∫c ∫a
You should be able to see how this process maps onto the double integral examples we
worked above. For example, in the rat problem x and y represented distances and the func-
tion z(x, y) represented rats per unit area, so the “volume” under each slice was the number of
rats in that region. Adding all of those up gave us the total “volume,” which was the number
of rats on the farm.
Separable Integrals
A “separable” function of x and y is one that can be written as the product of two functions:
one function of x only, and one of y only. If you are taking a double integral of a separa-
ble function and the limits of integration are all constants, you can evaluate each integral
independently, and then multiply the answers. (The phrase “the limits of integration are
constants” may not seem restrictive right now, but in later sections you’ll see problems where
they aren’t. In such cases you cannot separate the integral even if the integrand is separable.)
Question: Find the volume under the function z = e x cos y over the region 0 ≤ x ≤ 2,
𝜋∕2 ≤ y ≤ 𝜋.
2 𝜋
Answer: Following the previous example, we could write ∫0 ∫𝜋∕2 e x cos y dy dx and
𝜋 2
evaluate the y integral first; or, we could write ∫𝜋∕2 ∫0 e x cos y dx dy and evaluate the x
integral first.
Instead, we choose to separate the function:
( )( )
2 𝜋
∫0 e x dx ∫𝜋∕2 cos y dy
In Problem 5.79 you will prove that separating an integral in this way leads to the same
answer as the “inside-out” approach.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 209
∞ ∞
(c) Now approach the same problem as 5.66 Evaluate ∫1 ∫1 e −x−2y dx dy
1 2
∫−1 ∫−2 e 3x+y dx dy, evaluating the inside 5.67 The Discovery Exercise (Section 5.3.1) had
(x) integral first, and then the outside. you find the mass of a plank of height H and
The individual steps will look differ- width W with density 𝜎 = kx by drawing a
ent, but the final result should be the thin vertical strip and taking a single integral
same that you found in Part (b). over those strips. Find the mass of that strip
(d) Finally, approach ( the same problem
)( by
) starting with a thin horizontal strip. Doing it
2 1
separating it into ∫−2 e 3x dx ∫−1 e y dy . this way will require a double integral.
Evaluate both integrals and multiply the 5.68 A rectangular square of plastic that goes from
answers, and confirm once again that you (0, 0) to (L, L) has an electric charge on it with
get the same answer you got in Part (b). the charge density (charge per unit area)
5.63 A square with corners at (0, 0) and (2, 2) has a 𝜎 = kye qx . Find the total charge on the plastic.
charge density given by 𝜎 = sin(𝜋x∕2 + 𝜋y∕4). 2𝜋 5
5.69 Consider ∫0 ∫0 y2 sin x dy dx.
(a) Draw the square, and draw a thin ver- (a) Evaluate the double integral.
tical strip at x = 1 with thickness dx.
(b) What does the answer tell you about the sur-
Find the total charge on this strip.
face z = y2 sin x on the specified domain?
Note that your answer will be a func- d b
tion of dx, but not of x or y. 5.70 The integral ∫c ∫a k dx dy (where k is a con-
stant) gives the volume of a rectangular solid.
(b) Draw the square again, and draw a thin
horizontal strip at y = 1 with thickness dy. (a) What is the dimension of this solid in
Find the total charge on this strip. the x-, y-, and z-directions?
(c) Find the total charge on the square. (b) Evaluate the integral to show that
it gives the correct volume for this
5.64 Find the volume under the surface
rectangular solid.
z = e y∕x on x ∈ [1, 3], y ∈ [0, 4]. 3 5
(a) Write (but do not evaluate) the dou- 5.71 Consider ∫−3 ∫−5 x 2 e y dx dy.
ble integral such that the integral with (a) Evaluate the integral as written.
respect to x should be taken first. (b) Reevaluate the integral as x goes from
(b) Write (but do not evaluate) the dou- 0 to 5 (instead of −5 to 5), and then
ble integral such that the integral with double the result. Do you get the same
respect to y should be taken first. answer you got in Part (a)?
(c) One of the two integrals can easily be (c) Reevaluate the original integral as y goes
evaluated by hand; the other cannot. from 0 to 3 (instead of −3 to 3), and
Based on this, choose whether to use the then double the result. Do you get the
form from Part (a) or Part (b). same answer you got in Part (a)?
(d) Evaluate the first (inner) integral. The (d) In general, under what circumstances
result should be a single integral that is is it valid to consider half the region
impossible to evaluate by hand. and then double the result?
b 1
5.72 If ∫a ∫0 (3x + 2y) dx dy = 0, find a function a(b).
(e) Evaluate the second (outer) integral
using a computer or calculator. 5.73 Find the electric potential at the ori-
5.65 A rectangular region extending from (0, 0) to gin produced by a uniform square of charge
(7, 10) is filled with a charge density (charge Q with corners at (0, 0) and (L, L).
per unit area) 𝜎 = kye xy where k is a positive
constant. Your job is to find the total charge. 5.74 A square with corners at (0, 0) and (2, 2)
has a charge density given by 𝜎 = sin(𝜋x + 2𝜋y),
(a) Write the double integral with the x
where all quantities are measured in SI units
integral first and evaluate it.
(in which k = 9 × 109 ). Set up and numer-
(b) Write the double integral with the y ically evaluate an integral for the electric
integral first. Evaluating this is consider- potential at the origin.
ably harder; you can do the inner inte- 5.75 A survey team has measured the density of
gral using integration by parts, but will gold (mass of gold per unit area of desert)
still need a computer for the outer inte- in the Cartesian Desert. Defining an ori-
gral. Use one and verify that you get the gin at one corner of the desert they found
same answer you got in Part (a). the gold density in the region 0 ≤ x ≤ L,
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 211
dx
y dy
–R x R
2. In the next drawing this small rectangle has been extended into a vertical strip. Write
a function for the y-value at the top of this strip as a function of x.
dx
y dy
–R x R
3. To compute the mass of the entire vertical strip, integrate your dm from Part 1 as y
goes from the bottom (y = 0) to the top (the function you found in Part 2). The result
will be a function of x and dx.
4. Integrate your answer from Part 3 as x goes from −R to R to compute the mass of the
entire half-disk. Check that your answer has correct units.
See Check Yourself #29 in Appendix L
y = 20
x = 10
10 20
Students often write ∫0 ∫0 (kxy) dy dx: an understandable mistake, since y clearly goes from
0 to 20 and x from 0 to 10. But those limits of integration would give the mass of an entire
rectangle, not of our triangle. You can see this clearly if you draw several different “slices.”
y = 20
2 6 10
The two slices both have the same width dx, but different heights. The slice on the left
4
has a height of 4, and therefore a mass dm = ∫0 (2ky) dy dx; for the slice on the right,
12
dm = ∫0 (6ky) dy dx.
Generalizing, a slice at an arbitrary x-position in the middle has a mass given by
2x
dm = ∫0 (kxy) dy dx. In this generic form we can add up the masses of all the slices, from the
slice on the far left (x = 0) to the slice on the far right (x = 10).
10 2x
m= (kxy) dy dx (5.4.1)
∫0 ∫0
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 213
In general, if all your limits of integration are constants, you are describing a rectangular
region. For other shapes, the inner limits of integration are functions. The outer (last) limits
of integration are still constants, so the answer comes out as a constant.
Just as in the previous section, you can line up your strips along either axis. If we break this
triangle into horizontal strips, each is bounded on the left by the line y = 2x, so the x-value
there is y∕2. Each strip is bounded on the right by x = 10, so the mass is:
20 10
m= (kxy) dx dy (5.4.2)
∫0 ∫y∕2
For a finite problem, the two approaches will always produce the same answer as each
other. (You may want to try your hand at the two integrals above; both should give you
5000k.) However, there are now two different considerations when choosing which variable
to integrate first. The integrand still matters, as it may be difficult or impossible to find an
antiderivative with respect to one of the variables. But the shape of the region also plays into
your decision, as the example below demonstrates.
Question: Find the volume under the function z = x + y over the region bounded by
the curves x = 4y − y2 and x = 8y − 2y2 .
z
y
5
4
3
2
x = 4y ‒ y2
1
x = 8y ‒ 2y2
y x
x 2 4 6 8
The image on the left shows the volume in 3D. The image on the right shows the shadow of this region
in the xy-plane.
Solution:
The first step in any such problem is deciding which way to integrate first. We choose
horizontal strips, as drawn above. (More on this decision below.) The next step is to
integrate along each such strip, summing up the function as x goes from 4y − y2 on
the left to 8y − 2y2 on the right. Then we add up all the strips to get the correct
double integral.
4 8y−2y2
(x + y) dx dy
∫0 ∫4y−y2
Note the order of those limits; x is 4y − y2 on the left, 8y − 2y2 on the right. In
general, the lower value should always be the first (lower) limit of integration, or the
sign of your answer will be wrong.
To evaluate, we begin as always with the inner integral:
8y−2y2 ]8y−2y2 [ ]
x2 (8y − 2y2 )2 (4y − y2 )2
(x + y)dx = + xy = + (8y − 2y2 )y − + (4y − y2 )y
∫4y−y2 2 4y−y2 2 2
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 214
The rest of the problem—simplifying all that, and integrating it as y goes from
0 to 4—we leave as an exercise for the reader. Of course you’re not going to actually
do it (who would?), but it is worth taking a moment to convince yourself that you
could if you had to.
In the mean time, we turn our attention to one final question: what would have
happened if we had approached this particular problem with vertical strips instead of
horizontal ones?
To find the limits of integration, you
y
have to find the y-values at the top and bottom
5
of each strip. This presents two problems. First,
4 you have to solve the equations x = 4y − y2 and
3 x = 8y − 2y2 for y, which is doable but unpleasant.
2 Second, the three strips shown are all bounded by
x = 4y ‒ y2 different curves, so you have to break the problem
1
x = 8y ‒ 2y2 up into three different regions and evaluate
x
2 4 6 8 the integrals separately. See Problems 5.98–5.99.
This example makes a strong case for
beginning each problem with the question: “Which will be easier, horizontal slices
(dx first) or vertical (dy first)?” A few seconds of thought up front can save you a lot of
agony on the back end.
As a final word of caution: if you are doing an integral over a region in the xy-plane and your
answer has x or y in it you did something wrong. Most likely you put a function as the limit of
integration for your last integral.
We’re going to find the volume of the shape shown below. The bottom and sides are the
xy-, xz-, and yz-planes, and the top is the plane x + y + z = L. You could approach this with
the techniques of Section 5.2: break the volume into thin slices, use geometry to find the
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 217
volume of each slice, and take a single integral to find the volume. Or, using the methods
of Section 5.4, you could find the “volume under a surface” using a double integral over the
triangular region at the bottom. We will approach it, however, by integrating the volume of
a little box across all three dimensions. We’ve drawn a little blue box dV = dx dy dz below.
We’ve also drawn a strip for the first integral, which we’ve arbitrarily chosen to take in the
z-direction.
z
The bottom of the strip is at z = 0 and the top of the strip is on the plane x + y + z = L, so
the upper limit is at z = L − x − y.
L−x−y
dVstrip = dz dx dy
∫0
(Note that we’ve put dz on the left since this is the first integral we’re taking.)
Next we choose, again arbitrarily,
to extend this strip in the y-direction.
z
Figure 5.4 shows the original strip and the
extended slice.
It isn’t easy to look at that picture and
see the limits of integration for y, and
it becomes more difficult as the regions
become more complicated. So here comes
a valuable trick that we will return to many
times. Since only the x and y integrals y
remain, let’s see what that last picture
looks like when viewed directly from
overhead: its “shadow” on the xy-plane.
(See Figure 5.5.)
Make sure you see the relationship
between these two representations of the
same region. The little box in Figure 5.5
represents the vertical strip in Figure 5.4. x
If you integrate over x and y to add up all FIGURE 5.4 The small rectangular block in the middle is
the little boxes in the 2D triangle, you’re dx dy dz. It is in a vertical column extending in the
adding up all the strips in the 3D region— z-direction, which is in a larger triangular region
which will perfectly fill in the volume we’re extending in the y-direction.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 218
Question: A right circular cone has height H and edges defined by the equation
z 2 = x 2 + y2 . Its charge density (charge per unit volume) is cx 2 yz. Find the total charge
in the cone.
Solution:
On Page 200 we broke a similar cone into small circular “slices” with height dz, and
it’s worth taking a moment to note why we can’t use that strategy here. The charge
density in this cone is a function of x and y as well as z, so such a slice—no matter how
thin—would not have a uniform density. We need to integrate over every variable.
We choose to begin with z, which means
z
drawing a tall thin strip. This strip exists
H
at a particular spot (x, y), covering
√ a small area
dx dy, and extending from z = x 2 + y2
at the bottom to z = H at the top.
Our first integral is therefore:
H
√ cx 2 yz dz
∫ x 2 +y2
H y
H
x To find the limits on the other two variables,
ask the question: where do we need to draw all
those strips? We need to draw them extending down from every point along the top
circle.
y
H
H
x
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 219
You can evaluate that by hand or with a computer. You can also take a sneaky shortcut
by evaluating the z integral by hand and then converting the remaining integral to
polar coordinates (Section 5.6). However you do it, you end up with the answer 0.
Well, of course you do! The charge density is odd in y, and the region of integration is
symmetric in y. Every positive charge at y > 0 is perfectly balanced by a negative
charge at y < 0, so the total charge of zero was predictable at the outset.
In Problems 5.106–5.110 integrate the function 5.115 Find the volume of the region bounded
f (x, y, z) in the given region. by z = x 2 + y2 and z = 2x + 8.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 220
The drawing below shows a small region within some small d𝜌 at a distance 𝜌 from the origin,
and some small d𝜙 at an angle 𝜙 off the positive x-axis. Your job is to find the area dA of this
region.
. dϕ
1. We begin by ignoring 𝜙 and looking at the entire ring between
the two dotted lines. If this ring were cut at the bottom and ρ dρ
stretched up into a rectangle, what would be the width and ϕ
length of this rectangle? What would be its area?
2. What fraction of the ring is represented by our region?
3. Based on those two answers, what is the area of the region dA?
Suppose you set out to use a double integral to find the area of a circle
of radius R. The area of a tiny box is dx dy, and the lower and upper dx
limits for y are given by the formula for a circle, x 2 + y2 = R 2 .
y dy
√
R R 2 −x 2 R √
A= dy dx = 2 R 2 − x 2 dx x
√
∫−R ∫− R 2 −x 2 ∫−R
You may remember how to use a trig substitution to turn this into
an integral of cos2 𝜃 which you can then simplify with a double-angle
formula, but you’re probably not looking forward to it. When you’re working with a
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 222
circle, it’s often easier to use polar coordinates: instead of identifying a point by its x- and
y-coordinates you use 𝜌, the distance to the origin, and 𝜙, the angle off the positive x-axis. (If
you prefer the letters r and 𝜃 to represent those quantities, we’ll explain in Section 5.7 why
we don’t.)
It’s easy to see the limits of integration for this particular problem if you work in polar
coordinates: 𝜌 goes from 0 (at the center) to R (at the edge), and 𝜙 goes from 0 to 2𝜋 to
2𝜋 R
trace all around the circle. So you might think we’re simply going to evaluate ∫0 ∫0 d𝜌 d𝜙.
But that integral can’t be right; it has units of length, not area, because d𝜌 is a length but d𝜙
is an angle (and thus unitless).
So where did we go wrong? We got the limits of integration correct, but to find the total
area we have to add up the areas of all the little boxes that make up the circle, and d𝜌 d𝜙
is not that area. Below we will show you exactly how to define a differential box in polar
coordinates and find its area. For the moment, we’ll jump to the answer: dA = 𝜌 d𝜌 d𝜙. That
extra factor of 𝜌 is so important that mathematicians have given it a special name, and we
are giving it its own special box.
For this equation to work you also have to find the correct limits of integration for 𝜌 and 𝜙 so you
are covering the same 2D region you were in the Cartesian integral.
This equation assumes 𝜌 is positive.
With that little trivia fact in hand, we’re ready to return to the area of our circle.
In Cartesian coordinates the integrand when you’re performing a double integral to find
area is just 1, so in polar coordinates we get the following.
2𝜋 R 2𝜋
R2
𝜌 d𝜌 d𝜙 = d𝜙 = 𝜋R 2
∫0 ∫0 ∫0 2
That was easy. But where did that extra 𝜌 come from?
We were hoping you’d ask.
We begin as always with a little box. Our traditional dA has been defined by a small
dx and a small dy; this one, of course, is defined by a small d𝜌 and a small d𝜙. Unlike
our usual box, this one is not literally a rectangle—the lines on the left and right are not
quite parallel to each other, and the top and bottom are not strictly lines at all—but it does
become more rectangular as d𝜙 → 0, so we will make that approximation. The “height” of the
rectangle is d𝜌.
But in no approximation is d𝜙 the width; d𝜙 is the angle subtended by the box. What we
need is the length of an arc: a part of a circle of radius 𝜌, subtending an angle d𝜙. Since the
whole circle has circumference 2𝜋𝜌 and this arc takes up a fraction d𝜙∕(2𝜋) of that circle,
the length of the arc is 𝜌 d𝜙.
So the area of the box is 𝜌 d𝜌 d𝜙. (The Discovery Exercise, Section 5.6.1, was designed to
guide you to this conclusion.) This box does not only apply to the “area of a circle” problem
we worked above; it is the fundamental area unit in polar coordinates. That is the logic
underlying the rule in the box above.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 223
Solution:
The two things we need to know about our
differential box in order to find the potential it produces
are its charge and its distance from the point (0, 0, h). The
charge is charge density times area: dq = (c cos2 𝜙)(𝜌 d𝜌 d𝜙) = c𝜌 cos2 𝜙 d𝜌 d𝜙. From
the picture
√ we can see that the distance r is given by the Pythagorean theorem as
r = 𝜌2 + h 2 . Recalling that potential due to a point charge is kq∕r , the total
potential is thus
R 2𝜋
𝜌 cos2 𝜙 R
𝜌 (√ √ )
V = kc √ d𝜙 d𝜌 = 𝜋kc √ d𝜌 = 𝜋kc R 2 + h2 − h2
∫0 ∫0 𝜌2 + h 2 ∫0 𝜌2 + h 2
(In this example, integrating with respect to 𝜌 before 𝜙 would have turned out just as
easy. Remember, however, that sometimes one order will be easier than the other.)
From the formula for charge density c must have units of charge per distance
squared, so this has units of k times charge per distance, as it should.
In Problem 5.139 you’ll set this same problem up in Cartesian coordinates, which
should dispel any doubts about how useful other coordinate systems can be.
two lines. This tells us that the line y = c always sets the
y=c
boundary for 𝜌 in this region.
y = 2x But in polar coordinates we need to express that boundary
in the form 𝜌(𝜙), so instead of y = c we will write 𝜌 sin 𝜙 = c.
y=x We see then that if 𝜌 goes from 0 to c∕(sin 𝜙), it will cover
the entire region. Meanwhile, 𝜙 goes from 𝜋∕4 on the right
to tan−1 2 on the left. So the charge of our region is given by:
It’s easy enough to evaluate the first integral by hand, and the second with a
computer or calculator (or just do both on a computer). Putting in c = 2 and k = 0.5
it comes out to 0.78 C. The “needs-a-person” math is all in the setup.
Suppose you had forgotten the Jacobian and written this.
tan−1 2 c∕(sin 𝜙)
(k𝜌𝜙) d𝜌 d𝜙 mistake!
∫𝜋∕4 ∫0
You know from the problem that k𝜌𝜙 is a charge per unit area (that’s what charge
density is in 2D), so to get the total charge it must be multiplied by an area. Since d𝜌
has units of length and d𝜙 is unitless, you can see that this must be wrong. You couldn’t
do this units checking if you just replaced c and k with numbers at the beginning.
density on the object is 𝜎 = 𝜌∕ sin 𝜙. Find from this definition that the domain 0 ≤ R < 1
the electric potential at the origin. is equivalent to the domain 0 ≤ 𝜌 < ∞.
5.152 Exploration: Jacobians in Alternative Coor-
dinate Systems. If you move from (𝜌, 𝜙) 5.157 Region R is bounded by the curves y2 = 4 + 4x
to (𝜌 + d𝜌, 𝜙 + d𝜙) you sweep out an area and y2 = 4 − 4x and the constraint y ≥ 0.
of 𝜌 d𝜌 d𝜙, which is why 𝜌 is the polar (a) Draw the region.
Jacobian. To generalize this result to arbi-
(b) Integrate the function f (x, y) = 3y
trary coordinate systems we need to find
in region R.
the area of the parallelogram in Figure 5.8
on Page 225. Now you will reframe the region by using
the coordinate system x = u 2 − v 2 , y = 2uv,
(a) Let dy equal the change in the y-
subject to the restriction u ≥ 0.
coordinate between the top and bottom
of the vector labeled a⃗ in Figure 5.8. (c) Explain why we can assume v ≥ 0 even
Explain why dy = (𝜕y∕𝜕u)du. though this restriction was not specified.
(b) Write the i-̂ and j-components
̂ ⃗
of a. (d) One of the bounding curves of R is
y2 = 4 + 4x. Rewrite this curve as an
(c) Write the i-̂ and j-components
̂ ⃗
of b. equation in terms of u and v. (It may help
(d) The area of a parallelogram with to remember that u and v must be real
sides a⃗ and b⃗ is the magnitude of the numbers.)
| |
cross product, |a⃗ × b⃗|. (See Section 5.10 (e) Similarly rewrite y2 = 4 − 4x in terms of
| |
Problem 5.271.) Find the area of the u and v.
du dv parallelogram. (In order to take (f) What is the final bounding curve? Write
this cross product you will have to treat the equation for this curve in xy space.
both sides as three-dimensional vec- Then write the corresponding equations
̂
tors with k-components of zero.) (two of them!) in uv space.
(e) The Jacobian is a function f (u, v) such (g) Draw the resulting region in uv space.
that the area of that parallelogram is
(h) Calculate the Jacobian of the uv
f (u, v) du dv. Write the Jacobian for
coordinate system.
the uv-coordinate system.
(i) Convert the function f (x, y) = 3y into the
uv system and then integrate it through
Equation 5.6.1 gives a general formula for the the region in uv space. You should get
Jacobian in two dimensions. (If you worked the same answer you got in Part (b).
Problem 5.152 you derived that formula, but √
(j) Integrate the function f (x, y) = 1∕ x 2 + y2
these problems do not depend on the derivation.) In
on region R. (Unlike our first example,
Problems 5.153–5.156 use that formula to evaluate
this is really tough to do in Carte-
the Jacobian for the given coordinate system. You will
sian coordinates; use the uv trans-
need to start by writing x and y in terms of the new
formation instead!)
coordinates if those formulas aren’t given in the
problem. 5.158 In this problem you will integrate the function
f (x, y) = x 2 + y2 in the diamond-shaped region
5.153 Use the formula to verify the Jacobian with corners at (0, 1), (1, 0), (0, −1), and (−1, 0).
for polar coordinates. (a) Draw the region.
5.154 A coordinate system rotated 45◦ relative (b) Evaluate the integral in Cartesian coor-
to the x- and√y-axes can be defined
√ by dinates. You will have to be careful
x = (u − v)∕ 2, y = (u + v)∕ 2. In addi- about the limits of integration!
tion to finding the Jacobian, explain why the (c) Evaluate the integral with the substitution
result makes sense. In other words, how could x = u + v, y = u − v. Remember that any
you have predicted this Jacobian without coordinate transformation requires three
calculating it? changes: the integrand, the region (or the
√ √
5.155 x = 2 u, y = (u + v)∕ 2 limits of integration), and the Jacobian.
5.156 It’s possible to define a set of coordinates R 5.159 Integrate (y − x)e (2x+y)(y−x) within the region
and 𝜙 that show the infinite 2D plane in a bounded by the lines y = −2x, y = x,
finite circle. The angle 𝜙 is the usual polar y = 1 − 2x, and y = x + 1. Hint: The inte-
angle and the distance R is related to the polar grand suggests a coordinate system that
coordinate 𝜌 by 𝜌 = R∕(1 − R). You can see might make this problem easier.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 229
z 5. In Figure 5.10, the angle between the blue line labeled r and
(r,θ,ϕ) the z-axis is labeled 𝜃. The angle between the blue line labeled
r
r and the black line labeled z is not labeled. What is that angle
θ z
y in terms of spherical coordinates?
x 6. What are the Cartesian (x, y, z) coordinates of the point
ϕ ρ
(r,π/2,ϕ) whose spherical coordinates are r = 2, 𝜃 = 𝜋∕6, 𝜙 = 𝜋∕3?
Hint: begin by calculating the cylindrical 𝜌 even though it is
FIGURE 5.10
not your final goal.
7. What are the spherical (r , 𝜃, 𝜙) coordinates of the point whose Cartesian coordinates
are (3, 4, 10)?
See Check Yourself #31 in Appendix L
8. What shape is described by the equation r = 2?
9. What shape is described by the equation 𝜃 = 𝜋∕3?
z Question:
B The picture to the left shows three points A, B, and C.
Their Cartesian coordinates are (1, 1, 0), (1, 1, 2),
y and (0, −1, −1) respectively. What are their cylindrical
coordinates?
A
x Solution: √
Points A and B are at 𝜙 = 𝜋∕4 and 𝜌 = 2. The fact
C that they are at different heights has no effect on their
𝜌 and 𝜙 coordinates. Point C is at 𝜙 = 3𝜋∕2 and 𝜌 = 1.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 231
In all cases the cylindrical z coordinate is the same as the Cartesian one, so you have
√
A ∶( 2, 𝜋∕4, 0)
√
B ∶( 2, 𝜋∕4, 2)
C ∶(1, 3𝜋∕2, −1)
We deliberately presented that example without conversion equations because it’s impor-
tant to think about coordinate systems visually. When you do need the equations, they’re
not hard: the conversions between (x, y) and (𝜌, 𝜙) are the same as they are for polar, and z
remains unchanged. See Appendix D.
The final piece for taking integrals with cylindrical coor-
dinates is the Jacobian. We must find the volume of the z
differential box shown in the picture below.
The vertical side has height dz. The other two sides can
be seen from the shadow of the box on the xy-plane, which
is the same as the differential box in polar coordinates. The
volume of the 3D box is thus: dz
dVcylindrical = 𝜌 d𝜌 d𝜙 dz y
dρ ρdϕ
Equivalently we can say the Jacobian in cylindrical coor-
dinates is 𝜌, just as it was in polar coordinates. dϕ
We are now ready to return to our cylinder of gas. The x
verbal description of the concentration leads readily to a
cylindrical function: c = ke −𝜔z ∕𝜌. The limits of integration are also easy to write in this system.
(You may want to figure them out before reading our answer below.) So the total amount of
gas in the tank is:
2𝜋kR ( )
H R 2𝜋
n= k (e −𝜔z ∕𝜌) 𝜌 d𝜙 d𝜌 dz = 1 − e −𝜔H
∫0 ∫0 ∫0 𝜔
Those limits of integration were constants in cylindrical coordinates because the region
was, conveniently, a cylinder. For more difficult regions we use the same trick we empha-
sized in Cartesian 3-dimensional integrals: find the limits for one variable, thus reducing the
problem to a two-dimensional region to find the remaining limits.
Which variable first? It depends on the region, but most often integrals that you do in
cylindrical or spherical coordinates are over regions that you can make by taking a shape
and rotating it around the z-axis, either partway or entirely. In that z
case the 𝜙 integral will have constant limits of integration and the
2D shape you are left with in 𝜌 and z is simply the shape that was H
rotated around the z-axis, as shown to the right.
Each tiny box in this 2D shape represents a ring around the
z-axis in the 3D region. Adding up all of those boxes in 2D means
you are adding up all of those rings in 3D, which exactly fills the ρ
R
cylinder. With this 2D image it’s easy to see that the limits are ϕ
0 ≤ 𝜌 ≤ R, 0 ≤ z ≤ H .
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 232
From the density expression k has units of mass per distance to the fourth, and from
the formula for the cone c is unitless, so the units are mass times distance squared, as
they should be.
Spherical Coordinates
Our last system is “spherical coordinates” (sometimes called “spherical polar coordinates”).
The first coordinate r is distance from the origin. The other two coordinates are angles. The
“azimuthal angle” 𝜙 is the same variable we used in cylindrical coordinates, the angle around
the z-axis going counterclockwise from the positive x-axis. The “polar angle” 𝜃 is the angle
down from the positive z-axis. Spherical coordinates are written in the order (r , 𝜃, 𝜙).
The drawing to the left shows both the cylindrical and the
spherical coordinates for a point. Note that the domains r ≥ 0,
z
𝜙 ∈ [0, 2𝜋], and 𝜃 ∈ [0, 𝜋] are enough to cover all possible points.
(r,θ,ϕ)
On the Earth’s surface 𝜃 and 𝜙 represent latitude and lon-
gitude respectively, with one difference. Latitude is zero at the
r equator and ±𝜋∕2 at the poles. (It’s usually given in degrees, but
θ z spherical coordinates are always given in radians so we’ve con-
y verted latitude to radians as well.) The spherical coordinate 𝜃,
x
ϕ however, is 0 at the North pole, 𝜋∕2 at the equator, and 𝜋 at the
ρ (r,π/2,ϕ) South pole.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 233
See Appendix D for the equations for converting between Cartesian and spherical
coordinates. (They are not hard to derive once you realize that the cylindrical 𝜌 equals
r sin 𝜃.)
Figure 5.11 shows a differential box in spherical coordinates. Its volume is r 2 sin 𝜃 dr d𝜃 d𝜙,
so the Jacobian in spherical coordinates is r 2 sin 𝜃. You’ll derive this formula in
Problem 5.181.
dr
dϕ
y dθ
FIGURE 5.11
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 234
Question: A well-known swindler named Quark attempts to sell you a sphere of pure
dilithium, but your lab analysis indicates that only the outer edge is pure dilithium,
and the concentration (fraction of dilithium by volume) rises linearly with radius
from zero at the center to one at the edge. What fraction of this sphere is dilithium?
Solution:
Both the region of integration and the integrand are perfect candidates for spherical
coordinates. The concentration is c = r ∕R, where R is the radius of the sphere. The
volume of dilithium in a differential box is the volume of the box times the fraction
of it that is dilithium, so dVd = (r ∕R)r 2 sin 𝜃 dr d𝜃 d𝜙. You can integrate dVd to find
the total volume of dilithium in the sphere, and divide that by the volume of the
sphere to find the fraction of dilithium. To cover all directions, 𝜙 should go from 0 to
2𝜋 and 𝜃 should go from 0 to 𝜋. To cover the entire sphere r should go from 0 to R.
2𝜋 𝜋 R
1 1 3 3
f = r sin 𝜃 dr d𝜃 d𝜙 =
Vsphere ∫0 ∫0 ∫0 R 4
The answer comes out unitless, which makes sense for volume of dilithium per
volume of the sphere.
To make matters worse, both mathematicians and physicists write spherical coordinates
with 𝜃 before 𝜙. Since the meanings of 𝜃 and 𝜙 are reversed, the coordinates (1, 𝜋∕2, 𝜋∕6)
represent two completely different points in the two different systems. And just to add one
more bit of confusion, the angle around the z-axis is called the “azimuthal” angle and the
angle down from the z-axis the “polar” angle, even though the azimuthal angle is the one
that shows up in polar coordinates.
We are writing a book for physics and engineering students, and we will consistently follow
the conventions presented in this section. If you are taking another math class while going
through this book, you may have to change letters on a daily basis. In response to the con-
fusion this may engender, we offer the following sentiment from the famous philosopher
Han Solo:
It’s not our fault.
5.166 This problem requires that you know how to cal- you are integrating. Since the function
culate the determinant of a matrix. See Chapter 6. f is also simplest in cylindrical coordi-
Use Equation 5.6.2 to find the Jacobians for nates, that’s the system you’ll use.
cylindrical and spherical coordinates. Make (d) Rewrite the formulas for the paraboloid
sure your answers match the ones we gave. and the density in terms of cylin-
5.167 The Earth has a radius of 4000 miles, and drical coordinates.
rotates about its axis once every 24 h. We take (e) Write the mass dm of a differential
the axis of rotation as the z-axis. A dead girl volume. Be sure to include the Jaco-
named Lucy starts out at time t = 0 at the bian in cylindrical coordinates.
(Cartesian) position (0, 3464, 2000) and lies (f) What are the limits of integration for 𝜙?
in the ground, rotating with the Earth.
(g) Draw the 2D shape that remains after
(a) Write spherical functions r (t), 𝜃(t), 𝜙(t) you’ve written the 𝜙 integral. The drawing
for her position as a function of time. should have a 𝜌-axis and a z-axis. Include
(b) Convert your functions to x(t), y(t), z(t). a differential box in your drawing. (This
5.168 The Earth has a radius of approximately small box in 𝜌z space represents a ring in
6000 km and rotates around its axis every xyz space; take a moment to visualize that.)
24 h. James begins at the South Pole and flies, (h) Using your 2D drawing to guide you,
drives, and boats so that he is always going due find the limits of integration for 𝜌 and z.
North at 50 km/h. (Note that he maintains You can take these two integrals in either
a distance of 6000 km from the center of the order, but for this problem the d𝜌 dz inte-
Earth—the origin—and that while he trav- gration turns out easier than dz d𝜌. (We
els North, he is also spinning with the Earth.) know because we tried both ways; some-
We take the axis of rotation as our z-axis. times that’s what you have to do.)
(a) How long does it take James to travel from (i) Write and evaluate the triple integral
the South Pole to the North Pole? to find the mass of the object.
(b) Give the spherical functions r (t), 𝜃(t), (j) Check the units of your answer.
𝜙(t) that represent James’ motion.
(c) Give the cylindrical functions 𝜌(t), 𝜙(t), z(t) For Problems 5.170–5.175 integrate the given
that represent his motion. (Your answers function over the indicated region.
will be easier to interpret if you simplify
them with a few trig identities.) 5.170 f (𝜌, 𝜙, z) = 𝜌3 cos 𝜙∕z. The region is
the cylinder with boundaries z = H1 ,
(d) At what time does James reach the
z = H2 , and 𝜌 = R.
equator? Explain how you can tell that
5.171 f (𝜌, 𝜙, z) = e 𝜌 z 2 . The region is defined
2
your 𝜌 and z functions make sense at
this time. by taking the cylinder bounded by
(e) At what time does James reach the North z = 0, z = H , and 𝜌 = R and only consid-
Pole? Explain how you can tell that your 𝜌 ering the half of it with x ≥ 0.
and z functions make sense at this time. 5.172 f (𝜌, 𝜙, z) = z cos(𝜙∕4 + 𝜌∕R). The region is a
(f) Give the Cartesian functions x(t), y(t), right circular cone of height H and radius R
z(t) that represent his motion. pointing upward with its vertex at the origin.
5.169 Walk-Through: Integral in Cylindrical or 5.173 f (r , 𝜃, 𝜙) = r 2 cos2 𝜃. The region is a ball of
Spherical Coordinates. (An object radius R centered on the origin.
) enclosed
by the paraboloid z = c x 2√ + y2 from z = 0 5.174 f (r , 𝜃, 𝜙) = e r ∕R cos 𝜙. The region is the
to z = H has density D = k x 2 + y2 ∕z. x ≥ 0 half of a ball of radius R centered
(a) What are the units of c and k? on the origin.
(b) Sketch the region. Your sketch doesn’t 5.175 f (r , 𝜃, 𝜙) = r 2 . The region is a right circu-
have to be a work of art, but it should be lar cone of height H and radius R pointing
clear enough to give you a clear sense of upward with its vertex at the origin.
what region you’re integrating over.
(c) Explain why the limits of integration for In Problems 5.176–5.179 you will redo problems
this region will be simpler in cylindrical from Section 5.5. You did them the first time in
coordinates than in Cartesian or spherical, Cartesian coordinates; now you will use cylindrical or
having nothing to do with what function spherical. You should be able to set up and
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 237
evaluate all necessary integrals without that we had reduced the problem to a 2D
a computer. rectangle of height H and width R. Why
did that rectangle only have width R even
5.176 Problem 5.107. though the cylinder has a width 2R?
5.177 Problem 5.109. 5.183 A right circular cone of height H and
5.178 Problem 5.116. upper radius R with its vertex at the ori-
5.179 Problem 5.118. gin has density k𝜌 where 𝜌 is distance from
the z-axis. Find its total mass.
5.184 An Hourglass Figure The figure below shows
5.180 In Cartesian coordinates all three coor-
an object shaped like an hourglass, with
dinates must go from −∞ to ∞ in order
the z-axis going through its center. The top
to cover all space. In cylindrical coordi-
and bottom are at z = ±H and the sides
nates, you can cover all possible points with
are described by the function x 2 + y2 =
0 ≤ 𝜌 < ∞, 0 ≤ 𝜙 < 2𝜋, and −∞ < z < ∞.
z 2 + a 2 . The density of the object is c𝜌2 ,
In spherical coordinates, we tradition-
where 𝜌 is distance from the z-axis.
ally cover all possible points with r ≥ 0,
0 ≤ 𝜃 ≤ 𝜋, and 0 ≤ 𝜙 < 2𝜋, but there are
other possibilities.
(a) Describe the region covered with the
more restrictive 0 ≤ 𝜙 ≤ 𝜋 (using the
same range of values given above
for the other coordinates).
(b) Describe the region covered in spher-
ical coordinates if −∞ < r < ∞,
0 ≤ 𝜃 ≤ 𝜋, and 0 ≤ 𝜙 ≤ 𝜋.
(c) Describe the region covered in (a) Find the object’s mass.
spherical coordinates if 0 ≤ r < ∞, (b) Write, but do not evaluate, a triple inte-
0 ≤ 𝜃 ≤ 2𝜋, and 0 ≤ 𝜙 ≤ 𝜋. gral that represents the moment of iner-
5.181 Figure 5.11 shows a differential box in spher- tia of this shape about the z-axis.
ical coordinates. We assume that in the limit (c) Write, but do not evaluate, a triple inte-
where the box is small enough we can treat gral that represents the moment of iner-
this box as a rectangular solid, so finding its tia of this shape about the x-axis.
volume is reduced to finding the lengths of 5.185 You’re making a giant sculpture of an ice
its three sides. We’ll even give you one for cream cone. The side of the
free; the one labeled dr in the picture has ( sculpture
) is the
right circular cone z 2 = 2 x 2 + y2 from z = 0
length dr . to z = H . The top of the cone is a portion
(a) One of the sides of the box is an of the sphere x 2 + y2 + z 2 = R 2 . The density of
arc (a segment of a circle) that sub- the sculpture is kr , where r is distance from the
tends an angle d𝜃. What is the radius vertex of the cone. Find the total mass of the
of the circle that this arc is part of? sculpture.
Give your answer in terms of r , 𝜃,
and/or 𝜙.
(b) What is the length of the arc in Part (a)?
(c) Repeat Parts (a) and (b) for the side
that subtends an angle d𝜙. Give the
radius of the circle it is part of (this is not
entirely obvious), and then the length of
the arc.
(d) Multiply the lengths of the three sides to
find the volume of the differential box. (a) The constants R (radius of the sphere)
5.182 In the Explanation (Section 5.7.2) we took and H (height of the cone) are not inde-
an integral of a cylinder of radius R and pendent, because the two shapes have to
height H in cylindrical coordinates. After meet at the boundary. Find the relation-
integrating with respect to 𝜙 we claimed ship between these two constants.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 238
(b) Find the total mass of the sculpture. 5.191 A cylindrical container with radius R and
Your answer should contain either height H is rotating about its central axis.
H or R but not both. The air inside the cylinder is effectively
5.186 Find the moment of inertia of a uniform pushed to the outside, so the density grows
sphere of radius R and mass M about exponentially with distance from the axis
an axis through its center. of rotation. Meanwhile the cylinder is tall
enough that gravity causes the density to
(a) Write, but do not evaluate, the triple
fall off exponentially with height. (In Prob-
integral that represents this problem
lem 5.196 you will analyze a more realistic
in Cartesian coordinates.
model of density in a spinning cylinder.)
(b) Write, but do not evaluate, the triple
(a) Write the density function. The func-
integral that represents this problem
tion should depend on the coordi-
in cylindrical coordinates.
nates in whatever coordinate system
(c) Write, but do not evaluate, the triple you are using and should also include
integral that represents this prob- some unknown constants.
lem in spherical coordinates.
(b) Identify the units of all of the con-
(d) Evaluate one of those triple integrals—
stants you introduced.
whichever one looks easiest!—
to find the answer. (c) Set up a triple integral to find the
moment of inertia of the cylinder
5.187 A spherical shell of mass M has inner radius
about its central axis.
R1 and outer radius R2 . (In other words
it’s a sphere of radius R2 with a spherical (d) Evaluate your integral.
hole in the middle of radius R1 .) (e) Check that your answer has correct units.
(a) Find the moment of inertia of the shell
about an axis through its center. 5.192 A cylinder of radius 1 m and height 2 m
has a uniformly distributed charge 10−8 C.
(b) Evaluate your answer for the case R1 = 0.
Find the electric potential produced by the
What does that answer represent?
cylinder at a point on the axis of the cylinder a
(c) Evaluate your answer in the limit R1 → R2 . distance 4 m from the center of the cylinder.
What does that answer represent?
5.193 Imagine two right circular cylinders of radius
5.188 In Section 5.1 we presented Isaac Newton’s
1, one centered on the x-axis and the other
problem: show that a planet can be mathe-
on the z-axis. A uniform object of mass 1
matically treated as if it were a point particle
fills the space where the two cylinders inter-
at its center. We will approach this problem
sect. You’re going to find the moment of
for gravitational force in Section 5.11 (see
inertia of that object about the z-axis.
felderbooks.com), but here you will tackle
it with gravitational potential. Every particle
of mass m creates a potential V = −Gm∕r at
a distance r away from itself. Find the total
potential created by a uniform sphere of mass
M and radius R at a point L away from the
center, where L > R. Hint: At one point in
taking the integral
√ you may come across the
expression (a − b)2 . Figure out whether a or
b in that expression is larger so that you can
be sure to take the positive square root. (a) Explain why this problem is easiest to
5.189 Find the moment of inertia of a right circu- solve in cylindrical coordinates.
lar cone of uniform density D, with height H (b) Explain why it would be hard to set up
and upper radius R, about its central axis. the limits of integration for either the
5.190 A sphere of radius R has density propor- 𝜙 or 𝜌 integrals first. You will there-
tional to distance from its center. Find the fore start with the z integral.
moment of inertia of the sphere about an axis (c) We won’t ask you to draw this shape
through its center. (Your answer will contain with a box and a strip, but even with-
a constant of proportionality. Make sure you out that you can see from the picture
evaluate the units of that constant and use that a vertical strip will go from the bot-
them to check the units of your answer.) tom of the x-axis cylinder to the top. To
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 239
(f) A collision with a heavy freighter has inertia remains at the value you found
slowed the rotation of Babylon 5 to a in Part (e), calculate how much torque
period of 47 s, so the engines must get it the engines must exert to get Babylon 5
rotating at its usual speed again. Using back to its proper rotation speed in one
the approximation that the moment of hour.
We begin with a fact from physics: if a force F⃗ acts upon an object that moves through a
linear displacement d s⃗, the work done by the force on the object is given by the dot product
W = F⃗ ⋅ d s⃗.
1. Calculate the work done by a force F⃗ = 2î + 3ĵ on an object as it moves along a line
from the point (1, 1) to (5, 25).
See Check Yourself #32 in Appendix L
Part 1 does not require an integral, because a constant force is acting along a straight line.
If the force or the path varies, you employ the usual strategy: break the journey into small
parts, compute each part with a simple dot product like Part 1, and use an integral to add
up all the contributions.
Consider, for instance, a bead moving along a wire in the shape of the curve y = x 2 . A
force F⃗ = (x + y)î + (xy)ĵ acts on the bead as it moves from (1, 1) to (5, 25). Your job in this
exercise is to calculate the work done by the force on the bead.
We begin with a small part of the journey—small enough to treat the force as a constant
and the journey as a line.
2. The force is expressed as a function of x and y. Re-express the force—both its x- and
y-components—as a function of x only. Remember what curve we are moving along!
3. a small displacement ds ⃗ = dx î + dyĵ at
. The figure to the left represents 2
some point on the curve y = x . Express the relationship between dy and
dx, based once again on the curve. (Hint: It’s not dy = dx 2 .)
ds
dy 4. Calculate the work done along this small displacement. Your answer should
be expressed as a function of x and dx only.
dx 5. Integrate your answer as x goes from 1 to 5.
See Check Yourself #33 in Appendix L
For a constant vector field along a straight line, you don’t actually have to integrate to find
a line integral. So the examples above should help you understand what a line integral tells
you about a vector field and a curve, but they won’t help you evaluate line integrals in less
trivial situations.
Consider, therefore, a charged particle in a non-uniform electric field. The field exerts
a force F⃗ = kx î + (qy∕x)ĵ while the particle moves along the path y = cx 2 from (0, 0) to
(xf , cxf2 ). (The particle must be experiencing other forces to make it move along this curve,
but you don’t need to know them to solve the problem.) What work would that force do on
the particle?
To answer that question, we have to clearly define the differential displacement. In two
dimensions we can always write d s⃗ = dx î + dy j.̂ But dx and dy are not independent in that
expression; they are related by the equation for the curve, so in this case dy = 2cx dx. The
equation for the curve also allows us to replace every occurrence of y in the formula for F⃗
with cx 2 , so the work done during one differential displacement can be written:
( )
q cx 2 ( )
dW = F⃗ ⋅ d s⃗ = kxdx + (2cxdx) = kx + 2qc 2 x 2 dx
x
The total work is thus
xf ( ) k 2qc 2 3
W = kx + 2qc 2 x 2 dx = xf2 + x
∫0 2 3 f
From the equations for F⃗ and the equation of the curve y = cx 2 we see that k has units
of force over distance, q has units of force, and c has units of one over distance, so both
terms have units of force times distance. Notice that the work came out finite even though
the force was divergent at the starting point of the path. (Think about the dot product at
that point.)
We can summarize what we’ve said so far in the following definition.
V⃗ ⋅ d s⃗
∫C
In the example we worked above the curve was expressed as y(x), allowing us to rewrite
the line integral (both F⃗ and d s⃗) as functions of x. It is often more convenient to express a
curve parametrically. For example, the two functions x = 5 cos 𝜃, y = 5 sin 𝜃 for 0 ≤ 𝜃 ≤ 2𝜋
describe a circle, which cannot be written as a function y(x). Equivalently, those two functions
can be combined into a position vector r⃗ = (5 cos 𝜃)î + (5 sin 𝜃)j.̂ You evaluate line integrals
along such a curve by rewriting everything in terms of the parameter, as illustrated in the
example below.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 243
d s⃗ = −R sin 𝜙 d𝜙 î + R cos 𝜙 d𝜙 ĵ
cos 𝜙 ̂
V⃗ = i + k ĵ
sin 𝜙
𝜋∕2( ) 𝜋∕2
⃗ cos 𝜙
V ⋅ d s⃗ = − R sin 𝜙 + kR cos 𝜙 d𝜙 = R(−1 + k) cos 𝜙d𝜙 = R(−1 + k)
∫C1 ∫0 sin 𝜙 ∫0
The horizontal line C2 is defined by the equation y = R. Along such a line only the
x-component of the vector matters. (We hope you can see that fact visually, but for a
more formal approach, write d s⃗ = dx î and see what happens with the dot product.)
R
V ⋅ d s⃗ = (x∕R)dx = R∕2
∫C2 ∫0
The final stretch C3 is defined by x = R. Here we only care about the y-component of
the vector, but be careful: if k is positive then the line integral along this path is
negative because we are going down. (Formally, d s⃗ here is −dy j.) ̂ But no integral is
required to find ∫C3 V⃗ ⋅ d s⃗ = −kR. So the total line integral along the path is:
Since V⃗ is unitless, V⃗ ⋅ d s⃗ should have units of distance, which our answer does. It’s
also not surprising to see k cancel out at the end; consider that the work done by the
force k ĵ in moving up from 0 to R perfectly cancels the negative work done in moving
back down again.
As in the vector case, the math plays out a bit differently depending on whether the curve is
specified as y(x) or parametrically. In Problems 5.232 and 5.233 you will derive the following
formulas.
The line integral ∫C f (x, y, z) ds where C is given as x(t), y(t), z(t) is:
t2 √
f (x(t), y(t), z(t)) (x ′ (t))2 + (y′ (t))2 + (z ′ (t))2 dt (5.8.2)
∫t1
Question: Find the line integral of the function f = xy along the curve y = x 4 from
x = 0 to x = 1.
Solution: √
1 ( ) ( )2 1 √
f (x, y)ds = x x4 1 + 4x 3 dx = x 5 1 + 16x 6 dx
∫ ∫0 ∫0
( )
This can easily be evaluated to give 173∕2 − 1 ∕144 ≈ 0.48.
Stepping Back
There are some topics in math (partial derivatives are a vivid example) where the “mechan-
ics” are easy to learn, but the concept takes a lot of hard thought. For other topics, the
difficulty may lie more with the process. But line integrals, of vector fields in particular, have
the unfortunate property that the concept and the mechanics both take hard work to master.
Worse still, they don’t seem to have much to do with each other.
For understanding the concept, begin with the question “How much does this vector field
point along this curve?” Imagine sailing a boat and asking, not “how hard is the wind blow-
ing?”, but “how much is the wind helping me go?” That question doesn’t seem to have a
multiplication in it, but it does. (If you double the strength of the wind, or if you double the
length of the journey, the answer will generally double.) So for each differential d s⃗ along
the curve, you are multiplying the length of d s⃗—not by the magnitude of the vector, but
by the component of the vector in the direction of d s⃗.
That should make you think of a dot product, which brings us to the mechanics.
When you first meet a line integral, it is in the intimidating-looking form ∫C v⃗ ⋅ d s⃗ with
two or three independent variables. But those variables are related by the fact that they are
confined to lie on the curve C. So the curve—whether it is expressed as y(x) or as x(t), y(t),
z(t)—acts as a conversion formula, allowing you to rewrite both v⃗ and d s⃗ in terms of one
independent variable. At that point, you have a single integral ∫ f (t) dt to evaluate. A line
integral is always a single integral in the end, and you know how to handle that.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 245
5.216 A particle moves from the origin to 5.219 An asteroid falls in a straight line from
the point (1, 1) while being acted on by a distance R1 to a distance R2 . (Begin
the force F⃗ = x î + (x + y)j.̂ by defining your axes.)
(a) Find the work done by this force if 5.220 A satellite goes a fourth of the way around
the particle moves along the x-axis Earth in a circle of radius R.
from the origin to (1, 0) and then 5.221 A particle moves along the path
in the y-direction to (1, 1). x = R cos(𝜔t), y = R sin(𝜔t), z = H − pt
(b) Find the work if instead the particle from t = 0 to t = 4𝜋∕𝜔.
moves in a straight line from the origin 5.222 A rocket goes a fourth of the way around
to (1, 1). Earth in an elliptical path x 2 ∕a 2 + z 2 ∕b 2 = 1,
(c) Find the work if the particle moves y = 0, starting at the point (0, 0, b). (This
along the curve y = x 2 from the origin is an ellipse centered on the Earth, which
to (1, 1). is not what an orbiting object would do,
5.217 A force F⃗ = 5î − 12ĵ acts upon an object but a rocket could be made to for rea-
that moves in a straight line from the sons that we can’t begin to imagine.) Hint:
point (3, 9) to (6, 13). If you end up with an integral that looks
(a) To calculate the work done by this impossible you should be able to evaluate
force on this object, W = ∫C F⃗ ⋅ d s⃗, it is it with some simple trig identities and a
not necessary to perform an integral. u-substitution.
Why not? 5.223 A rocket goes all the way around Earth in
(b) Calculate the work done by this the same ellipse as Problem 5.222.
force on this object. 5.224 A rocket can go wherever its engines take
(c) Of all the constant forces with mag- it, but for an orbiting satellite the center of
nitude 13, which one would have the the Earth is not at the center of a satellite’s
greatest possible dot product along this orbit, but at one focus. Do Problem 5.222
line? (Hint: this is not a calculus-based with√the x coordinate of the orbit shifted
optimization problem. Just think about by a 2 − b 2 (a > b). For this problem
how line integrals work.) Calculate you just need to set up the integral, not
the work done in this case. evaluate it.
(d) Of all the constant forces with mag-
nitude 13, which one would have A wire carrying a current I causes a magnetic field B⃗
the lowest possible dot product along to point along a circle around the wire.
this line? Calculate the work done in Problems 5.225–5.230 are about “Ampère’s Law,”
this case. which states that around any closed curve C,
(e) Find a constant force with magni- ∮C B⃗ ⋅ d s⃗ = 𝜇0 Ienclosed where 𝜇0 is a positive
tude 13 that would do no work at constant.
all (W = 0) along this line.
5.218 A charged particle experiences an elec-
tric force F⃗ = 2x î − ĵ as it moves along I
the path y = x 2 from x = 0 to x = 1. Find
the work done by the electric force on the
particle.
(b) Use your answer and Ampère’s Law (b) Evaluate ∮C B⃗ ⋅ d s⃗ around the cir-
⃗ as a function of I .
to find |B| cle from Part (a) and verify that
5.226 [This problem depends on Problem 5.225.] In it obeys Ampère’s law.
Problem 5.225 you found the field from (c) Repeat Parts (a)–(b) for a square
a single wire carrying current I . of side length R∕2 centered on the
(a) Express your answer as a vector B⃗ central axis of the wire.
in unit vector notation. 5.229 The xy-plane is covered by an infinite sheet of
(b) Consider a circle of radius 𝜌 surrounding metal that carries a current in the y-direction
the wire. Write functions x(𝜙) and y(𝜙) with current per unit length in the x-direction
that parametrize that circle in terms of equal to J . The magnetic field is in the positive
an angle and then use those functions to x-direction for points above the plane and
evaluate ∮C B⃗ ⋅ d s⃗ around the circle. Your the negative x-direction for points below it.
answer should confirm Ampère’s law. You are going to calculate the magnetic field
(c) Still using the magnetic field around a sin- from the sheet of metal by considering a rect-
gle wire, set up the integrals you need to angle in the xz-plane that extends through
evaluate ∮C B⃗ ⋅ d s⃗ around a rectangle that the sheet.
surrounds the wire. You’ll have to choose (a) By symmetry the field must be indepen-
where to put your axes and choose a spe- dent of x and y. By considering more
cific rectangle to integrate around. than one such rectangle, use Ampère’s
law to explain how you can know that the
(d) Evaluate the integral you magnitude of the field is independent of
found in Part (c) and show that z (aside from the direction of the field
it obeys Ampère’s law. being different for z > 0 and z < 0).
(b) Evaluate ∮C B⃗ ⋅ d s⃗ around one of these
5.227 [This problem depends on Problem 5.225.] A
rectangles. Your answer should include
long, thin sheet of metal covers the region ⃗
the as-yet-unknown magnitude |B|.
−L ≤ x ≤ L, −∞ < y < ∞. It is carrying
a total current I in the positive y-direction, (c) Use your answer and Ampère’s
with the current uniformly spread out ⃗
law to find |B|.
across the width of the sheet. 5.230 A thick wire is carrying a current that is not
(a) Using the result you found in Prob- uniformly distributed throughout the wire.
lem 5.225 for the field from a sin- Instead the current density is a function of
gle wire, set up and evaluate a single 𝜌, distance from the central axis of the wire.
integral (not a line integral) for the You measure the magnetic field at various
magnetic field produced at a point points( throughout
) ( the wire
) and find it to be
(x, y, z) by this sheet of metal. B⃗ = k x 2 + y2 −yî + x ĵ . Evaluate ∫C B⃗ ⋅ d s⃗
(b) Take the line integral of the mag- around a circle of radius 𝜌 around the cen-
netic field you found in Part (a) tral axis of the wire and use your result and
around the rectangle with corners at Ampère’s law to find J (𝜌), the current per
(−2L, 0, −L) and (2L, 0, L) and verify unit area as a function of radius.
that it obeys Ampère’s law.
(c) Take the line integral of the magnetic 5.231 A wire bent into the shape z = 2x = y2
field you found in Part (a) around a cir- from (2, 2, 4) to (8, 4, 16) has linear
cle of radius 2L surrounding the sheet and density 𝜆 = y∕z. Find the total mass of
verify that it also obeys Ampère’s law. the wire.
5.228 A thick wire of radius R carries a current 5.232 In this problem you will derive Equation 5.8.2,
I uniformly spread out throughout the the line integral of a scalar function along
cross-sectional area of the wire. The mag- a parametrically expressed curve. Our
netic field at any( point)(x, y) inside the curve is expressed as two functions, x = x(t)
wire is B⃗ = 𝜇0 I ∕ 2𝜋R 2 (−yî + x j).
̂ and y = y(t). (We will work in two dimen-
(a) How much current is enclosed by a cir- sions for the sake of the drawing, but the
cle of radius 𝜌 centered on the central idea doesn’t change in three.) We begin as
axis of the wire (with 𝜌 < R)? always with a differential piece of the
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 248
t x
Once you think about parametrically expressed curves in this more abstract “mathe-
matical” way, you can step up to parametrically expressed surfaces. You express x, y, and z
as functions of two independent variables u and v. This maps a region on the uv-plane to a
surface in three dimensions.
u y
x So a region of the u–v-plane
creates a surface in x–y–z space.
Expressing surfaces in this way takes some getting used to, but it’s important. The simplest
alternative, z(x, y), is constrained to obey a vertical line test—any given (x, y) pair can lead to
only one z-value—so even simple shapes such as a sphere cannot be expressed that way.
The three functions x(u, v), y(u, v), and z(u, v) can be combined together into
the position vector r⃗ = x î + y ĵ + z k. ̂ For instance, the example above could be written
̂ This is of course the same thing written in a slightly differ-
r⃗(u, v) = v cos u î + v sin u ĵ + v 2 k.
ent way. We will see in Section 5.10 that the r⃗ form is sometimes more useful for plugging into
formulas.
(d) Write the parametric equations x(x, 𝜃), (a) In general, which way would you
y(x, 𝜃) and z(x, 𝜃) for this surface. expect this vector to point? (Think
5.251 [This problem depends on Problem 5.250.] about what partial derivatives
mean!)
(a) Write parametric equations for the surface
i. Along a gridline of constant u.
that results when the curve z = sin x on
ii. Along a gridline of constant v.
the xz-plane is rotated about the x-axis.
iii. Neither of the above.
(b) Use a computer to graph your Briefly explain your answer.
equations from Part (a) and confirm that Now you’re going to test it on the surface we
the resulting surface looks right. examined in the Explanation (Section 5.9.2):
̂
r⃗(u, v) = v cos u î + v sin u ĵ + v 2 k.
5.252 [This problem depends on Problem 5.250.] Write
parametric equations for the surface that (b) Draw the surface. Find the point
results when the function y = f (x) on the represented by (u, v) = (0, 1) and
xy-plane is rotated about the x-axis. mark it on your drawing.
5.253 If x(u, v), y(u, v), and z(u, v) are all linear (c) From that point, which way do the grid-
functions, the resulting surface is generally lines of constant u and v go out? Explain
a plane. your answer visually, in terms of the
way this surface draws itself.
(a) To demonstrate this result, let x = au +
bv + c and similarly for y and z (but with (d) Calculate 𝜕 r⃗∕𝜕u at that point. Does it
different constants). Solve the first two point along the gridline of constant
equations for u and v and plug them in to u, of constant v, or neither?
find an equation z(x, y), which should be (e) Calculate 𝜕 r⃗∕𝜕v at that point. Does it
a plane. point along the gridline of constant
(b) Give an example of three linear para- u, of constant v, or neither?
metric functions that do not describe Make sure all your answers agree. By
a plane. What do they describe? the end you should have a simple explana-
5.254 A parametric surface can be expressed as the tion of which way 𝜕 r⃗∕𝜕u points—not just
vector-valued function r⃗(u, v). So the partial for this surface, but in general—and why.
derivative 𝜕 r⃗∕𝜕u is a vector that points in a This result will be important to our work in
certain direction for any given values of u Section 5.10.
and v.
Rainfall is often measured in inches per hour. If that seems strange, imagine placing these
three buckets side by side in a rainstorm. The rising waterline in each bucket, measured in
inches per hour, will be exactly the same.
1 in.2
2 in.2 3 in.2
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 254
1. Although (as stated above) the three buckets will see the same rising waterline, their
total accumulation of water (measured in gallons/hour for instance, or in3 /h) will
be very different. Which of the following is primarily responsible for this difference?
(Assume none of the buckets fills up.)
(a) The tops, through which the rain enters, have different surface areas.
(b) The bottoms, on which the rain lands, have different surface areas.
(c) The buckets are different heights.
(d) The buckets have different volumes.
In the three buckets above, the accumulation will again be different. We are interested in
measuring the total accumulation of water in in3 /h.
2. A rainfall of 0.2 inches per hour is considered “moderate.” After one hour of such a
rain, the middle bucket shown above will have a waterline at 0.2 inches. How much
water (measured in in3 ) will it have accumulated?
3. After 1 h of the same rain, how much water (measured in in3 ) will each of the other
two buckets accumulated? Hint: “accumulated water” is the same as “total amount of
water that has passed through the top.”
4. A rainfall of 0.4 in/h is considered “heavy.” After 1 h of such a rain, how much water
has accumulated in each bucket?
5. Write a formula for the total amount of water that accumulates in a bucket after 1 h.
Your formula should have two independent variables.
See Check Yourself #37 in Appendix L
Now we’re going to change the rate of accumulation without changing either the rate of
the rainfall, or the area of the top of the bucket. How? Tilt the bucket!
. How fast does the water accumulate if we tilt the bucket
6.
90◦ from its starting position?
7. Let r equal the rate of rainfall, A the area of the top of
the bucket, and 𝜃 the angle of tilt, going from 0 in the
original vertical position to 𝜋∕2 in a completely hori- 2 in.2
zontal position. Which of the following is a reasonable
formula for the rate of water accumulation W ?
(a) W = rA
(b) W = rA𝜃
(c) W = rA cos 𝜃
(d) W = rA sin 𝜃
(e) W = rA tan 𝜃
(f) none of the above is reasonable
If you understood Section 5.8 on line integrals you have a head start on this section because
the concepts are analogous.
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 255
A line integral says “Add up this function A surface integral says “Add up this func-
all along this curve.” tion all along this surface.”
The line integral of a vector function is a The surface integral of a vector function is
scalar that measures how much this func- a scalar that measures how much this func-
tion points along the curve. tion points through the surface.
You can often compute a line integral with- You can often compute a surface integral
out actually integrating anything; you find without actually integrating anything; you
the right component (dot product) and find the right component (dot product)
multiply it by the length of the curve. and multiply it by the area of the surface.
If you do have to integrate, use the formula If you do have to integrate, use the formula
for the curve to translate the entire prob- for the surface to translate the entire prob-
lem to a single integral. lem to a double integral.
You can also take the line integral of a You can also take the surface integral of
scalar function—for instance, to add up a scalar function—for instance, to add up
density along a curve. density along a surface.
The Discovery Exercise (Section 5.10.1) is entirely about a surface integral, although it
never uses the word. The vector field is the rainfall, presumed in this case to point straight
down, with a magnitude measured in in/h. The surface is the top of the bucket. The goal is
to measure the total extent to which the rain vector flows through the surface.
And what did we find? The accumulation is proportional to both the magnitude of the
rainfall and the area of the surface, but it also depends on the angle between them. The
final answer to the Discovery Exercise is W = rA cos 𝜃. This can be expressed more concisely
by using a convention, common in such cases, of expressing area as a vector: the magnitude
of this vector is the area itself, and its direction is perpendicular to the surface. If we let A⃗
represent the top of the bucket in this way, we can write W = r⃗ ⋅ A.⃗ As with any dot product, it
depends on the angle 𝜃 between the rain vector and the area vector: an angle of 0 maximizes
the result, and an angle of 𝜋∕2 makes the result zero.
By this point in the chapter we hope you can recite the next paragraph in your sleep.
If the vector field (rainfall) varies, or if the surface twists, we write dW = r⃗ ⋅ d A⃗ to find the
rainfall through a small area over which we consider everything constant. Then we use an
integral—a double integral in this case—to add up the rainfall passing through the entire
surface.
⃯ ⃯
The rainfall through this slice is |dA | dA
times the component ⃯ of this vector field To find the total rainfall,
in the direction of dA . integrate the result across
the entire surface.
V⃗ ⋅ d A⃗ (5.10.1)
∫ ∫S
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 256
The vector d A⃗ is a differential area vector. Its magnitude is the differential area dA and its direction
is normal to the surface at this point. The integrand can therefore be written in the following forms.
| |
V⃗ ⋅ d A⃗ = V⃗ ⋅ (dA n)
̂ or V⃗ ⋅ d A⃗ = |V⃗ | dA cos 𝜃
| |
Here n̂ represents a unit vector normal to the surface at a given point, and 𝜃 represents the angle
between V⃗ and n. ̂
In the special case where S is a closed surface (a surface that completely surrounds a 3D region)
we write ∯S V⃗ ⋅ d A.
⃗
As we did with line integrals, we will begin with a few examples where you do not have to
integrate before diving into the algebra. In each case think mathematically about ∫ ∫S f⃗ ⋅ d A,
⃗
but also think about the physical scenario: the vector field represents the momentum density
of a flowing fluid, and our goal is to measure the rate at which this fluid passes through the
surface.
For a surface expressed as z(x, y) the surface is evaluated as a double integral over x and y.
[ ( ) ( ) ]
𝜕z ̂ 𝜕z ̂ ̂
d A⃗ = − i− j + k dx dy (5.10.3)
𝜕x 𝜕y
These formulas aren’t obvious. We’ll work an example to show you how to use the paramet-
ric one, and then outline an argument as to why it works. You will derive the non-parametric
formula in Problem 5.268.
Question: Section 5.9 discussed the surface x(u, v) = v cos u, y(u, v) = v sin u,
z(u, v) = v 2 on 0 ≤ u ≤ 2𝜋, 0 ≤ v ≤ 3. Find the integral of f⃗ = (z + 3)k̂ through this
surface.
z 4
0 2
‒2
0
0 y
x ‒2
2
You can easily evaluate that integral to find the answer −135𝜋∕2.
Why did it come out negative? If you look at our d A⃗ you can see that it pointed
out-and-down through the bowl. Since f⃗ points upward, the surface integral came out
negative. This does not tell us anything about the problem; remember that direction
is arbitrary, and d A⃗ could just as easily have pointed in-and-up if we had parametrized
the surface differently.
(That formula should make sense if you followed our explanation of the vector formula
above.) If the surface is expressed as z(x, y), then the surface integral of the scalar function
f (x, y, z) is:
√
( )2 ( )2
( ) 𝜕z 𝜕z
f (x, y, z)dA = f x, y, z(x, y) 1+ + dx dy (5.10.5)
∫∫ ∫∫ 𝜕x 𝜕y
S
5.261 The equations x = sinh u cos v, y = sinh u sin v, (c) Calculate the integral of V⃗ through the
z = cosh u where 0 ≤ u ≤ 2 and 𝜋∕6 ≤ v ≤ top of the bowl. Use normal vectors point-
𝜋∕2 describe part of a hyperboloid. The ing upward from the disk. This part
vector field is f⃗ = (1∕y)î + z k.
̂
should not require any calculus.
(d) Using your answers to Parts (a) and (c),
5.262 Rain is falling straight down. Will you get find the integral through the closed
wetter (measured in number of raindrops surface consisting of the bowl and disk
per minute that hit your body) if you are together. Once again, this should not
standing still, or zooming forward on a require actually integrating anything.
bicycle? Why?
(e) In Parts (a), (c), and (d) you calculated
5.263 In this problem, S represents a closed, the integral of V⃗ through three different
permeable surface, and V⃗ represents the surfaces. If you interpret V⃗ as the momen-
momentum density of air. At time t = 0 tum density of a fluid, what does each of
the surface surrounds 1 kg of air. An hour those three integrals tell you about the
later, you come back and measure the flow of the fluid? Explain why your answer
amount of air surrounded by the surface. to Part (d) makes sense physically.
What might you expect to find if…
5.267 In the example “Surface Integral that
(a) ∯ V⃗ ⋅ d A⃗ = 0?
S Does Require Integration” on Page 258
(b) ∯S V⃗ ⋅ d A⃗ > 0? we calculated the integral of a vector field
(c) ∯S V⃗ ⋅ d A⃗ < 0? f⃗ = (z + 3)k̂ through the bowl-shaped sur-
√ face given by r⃗ = v cos u î + v sin u ĵ +
5.264 The equation z = 1 − x 2 − y2 repre- v 2 k̂ on 0 ≤ u ≤ 2𝜋, 0 ≤ v ≤ 3.
sents the top half of a sphere. Your job
(a) Calculate the integral of f⃗ through
is to find the integral of yî + x ĵ + z k̂
a disk of radius 3 in the xy-plane,
upward through this surface.
centered on the z-axis.
(a) Find the answer by parameterizing the
(b) Calculate the integral of f⃗ through
hemisphere and using Equation 5.10.2.
a disk of radius 3 in the z = 9 plane,
Hint: Think about spherical coordinates.
centered on the z-axis.
(b) Find the answer directly by (c) Rank the three surface integrals by magni-
using Equation 5.10.3. tude: through the bowl, through the lower
√ disk, and through the upper disk. Looking
5.265 The equation z = 1 − x 2 − y2 represents the
top half of a sphere. The density of this hemi- at the field f⃗ and the shape and location
sphere is given by 𝜎 = (x 2 + y2 ) kg/m2 . Your of the surfaces, explain why it makes sense
job is to find the total mass of this hemisphere. which is largest and which is smallest.
(a) Find the answer by parameterizing the 5.268 How do we find the formulas for d A⃗ or dA if
hemisphere and using Equation 5.10.4. the surface is expressed as z(x, y)? We could
derive them from scratch, essentially repli-
(b) Find the answer directly by cating the process we used for a parametric
using Equation 5.10.5. representation, but it’s easier to use the formu-
5.266 In the example “Surface Integral that Does las we already derived. Any z(x, y) function is
Require Integration” on Page 258 we cal- a special case of a parametric representation,
culated the surface integral of a vector which we can write as r⃗ = x î + yĵ + z(x, y)k.
̂
field through the bowl-shaped surface (a) Use this form and Equation 5.10.2
given by r⃗ = v cos u î + v sin u ĵ + v 2 k̂ on to derive Equation 5.10.3.
0 ≤ u ≤ 2𝜋, 0 ≤ v ≤ 3. In this problem (b) Use this form and Equation 5.10.4
you’re going to consider the same surface to derive Equation 5.10.5.
with a simpler vector field, the constant
5.269 In Equation 5.10.2 we take the cross product
V⃗ = 3k.̂
of two partial derivatives. Putting 𝜕 r⃗∕𝜕u first
(a) Calculate the integral of V⃗ through is arbitrary, and we could just have easily put
the surface S. 𝜕 r⃗∕𝜕v first. But the cross product is not com-
(b) Write a parametric expression for the disk mutative; we would get a different answer.
that would cover the top of that bowl. How do the two answers differ, and why do
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 262
we get two different answers for the same constant), and r is distance from the par-
question? ticle. The direction of the field is directly
5.270 On page 250 we considered the sur- away from the particle. (This can all be
face r⃗ = v cos u î + v sin u ĵ + v 2 k̂ with expressed by writing E⃗ = kq r⃗∕|⃗ r |3 where
0 ≤ u ≤ 2𝜋, 0 ≤ v ≤ 3. ̂ ̂ ̂
r⃗ = x i + yj + z k.) For this problem we will con-
(a) 𝜕 r⃗∕𝜕u points along the gridline of sider a particle of charge q at the origin.
constant v. Which way does it point? (a) Write parametric equations for a disk
(i.e. which way do you move along of radius w centered on the z-axis at
the surface as u increases?) height H . For the rest of this prob-
(b) 𝜕 r⃗∕𝜕v points along the gridline of con- lem we’ll call this disk D.
stant u. Which way does it point? (b) Find the electric flux: that is, the sur-
(( ) ( )) face integral of E⃗ through D.
(c) If we use 𝜕 r⃗∕𝜕u × 𝜕 r⃗∕𝜕v du dv to
⃗ which way does d A⃗ point?
define d A, (c) The surface x = R sin 𝜃 cos 𝜙, y =
(Hint: don’t do any math. Use your pre- R sin 𝜃 sin 𝜙, z = R cos 𝜃, 0 ≤ 𝜙 ≤ 2𝜋,
vious answers and the right-hand rule.) 0 ≤ 𝜃 ≤ 𝜃max describes a portion of a
sphere. We will call this surface P . Find
(d) Based on your previous answer—and
R and 𝜃max so that the boundary of P is the
again, without doing any math—if
same circle that forms the boundary of D.
we integrate the function k̂ through
this surface, will the answer come (d) Take a scalar surface integral to find
out positive or negative? the area of P . Express your answer
in terms of w and H .
(e) How do your(( answers)change( if ))
we instead
define d A⃗ as 𝜕 r⃗∕𝜕v × 𝜕 r⃗∕𝜕u du dv? (e) Using your answer to Part (d) find
the flux of E⃗ through P . This part
5.271 The picture shows the parallelogram defined should not require an integral.
by vectors a⃗ and b⃗ with angle 𝜃 between them.
(f) What relationship did you find between
We want the area of this parallelogram.
the flux through D and the flux through
P ? If you pretend for a moment that E⃗ is
b h
the mass flow rate of a fluid, what would
θ this result tell you about that fluid?
a
In electrostatics, “Gauss’s Law” promises that if you
(a) If the triangle on the left were cut out draw a closed surface around any collection of
and put on the right, the resulting figure charges, the electric flux—that is, the integral of the
would be a rectangle. What are the width electric field through that surface—is proportional to
and height of this rectangle, in terms the charge enclosed.
of letters given in the drawing?
(b) If your answer included the letter h, Q enclosed ( )
E⃗ ⋅ d A⃗ = 𝜀0 is a constant
rewrite h so that you can express the ∯S 𝜀0
area of the rectangle, and therefore
the parallelogram, in terms of only In Problems 5.273–5.277 you will use Gauss’s Law to
| |
three quantities: ||a⃗||, |b⃗|, and 𝜃. find the magnitude of the electric field in various
| | charge configurations. You should be able to do these
(c) Recalling that the magnitude of the cross
⃗ sin 𝜃, problems without integrating anything.
product of two vectors is |a| ⃗ |b|
you should have shown that the area of 5.273 A sphere of radius R has a uniformly
| |
the parallelogram is |a⃗ × b⃗|. When we distributed charge Q .
| |
used that fact in this section, what were (a) Draw a Gaussian sphere at a radius r > R
a⃗ and b⃗ and what did the parallelogram around this sphere. Based on Gauss’s Law,
represent? what is the total flux through this sphere?
5.272 A positively charged particle produces an (b) We know from symmetry that the
electric field E⃗ around itself. The magni- electric field through your sphere
tude of the electric field is kq∕r 2 where k is always points directly outward, per-
a constant, q is the particle’s charge (also pendicular to the sphere, and that its
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 263
| |
magnitude |E⃗| is the same everywhere on in string theory for why the world appears
| | 3D to us and why gravity appears to obey
the sphere. Based on those facts, find the
flux of E⃗ through the sphere as a func- an inverse square law, but they are too com-
| | plicated for us to get into here. Go forth
tion of |E⃗|. (In this part you will ignore
| | and study.
Part (a).)
5.276 An infinitely long straight wire carries
(c) Putting your two answers together,
| | a uniform charge density 𝜆.
calculate |E⃗| outside the sphere of
| | (a) Begin by drawing the wire. Around it you
charge as a function of r .
will draw a Gaussian cylinder, consist-
5.274 [This problem depends on Problem 5.273.] ing of three parts: a tube at a distance 𝜌
Repeat Problem 5.273 for a Gaussian from the wire with length L, and circular
sphere of radius r < R to find the elec- caps on each end of the tube. (Remem-
tric field inside the sphere of charge. ber that Gauss’s Law applies only to a
Hint: the difference comes in the very closed surface, so we need those caps!)
first step, when you have to calculate
Q enclosed . (b) Find the charge enclosed by the
surface. Based on this and Gauss’s
5.275 [This problem depends on Problem 5.273.] The Law, calculate the flux of the elec-
novel “Flatland” by Edwin Abbott describes tric field through the surface.
a 2 dimensional world inhabited by crea-
tures shaped as squares, triangles, and other (c) Based on symmetry considerations, which
polygons for whom “up” and “down” are syn- way does the electric field point?
onymous with “North” and “South.” Suppose (d) Calculate the flux of the electric field
a Flatland physicist figured out that the flux through the caps at the top and bot-
of the electric field through any closed loop tom of the Gaussian surface.
in Flatland was proportional to the charge (e) Calculate the flux of the electric
enclosed. He writes flux as ∮ E⃗ ⋅ d 𝜆⃗ where d 𝜆⃗ field through the cylindrical part
is an outward-pointing vector normal to the of the surface. Your answer will be
| |
curve with magnitude equal to the arc length a function of 𝜌, L, and |E⃗|.
| |
d𝜆. (Note that d 𝜆⃗ is perpendicular to our (f) Putting your answers together, calculate
usual d s⃗.) | ⃗|
|E | as a function of distance to the wire.
(a) Using a similar argument to the one | |
you used in Problem 5.273, find the 5.277 An infinite plane carries a uniform
flux through a circle of radius r around charge density 𝜎.
a point charge Q and use the 2D ver- (a) Begin by drawing the plane. Around
sion of Gauss’s law to find the electric it you will draw a Gaussian “pillbox”:
field in Flatland as a function of r . a circle of radius R at a distance L∕2
(b) Some versions of string theory predict above the plane, an identical circle
that space is 10 dimensional. The the- at distance L∕2 below the plane, and
ory also predicts that gravity should obey a cylinder connecting them.
Gauss’s law: the flux of the gravitational (b) Find the charge enclosed by the
field through any closed surface is pro- surface. Based on this and Gauss’s
portional to the mass/energy enclosed. Law, calculate the flux of the elec-
Using a similar argument to the one in tric field through the surface.
Part (a), find how the gravitational field (c) Based on symmetry considerations, which
from a point mass depends on r in a 10 way does the electric field point?
dimensional space. You don’t need to find
(d) Calculate the flux of the electric field
the constant coefficient, just the r depen-
through the circles at the top and bot-
dence. Hint: You can figure out how the
tom of the Gaussian surface.
surface area of an n dimensional sphere
depends on its radius just by thinking (e) Calculate the flux of the electric field
about units. through the cylindrical part of the surface.
You should have concluded that gravity in (f) Putting your answers together, calculate
| ⃗|
a 10 dimensional universe does not obey an |E | as a function of distance to the plane.
| |
inverse square law. There are explanations
7in x 10in Felder c05.tex V3 - February 26, 2015 9:56 P.M. Page 264
d
E⃗ ⋅ d s⃗ = B⃗ ⋅ d A⃗
∮ dt ∫ ∫
loop area
circle centered on the wire. For this prob- (a) Find the flux of B⃗ through a square with
lem we will consider a wire along the z-axis corners at (L, 0, 0) and (2L, 0, L).
with a current I moving upward.
(b) If you increase the current I in the
straight wire at a constant rate, rising
I
from 0 to If in time 𝜏, find the emf
around the square loop in Part (a).
ρ (c) Take L = .04 m. In what time 𝜏 would you
have to raise the current in the wire from
0 to If = (1∕10) A in order to produce
an emf equivalent to 9 V in the square
loop?
CHAPTER 6
Linear Algebra I
∙ find the components of a vector and use them to represent the vector in terms of the basis
vectors î and j.
̂
∙ solve two linear equations for two unknowns.
∙ find the general solution of a linear differential equation by the method of “guess and
check” (see Chapter 1).
After you read this chapter, you should be able to . . .
∙ multiply a matrix by a column matrix to transform information from one form to another,
or to visually transform a vector.
∙ express any vector as a linear combination of any set of basis vectors.
∙ multiply matrices to create a single matrix that performs multiple transformations.
∙ find and use an inverse matrix to reverse a transformation.
∙ find the eigenvectors and eigenvalues of a matrix, and use them to define a natural basis
for a transformation.
∙ use matrices to rewrite and solve coupled differential equations.
Not knowing your personal history with matrices, we’re going to guess that it looks something
like this. Your high school teacher told you that you multiply matrices by taking the rows of
this matrix, turning them sideways, and multiplying them by the columns of that matrix,
“because I said so.” It didn’t make any sense and it certainly didn’t seem to have any point,
but you dutifully mastered the algorithm, passed the test, and promptly forgot all about it.
We request that you set aside what you learned previously. (If you never have learned about
matrices, you’re that much ahead of the game.) This small step back to the very beginning
of matrices will pay big dividends as we make our way through linear algebra.
6.1 | The Motivating Example on which We’re Going to Base the Whole Chapter 267
If you skip this section, the rest of the chapter should still make sense, and you can skip
the parts that say “Application to the Three-Spring Problem.” But if you do take the time up
front to work through this solution then the individual pieces will be more coherent, because
you will see how they fit together as you’re moving through them.
At the end of the chapter we’ll return to this same problem and walk you back through it
with all the steps filled in.
k1 k2 k3
m1 m2
x1 = 0 x2
FIGURE 6.1 The three-spring problem with Ball 1 at equilibrium and Ball 2 displaced to the right.
With a bit of introductory physics (Problem 6.10) you can derive the appropriate differ-
ential equations. For m1 = 2 kg, m2 = 9 kg, k1 = 2 N/m, k2 = 6 N/m, and k3 = 21 N/m, the
equations become:
d 2 x1
= −4x1 + 3x2 (6.1.1)
dt 2
d 2 x2 2
2
= x1 − 3x2 (6.1.2)
dt 3
We begin by noticing that Equation 6.1.1 writes ẍ1 as a linear combination of x1 and x2 , and
Equation 6.1.2 does the same for ẍ2 . (If you’re not familiar with that notation, ẋ means dx∕dt
and ẍ means d 2 x∕dt 2 and so on.) In general, a linear combination of x and y is any function
ax + by where a and b are constants. Linear algebra is all about working with that particular
kind of function, using a mathematical tool called a “matrix.”
Section 6.3 will discuss rewriting linear equations with matrices. In Section 6.3 Problem 6.40
you will rewrite Equations 6.1.1 and 6.1.2 as a single matrix equation. This simple change in
notation will allow us to bring all the tools of matrix mechanics to bear on this problem.
Throughout this section, you will see other arrows like the one above. Each one will point to a section
of this chapter where you will learn how to use matrices to solve part of this problem. In some cases your
first reaction may be “I don’t need matrices to solve this.” But this example has only two equations with
two unknowns: engineers frequently work with hundreds of equations and unknowns, and matrices give
us the approach that scales up to that size.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 268
As we will see below, most possible behaviors of this system do not qualify as normal modes.
For most initial conditions the ratio x2 ∕x1 will not remain constant and the balls will not
oscillate with the same frequency (or with any constant frequency). But there is one other
normal mode, described by the following solution.
(√ )
x1 (t) = B cos 5t
(√ ) (6.1.4)
1
x2 (t) = − B cos 5t
3
Once again this solution represents a very specific set of initial conditions: you pull Ball 1 a
certain distance from equilibrium, pull Ball 2 a third of that distance in the opposite direction,
and release them from rest. The balls move inward at the same time, and then outward. But
once again the frequencies are the same, the motion is simple and periodic, and the ratio
x2 = −(1∕3)x1 persists.
1 If the initial velocities were non-zero we would need to add sine solutions as well as cosines: see Problem 6.12.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 269
6.1 | The Motivating Example on which We’re Going to Base the Whole Chapter 269
You might now expect that for every initial x2 ∕x1 there is a certain normal mode with a
certain frequency associated with it, but in fact Equations 6.1.3 and 6.1.4 represent the only
two normal modes of this particular system. There are no more.
So we’ve learned a lot about our system, but you’re probably not ready to sound the victory
bells. First of all, what if x2 doesn’t happen to start at (2∕3)x1 or −(1∕3)x1 ? And second of all,
how did we find those solutions in the first place? Below we will outline the answers to both
of those questions, but we will have to leave a number of holes that will be filled in with
matrices.
√
The balls will oscillate with a frequency of 5 and with amplitudes 12 and 4. Their
displacements will always be in opposite directions (out of phase by 𝜋).
There is no way to get either of our two solutions to fit this initial condition, and there
is no normal mode with x2 ∕x1 = 5∕12. It seems that our solutions so far are only
useful in a few very carefully selected cases.
But Equations 6.1.1 and 6.1.2 are linear, homogeneous differential equations, so
any combination of solutions is a solution. If we choose A = 3 in the first normal
mode and B = 1 in the second normal mode, and then add the solutions, we get a new
solution. (√ ) (√ )
x1 (t) = 3 cos 2 t + cos 5t
(√ ) (√ ) (6.1.5)
1
x2 (t) = 2 cos 2 t − cos 5t
3
In Problem 6.13 you will plug these in to verify that they solve the differential
equations and our initial condition. But it’s more important to see that they work
without plugging them in, by following this logic.
√ √
∙ The solution x1 = 3 cos( 2 t), x2 = 2 cos( 2 t) solves the differential equations;
we know this because it fits our first normal mode. It meets the initial conditions
x1 (0) = 3, x2 (0) = 2.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 270
√ √
∙ The solution x1 = cos( 5 t), x2 = −(1∕3) cos( 5 t) solves the differential equations
because it is our second normal mode. It meets the initial conditions x1 (0) = 1,
x2 (0) = −1∕3.
∙ When we add these two solutions the result must still satisfy the differential
equations, and must meet initial conditions x1 (0) = 4, x2 (0) = 5∕3.
The “Second Example” above is the most important part of this section, capturing an
entire solution in the two numbers A = 3 and B = 1. In other words the system starts out
in a state that can be represented as 3 of the first normal mode plus 1 of the second nor-
mal mode. We know how each of these normal modes evolves in time, so we know how
the entire system will evolve. Hence, the question “how do I find the behavior of this sys-
tem?” is replaced with “how do I express the initial state as a linear combination of the two
normal modes?”
Section 6.3 will use matrix multiplication to convert from normal modes to positions.
Section 6.4 will reframe that process as a “change of basis” and give the conditions
under which such a conversion can reliably be done.
We need to make one more point about the above solution: the resulting motion is not
simple. If you were to watch the two balls the motion would look almost random, and it would
not be periodic no matter how long you looked. But hidden under the randomness would
be two simple, periodic systems being added together.
The moral of Figure 6.2 is that if you look at the system and ask “Where is Ball 1 and
where is Ball 2 over time?” the answer looks chaotic. You see the underlying order if you ask
instead “How much of the first normal mode and how much of the second normal mode do
I have?”
2 2 2
t t t
10 20 30 40 10 20 30 40 10 20 30 40
–2 –2 –2
–4 –4 –4
(a) (b) (c)
FIGURE 6.2 Plots (a) and (b) show normal modes, which are periodic. Plot (c) shows the motion of x1 (t) in the
second example above. It’s a sum of periodic motions, but it is not periodic.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 271
6.1 | The Motivating Example on which We’re Going to Base the Whole Chapter 271
Finding the numbers 26/3 and –11/3 in the last example required us to find just the right
linear combination of solutions to match the desired initial conditions. In other words,
it requires us to translate from “positions of the balls” to “amplitudes of the normal modes.”
Section 6.6 will find the inverse of the normal-modes-to-positions matrix to accomplish this.
3C1 − 3C2 = 0
2 (6.1.6)
C − 2C2 = 0
3 1
Section 7.4 (see felderbooks.com) will discuss how to solve simultaneous linear equations. (Remember
again that we need to be able to do this with 100 equations, not just with two as in this example.)
You can probably solve Equations 6.1.6 in a minute or two, but we’ll save you the trou-
ble and tell you that C1 = C2 = 0. (Looks obvious now, doesn’t it?) That leads us to x1 (t) =
x2 (t) = 0, sometimes called the “trivial solution”: it does satisfy Equations 6.1.1 and 6.1.2, but
it isn’t very helpful. We want to know how the balls move, not how they stand still.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 272
So “cos t” didn’t work out too well. What if we change the frequency? In Problem 6.16 you
will guess oscillatory functions with many different frequencies, but they will lead back to the
same place. √
So here comes √a particularly lucky guess: what if we just happen to try x1 = C1 cos( 2 t)
and x2 = C2 cos( 2 t)? Then we end up here.
Once again C1 = C2 = 0 fits, but this time it isn’t the only solution. Equation 6.1.7 can be
rewritten as C2 = (2∕3)C1 . Equation 6.1.8 can be written the same way. That means any num-
bers C1 and C2 with that particular ratio will solve both equations. So this time we have a
useful solution—an infinite number of them, in fact—which are the first normal mode we
saw above.
(√ )
x1 (t) = A cos 2t
(√ )
x2 (t) = (2∕3)A cos 2t
Why was it that Equations 6.1.6 got us nowhere, but Equations 6.1.7 and 6.1.8 worked so
well? Because the first set of equations had only one solution; the second set of equations
was “linearly dependent” (or “redundant”), and had infinitely many solutions.
Section 6.7 will explain how to determine if a set of simultaneous equations has only one solution,
as opposed to being “linearly dependent” (infinitely many solutions) or “inconsistent” (no solutions).
It will show how you can find the frequencies (√2 and √5 in this example) if you assume cosine solutions.
Finally, what led us to expect that we could find a normal mode solution in the first place?
It turns out that in almost every case linear differential equations for coupled oscillators have
normal mode solutions, and the general solutions can be built up from linear combinations
of these solutions.
If you’ve seen linear algebra before, you may remember finding “eigenvectors” and “eigenvalues.”
In Section 6.8 you’ll see that the eigenvectors of a matrix tell you the normal modes of a system,
and the eigenvalues tell you the frequencies of those normal modes.
Stepping Back
One of our goals in this section is to have you understand normal modes as a way of thinking
about the behavior of a coupled oscillator. Equations 6.1.3 represent a simple, periodic, and
self-perpetuating solution to the differential equation: that is, if the system is ever in that state,
it will stay in that state forever. Equations 6.1.4 represent another self-perpetuating solution.
Any combination of these solutions is also a solution (because the differential equations are
linear and homogeneous). Most remarkably, any state of the system can be represented as
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 273
6.1 | The Motivating Example on which We’re Going to Base the Whole Chapter 273
a linear combination of these solutions—so once you know the combination, you know just
how it will evolve.
But we also want to set up the two key places where matrices play into the process.
∙ We can use a matrix to convert from A and B to x1 (t) and x2 (t). (“If the first coefficient
is 7 and the second is −4, what are the balls actually doing?”) As you will see this is a
job ideally suited for matrices, because it is a linear transformation from one multivari-
able representation to a different multivariable representation of the same state. An
inverse matrix performs the same conversion in the other direction, finding the right
combination of normal modes to match any given initial conditions.
∙ We can also use a matrix to find the normal modes in the first place. The “eigenvalues”
of a matrix give the frequency of each normal modes, and the “eigenvectors” give the
ratio of x1 to x2 .
These two uses of matrices are not unrelated, but that won’t be clear until you have learned
about eigenvalues. Focus now on how two numbers such as “A = 26∕3 and B = −11∕3” can
describe the entire behavior of the system—the initial conditions and how they will evolve
through time—and you will begin the chapter on a solid foundation.
6.1 | The Motivating Example on which We’re Going to Base the Whole Chapter 275
(a) The position x1 = x2 = 0 represents (b) Show that the initial conditions
the equilibrium position. Now imagine ẋ 1 (0) = ẋ 2 (0) = 0 lead to A2 = 0 and
that Ball 1 is at this position precisely, therefore to the solution we used in the
but Ball 2 is slightly to the right of this Explanation (Section 6.1.1).
position, as shown in Figure 6.1. Which (c) Do Equations 6.1.9 represent a nor-
springs now push on Ball 1, and in which mal mode of the system?
directions? Which springs now push on (d) The three-spring problem has four ini-
Ball 2, and in which directions? tial conditions: the initial position and
(b) Now imagine that Ball 1 is displaced to velocity of Ball 1 and the initial posi-
the right by a distance x1 , and Ball 2 is tion and velocity of Ball 2. What must be
displaced to the right by x2 . This time true of all the initial conditions for the
the answers will be quantitative, and they balls to follow Equations 6.1.9?
will include the three spring constants
6.13 In our second example we created a new
k1 , k2 , and k3 . We begin with Ball 1.
solution by combining the two normal
i. The force of Spring 1 on Ball 1
modes, along with carefully choosing val-
depends only on the position x1 (the
ues of the arbitrary constants.
other position is irrelevant). Write
a formula for this force. (a) Show that Equations 6.1.5 are a valid solu-
ii. How much is Spring 2 stretched? tion to Equations 6.1.1 and 6.1.2.
Your answer should be a function of (b) Use Equations 6.1.5 to find x1 (0) and
x1 and x2 . For example, if both balls x2 (0) and confirm that they meet the
are displaced the same amount to the intended initial conditions.
right then Spring 2 isn’t stretched at 6.14 Arguably the most important point in our
all. As a check on your answer, make treatment of the three-spring problem—both
sure it gives a positive answer when for understanding normal modes, and for
Spring 2 is longer than its equilib- setting up linear algebra—is creating new
rium length, a negative answer when solutions as linear combinations of the two
it is shorter, and 0 when Spring 2 is solutions we found. (This works for any lin-
at its equilibrium length (neither ear homogeneous differential equation.)
stretched nor compressed.) (a) Show that if you add Equations 6.1.3 to
iii. Using your answer to Part (ii), Equations 6.1.4 the resulting functions
write a formula for the force of are valid solutions to Equations 6.1.1 and
Spring 2 on Ball 1. 6.1.2 for any values of A and B.
iv. Putting the two forces together
and using F = ma, write a differen- (b) What are the initial positions x1 (0) and
tial equation for x1 (t). Be sure you x2 (0) for these generalized solutions? Your
have the sign of each force correct. answers will be functions of A and B.
Hint: One way to check yourself is (c) Now turn it around; starting with your
to plug in numbers and make sure answers to Part (b), solve for A and B
you get Equation 6.1.1! as functions of x1 (0) and x2 (0). In doing
(c) Repeat Part (b) for x2 . so, you find a direct way to meet any
given initial positions, as we did in our
6.11 Plug Equations 6.1.3 into Equations 6.1.1 third example. (Note that we still do not
and 6.1.2 to confirm that they are valid have a fully general solution to the three-
solutions for any constant A. spring problem. We left off all the sine
6.12 Our treatment of the three-spring problem terms based on our initial assumption
was incomplete because we looked only at the that both balls start at rest, so we don’t
cosine parts of the solutions, ignoring the sines. have to worry about v1 (0) and v2 (0).)
(a) Show that the following equations are (d) Write the solution if you pull Ball 1 to the
valid solutions to Equations 6.1.1 and left by a distance of 1 and Ball 2 to the right
6.1.2 for any constants A1 and A2 . by the same amount and release them.
(√ ) (√ )
x1 (t) = A1 cos 2 t + A2 sin 2t 6.15 A key point in our explanation is that a
(√ ) (√ ) normal mode represents simple oscillation with
2 2
x2 (t) = A1 cos 2 t + A2 sin 2t one frequency, but that combinations of normal
3 3
(6.1.9) modes are generally not simple or periodic.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 276
Often they do not appear to be built from sines and 6.1.11 for any value of A, provided
and cosines (even though they are). Graph you choose the right value of k. Show
each function below from t = 0 to t = 30. that this works by finding the right value
(√ )
(a) sin 2 t . of k.
(√ ) (b) Show that the functions x1 = C cos(2t),
(b) sin 5 t .
(√ ) (√ ) x2 = kC cos(2t) do not provide a valid
(c) sin 5 t + sin 2 t . solution unless C = 0.
(√ ) (√ ) (√ )
(d) 3 sin 5 t + 2 sin 2 t . (c) Show that the functions x1 = D cos 2 t ,
(√ )
6.16 In this problem you will solve Equations 6.1.1 x2 = kD cos 2 t do not provide a
and 6.1.2 by guess and check. valid solution unless D = 0.
(√ )
(a) Plug in the solution x1 = C1 cos(2t), x2 = (d) The functions x = B cos 7 t , x2 =
(√ ) 1
C2 cos(2t). Write down and solve the result- kB cos 7 t provide a valid solu-
ing two linear equations for C1 and C2 . tion to Equations 6.1.10 and 6.1.11
What motion does the resulting solution for any value of B, provided you
describe? choose the right value of k. Show that
(√ )
(b) Plug in the solution x1 = C1 cos 3 t , this works by finding the right value
(√ )
x2 = C2 cos 3 t . Write down and of k.
solve the resulting two linear equations (e) Parts (a) and (d) represent solutions,
for C1 and C2 . What motion does the but are they normal modes? How can
resulting solution describe? you tell?
(c) Plug in the solution x1 = C1 cos(𝜔t), x2 = (f) What initial conditions are repre-
C2 cos(𝜔t) where 𝜔 is a constant. Show that sented by the solutions to Parts (a)
for any value of 𝜔, you can solve the result- and (d)?
ing linear equations with C1 = C2 = 0. (g) Find a solution to match the initial
(√ )
(d) Now plug in√ the solution x1 = C1 cos 5 t , conditions x1 = −2, x2 = 2.
( )
x2 = C2 cos 5 t . Show that the result- (h) Find a solution to match the initial
ing linear equations both reduce to the conditions x1 = 10, x2 = 2.
same equation, and are therefore solved
as long as C1 and C2 are in the correct ratio. 6.18 Each of the scenarios below refers
This ratio should lead you to our second to a solution to Equations 6.1.1–6.1.2. For
normal mode. each one have a computer create an ani-
mation of two moving balls. Draw them as
6.17 In this problem you will walk through the
balls oscillating about their equilibrium
entire three-spring problem, from the begin-
positions, which you should separate by
ning, with different masses and spring con-
a distance of 3.
stants. The differential equations you will
solve are: (a) Show the two balls oscillating in the
first normal mode, with A = 2.
d 2 x1 (b) Show the two balls oscillating in the sec-
= −5x1 + 8x2 (6.1.10)
dt 2 ond normal mode, with B = 2.
2
d x2 (c) Show the two balls oscillating in a sum of
= x1 − 3x2 (6.1.11)
dt 2 the two normal modes with A = B = 1.
(a) The functions x1 = A cos t, x2 = kA cos t (d) Describe the motion in each of
provide a valid solution to Equations 6.1.10 the three cases.
When we say this is a “matrix” we’re referring to the grid of numbers. The labels (such as
“Granger, H” or “Poison”) explain what this particular matrix represents, but they are not
part of the matrix itself.
The matrix can be viewed as a list of horizontal rows or as a list of vertical columns.
1. Is “all Longbottom’s grades” a row or a column?
2. Is “all grades on Invulnerability” a row or a column?
The “dimensions” of a matrix are the number of rows and columns . . . in that order. So a
10×20 matrix means 10 rows and 20 columns.
3. What are the dimensions of the gradebook matrix?
For two matrices to be “equal” they must be exactly the same in every way: same dimen-
sions, and every element the same. If everything is not precisely the same, the two matrices
are not equal.
4. What must x and y be, in order to make the following matrix equal to Professor Snape’s
gradebook matrix?
⎛ 100 105 99 100 ⎞
⎜ 80 x + y 85 85 ⎟
⎜ 95 90 10 85 ⎟
⎜ ⎟
⎜ 70 75 x − y 75 ⎟
⎝ 85 90 95 90 ⎠
If two matrices do not have exactly the same dimensions, you cannot add or subtract them.
If they do have the same dimensions, you add and subtract them just by adding or subtracting
each individual element.
As an example: Professor Snape has decided that his grades are too high, so he decides
to subtract the following grade-curving matrix from his original grade matrix.
⎛5 0 10 0 ⎞
⎜5 0 10 0 ⎟
⎜5 0 10 0 ⎟
⎜ ⎟
⎜5 0 10 0 ⎟
⎝5 0 10 0 ⎠
Multiplying something by 3 always means adding it to itself three times, so once you know
how to add matrices, you can figure out how to multiply a matrix by a number.
7. If we call the gradebook matrix 𝐆, what is matrix 3𝐆? In other words, what is
𝐆 + 𝐆 + 𝐆?
The plural of matrix is “matrices” (not “matrixes.”) The singular of matrices is “matrix” (not
“matrice”).2
You can see that the matrix at the beginning of this section fits none of these categories,
and that a 1×1 matrix fits all of them; it’s a column matrix, row matrix, square matrix, and
diagonal matrix.
All of these properties and definitions, as well as others we will introduce throughout the
chapter, are collected in Appendix E. If we refer to the “main diagonal” (introduced here)
or to an “orthogonal matrix” (introduced later) and you don’t remember what that is, be
sure to have that appendix conveniently bookmarked.
(d) What is M23 + M14 ? 6.22 For each equation below, solve for the values of
all the letters or explain why it’s not possible.
6.20 For each of the operations indicated below ( ) ( ) ( )
2x x+y 6
give the answer or explain why the oper- (a) + =
5y −6x 2
ation cannot be carried out.
( ) ( ) ( )
( ) ( ) x+y 4x + y 10 6
3 −3 0 2 6 4 (b) + =
𝐀= 𝐁= 3x − 2y x + 5y 8 9
2 4 −1 9 n 8
( ) ( ) ( )
5 7 2 4
𝐂= (c) x =
9 −n 3 6
6.23 For each equation below, solve for the values of
(a) 𝐀+𝐁 all the letters or explain why it’s not possible.
( ) ( ) ( )
(b) 2𝐂 1 2 1 w x 1 1
(a) + =
(c) 𝐁+𝐂 3 4 2 y z 0 0
(d) 2𝐀 − 𝐁 ( ) ( ) ( )
x+y 3 1 2
(e) A12 + C21 (b) − =
2 x−y 3 4
6.21 For each of the operations indicated below ( ) ( )
give the answer or explain why the oper- 2 4
(c) x =
ation cannot be carried out. 3 8
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 280
6.24 Write the 3 × 3 matrix defined by Mij = i 2 − j 2 . Answer the following questions in matrix
6.25 Write a formula Mij)for the elements of form. For instance, for the question
( “what supplies do you need to stock
3 5
the matrix . A formula like that three small bathrooms?” you would not
4 6
is how you write a matrix in “index nota- answer “three shampoo bottles, six . . .” but
tion.” (See e.g. Problem 6.24.) rather:
6.26 In order to stock a small hotel bathroom,
you need one bottle of shampoo, two bars
shampoo bottles ⎛ 3 ⎞
of soap, and four towels. A large hotel bath- ⎜ 6 ⎟
3𝐒 = soap bars
room requires two bottles of shampoo, three ⎜ ⎟
bars of soap, and five towels. We represent
towels ⎝ 12 ⎠
this information with column matrices as
follows. (Remember that the labels are not
part of the matrices per se, but they tell us (a) What supplies would you need to
what the elements represent.) stock four large bathrooms?
(b) What supplies would you need to stock
shampoo bottles ⎛1⎞
⎜2⎟ three small bathrooms and four large
𝐒= soap bars
⎜ ⎟ bathrooms?
towels ⎝4⎠
(c) What supplies would you need to
shampoo bottles ⎛2⎞ stock n small bathrooms and m large
𝐋= soap bars ⎜3⎟ bathrooms?
⎜ ⎟
towels ⎝5⎠
Think of the 200 as a conversion from “number of books” (on the right) to “number of pages”
(on the left). The last part of this problem indicates that the conversion is reversible, a very
useful property of some—but not all—conversions.
We can express problem 2 the same way, but we want to convert a single-fact (number of
books) to a double-fact (number of words and number of pictures). The conversion there-
fore requires two pieces of information: words per book, and pictures per book. We use a 2×1
column matrix to represent each double-fact and write, analogously to our equation above:
We have left question marks where the answers go, although they aren’t hard to figure out.
But focus on the transformation process this equation represents.
∙ On the far right is a column matrix representing the information we start with. Its
labels go on its left.
∙ The matrix in the middle acts as a conversion from its labels on top, to its labels on the
left—in this case, from kiddy and chapter books to pictures and words.
∙ On the far left, on the other side of the equal sign, is a column matrix representing
the information we end up with, also labeled on the left.
Once you understand the setup of the problem, it becomes obvious how to solve it. We can
represent the contents of 4 kiddy books and 6 chapter books as follows.
( ) ( )
30 10
4 +6 (read “4 kiddy books plus six chapter books”)
500 20, 000
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 282
This gives us the numbers we need to fill in the question marks in our previous equation.
There are 180 pictures and 122,000 words in total. More importantly, this example gives
us the general algorithm for multiplying a matrix (on the left) by a column matrix
(on the right).
If the number of columns in the left-hand matrix does not equal the number of elements in the
right-hand matrix, then the operation is illegal. Otherwise, the product is given by the matrix:
E1 𝐜𝟏 + E2 𝐜𝟐 + E3 𝐜𝟑 + . . . + En 𝐜𝐧
In other words, the column matrix on the right supplies the coefficients for a linear com-
bination of the vectors represented by the left-hand matrix. As usual, an example is worth a
thousand explanations.
Whenever you multiply matrices you are creating a linear combination of somethings. Each
column in the left-hand matrix represents one “something”; the right-hand matrix represents
the coefficients. In our example above we multiplied matrices to combine 4 kiddy books with
6 chapter books. Section 6.4 will discuss another important example, spatial vectors. You will
work through many other examples in the problems.
“But I was taught to think of the left-hand matrix as rows, not columns,” we hear some of you
cry. If you know that method you should be able to convince yourself that it is equivalent to
the one we’re presenting (see Section 6.5 Problem 6.91), and that either one takes about as
much work as the other. But we hope to convince you that thinking of the column on the
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 283
right as a series of coefficients for the columns of the matrix on the left gives you insight into
the meaning of matrix multiplication that the other method doesn’t.
For instance, in our third example (p. 271) we saw that A = (26∕3) and B = −(11∕3) allowed
us to match the initial conditions x1 (0) = 5, x2 (0) = 7. We can now write that relationship as
a matrix equation.
Stop and make sure that you can clearly see all the following information in that one
equation.
∙ The first normal mode represents initial conditions x1 = 1 and x2 = 2∕3; the second
normal mode, x1 = 1 and x2 = −1∕3.
∙ The matrix multiplication says “I am multiplying the entire first normal mode (the first
column) by 26∕3, the second normal mode by −11∕3, and adding the results.”
∙ The end result converts the given information from “this much of the first normal
mode and that much of the second” to “this initial position for the first ball and that
for the second.”
So the matrix multiplication above allows us to answer the question: given any values of A
and B (the arbitrary constants in the solution), what are the corresponding initial conditions?
In Section 6.6 we will use the “inverse matrix” to answer the opposite question: given a set
of initial conditions to meet, what should we choose for A and B? (That’s how we got the
numbers 26∕3 and −11∕3 in the first place.)
Stepping Back
We began this section with p(b) = 200b because you can think of a matrix as a kind of func-
tion, transforming each input into an output. For a regular function the input and output
are numbers. For a matrix the input and output are lists of numbers, not necessarily the
same length, which we write in vertical columns. In that sense matrices are more general
than functions, but they can only represent transformations where each output is a linear
combination of the inputs. If x = a 2 − e b and y = ln(ab), you can’t use matrices to convert
from a and b to x and y. Hence the name “linear algebra.”
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 284
The way you apply a matrix to a list of numbers is through matrix multiplication:
( )( ) ( ) ( ) ( )
1 3 5 1 3 23
=5 +6 =
2 4 6 2 4 34
The 2×2 matrix in the above example might represent the following information: “Each
small structure contains 1 ton of steel and 2 tons of concrete, while each large structure
contains 3 tons of steel and 4 tons of concrete.” We could then interpret the above matrix
multiplication as a conversion from “5 small and 6 large structures” to “23 tons of steel and
34 tons of concrete.”
small large
( ) ( ) ( )
steel 1 3 small 5 steel 23
=
concrete 2 4 large 6 concrete 34
Watch how the matrix multiplication, when you view the process as we presented it, explic-
itly says “5 of these and 6 of those.” Watch how this problem, the “books” example above,
the normal modes of the three-spring problem, and the many different examples you will
encounter in the problems all use linear combinations to convert information from one
form to another. This single idea is powerful enough to make its presence felt in all areas of
physics and engineering.
A transformation matrix always converts from the labels on its top to the labels on its left.
That rule will help you keep your variables straight even when the variables are not explicitly
written, which they usually aren’t. You know that the statement “every book has 200 pages”
would generally be represented by an equation like p = 200b. Of course that equation doesn’t
mean anything unless you know that p represents the number of pages and b the number of
books, but we generally write the equation without the labels and let the interpretation come
later. Similarly with matrices: we are labeling them all explicitly right now to emphasize how
the information in the top labels is transformed to the information in the left-hand labels.
(Small and large structures to steel and concrete.) But it is more common to do the math
with unlabeled numbers and let the interpretation come at the end.
x x
For this exercise your job is entirely visual: no calculations are involved, so your answers
will be approximate.
.
y v1 1. Draw the vector −A⃗ + B.
⃗
2. ⃗
Draw the vector (1∕2)A + 2B.⃗
v2 3. Draw the vector 0A⃗ + 0B.
⃗
v3 4. The graph on the left shows three vectors. For each vector, esti-
x
mate how many As⃗ and Bs ⃗ would add up to that vector. In other
words, find a and b to express each vector as a A⃗ + b B.
⃗
To approach such a system you break the force of gravity FG into two
FN
components. The “parallel” (to the ramp) component F∥ causes acceler-
ation; the “perpendicular” component F⟂ causes a normal force that in Ffr
m
turn causes friction. In effect you are creating a new set of axes that is
θ
more natural to this problem than the traditional horizontal and vertical
FG F
axes. In the drawing below we label these new axes x ′ and y′ .
T
In the Discovery Exercise (Section 6.4.1) you combined two vectors A⃗ and B⃗ to build a
wide variety of vectors. But can they create any vector, or are some vectors beyond their
reach? We know that we can span the xy-plane with the two vectors î and j,̂ or we can use î′
and ĵ′ . Is it important that our two basis vectors are unit vectors? Is it crucial that they are
perpendicular?
You may be surprised at how generally we can answer that question.
Any two non-parallel, non-zero vectors in a plane form a basis for that plane.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 288
1
⃯ You will prove that in Problem 6.55 by finding a
V
2 2 general formula for combining any two vectors to cre-
⃯ ⃯ ⃯ ate any third vector. But right now, let’s play with an
V1 ⃯ 3V1 ⃯
V2 A ˆ
9j example. We’re going to span the xy-plane with these two
A
vectors.
4 iˆ
V⃗1 = î + 3j,̂ V⃗2 = 2î
We can represent A⃗ as 4î + 9j,
̂ or as
3V⃗1 + (1∕2)V⃗2 .
So when we describe a vector we won’t give its i-̂ and
̂j-components. Instead we will describe one vector as
V1 12V⃗1 + V⃗2 , another as 𝜋 V⃗1 − 3V⃗2 , and so on. For any vec-
tor you can draw in the xy-plane, there is a way—and
there is only one way—to write it as a combination of V⃗1
3V1+(1/2)V2 and V⃗2 .
Don’t think of the îĵ representation as the “real vec-
tor” and the V⃗1 V⃗2 one as simply a shorthand for express-
ing so many is ̂ and js:
̂ these are two perfectly equivalent
V2
representations of the same space. We can do all our
graphing on V⃗ paper.
Since we have two different ways of expressing the
FIGURE 6.3 The point same vector, there must be a way to convert between
3V⃗1 + (1∕2)V⃗2 represented on them. Converting from the V⃗1 V⃗2 to the îĵ representation
V⃗1 V⃗2 graph paper. is easy; the other way takes a bit of algebra.
Can you see how this topic lends itself to matrices? Any vector can be expressed as a linear
combination of basis vectors, and “linear combination” is the phrase that always points us to
matrices. Using a column matrix to represent each vector, the first calculation in the example
above can be written:
V⃗1 V⃗2
( ) ( ) ( )
î 1 2 V⃗1 10 î −4
= ̂
ĵ 3 0 V⃗2 −7 j 30
The first column of the transformation matrix is V⃗1 and the second is V⃗2 . So multiplying that
matrix by the vector in the middle says “I want 10 of V⃗1 and −7 of V⃗2 .”
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 289
And what about the second calculation in the example? We can of course express the
question with matrices:
V⃗ V⃗2
( 1 ) ( ) ( )
̂i 1 2 V⃗1 a î 4
=
ĵ 3 0 V⃗2 b ĵ 5
But that doesn’t seem to get us any closer to a solution. We will readily answer such ques-
tions when we introduce the “inverse matrix”—which is exactly what it sounds like, a matrix
designed to go the other way—in Section 6.6.
5 5 TB 5
C
A x x x
–15 –10 –5 5 10 15 –15 –10 –5 5 10 15 –15 –10 –5 5 10 15
–5 –5 B –5
TC
–10 –10 –10
You might now expect us to give you “the matrix” that performs transformation 𝐓, but be
careful. The matrix that performs this transformation depends on the coordinate system—
the basis—you use to span the space.
∙ If all points are represented their x- and y-components (in other words in the
( by giving )
−3 −10
îĵ basis), then the matrix represents the transformation 𝐓, as shown in
3 8
all three pictures above.
∙ Above we looked at the basis vectors V⃗1 = î + 3ĵ and V⃗2 = 2i.̂ If you (
describe any
) point
9 2
in this basis, then transformation 𝐓 is performed by the matrix . The
−21 −4
pictures don’t change, although the algebra does.
∙ If you express all points as combinations(of B⃗1 = ̂ ̂ ⃗ ̂ ̂
) −5i + 3j and B 2 = 2i − j, then the
3 0
matrix that performs transformation 𝐓 is .
0 2
These are obviously three different matrices, but they represent only one transformation,
meaning that the “before-and-after” picture above looks exactly the same in all three cases.
It is the numbers used to describe the drawings, and therefore the numbers that effect the
transformation, that differ. We describe these as three “similar matrices.”
If this definition seems confusing, think back to the inclined plane example at the begin-
ning of the section. The force of gravity on a particular block is a vector that has a well
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 290
defined
( magnitude
) (mg ) and (
direction (down).
) The representation of that vector, however,
0 ̂ ̂ −mg sin 𝜃
is in the i j basis and in the î′ ĵ′ basis.4 Likewise, when we talk about
−mg −mg cos 𝜃
similar matrices we are talking about two different sets of numbers that represent the same
physical thing.
The third matrix above is “diagonal,” meaning that all the elements off the main diagonal
are zero. This makes its effect particularly easy to describe: “multiply the B⃗1 -component by 3
and the B⃗2 -component by 2.” If we had to work with transformation 𝐓 a lot, it would help to
do all our work in the B⃗1 B⃗2 basis, which is referred to as the “natural basis” for this particular
transformation. A different transformation would have a different natural basis.
Our main point here is the difference between a transformation and a matrix: one trans-
formation is represented by different matrices in different bases. Section 6.8 will discuss how
to find the natural basis where a given transformation is represented by a diagonal matrix.
Section 6.9 will represent the equations for the three-spring problem using a matrix and
solve them by expressing everything in the natural basis for that matrix. Section 7.2 will take
up very generally how to convert any transformation matrix from one basis to another.
( ) ( )
kiddy books 1 kiddy books 0
𝐤= 𝐜=
chapter books 0 chapter books 1
This 𝐤 is of course not a vector in the sense of “a quantity with magnitude and direction”
but it behaves in a similar way mathematically, so we look on it as a generalized version of
a vector.5 The contents of any library can be expressed as a𝐤 + b𝐜 for the correct choices of
the numbers a and b. So instead of saying “I have four kiddy books and six chapter books” I
could describe my library as 4𝐤 + 6𝐜.
Alternatively, we can use a vector (a different 2 × 1 matrix) to represent “number of pic-
tures” and “number of words.” The vector 𝐩 represents 1 picture and no words, and 𝐰 the
opposite. Any library can be expressed as a linear combination p𝐩 + w𝐰. This is a different
way of expressing the contents of the same library. With that mental shift, we can look on
the following matrix multiplication as a change from the 𝐤𝐜 basis to the 𝐩𝐰 basis.
4 To paraphrase the Buddha, do not confuse a pair of numbers representing a force with the actual force. ( )
5 We refer to “magnitude-and-direction” quantities as “spatial vectors” and designate them with arrow variables V⃗ ;
we designate generalized vectors with boldfaced lowercase variables (𝐤).
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 291
6.43 Consider the vectors A⃗ and B⃗ defined by coefficients that would be used to express
the drawing. The magnitude of A⃗ is 1; this vector in the form a A⃗ + b B.
⃗
B⃗ is obviously somewhat larger. You can
express any vector as a A⃗ + b B⃗ where a and
b are carefully chosen numbers. In Problems 6.44–6.48, A⃗ = î + 2ĵ and B⃗ = −4î + j.
̂
6 In 2D the requirement that the two vectors be non-parallel amounts to saying they can’t be on the same line. In
3D the requirement is that they can’t be on the same plane. See Problem 6.57.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 292
ii. Draw S⃗1 S⃗2 graph paper, as we did for a basis for a plane.” In this problem you will
a different basis in Figure 6.3. On prove it. Let A⃗ = Ax î + Ay ĵ and B⃗ = Bx î + By ĵ
that paper show U⃗ and ⃗
( 𝐅𝐬 U .)
represent two vectors that we want to use as
1 2 a basis; your goal is to express vector V⃗ =
iii. Now, use the matrix to con- Vx î + Vy ĵ in the form V⃗ = a A⃗ + b B.
⃗
2 3
vert vector U⃗ to the îĵ basis. (a) Solve for the variables a and b in terms of
iv. Act on your new vector with 𝐅𝐢 . Draw all the constants given in the problem.
the îĵ vectors before and after. (Your (b) Under what circumstances are your answers
picture should look exactly like your invalid? Answer by looking at your for-
previous picture, showing that the mulas, and then show that the answers
same vector has been acted on by match the criterion we described.
the same transformation.) 6.56 The three vectors P⃗ = 3î − 2ĵ + k, ̂ Q⃗ =
(b) Repeat Part (a) on a different vector of ̂ ̂ ̂ ⃗
7i + 4j − k, and R = (1∕2)k form â
your own choosing. Remember to start basis for three-space.
by choosing a vector in the S⃗1 S⃗2 basis. (a) Use a matrix multiplication to translate the
6.53 A transformation matrix is represented
( ) in vector 2P⃗ + 10Q⃗ − 6R⃗ to the îĵk̂ basis.
2 1
̂ ̂
the i j basis by the matrix 𝐓 = . (b) Translate the vector 2î − 10ĵ + 6k̂
1 2
into the P⃗ Q⃗ R⃗ basis.
(a) Draw î and 𝐓i.̂
6.57 The three vectors P⃗ = 3î − 2ĵ + k, ̂ Q⃗ = 7î +
(b) Draw ĵ and 𝐓j.̂
4ĵ − k, ̂ and R⃗ = î + 8ĵ − 3k̂ do not form a
(c) Draw (î + j)̂ and 𝐓(î + j).
̂
basis for three-space. Show that you can-
(d) Draw (î − j)̂ and 𝐓(î − j).
̂ not use these three vectors in any combina-
(e) Your first answers may look confusing, tion to create the vector i⃗ + j. ⃗ (These three
but Parts (c) and (d) should make it vectors do not form a basis—even though
clear what the natural basis for this trans- they are not zero and not parallel—because
formation is. (The matrix performs a they are “linearly dependent,” a topic we
stretch along one of them, and leaves the will discuss further in Section 6.7.)
other alone.) What is that basis? 6.58 The two vectors A⃗ = î + ĵ + 2k̂ and B⃗ = î + k̂
(f) Write the matrix that represents form a basis for the plane z = x + y.
this transformation in the basis you (a) Express the vector 3A⃗ + 2B⃗ in the
described in Part (e). This should not îĵk̂ representation.
require any calculations.
(b) Express the vector 3î + ĵ + 4k̂ in the
6.54 A vector V⃗ is expressed in the form p P⃗ + q Q⃗ A⃗B⃗ representation. (You’ll just have to
where P⃗ is the vector 2î − 4j, ̂ and Q⃗ is the play around with this one.)
vector 8î + 4j.̂ In other words p and q are (c) Prove that the vector î cannot be expressed
the components of V⃗ in the P⃗ Q⃗ basis. in the A⃗B⃗ representation.
(a) Write a transformation matrix that will (d) Prove that any vector a A⃗ + b B⃗ lies
stretch your vector by a factor of 6 in on the plane z = x + y.
the P⃗ -direction and a factor of 7 in the
6.59 The vectors î and î + k̂ span a space.
Q⃗ -direction. (This is difficult if the vec-
tor is specified in the îĵ basis, but easy in (a) What space do they span, and what is
the P⃗ Q⃗ basis, so you should write it in the the dimensionality of that space?
P⃗ Q⃗ basis.) (b) Give one example of a third vector
(b) Write a transformation matrix that you could add to make a basis for a
will reverse the P⃗ -component of your higher dimensional space.
vector, leaving the Q⃗ -component (c) Give one example of a third vector you
unchanged. (Same note.) could add that would not make a basis
for a higher dimensional space.
(c) Write a matrix to convert the vector V⃗
to the îĵ basis; in(other
) words
( )the matrix
6.60 For each set of vectors below say whether or
p x not it is a basis. If so, what is the dimension-
should convert to . ality of the space that it’s a basis for?
q y
6.55 The Explanation (Section 6.4.2) claimed that (a) A⃗ = i,̂ B⃗ = ĵ + k̂
“Any two non-parallel, non-zero vectors form (b) A⃗ = i,̂ B⃗ = j,̂ C⃗ = 3ĵ
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 294
(c) A⃗ = i,̂ B⃗ = j,
̂ C⃗ = 3k̂ position 1st 2nd
(d) 𝐚 represents 1 house and 2 apartments, 𝐛 of ⎛ normal normal ⎞ ( )
represents 3 houses and 5 apartments. ⎜ mode mode ⎟ 1st normal mode 26∕3
Ball 1 ⎜ 1 1 ⎟
6.61 An object is free to move about in the ⎜ ⎟ 2nd normal mode −11∕3
Ball 2 ⎝ 2∕3 −1∕3 ⎠
xy-plane with no forces on it.
(a) What physical information would you ( )
have to give about the object to fully position of Ball 1 5
=
specify its initial conditions? position of Ball 2 7
(b) What is the dimensionality of the space
of initial conditions for this object?
Problems 6.63–6.66 refer to this conversion.
(c) Write one possible set of basis vectors
for the initial conditions for the object. 6.63 (a) What basis vectors do we use to express
Hint: Every basis vector must have the the state of the system as the posi-
same dimensionality as the space itself. tions of the two balls?
That is, your answer to Part (b) tells you (b) What basis vectors do we use to express
how many components each basis vector the state of the system as a combi-
must have. nation of normal modes?
6.62 An object is confined to move along the 6.64 Represent the initial condition “3 of the
curve y = x 2 in the xy-plane. The space S first normal mode and 6 of the second one”
is defined as the set of all possible posi- in the basis of initial ball positions.
tions for the object along that curve. What
6.65 Represent the initial condition “a of the
is the dimensionality of S?
first normal mode and b of the second one”
in the basis of initial ball positions.
Section 6.3 showed how a matrix multiplication can
be used to find the initial positions associated with a 6.66 Represent the initial condition “Ball 1 at
combination of normal modes in the three-spring position 2 and Ball 2 at position −1” in
problem. the basis of normal modes.
( ) ( H2 O H2 O2 H2 )
H2 O ⎛ 10 ⎞
hydrogen 120 hydrogen 2 2 2 ⎜ ⎟
= H2 O2 ⎜ 20 ⎟
oxygen 50 oxygen 1 2 0 ⎜ 30 ⎟
H2 ⎝ ⎠
𝐚 = 𝐗 𝐦
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 296
( ) ( hydrogen oxygen ) ( )
protons 520 protons 1 8 hydrogen 120
=
neutrons 400 neutrons 0 8 oxygen 50
𝐩 = 𝐘 𝐚
More concisely, we can combine the equations 𝐩 = 𝐘𝐚 and 𝐚 = 𝐗𝐦 into the equation
𝐩 = 𝐘(𝐗𝐦).
⎡ ⎤
oxygen ) ⎢ ( H2 O
( hydrogen ⎛ 10 ⎞⎥
( ) H2 O2 H2 )
⎢ H2 O
⎜ ⎟⎥⎥
protons 520 protons 1 8 ⎢H 2 2 2
=
⎢O
H2 O2 ⎜ 20 ⎟⎥
neutrons 400 neutrons 0 8 1 2 0 ⎜ 30 ⎟⎥
⎢ H2 ⎝ ⎠
⎢ ⎥
⎣ ⎦
𝐩 = 𝐘 [ 𝐗 𝐦 ]
You must read this equation from right to left: 𝐦 gets acted on by 𝐗, and then the result gets
acted on by 𝐘, to produce 𝐩.
Now we want to define one matrix that performs the molecules-to-particles transforma-
tion. We will call this new matrix the product 𝐘𝐗. Reading from right to left again the matrix
𝐘𝐗 will represent “Do transformation 𝐗 and then transformation 𝐘.”
( ) ( H2 O H2 O2 H2 )
H2 O ⎛ 10 ⎞
protons 520 protons ? ? ? ⎜ ⎟
= H2 O2 ⎜ 20 ⎟
neutrons 400 neutrons ? ? ? ⎜ 30 ⎟
H2 ⎝ ⎠
𝐩 = 𝐘𝐗 𝐦
If you’ve followed our train of thought to this point, you can reason out what goes into
those question marks. For instance, the first column represents the number of protons and
neutrons in a water molecule. A water molecule contains 2 hydrogen atoms (each with 1
proton and no neutrons), and 1 oxygen atom (with 8 protons and 8 neutrons), so it contains
a total of 10 protons and 8 neutrons.
But you should also see that this is exactly the operation represented by multiplying(matrix
)
1
𝐘 by the first column—the “water molecule” column—of matrix 𝐗. You compute 2 +
( ) ( ) 0
8 10
1 to find that the first column of matrix 𝐘𝐗 is . You then repeat the process for
8 8
hydrogen peroxide and hydrogen molecules, putting each in its own column of matrix 𝐘𝐗.
𝐀𝐁 = (𝐀 𝐛𝟏 𝐀 𝐛𝟐 𝐀 𝐛𝟑 ... 𝐀 𝐛𝐧)
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 297
Each individual term in this sequence is the product of matrix 𝐀 times a column of matrix 𝐁, as
defined in Section 6.3. These individual products are not added together, but instead form different
columns of the result.
In words: “You already know how to multiply a matrix times a single column. When mul-
tiplying matrix 𝐀 by multicolumn matrix 𝐁, multiply 𝐀 by each column of 𝐁 to produce the
corresponding column of the product 𝐀𝐁.” As usual, your best strategy is probably to follow
the example below, and then go back and try to make sense of the definition above.
( ) ⎛ 7 10 ⎞
1 2 3 ⎜ 8 11 ⎟
Question: Multiply
4 5 6 ⎜ ⎟
⎝ 9 12 ⎠
Answer: We find the answer by moving column-by-column through the right-hand
matrix.
( )⎛7⎞ ( )
1 2 3 ⎜ ⎟ 50
8 gives us the first column of the answer, .
4 5 6 ⎜ ⎟ 122
⎝9⎠
( ) ⎛ 10 ⎞ ( )
1 2 3 ⎜ ⎟ 68
11 gives us the second column of the answer, .
4 5 6 ⎜ ⎟ 167
⎝ ⎠
12
( ) ⎛ 7 10 ⎞ ( )
1 2 3 ⎜ ⎟ 50 68
We conclude that 8 11 = .
4 5 6 ⎜ ⎟ 122 167
⎝ 9 12 ⎠
Notes on this process:
∙ The right-hand matrix can have as few or as many columns as you like; each one
becomes a column in the answer.
∙ The left-hand matrix can have as few or as many rows as you like; that determines
the size of the columns in the answer.
∙ However, the number of rows in the right-hand matrix must perfectly match
the number of columns in the left-hand matrix, or the multiplication is
illegal.
∙ Reversing the order will in many cases turn a legal multiplication into an illegal
one. In this case the multiplication remains legal, but the answer is completely dif-
⎛ 7 10 ⎞ ( ) ⎛ 47 64 81 ⎞
⎜ ⎟ 1 2 3
ferent: 8 11 = ⎜ 52 71 90 ⎟.
⎜ ⎟ 4 5 6 ⎜ ⎟
⎝ 9 12 ⎠ ⎝ 57 78 99 ⎠
Given that goal, how do you know that this is the right algorithm? One way is to go back to
the molecules-to-particles example above and consider again what the elements of the new
matrix must be, and why. A more formal proof is presented in Problem 6.90.
If your matrices are labeled, the new matrix picks up the top labels of the right-hand
matrix, and the left labels of the left-hand matrix. The example below illustrates this, as well
as giving another example of a composite transformation.
We have used this example before: “Each ‘kiddy book’ has 30 pictures and 500 words.
Each ‘chapter book’ has 10 pictures and 20,000 words.” Now we add to that scenario:
every college apartment has 5 kiddy books and 20 chapter books. Every family
household has 40 kiddy books and 35 chapter books. Multiplying these two matrices
gives us a single matrix that converts from “apartments and houses” to “pictures and
words.”
( apartment house )
pictures 350 1550
=
words 402, 500 720, 000
Make sure you understand why we multiplied those two matrices in that order, why the
labels came out the way they did, and what the final matrix accomplishes. This
example summarizes everything we have said in this section, if you follow it carefully.
Point Matrices
Computer graphics software provides the underlying engine for modern computer games
and animated movies. A drawing is represented in such systems by a “point matrix.”7 Other
matrices effect transformations such as rotating, scaling, or moving such(a drawing. )
0 5 5 0 0
For instance, the matrix
5 0 0 3 3 0
4 represents a line from (0, 0) to (5, 0), and then a line
3 from there to (5, 3), and so on. It therefore creates
2 the drawing on the left. ( )
1 2 0
If we multiply matrix by this matrix (try
0 0 2∕3
0 2 4 6 8 10 12 it!), the resulting matrix renders as Figure 6.4.
7 The term “point matrix” is not in common use, but as far as we can tell no other term for such a matrix is in
common use either. This term was coined by Henry Rich, to whom we are in debt not only for “point matrix” but
for a great deal of the way we present linear algebra.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 299
SP RSP
RP SRP
Alternative Interpretations
Our motto for matrix multiplication is “it’s all about the columns.” We think of both the left-
hand and the right-hand matrices as a series of columns that might represent spatial vectors
or other collections of related information.
( )( ) ( ) ( )
a b e f a b
First step: e +f (6.5.1)
c d g h c d
That step becomes the first column of the solution, and you proceed similarly along all the
columns of the right-hand matrix. But two other interpretations are available.
1. “It’s all about the rows.” Think of both matrices as collections of rows. For
Equation 6.5.1 you would start here.
( ) ( )
First Step: a e f + b g h
That step becomes the first row of the solution, and you proceed similarly down all
the rows of the left-hand matrix.
8 When you use matrices to transform shapes that don’t have vertices, such as circles, you approximate the shapes
2. “Rows on the left, columns on the right.” This method is based on the fact that a
row matrix times a column matrix gives a 1 × 1 answer. For Equation 6.5.1 you would
start here. ( )
( ) e ( )
First Step: a b = ae + bg
g
This number becomes one element of the solution. You will experiment with this
method in Problem 6.91.
We call these different “interpretations” (you could also use the word “methods”) because
they are all perfectly equivalent. The rule of when you are allowed to multiply matrices, and
the matrix you get after you perform the multiplication, are the same in all three cases. The
last of these methods is perhaps the most convenient way to multiply matrices in your head.
But we will continue to emphasize the column interpretation. From toy examples such
as chapter books and kiddy books, to point matrices and geometric transformations, to
the three-spring problem, we use matrices to find linear combinations of vectors that are
generally represented as columns. This makes it particularly natural to view every matrix mul-
tiplication as a transformation from one basis to another, a subject we discussed in Section 6.4
and to which we will return in Section 7.3.
There is also a purely computational argument for a column-based approach. Computers
typically access and store matrices by column, so processing a matrix column-by-column can
save expensive I/O time.
Stepping Back
The algorithm for multiplying two matrices may appear both complicated and arbitrary. The
solution for “appears complicated” is practice, and in this case we do recommend what may
feel like mindless drill; keep multiplying matrices until the process is comfortable. It comes
up often enough to be worth your time.
The solution for “appears arbitrary” is to understand that while matrix multiplication is
not generally commutative (𝐀𝐁 ≠ 𝐁𝐀), it is always associative:
𝐀(𝐁𝐂) = (𝐀𝐁)𝐂
The left side of that equation says “act on 𝐂 with 𝐁, and then act on the result with 𝐀.” The
right side describes a new matrix 𝐀𝐁 that, when acting on 𝐂, produces the same result in one
step. Our goal in this section has been to convince you that the product 𝐀𝐁 as we defined it
here is the correct and only matrix that accomplishes this.
2. We know that matrix multiplication is not generally “commutative”; that is, 𝐀𝐁 ≠ 𝐁𝐀.
Show that the multiplication in Part 1 is commutative, meaning 𝐀𝐈 = 𝐈𝐀.
3. Your goal is now to find a matrix 𝐁 such that 𝐀𝐁 = 𝐈 (where 𝐈 is the matrix you found
in Part 1).
(a) Explain why matrix 𝐁 must have dimensions 2×2.( )
w y
Now that we know the dimensions, we will let 𝐁 = ; in other words, use let-
x z
ters to represent all the unknown numbers. The defining equation 𝐀𝐁 = 𝐈 there-
fore becomes: ( )( )
2 1 w y
=𝐈 (6.6.1)
5 3 x z
(b) Perform the matrix multiplication on the left side of Equation 6.6.1. The result
should be another 2×2 matrix that you are setting equal to 𝐈.
(c) Recall that if two matrices equal each other, all the corresponding elements must
be equal. Use this fact to write four equations to solve for the unknown w, x, y,
and z.
(d) Solve your equations to find matrix 𝐁.
See Check Yourself #41 in Appendix L
4. Show that the multiplication in Part 3 is commutative, meaning 𝐀𝐁 = 𝐁𝐀.
The numbers 3 and 8 make a simple demonstration but the point is that 2x and log2 x will
always reverse each other in this way. One turns −5 into 1∕32, the other turns 1∕32 into −5.
One turns 0 into 1, the other turns 1 into 0.
Equations 6.6.3 offer an analogous matrix equation.
( ) ( ) ( )
2 1 4 7
When multiplies (or transforms) it gives you
( 5 3 ) ( −1 ) ( 17 )
3 −1 7 4
When multiplies (or transforms) it gives you
−5 2 17 −1
( ) ( ) (6.6.3)
4 7
Once again, and are just one simple example. The more general point is that
( ) −1
( ) 17
2 1 3 −1
and always reverse each other’s effects in this way: they are “inverse
5 3 −5 2
matrices.”
Below we present two formal definitions that may not seem to have anything to do with
the reversing effect described above. Then we will wind our way back to the idea of inverting
a transformation.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 305
The Discovery Exercise (Section 6.6.1) presents a straightforward, but time-consuming, pro-
cess for finding the inverse of a matrix: you write the equation 𝐀𝐀−1 = 𝐈, substitute letters
for the unknown elements of 𝐀−1 , and solve. In Problem 6.94 you will practice this method
on a specific matrix. In Problem 6.123 you will apply this method to the generic 2×2 matrix,
and show that:
( ) ( )
a b 1 d −b
The inverse of is (6.6.4)
c d ad − bc −c a
) (
2 1
For instance, in the Discovery Exercise, you found the inverse of the matrix 𝐀 = .
( ) ( 5) 3
1 3 −1 3 −1
Using Equation 6.6.4 we can see readily that 𝐀−1 = = . You
6−5 −5 2 −5 2
may want to confirm for yourself that this matrix multiplies by 𝐀 (in either order!) to produce
the identity matrix 𝐈.
In practice, you never find an inverse matrix by using the definition directly; you use a
formula or, better yet, a computer. The important thing is to understand what an inverse
matrix is, and when to use it.
Question: If 𝐗 is a matrix, and 𝐓 is a transformation, then what is 𝐓−1 𝐓𝐗? (Try to answer this
question for yourself before you read on.)
Answer: 𝐓−1 𝐓 = 𝐈 (by the definition of 𝐓−1 ), and 𝐈𝐗 = 𝐗 (by the definition of 𝐈), so
𝐓−1 𝐓𝐗 = 𝐗.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 306
Question: So what?
Answer: The equation 𝐓−1 𝐓𝐗 = 𝐗 tells us that 𝐓−1 , acting on the matrix 𝐓𝐗, produces the
matrix 𝐗. In other words, 𝐓−1 must undo whatever 𝐓 did.9
That little bit of algebra connects the definition 𝐓−1 𝐓 = 𝐈 with our original goal of cre-
ating an “inverse function,” reversing the inputs and the outputs. Consider the following
examples.
∙ 𝐗 is a point matrix that describes a shape, and 𝐓 is a transformation that rotates the
shape 30◦ clockwise and then stretches it by a factor of 2 in the x-direction. So what
does 𝐓−1 do? Since it must turn 𝐓𝐗 back into 𝐗, it must shrink it by a factor of 2 in the
x-direction and then rotate it 30◦ counterclockwise.
X
TX
T–1
∙ 𝐱 contains the number of offices and conference rooms, and 𝐓 converts this into the
number of chairs and tables contained therein. Since 𝐓−1 must turn 𝐓𝐱 back into 𝐱, it
can be used to answer the question: “If you have c chairs and t tables, then how many
offices and conference rooms must you have?” (This calculation will sometimes lead
to fractional offices or conference rooms, which you should interpret as “no possible
combination would lead to exactly that many chairs and tables.”)
Question: A molecule of methane (CH4 ) has 1 carbon atom and 4 hydrogen atoms. A
molecule of butane (C4 H10 ) has 4 carbon atoms and 10 hydrogen atoms. Each
carbon atom has 6 protons and 6 neutrons, while each hydrogen atom has 1 proton
and no neutrons. If a sample contains 3192 protons and 2214 neutrons, how many
molecules each of methane and butane does it contain?
Answer: We start by writing a matrix 𝐗 to answer the easier question: given a certain
number of molecules of methane and butane, how many protons and neutrons are
there? We can get that by multiplying the molecules-to-atoms matrix by the
atoms-to-particles matrix.
( CH4 C4 H10 )
protons 10 34
=
neutrons 6 24
9 We are implicitly using a fact that we emphasized in Section 6.5, the associativity of matrix multiplication: that is,
( )
we are assuming 𝐓−1 𝐓 𝐗 is the same as 𝐓−1 (𝐓𝐗).
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 307
Next we can use Equation 6.6.4 to find 𝐗−1 , which will convert from protons and
neutrons to molecules of methane and butane.
protons neutrons
( ) ( )
1 24 −34 CH4 2∕3 17∕18
𝐗−1 = =
36 −6 10 C4 H10 −1∕6 5∕18
protons neutrons
( ) ( ) ( )
CH4 2∕3 17∕18 protons 3192 CH4 37
=
C4 H10 −1∕6 5∕18 neutrons 2214 C4 H10 83
Section 6.4 gave the example of using V⃗1 = î + 3ĵ and V⃗2 = 2î as a basis. If you understand
matrix multiplication, it’s easy to see how the following matrix multiplication converts from
a V⃗1 V⃗2 representation to an îĵ representation of a vector.
V⃗1 V⃗2
( ) ( ) ( )
î 1 2 V⃗1 10 î −4
= ̂
ĵ 3 0 V⃗2 −7 j 30
The first column of the transformation matrix is V⃗1 and the second is V⃗2 . So multiplying that
matrix by the vector in the middle says “I want 10 of V⃗1 and −7 of V⃗2 .” The answer tells you
that you end up with −4î + 30j.̂
Inverse matrices reverse this operation: given any vector represented in the îĵ basis, you
can find its V⃗1 - and V⃗2 -components.
î ĵ
( ) ( ) ( )
V⃗1 0 1∕3 î −4 V⃗ 10
= ⃗1
V⃗2 1∕2 −1∕6 ĵ 30 V2 −7
You will work with this use of inverse matrices in Problems 6.116 and 6.117.
You could write this out as two regular algebra equations and then solve for x and y using
substitution or elimination, but we want to use this example to show you how you can do
algebra directly with matrices. In this case, we’re going to solve this equation for 𝐱.
If matrices were numbers we would solve 𝐀𝐱 = 𝐛 by dividing both sides by 𝐀, but there
is no such thing as matrix division. How would you solve the numerical equation 5x = 12 if
we told you that you were allowed to multiply but not divide? Of course you would multiply
both sides by 1∕5. In an analogous way—and for analogous reasons—you can solve matrix
equations by multiplying by the inverse matrix.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 308
But there is one limitation to this analogy, because matrix multiplication is not commu-
tative. To do the same thing to both sides of the equation we must “left-multiply” both sides
by 𝐀−1 , or “right-multiply” both sides. If we left-multiply one side while right-multiplying the
other side, we have not done the same thing to both sides.
In the case of Equation 6.6.5, left-multiplying is the operation we want.
𝐀−1 𝐀𝐱 = 𝐀−1 𝐛
Note that we have legally done the same thing to both sides. But now look what we get! 𝐀−1 𝐀
becomes 𝐈 and then 𝐈𝐱 becomes 𝐱.
𝐱 = 𝐀−1 𝐛
Find the inverse of 𝐀 and multiply it by 𝐛 and you find the matrix we were looking for.
( )
−3
𝐱=
1∕2
As a quick check, our original matrix equation was equivalent to the two equations
3x + 2y = −8 and 7x − 10y = −26, and you can easily verify that x = −3, y = 1∕2 is the correct
solution. The value of this technique is not for solving systems of linear equations; row
reduction (Section 7.4 at felderbooks.com) is more efficient for that. However, there are
situations where it is important to do algebra directly with matrices, often without knowing
numerical values for them, and this provides a simple illustration of the technique.
Non-invertible Transformations
We noted before that only square matrices can be inverted. You’ll show in Problem 6.235 why
this makes sense in terms of what matrix transformations do; roughly speaking the output
of a 2 × 3 matrix doesn’t have enough information to be inverted and the output of a 3 × 2
matrix has too much. Square matrices have the same amount of information in their input
and output, so they generally can be inverted.
Generally, but not always. If we tell you a terrarium contains 300 ants (6 legs each and no
hands) and 20 spiders (8 legs each and no hands), you can easily tell us that(the terrarium
)
6 8
contains 1960 legs and zero hands. The matrix for doing that conversion is . If we
0 0
just told you that the terrarium contained 1960 legs and no hands, though, you wouldn’t
know if it had 300 ants and 20 spiders, or 220 ants and 80 spiders, or any of countless other
possible combinations. This matrix simply has no inverse. In this case it’s true for a triv-
ial reason (neither animal has hands, so saying “zero hands” doesn’t give you any useful
information). There are much less obvious non-invertible transformations, however. We will
return to this issue in Section 6.7.
6.109 2m + 3n + 17 = 5, (1∕3)(m + 6n) = −3 (a) Find the inverse of the above 2×2 matrix.
2 2
6.110 9p − r = 6, 6p + r ∕3 + 1 = 0 (b) Use your inverse matrix to show
that x1 (0) = 5 and x2 (0) = 7 lead to
6.111 9a + 3b − 5c + 7d = 8, 3a − 4b + c = 10, A = 26∕3 and B = −11∕3.
10a + 3b + c + 6d = 3, a − b + 2c − d = 69. (c) Find the solution to the three-spring
(You can have a computer solve these problem that meets the initial condi-
equations to check your answer, but to tions x1 (0) = 8 and x2 (0) = −3.
solve the problem you need to show
6.116 Section 6.4 used a matrix multiplication to
the steps of using an inverse matrix to
convert a vector from V⃗1 V⃗2 representation to îĵ
solve them, using the computer to do
representation, where V⃗1 = î + 3ĵ and V⃗2 = 2i.̂
the matrix algebra for you.)
(a) Write a matrix to convert from îĵ
6.112 Attempt to solve the equations 4x + 6y = 12, representation to V⃗1 V⃗2 .
6x + 9y = 14 using the method described in (b) Use your matrix to convert the vector
the Explanation (Section 6.6.2) and Prob- 4î + 5ĵ to V⃗1 V⃗2 representation.
lem 6.107. Explain what went wrong in 6.117 Basis P represents all vectors as a linear
terms of the matrix operations; then explain
combination of the two vectors P⃗ 1 = 3î − 4ĵ
what went wrong in terms of the original
and P⃗ 2 = 2î + j.̂ Basis Q represents all vec-
equations, with no reference to matrices.
tors as a linear combination of the two
6.113 Begin with the matrix equation 𝐀𝐁𝐂 = 𝐃. vectors Q⃗ 1 = î + 5ĵ and Q⃗ 2 = 4j.̂ In this prob-
Assume 𝐀, 𝐁, 𝐂, and 𝐃 are all square matri- lem you are going to create a matrix that
ces of equal size. In each part below solve converts vectors directly from basis P to
for the indicated matrix in terms of the basis Q .
other matrices and their inverses.
(a) Write a matrix that converts from basis
(a) 𝐂 P to an îĵ representation. (No cal-
(b) 𝐀 culations are needed here; you can
(c) 𝐁 just write down the matrix.)
6.114 In each part below solve for the matrix 𝐌 (b) Write a matrix that converts from basis
in terms of the other matrices and their Q to an îĵ representation.
inverses. In each case assume that all the (c) Write a matrix that converts from an
matrices are square matrices of equal size. îĵ representation to basis Q .
(a) 𝐀𝐌 + 𝐁 = 𝐂 (d) Write a single matrix that converts
(b) 𝐌𝐀 + 𝐁 = 𝐂𝐃 from basis P to basis Q .
(c) 𝐀 𝐌𝐀 = 𝐁
−1 (e) Use your matrix to convert P⃗ 1 − P⃗ 2
to basis Q . Draw the vector and show
6.115 Section 6.1 presented the three-spring sce-
nario; Section 6.3 re-expressed the sce- its P⃗ 1 - and P⃗ 2 -components and its
Q ⃗ - and Q⃗ -components.
nario as a matrix problem, and presented 1
( 2 )
the following multiplication. 1 1
6.118 Matrix 𝐒 = is called a “shear matrix.”
0 1
first normal second normal (a) Apply matrix 𝐒 to the vertices of polygon
( mode mode )
𝐏, whose vertices are (0, 0), (1, 0), (1, 1),
position of Ball 1 1 1 and (0, 1) in that order. Draw 𝐏 on one set
position of Ball 2 2∕3 −1∕3 of coordinate axes, and the transformed
polygon 𝐏′ on a different graph.
( ) ( )
first normal mode 26∕3 position of Ball 1 5 (b) Describe in words what 𝐒 did to 𝐏.
= (c) The inverse transformation matrix 𝐒−1
second normal mode −11∕3 position of Ball 2 7
should undo the effect of 𝐒; in other
This shows how you can use a 2×2 matrix to words, it should transform 𝐏′ back to 𝐏.
convert from “A of the first normal mode Describe in words what it would need
and B of the second” to the initial positions to do to make that happen.
x1 (0) and x2 (0). But when we solved the three- (d) Think about what 𝐒−1 would do if it were
spring problem, we had to go the other applied to 𝐏 (not 𝐏′ —we know what would
way; we wanted x1 (0) = 5 and x2 (0) = 7, and happen then!). Draw a picture of how you
we had to find A and B accordingly. expect 𝐏 will be transformed by 𝐒−1 .
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 312
(e) Calculate 𝐒−1 and apply it to 𝐏. Draw 6.123 In this problem you will derive the for-
the resulting figure. mula for the inverse
( of the
) generic
a c
6.119 Matrix 𝐗 interchanges the axes; in 2×2 matrix 𝐀 = .
b d
other words, it turns every point (x, y) ( )
into (y, x). w y
(a) Using 𝐀−1 = to repre-
x z
(a) An easy way to find matrix 𝐗 is to note
sent the inverse matrix, write out
the effect
( it )
will have on the point matrix
the equation 𝐀𝐀−1 = 𝐈.
1 0
𝐏= . Determine the point (b) Perform the requisite multiplication
0 1
matrix that should result when 𝐗 acts and set the two resulting matrices equal
on 𝐏. Then write the resulting equation, to each other. You should now have
and you can write down matrix 𝐗 four separate linear equations.
immediately. (c) Solve for the unknown variables w,
(b) Explain, based on our verbal descrip- x, y, and z. Remember that each one
tion of the effect of matrix 𝐗, how we will end up being a function of a, b, c,
know that matrix 𝐗 should be its own and d.
inverse. (d) In the end, with possibly a bit of rewriting,
(c) Now prove mathematically that it is, you should be able to factor 1∕(ad − bc)
by showing that 𝐗𝐗 = 𝐈. out of all four answers, resulting in
Equation 6.6.4.
6.120 [This problem depends on Problem 6.119.] In
Problem 6.119 you found a matrix 𝐗 that is its (e) A “singular matrix” has no inverse. Based
own inverse. You found it by considering a ver- on your result, what must be true for
bal description of a matrix that will undo itself; a 2×2 matrix to be “singular?”
you proved it by showing that it satisfied the 6.124 The inverse of a diagonal matrix is a diag-
equation 𝐗𝐗 = 𝐈. onal matrix with each element inverted.
(a) Show mathematically that the 2 × 2 ⎛ c1 0 ⎞
identity matrix 𝐈 is its own inverse. ⎜ 0 c2 ⎟
In other words, if 𝐀 = ⎜ ⎟
...
(b) Find 4 other 2 × 2 matrices (for a total ⎜ ⎟
of 6) that are their own inverses. For ⎝ cn ⎠
each one say in words what the matrix ⎛ 1∕c1 0 ⎞
⎜ 0 1∕c2 ⎟
does and demonstrate algebraically then 𝐀 = ⎜
−1
⎟.
...
that it is its own inverse. ⎜ ⎟
⎝ 1∕cn ⎠
6.121 Matrix 𝐀𝐁 means “First do matrix 𝐁, and then
do matrix 𝐀.” Is matrix (𝐀𝐁)−1 the same as (a) Prove this fact mathematically.
𝐀−1 𝐁−1 , or 𝐁−1 𝐀−1 , or neither? Explain the (b) Now explain it verbally by saying
reasoning that leads you to your conclusion. what effect 𝐀 and 𝐀−1 have as trans-
( ) √ formations of an n-dimensional
1 i
6.122 Let matrix 𝐂 = where i ≡ −1. Find vector.
i 1
𝐂−1 and demonstrate that 𝐂𝐂−1 = 𝐈.
2x + 3y = 7
4x + 2y = 21
2. Try to solve the following equations for x and y. Explain what went wrong.
2x + 3y = 7
4x + 6y = 21
3. Find at least two pairs of numbers (x, y) that solve the following equations.
2x + 3y = 7
4x + 6y = 14
You may have been taught that n linear equations with n unknowns have exactly one solution,
but this isn’t always the case. In some cases they have one solution, in some cases they have
none, and in some cases they have infinitely many solutions. With two equations and two
unknowns you can figure out which case you’re dealing with by playing with the equations
for a bit, but larger sets of equations demand a systematic approach.
y
y y
x + 2y = 4 5
4 4
2 2
x x x
–4 –2 2 4 –4 –2 2 4 –4 –2 2 4
–2 –2
–4 9x–6y = 5 –5 12x–8y = 18 –4
–6 –6
–8 6x–4y = 9 –10 –8
6x–4y = 9 6x–4y = 9
–10 –10
(a) (b) (c)
FIGURE 6.5 The three possibilities for two linear equations: intersect at a point, never intersect, or intersect
everywhere on both lines.
But if it is typical to have one unique solution, what makes the lines in the middle and
right plots atypical? In both cases, the left side of one equation is a multiple of the left side of
the other: 9x − 6y = (3∕2)(6x − 4y), and 12x − 8y = 2(6x − 4y). With that insight we can look
at such equations in their generic form.
ax + by = r1
(6.7.1)
cx + dy = r2
These equations have a unique solution unless the left side of the second equation is a
multiple of the left side of the first equation. That’s equivalent to saying c∕a = d∕b, which
we can rearrange as follows.
For two simultaneous linear equations in the form given by Equations 6.7.1:
As you can see, the quantity ad − bc determines if a set of equations has a unique solution,
but it does not distinguish “inconsistent” from “linearly independent” sets of equations. That
difference depends on the constant terms r1 and r2 , a subject to which we will return more
generally in Section 7.4 (see felderbooks.com). Here we need address only one special case:
what if r1 and r2 are both zero?
An equation of the form ax + by = 0 is called “homogeneous.” A set of homogeneous
equations always has the solution x = y = 0, known as the “trivial solution.” (Graphically,
all homogeneous lines go through the origin.) Therefore, n homogeneous equations with n
unknowns can never be inconsistent. Either they have only one unique solution (the origin)
or they are linearly dependent and have infinitely many solutions.
x + 𝜆y = 0
2x + 3y = 0
Answer: You can easily verify that x = y = 0 works no matter what 𝜆 is. If ad − bc is
anything other than zero, then that trivial solution will be the unique solution. For
there to be any other solutions, the equations must be linearly dependent, so
3 − 2𝜆 = 0, meaning 𝜆 = 3∕2.
Conclusion: For 𝜆 = 3∕2 the two equations both reduce to the same equation; any
pair of numbers (x, y) that satisfies x = −3y∕2 will solve both equations. For any other
𝜆, the only solution is the trivial one: x = y = 0.
We have restricted our discussion so far to 2 equations with 2 unknowns. The first and
most important result—that every set will have 0 solutions, 1 solution, or infinitely many
solutions—applies to any set of linear equations. (You will prove this for 3 equations with 3
unknowns in Problem 6.148.) But spotting linear dependence becomes more complicated
in a case like this.
A ∶ 2x − 3y + 4z = 3
B ∶ x + 4y − z = 1
C ∶ 4x − 17y + 14z = 7
You can easily check that none of these equations is a multiple of any of the others. What
you can’t tell so easily is that equation C is 3 times equation A minus 2 times equation B.
That means equation C gives us no information that was not already implicit in the first
two equations, so these equations are linearly dependent (infinitely many solutions). The
following definition applies to any set of 2 or more equations.
3x +7y +2z = 10 ⎛3 7 2 ⎞
For the equations 4x −8y +6z = 11 the matrix of coefficients is ⎜ 4 −8 6 ⎟
⎜ ⎟
5x −z = 13 ⎝5 0 −1 ⎠
We saw above that these coefficients determine whether the equations have a unique solu-
tion. So we define a new function of these numbers, called the “determinant”; it determines
the types of results we expect to find. We have seen what that function looks like for a 2×2
matrix.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 316
Above we stated a rule for determining if two simultaneous equations have a unique
solution. We can now restate that rule in terms of matrices. This new rule will generalize
to more than two equations, provided we define the determinants of higher-level matrices
correctly.
We’ve seen that these rules works for a 2×2 matrix if we define the determinant as ad − bc,
and we have also seen that finding the right relationship of coefficients is harder for more
equations. Fortunately, that work has been done once and for all, and the formula for the
determinant of larger matrices has been defined to make those rules work. We will present
that definition below, but in practice you will rarely if ever need to use it. You need to know
how to find the determinant of a 2×2 matrix (which we told you above), and sometimes
even a 3×3 matrix (which we tell you below), by hand. For any higher-level matrix, just get
the determinant from a computer. In all cases, though, you need to know what to do with
the determinant once you have it!
(1 − 𝜆)a + b − 3c + 2d =0
2a + (3 − 𝜆)b + c − 4d =0
5a + 2b + (−1 − 𝜆)c + 3d =0
2a − 2b + 2c + (3 − 𝜆)d =0
Answer: The only way they could have any other solutions in addition to the trivial
one is if they are linearly dependent. We asked a computer to find the determinant of
the matrix
⎛ (1 − 𝜆) 1 −3 2 ⎞
⎜ 2 (3 − 𝜆) 1 −4 ⎟
⎜ 5 2 (−1 − 𝜆) 3 ⎟
⎜ ⎟
⎝ 2 −2 2 (3 − 𝜆) ⎠
and got the result 𝜆4 − 6𝜆3 + 𝜆2 + 3𝜆 − 39. We then asked the computer to set
that equal to zero and solve for 𝜆, with the results 𝜆 = −1.75, 5.93, or 0.006 ± 1.71i.
The equations will have non-trivial solutions if and only if 𝜆 has one of these
four values.
If this example seems pointless, hold on. This is the type of calculation you need to
do to find “eigenvalues” of a matrix. We’ll define what that means in Section 6.8 and
by the end of the chapter you’ll have used eigenvalues in solving the three-spring
problem among others.
Calculating Determinants
We present here the algorithm for finding determinants with no justification at all. We’re
not even going to say, as we usually do, “you will prove in thus-and-such a problem that this is
the right process.” We’re just going to show you how to find the determinant of a 3×3 matrix,
and how the process generalizes up from there.
Note that the determinant is only defined for a square matrix: one with the same number
of rows as columns. Ask a computer for the determinant of a 2×3 matrix you and should get
an error.
The process is called “expansion by minors.” A “minor” of a matrix is a smaller
matrix, obtained by crossing out the row and column that contain a particular
1 2 3
element. The picture shows a 3×3 matrix with the(row and ) column containing 4 5 6
4 6
the “2” crossed out. What’s left is the 2×2 matrix , the “minor” of that 7 8 9
7 9
element. The determinant of this particular minor is (4 × 9) − (6 × 7) = −6.
The process works like this.
1. Move across the top row of the matrix. For each element, multiply that
number times the determinant of its minor.
2. Take all the numbers you get by this process and alternately add and subtract them.
Got all that? Of course you don’t, but it will be clearer with an example.
⎛3 7 2 ⎞
Question: If matrix 𝐀 = ⎜ 4 −8 6 ⎟, find its determinant |𝐀|.
⎜ ⎟
⎝5 0 −1 ⎠
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 318
Answer:
( Begin
) in the upper-left-hand corner. The number there is a 3, and its minor
−8 6
is . You multiply the number 3 times the determinant of that 2×2 matrix,
0 −1
and you get a number.
Then you repeat this process all along the top row, multiplying the 7 by the
determinant of its minor, and the 2 by the determinant of its minor. This gives you
three numbers, which you alternately add and subtract.
| −8 6 | | | | |
3 || | − 7 | 4 6 | + 2 | 4 −8 |
0 −1 | | 5 −1 | | 5 0 |
| | | | | |
= 3(−8 × (−1) − 6 × 0) − 7(4 × (−1) − 6 × 5) + 2(4 × 0 − (−8) × 5)
The determinant is 342. And so what? Because |𝐀| ≠ 0 we can say that any set of
equations with these coefficients would have one unique solution. You will see other
valuable interpretations of that number as we go.
The determinant of any square matrix can be defined inductively in this way.
∙ The determinant of a 1×1 matrix is simply the number.
∙ The determinant of any higher square matrix is defined by “expansion by minors” as
explained above.
This process gives you the correct determinant of a 2×2 matrix, ad − bc. It also tells you
how to find the determinant of a 4×4 matrix, but the process involves four separate 3×3
minors, and each of those involves three 2×2 minors! For very large matrices—and yes, they
do come up all the time—expansion by minors is too slow even for a computer, so many
shortcuts have been developed. We will pass over that topic, content that our computers can
find determinants for us.
It is sometimes convenient to be able to quickly calculate 3 × 3 determinants by hand,
however. Problem 6.145 illustrates a shortcut for doing so.
∙ If |𝐂| = 0 then the vectors are linearly dependent. In that case, at least one of those vectors
can be expressed as a linear combinations of some of the others.
∙ If |𝐂| is anything other than zero then the vectors are linearly independent. In that case, the
vectors form a basis for an n-dimensional space. You can create any vector in n-dimensional
space by choosing the right combination of these vectors.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 319
In two dimensions the only way for two non-zero vectors V⃗1 and V⃗2 to be linearly depen-
dent is if V⃗1 = k V⃗2 , which is to say they are parallel.
But consider these three vectors:
Any two of these vectors are non-parallel, which is to say, linearly independent. But the set
of three is not linearly independent: you can build any one of these vectors from the other
two. For instance, V⃗3 = 3V⃗1 − 2V⃗2 .
So imagine trying to use these three vectors as a basis, which is to say, trying to build all
other vectors as a V⃗1 + b V⃗2 + c V⃗3 . If we replace V⃗3 with 3V⃗1 − 2V⃗2 then we have expressed all
vectors using only V⃗1 and V⃗2 ! It is impossible for two vectors to span 3-space, so it must be
impossible for three linearly dependent vectors to act as a basis.
We can express this same result visually by saying that V⃗1 , V⃗2 , and V⃗3 all lie in a plane; they
can be used only to build other vectors that lie in that same plane. But remember that the
same result applies in higher dimensions that we cannot visualize.
This result may seem unrelated to the sets of equations and vectors for which we’ve used
the determinant before. Once again, though, the key is linear dependence. Recall how we
used matrix multiplication to convert one vector basis to another:
V⃗1 V⃗2
( ) ( ) ( )
î 1 2 V⃗1 10 î −4
= ̂
ĵ 3 0 V⃗2 −7 j 30
We interpret this equation as defining V⃗1 = î + 3ĵ and V⃗2 = 2i,̂ and then multiplying to find
that 10V⃗1 − 7V⃗2 is the same as −4î + 30j.̂ The inverse matrix converts back from îĵ represen-
tation to V⃗1 V⃗2 .
But what happens if V⃗1 and V⃗2 are linearly dependent, such as V⃗1 = î + 3ĵ and V⃗2 = 2î + 6j?
̂
Both lie on the line y = 3x, so any linear combination of them lies on that same line. Of
course you can still use a matrix to convert any linear combination of those ( two vectors
) to
1 2
an îĵ representation, but going the other way is not possible. So the matrix doesn’t
3 6
have an inverse. (Take a moment to check that its determinant is 0.)
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 320
The same conclusion holds in higher dimensions. If three vectors are linearly dependent
they all lie on a plane, so you can’t convert from a general 3D vector to a combination of
those three vectors.
With that visual image in mind, you can generalize this to non-geometric transforma-
tions. If a small book has 100 pages and 50,000 words and a big book has 200 pages and
100,000 words, and we know that we have 1000 pages and 500,000 words, there’s no way
to tell how many of each kind of book we have. It could be 10 small books and no big
books, 5 big books and no small books, or many other combinations. Because <100, 50,0000>
and <200, 10,0000> are linearly dependent—that is, because one big book is exactly two
small books—the transformation “small books and big books” to “pages and words” can’t
be inverted.
d 2 x1
= −4x1 + 3x2
dt 2 (6.7.2)
d 2 x2 2
2
= x1 − 3x2
dt 3
As we explained in that section, we want to find “normal modes”—solutions in which the
two balls oscillate at the same frequency, so x1 = C1 cos(𝜔t) and x2 = C2 cos(𝜔t). But these
functions don’t work unless you pick just the right frequency. For instance, we began with
the simple case 𝜔 = 1. Plugging x1 = C1 cos t and x2 = C2 cos t into Equations 6.7.2 gives
Stepping Back
In this section we have talked about the idea of linear dependence in relation to equations,
spatial vectors, and more general “vectors.” In its most general form, any set of things is
linearly dependent if one of those things can be written as a linear combination of others
of those things. (You can even have a linearly dependent, or linearly independent, set of
matrices!)
The determinant of a square matrix is a number you can calculate from the elements
of that matrix. The determinant is zero if and only if the rows of the matrix are linearly
dependent, or equivalently if the columns are linearly dependent. (It’s possible to prove
that either of those implies the other.) All of the applications of the determinant we’ve seen
in this section are different uses for knowing whether a set of things is linearly dependent.
In this section the only thing we’ve done with the determinant is to ask whether or not it
is zero. The determinant tells you more than that. For instance, when you consider a square
matrix to be a transformation acting on a shape, the determinant of that square matrix is
the factor by which the area or volume of that shape is multiplied. We will discuss this issue
in Section 7.1.
If the determinant is zero, however, it doesn’t tell you whether a set of equations is incon-
sistent or linearly dependent, how many of the variables are linearly independent if they all
aren’t, or what the solution to the equations is if there is a unique solution. In Section 7.4
(see felderbooks.com) we’ll teach you a method called “row reduction” that can answer these
questions and is more efficient than calculating large determinants.
∙ If the equations are all homogeneous, classify being a useful mnemonic for finding cross
them as having a unique solution, or as being lin- products.)
early dependent (infinitely many solutions). 6.145 A shortcut for finding the determinant of
∙ If the equations are not all homogeneous, clas- a 3 × 3 matrix is illustrated below.
sify them as having a unique solution, or as being
“inconsistent or linearly dependent.” (Section 7.4
at felderbooks.com discusses distinguishing these
two cases.)
You consider six diagonals, including ones
6.131 3x − 4y = 7, 2x − 5y = 3 that wrap around from one side to the other.
6.132 5x − 2y = 2, 15x − 6y = −3 For each one you multiply the three num-
bers. Then you add the three “upper left
6.133 x + 3y = 4, −2x − 6y = −8 to lower right” diagonals and subtract the
6.134 x − y = 0, 3x + 3y = 0 “upper-right to lower left.” So the determi-
6.135 x − 2y + 3z = 5, 2x − y + 3z = 3, nant of the sample matrix shown above is
−x + 4y − 2z = 1
2⋅3⋅3+1⋅1⋅1+2⋅0⋅0
6.136 x + 2y − z = 0, −2x + 3y + 2z = 0, x + y − z = 0
− 2 ⋅ 3 ⋅ 1 − 1 ⋅ 0 ⋅ 3 − 2 ⋅ 1 ⋅ 0 = 13
6.137 −6x + 3y + 4z − 2q + 5r = 2, −x + 2y −
z + q + 2r = −3, −4x + 2y + 3z − 2q − r = −1, (a) Use this trick to find the deter-
2x − 2y − z + 2q + r = 0, 5x − 5y − 3q = 2 minant of the matrix
⎛1 0 −1 ⎞
6.138 x + 2y + 3z + 4q + r = 0, 5x + 6y + 7z + ⎜2 1 3 ⎟
8r = 0, 9x + 10y + 11q + 12r = 0, 13x + 14z + ⎜ ⎟
15q + 16r = 0, 17y + 18z + 19q + 20r = 0 ⎝0 −2 2 ⎠
6.147 If two linear equations have different (c) V⃗1 = 3î + 4ĵ − k,
̂ V⃗2 = î − ĵ + 7k,
̂
slopes, then they intersect in a unique V⃗3 = 5î + 9ĵ − 9k̂
point. If two linear equations have the
same slope, then they represent either
the same line (linearly dependent) or two 6.150 For each set of vectors shown below,
parallel lines (inconsistent). With that in say whether it can form a basis, and how
mind, consider the equations: you know. (The answer in each case only
needs to be a few words long.)
ax + by = r1 (a) V⃗1 =< 1, 2, 3 >, V⃗2 =< 4, 5, 6 >,
cx + dy = r2 V⃗3 =< 1, 0, 1 >
(b) V⃗1 =< 1, 2, 3, 4 >, V⃗2 =< −1, 3, −2, 0 >,
Find the slopes of these two lines. Use the V⃗3 =< 2, 9, 7, 12 >,V⃗4 =< −2, 3, −1, 2 >
result to prove that if the determinant of
the matrix of coefficients is zero, then the (c) V⃗1 =< 1, 2, 0, −1, 3 >, V⃗2 =< 1, 3, 0, 0, 2 >,
equations must be dependent or inconsistent. V⃗3 =< 2, −1, 3, 4, 1 >,V⃗4 =< 6, −3, 0, 2, 1 >
6.148 In the Explanation (Section 6.7.2) we noted ⎛ 3 −2 6 5 ⎞
that two linear equations with two unknowns ⎜ 1 4 0 7 ⎟
Matrix 𝐌 = ⎜
−4 ⎟
6.151 .
can be graphed as two lines on the xy-plane. 8 1 2
⎜ ⎟
The only three possibilities for the inter- ⎝ −1 −1 10 21 ⎠
section of those two lines are that they (a) Find |𝐌|.
are parallel (no intersection, inconsistent (b) Find 𝐌−1 .
equations), they are the same line (infinitely
6.152 𝐌 is a square matrix whose top row is
many intersection points, linearly depen-
all zeros. You should be able to answer
dent equations), or they intersect at a single
the following questions without know-
point. Put succinctly, their regions of inter-
ing anything else about 𝐌.
section can be either nothing, a point, or
a line. In this problem you will extend this (a) What does having all zeros in the
logic to three equations with three unknowns, top row tell you about the transfor-
which can be graphed as three planes. mation that 𝐌 performs?
(a) First consider two planes. What (b) Explain, based on your answer to Part (a),
are the three possibilities for their why 𝐌 cannot have an inverse.
regions of intersection? (c) Using the algebraic rule for finding
(b) Now consider the intersection of a third determinants, prove that |𝐌| = 0.
plane with each of the regions you ⎛2 6 1 ⎞
described above. For example, one possi- 6.153 Matrix 𝐙 = ⎜ 5 3 −4 ⎟. Find non-zero num-
⎜ ⎟
bility for two planes is that they intersect ⎝a b c ⎠
on a line. So describe all the possible bers a, b, and c so that 𝐙 has no inverse.
regions of intersection of the third plane 6.154 A carbon nucleus has 6 protons and
with that line. Do the same with the other 6 neutrons. An oxygen nucleus has 8
possible regions you found in Part (a). You protons and 8 neutrons.
should find four possibilities in all for the (a) Write a matrix 𝐗 that converts from “Num-
region of intersection of three planes. ber of carbon and oxygen nuclei” to
(c) Using your answers to Part (b), explain “Number of protons and neutrons.”
why three linear equations with three (b) A nuclear reaction involves 48 pro-
unknowns must either have zero solutions tons and 48 neutrons. Write at least
(inconsistent), infinitely many solutions two possible combinations of car-
(linearly dependent), or one solution. bon and oxygen nuclei that could
6.149 For each set of vectors shown below, say emerge from this reaction.
whether it can form a basis (either for (c) Based on your answers to Parts (a)
a plane or a 3D space), and how you and (b), how can you predict with-
know. (The answer in each case only out doing any calculations whether
needs to be a few words long.) or not 𝐗 is invertible?
(a) V⃗1 = 3î + 4j,
̂ V⃗2 = 3î − 4ĵ (d) Calculate |𝐗| and (hopefully) verify the
(b) V⃗1 = 3î + 4j,
̂ V⃗2 = −6î − 8ĵ prediction you made in Part (c).
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 324
6.158
In Problems 6.155–6.159 you will be shown a diagram
illustrating a chemical reaction. For example, 40 mol/min CH2 C2H4 ∙
n1
the reaction
30 mol/min C H n∙ 2
2 2 C4H8
100 mol/min CO2 C2O6
n∙ 1
6.159
100 mol/min C O n∙2
2 2 CO
40 mol/min CH4 C2H4 ∙
n1
represents 100 mol/min each of CO2 and C2 O2 30 mol/min C H n∙ 2
2 2 C4H8
coming in and unknown rates of C2 O6 and CO
coming out. For each reaction . . .
∙ Write down a set of equations setting the total 6.160 The diagram below shows a map
input rate of each type of atom equal to its output of one way streets. The numbers and
rate. In our example above, setting the input and letters represent known and unknown
output rates of carbon equal gives 100 + 2(100) = rates of traffic flow, measured in cars
2ṅ 1 + ṅ 2 and setting the input and output rates of per hour.
oxygen equal gives 2(100) + 2(100) = 6ṅ 1 + ṅ 2 .
∙ Rewrite the equations in matrix form and use a 20 250
determinant to figure out if they have a unique
solution. 120 f1 50
f2 f3
∙ If they do have a unique solution find it. Oth-
erwise simply state that the equations are either
170 f4 f5 200
inconsistent (meaning the reaction is impossible)
20 70 80
or linearly dependent (meaning the output rates
can’t be determined from the input rates). You
don’t need to say which of those two is the case— (a) For each intersection, write an equation
we’ll teach you how to distinguish inconsistent that says “The total rate of cars coming in
from linearly dependent equations in Section 7.4 equals the total rate of cars going out.”
(see felderbooks.com)—but you should note that (b) Is there a unique solution for all of
the equations are “either inconsistent or linearly the unknown flow rates? Explain how
dependent.” Our example above produces 25 you got your answer, including what
mol/min of C2 O6 and 250 mol/min of CO. you asked a computer or calculator
to calculate for you and what answer
These are not necessarily realistic reactions, which would it gave.
require uglier math than we want to get into in these
6.161 Following the method outlined in the
problems.
Explanation (Section 6.7.2), find the
6.155 normal modes of the system:
d 2 x1
40 mol/min CO2 CO
n∙ 1 = −4x1 − 4x2
dt 2
30 mol/min C O n∙ 2 d 2 x2
2 6 C2O2 = −x1 − 4x2
dt 2
6.156 You may assume that the normal modes
are sinusoidal. (You’ll verify that assump-
C2H4 tion by finding them.) You may also assume
100 mol/min C2H6 n∙ 1
n∙ 2 the initial velocities are zero so you can
H2 use only cosines and no sines.
6.162 [This problem depends on Problem 6.161.] For
6.157 the system in Problem 6.161 find the solution
that matches the initial conditions x1 (0) = 7.5,
40 mol/min CH4 C2H4
n∙ 1 x2 (0) = 6.25. Hint: you can do this with sim-
30 mol/min C H n∙ 2 ple algebra, but it’s faster with an inverse
2 2 CH4 matrix.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 325
We are now ready to put in place the last piece we need for a proper analysis of the three-
spring problem from Section 6.1. In our initial treatment of that problem we found that the
right initial conditions led to simple behavior called normal modes. Other initial conditions
led to more complicated behavior, but we could describe that behavior by combining the
normal modes.
When we write the problem with matrices, we can describe those normal modes as the
“eigenvectors”—the natural basis—of a transformation matrix. We will begin with a visual
description of eigenvectors, and end by applying them to the three-spring problem.
Throughout this section we will focus on 2 × 2 matrices but the concepts scale up to higher
dimensional square matrices.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 326
10 10 10
TA
TB
5 5 C 5
B
A
–10 –5 5 10 –10 –5 5 10 –10 –5 5 10
TC
–5 –5 –5
Q1 It apparently make things bigger, but beyond that it’s hard to say much. 𝐓 seems to stretch
things out and turn them around randomly. But let’s look at the effect that 𝐓 has on two other
(carefully chosen) vectors.
10 10
5 TV1 5
V2
V1
–10 –5 5 10 –10 –5 5 10
–5 –5
TV2
–10 –10
(d) (e)
FIGURE 6.7 The effect of matrix 𝐓 on the vectors V⃗1 = î + ĵ (d) and V⃗2 = −3î + 4ĵ (e).
Those two pictures are more important than you might suppose, so make sure you see
what they tell us.
∙ When 𝐓 acts on V⃗1 = î + ĵ it stretches it by a factor of 5, and does not change its direc-
tion. (You don’t have to take our word for that. We gave you 𝐓 above, so you can
multiply it by î + ĵ and see what you get.) Mathematically we write 𝐓V⃗1 = 5V⃗1 .
∙ When 𝐓 acts on V⃗2 = −3î + 4ĵ it stretches it by a factor of 2 and reverses its direction.
Mathematically we write 𝐓V⃗2 = −2V⃗2 .
And so what? Well, remember how we approached the three-spring problem in Section 6.1.
We found two special initial conditions for which we could easily predict the behavior of
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 327
the springs. That gave us a complete understanding of the system because we could write
any other initial condition as a combination of those two. The situation here is perfectly
analogous to that one. We have two special vectors for which we can easily describe the effect
of matrix 𝐓. That gives us a complete understanding of the transformation because we can
describe any other vector as a combination of those two.
The two special initial conditions for the three-spring problem were called the “normal
modes” of that system. The two special vectors for 𝐓 are called the “eigenvectors” of that
transformation.
𝐌𝐯 = 𝜆𝐯 (6.8.1)
where 𝐯 is a vector and 𝜆 is a scalar. For any non-zero solution to this equation, 𝐯 is an “eigenvector”
of 𝐌 and 𝜆 is an “eigenvalue.”
(The phrase “non-zero” is important here. Equation 6.8.1 will always be satisfied if 𝐯 is all zeros,
but that doesn’t count.)
Equation 6.8.1 does not imply that there is anything particularly special about the matrix
𝐌. It says that for almost any square matrix, there are a few vectors that the matrix will act
on in this simple way. (The prefix “eigen” comes from the German for “one’s own”; every
matrix has a few vectors of its own.)
We said that the eigenvectors give you a complete understanding of a particular transfor-
mation. Let’s see how that works for the transformation 𝐓 we described above.
In these examples the matrix 𝐓 and the vectors V⃗1 and V⃗2 are as defined above.
Problem:
Multiply 𝐓 times the vector V⃗1 .
Solution:
𝐓V⃗1 = 5V⃗1 . This mathematical equation expresses the same idea that we expressed in
words above: 𝐓 stretches V⃗1 by 5 while leaving its direction unchanged. This defines
V⃗1 as an eigenvector of 𝐓 with eigenvalue 5.
Problem:
Multiply 𝐓 times the vector V⃗2 .
Solution:
𝐓V⃗2 = −2V⃗2 . The negative sign indicates the reversal of direction. So we see that V⃗2 is
also an eigenvector of 𝐓, with an eigenvalue of −2.
Problem:
Multiply 𝐓 times the vector D⃗ = 10V⃗1 .
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 328
Solution:
( ) ( ) ( )
𝐓 10V⃗1 = 10 𝐓V⃗1 = 10 5V⃗1 = 50V⃗1 . Notice that the answer is 5D.
⃗ This tells us
that D⃗ is an eigenvector of 𝐓 with an eigenvalue of 5.
Problem:
Multiply 𝐓 times the vector E⃗ = 10V⃗1 + 7V⃗2 .
Solution:
( ) ( ) ( ) ( ) ( )
𝐓 10V⃗1 + 7V⃗2 = 10 𝐓V⃗1 + 7 𝐓V⃗2 = 10 5V⃗1 + 7 −2V⃗2 = 50V⃗1 − 14V⃗2 . The
⃗ so E⃗ is not an eigenvector of 𝐓—the
product is not a scalar multiple of E,
transformation changed the direction as well as the magnitude of E. ⃗ But we were still
able to perform the multiplication easily because E⃗ was expressed as a combination of
our eigenvectors.
Problem:
Multiply 𝐓 times the vector F⃗ = −11î + 24j.̂
Solution:
This problem doesn’t seem to have anything to do with our eigenvectors. But V⃗1 and
V⃗2 form a basis; any vector can be expressed in terms of them, if we want to. In this
case, a bit of work (using an inverse matrix for instance) tells us that F⃗ = 4V⃗1 + 5V⃗2 .
In that form we can do the multiplication trivially and find that 𝐓F⃗ = 20V⃗1 − 10V⃗2 .
Vector D⃗ above imparts a lesson that you don’t want to miss. We saw that because V⃗1 is an
eigenvector with eigenvalue 5, 10V⃗1 is also an eigenvector with eigenvalue 5. More generally,
any scalar multiple of an eigenvector is itself an eigenvector, with the same eigenvalue. You
will often hear that a 2 × 2 matrix has two eigenvectors, but this is not strictly accurate: it
generally has two linearly independent eigenvectors, or two special directions, each with its
own eigenvalue.
The last example above (vector F⃗) shows why eigenvectors are so important. You can
express any vector as a linear combination of the eigenvectors of a particular transformation.
When a vector is so expressed, you can easily determine the effect of that transformation on
that vector. This is called the “natural basis” for the transformation, and we shall return to it
below.
( )
9 4
Question: Find the eigenvectors and eigenvalues of the matrix .
3 8
Answer: By definition the eigenvectors and eigenvalues satisfy the eigenvalue
equation ( )( ) ( )
9 4 A A
=𝜆
3 8 B B
(9 − 𝜆) A + 4B = 0
(6.8.2)
3A + (8 − 𝜆) B = 0
One solution is obviously A = B = 0. These linear equations can only have more
solutions if they are linearly dependent, which is to say, the determinant of the matrix
of coefficients must be zero.
| 9−𝜆 4 |
| | = (9 − 𝜆) (8 − 𝜆) − (4)(3) = 0
| 3 8−𝜆 |
| |
This is a quadratic equation with solutions 𝜆 = 12 and 𝜆 = 5, our two eigenvalues.
Now, plug 𝜆 = 12 into our equations for A and ( B ) and both equations reduce to the
4
same relationship: 4B = 3A. That tells us that , or any multiple thereof, is the
3
eigenvector that corresponds to this eigenvalue. (Remember that any multiple of an
eigenvector is another eigenvector.)
We’ll leave it to you to find the eigenvector that corresponds to 𝜆 = 5.
For almost any 2×2 matrix the process looks like that example. It leads to a quadratic
equation, meaning that we are setting a second-order polynomial—the “characteristic poly-
nomial” of the matrix—equal to zero. A 3×3 matrix has a third-order characteristic polyno-
mial, and so on. Since we know a great deal about the roots of polynomials, we can say a great
deal about the eigenvalues of a matrix.
∙ Most n × n matrices will have n eigenvalues, corresponding to n linearly independent
eigenvectors. This is what we would hope for, since we want to use the resulting vectors
as the basis for an n-dimensional space!
∙ It is possible in some cases to have fewer eigenvalues than n, but never more.
∙ It is also possible for some eigenvalues to be complex. When a real matrix has complex
eigenvalues, they will come in complex conjugate pairs.
Diagonalizing a Matrix
Section 6.4 discussed using different basis vectors to represent a space. When you switch from
the îĵ basis to the V⃗1 V⃗2 basis, all your vectors look the same (the same little arrows on a graph),
but the numbers used to represent those vectors are different. All your transformations are
the same but the matrices you use to represent those transformations are different.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 330
This object in the îĵ basis . . . . . . is represented in the V⃗1 V⃗2 basis as
( ) ( )
̂ ̂ 11 ⃗ ⃗ 5
The vector 11i − 3j or 5V1 − 2V2 or
−3 −2
( ) ( )
−3 0
The eigenvector −3î + 4ĵ or , with V⃗2 or , still with eigenvalue −2
4 1
eigenvalue −2
( )
5 0
The (
transformation
) matrix 𝐃=
2 3 0 −2
𝐓=
4 1
We could make a table like that for any two bases, and include any vectors and transfor-
mations we like. The last row of the table shows that 𝐓 and 𝐃 are “similar matrices”: they
represent the same transformation in two different bases.
But we didn’t choose that particular basis arbitrarily. We chose it because V⃗1 and V⃗2 are
the eigenvectors of the transformation we called 𝐓. The eigenvectors of a matrix form the
“natural basis” for the transformation effected by that matrix, because in that basis the trans-
formation is easy to work with.
Why does working in the natural basis give you a diagonal matrix? And why is it such a big
deal to have a diagonal matrix anyway?
Well, remember what eigenvectors mean. When a transformation acts on an eigenvec-
tor, it multiplies it by a scalar (the eigenvalue). When a transformation acts on a different
vector that is represented as a linear combination of the eigenvectors—that is, in the basis
of the eigenvectors—it multiplies each component by the associated eigenvalue. And that’s
exactly what happens when you multiply a diagonal matrix (below we use 𝜆1 and 𝜆2 for the
eigenvalues) by an arbitrary point (x, y).
( )( ) ( )
𝜆1 0 x 𝜆1 x
=
0 𝜆2 y 𝜆2 y
In general, when you multiply a matrix by a vector, the new x-component depends on
both the old x-component and the old y-component. But when the matrix is diagonal, the
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 331
new x-component depends only on the old x-component and the new y depends only on
the old y. That is why diagonal matrices are so important: they “decouple” the variable
dependencies.
In Section 6.9 we will see that diagonalization is ultimately the right way to look at
the three-spring problem. The original problem is two coupled differential equations,
meaning that x1′′ depends on both x1 and x2 , and so does x2′′ . When you rewrite the entire
problem in the natural basis of the transformation matrix the equations decouple. We’ll
define new variables f and g that are the amplitudes of the normal modes and write
differential equations for them: f ′′ will depend only on f and g ′′ will depend only on g , so
we will have two ordinary differential equations that we know how to solve.
Below we take a less ambitious approach. But as you will see, the eigenvectors are still the
key to finding the normal modes.
d 2 x1
= −4x1 + 3x2
dt 2
d 2 x2 2
= x1 − 3x2
dt 2 3
The positions of the balls form a vector: not a vector in the sense of “a quantity with mag-
nitude and direction” but a vector in the more general sense of “a variable that must be
represented with two numbers.” And the equations are based on linear combinations of
these positions. All this suggests rewriting these two equations as one matrix equation:
( ) ( )( )
ẍ1 −4 3 x1
= (6.8.3)
ẍ2 2∕3 −3 x2
The right-hand matrix in that equation is a 2×1 vector that represents the state of the system:
a description of what the system looks like at any given time and, ultimately, the function
we hope to solve for. The middle matrix represents a transformation that acts upon
this state.
Recall from Section 6.1 that our entire solution was based on normal modes, and that each
normal mode was defined by two properties: a particular ratio of x2 to x1 , and a frequency
of oscillation. Now that you know how to work with matrices, we can tell you in four words
how to find the normal modes. Just find the eigenvectors!
) ( ( )
−4 3 1
For instance, one eigenvector of the matrix is with eigenvalue −2.
2∕3 −3 2∕3
(You can find this and the other eigenvector by hand, as discussed above, or by computer.)
This particular state means x1 = 1 and x2 = 2∕3 but remember that any multiple of an
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 332
Equation 6.8.4 provides the vital link between eigenvectors and normal modes, so be sure to
follow it in three ways.
1. First follow the algebra. Look what substitutions we made in Equation 6.8.3, and why.
Do the matrix multiplication to make sure you get what we got.
2. Second, notice that when we multiplied the transformation matrix by the vector, we
got the same vector multiplied by −2. That’s what eigenvectors do, and it’s why they
are the key to this whole process.
3. Finally, notice that we have decoupled the two variables. That is, we now have the ordi-
nary differential equation ẍ1 = −2x1 . (In fact, we have it twice; as you will see in Prob-
lem 6.190, if you start with the wrong ratio, you end up with contradictory equations.)
And of course we have the promise that x2 = (2∕3)x1 , so one solution gives us both
variables.
We can write down the solutions by inspection. They are of course a mix of sines and cosines,
but as we did in Section 6.1 we’re going to leave out the sine terms so √ both balls start at zero
velocity. With that restriction, the solution to ẍ1 = −2x1 is x1 = A cos( 2 t). Remember this
equation is only valid when x2 = (2∕3)x1 , so it can be written
(√ )
( ) ⎛ A cos 2t ⎞
x1
=⎜ (√ ) ⎟ The first normal mode solution
x2 ⎜ (2∕3)A cos 2 t ⎟⎠
⎝
In a similar way we can plug in the eigenvector x2 = −(1∕3)x1 to get the equation ẍ1 = −5x1 ,
which leads to
(√ )
( ) ⎛ B cos 5t ⎞
x1
=⎜ (√ ) ⎟ The second normal mode solution
x2 ⎜ −(1∕3)B cos 5 t ⎟⎠
⎝
We discussed in Section 6.1 how to combine these normal modes to match initial conditions.
There is no need to repeat that here.
Summing up, if we write a set of coupled, linear differential equations as a matrix
equation, the eigenvectors of the matrix tell us the normal modes of the system, and
the eigenvalues tell us the frequencies of those normal modes. Once we have those
normal modes, we can write arbitrary initial conditions as a sum of normal modes and
thus find their behavior. All of this is parallel to the use of eigenvectors in geometric
transformations.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 333
For each transformation, indicate whether 6.174 Walk-Through: Finding Eigenvectors and
V⃗ is an eigenvector of the transformation. Eigenvalues. In this problem you’re going
If so, estimate its eigenvalue. to find the eigenvectors
( and) eigenvalues
( )
2 1 9 6
6.169 Matrix 𝐒 = represents a of the matrix 𝐌 = .
1 2 8 1
transformation. ( ) (a) Write the eigenvalue equation 𝐌𝐯 = 𝜆𝐯
0 1 1 0 0 for this matrix, using the unknowns x
(a) Matrix 𝐏1 = . Draw
0 0 1 1 0 and y for the elements of the eigen-
the figures represented by 𝐏1 and 𝐒𝐏1 . vector, and multiply it out to two sep-
( ) arate algebra equations.
0 2 1 −1 0
(b) Matrix 𝐏2 = . Draw (b) Collect like terms until both equations
0 2 3 1 0
the figures represented by 𝐏2 and 𝐒𝐏2 . have the form ax + by = 0.
(c) Describe in words the effect of Note that x = y = 0 is a valid solution
matrix 𝐒: it stretches any figure how to these equations. This represents the
far, in what direction? “trivial” solution that doesn’t count as an
(d) Which shape made it easier to see eigenvector.
the effect of matrix 𝐒? Why? (c) For these two linear equations to have
( ) more solutions, they must be linearly
5 1
6.170 The matrix has two eigenvalues. dependent: that is, the determinant
6 4
( ) of their matrix of coefficients must
1
(a) One eigenvector is v⃗1 = . Find be zero. Write the resulting equation
2
for 𝜆.
the associated eigenvalue 𝜆1 .
(d) Solve for 𝜆. The result should give
(b) The other eigenvalue is 𝜆2 = 2. Find
you two eigenvalues.
the associated eigenvector v⃗2 .
( ) (e) For each eigenvalue, find the eigenvector:
7 −2
6.171 The matrix consists entirely i. Plug the eigenvalue into one of
5 1
of real numbers, but its eigenvectors the equations of the form ax +
and eigenvalues are complex. by = 0. Solve to find a relation-
( ) ship between x and y.
3+i
(a) One eigenvector is v⃗1 = . Find ii. Check that the other equation gives
5
the associated eigenvalue 𝜆1 . you the same relationship. That tells
you that you correctly found an eigen-
(b) The other eigenvalue is 𝜆2 = 4 − i. Find
value, and that any vector (x, y) that
the associated eigenvector v⃗2 .
satisfies that relationship is an eigen-
6.172 What are the eigenvectors of the iden- vector with this eigenvalue.
( )
tity matrix 𝐈? Why? x
iii. Write an eigenvector that sat-
6.173 We’ve said that if v⃗ is an eigenvector of 𝐌 with y
eigenvalue 𝜆 then any multiple of v⃗ is also an isfies the relationship you found.
eigenvector with the same eigenvalue. iv. Confirm that your eigenvalue and
(a) Show that if two vectors v⃗1 and v⃗2 are eigenvector satisfy the eigenvalue
both eigenvectors of 𝐌 with the same equation for this matrix.
eigenvalue 𝜆, then any linear combi-
nation of v⃗1 and v⃗2 is as well. In Problems 6.175–6.178, find the eigenvectors and
(b) As a trivial example, what are all of the eigenvalues of the given matrix. It may help to first
eigenvectors of the 2 × 2 identity matrix work through Problem 6.174 as a model.
and what are their eigenvalues? ( )
2 6
(c) As a less trivial example, the matrix 6.175
4 7
⎛2 0 1⎞ ( )
3 3
𝐓 = ⎜ 0 2 0 ⎟ has eigenvectors i,̂ j,̂ and 6.176
⎜ ⎟ 2 8
⎝0 0 1⎠ ( )
−î + k.
̂ Find the corresponding eigenval- 6.177
4 2
ues. You should find that two of them 6 5
( )
are equal and the third is different. 6 2
6.178
(d) What are all the eigenvectors of 𝐓 and 4 5
what are their eigenvalues?
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 335
6.179 Explain why we cannot use the techniques x î + yj.̂ Write a different matrix that per-
( to find)a natural basis for
in this section forms the same transformation on any
6 −1
the matrix . Hint: try it! vector expressed in the form a V⃗1 + b V⃗2 .
4 2
( ) (b) Vector F⃗ = 2î + 3j. ̂ Draw the vec-
4 2
6.180 The matrix 𝐓 = has eigenvec- tors F⃗ and 𝐍F⃗.
( ) 9 1 (c) Represent the vector F⃗ in the V⃗1 V⃗2 basis,
2
tors V⃗1 = with eigenvalue 𝜆1 = 7, and and then act on the resulting vector with
( )3 the matrix you wrote in Part (a).
1
V⃗2 = with eigenvalue 𝜆2 = −2. (d) Draw both your beginning and ending
−3
(a) Find the i-̂ and j-components
̂ of vectors from Part (c). Your axes will still
be x and y, so you will want to express
the vector 5V1 + 4V⃗2 .
⃗
vectors in the îĵ basis to draw them.
(b) Write a matrix 𝐁 that will convert
any vector from a V⃗1 V⃗2 representa- 6.183 Construct a matrix ( with)the following two
tion to an îĵ representation. ⃗
eigenvectors: V1 =
2
has 𝜆1 = 7, and
( ) 3
(c) Find the V⃗1 - and V⃗2 -components
5
of the vector 5î + 4j.̂ V⃗2 = has 𝜆2 = −6. (A computer is not
( ) 4
(d) Evaluate 𝐓 5V⃗1 + 4V⃗2 . Hint: work essential for this problem, but you may want
to use one for some of the algebra.)
entirely in the V⃗1 V⃗2 basis without con-
verting anything to the îĵ basis. 6.184 In this problem you will find the eigen-
(e) Matrix 𝐓 represents a particular trans- vectors and eigenvalues of the matrix
formation on vectors in the îĵ basis. ⎛ −10 −10 1⎞
Write a different matrix that performs 𝐌=⎜ 2 1 3 ⎟.
⎜ ⎟
the same transformation on vectors ⎝ 4 10 2⎠
in the V⃗1 V⃗2 basis. (Hint: think about
what happened in Part (d)!) (a) Write the statement “x î + yĵ + z k̂ is
( ) an eigenvector of 𝐌 with eigenvalue
5 1
6.181 The matrix 𝐓 = has eigenvec- 𝜆” as a matrix equation.
( ) 6 6
1 (b) Multiply the matrices to turn this into
tor V⃗1 = with eigenvalue 𝜆1 = 8 and three homogeneous linear equations.
3 ( )
−1 Put all of the non-zero terms on the
eigenvector V⃗2 = with 𝜆2 = 3. left side.
2
(a) Matrix 𝐓 performs a certain transforma- (c) The only way 3 homogeneous lin-
tion on any vector expressed in the form ear equations for 3 variables can have
x î + yj.̂ Write a different matrix that per- non-trivial solutions is for the deter-
forms the same transformation on any minant of the matrix of coefficients
to be zero. Use that condition to write
vector expressed in the form a V⃗1 + b V⃗2 .
the “characteristic polynomial” for this
(b) Draw the vectors î and 𝐓i.̂ matrix: the polynomial that must be
(c) Represent the vector î in the V⃗1 V⃗2 basis, zero for 𝜆 to be an eigenvalue.
and then act on the resulting vector with
the matrix you wrote in Part (a). (d) Set the characteristic polynomial
(d) Draw both your beginning and ending equal to zero and solve for 𝜆.
vectors from Part (c). Your axes will still (e) Plug each of the values of 𝜆 you found
be x and y, so you will want to express into the equations for x, y, and z. For
vectors in the îĵ basis to draw them. each value of 𝜆 you should find, not
( ) a specific set of values, but a set of
7 2
6.182 The matrix 𝐍 = has eigenvec- ratios between them that must be sat-
( ) 3 2
2 isfied for the equations to work.
tor V⃗1 = with eigenvalue 𝜆1 = 8 and (f) What are the eigenvectors and
1 ( )
−1 eigenvalues of 𝐌?
eigenvector V2 = ⃗ with 𝜆2 = 1.
3 6.185 Walk-Through: Solving Coupled
(a) Matrix 𝐍 performs a certain transforma- Differential Equations with Eigenvec-
tion on any vector expressed in the form tors and Eigenvalues. In this problem
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 336
you are going to solve the differential with the given initial and/or boundary conditions.
equations: You may find it helpful to first work through
Problem 6.185 as a model.
d 2 x1
= −8x1 − 8x2 6.186 d 2 x1 ∕dt 2 = −11x1 + 6x2 , d 2 x2 ∕dt 2 = 6x1 − 6x2 ,
dt 2
d 2 x2 x1 (0) = 1, x2 (0) = 8, x1′ (0) = x2′ (0) = 0
= −x1 − 6x2
dt 2 6.187 dx1 ∕dt = 8x1 + 2x2 , dx2 ∕dt = 9x1 + 5x2 ,
x1 = 1, x2 = 2. Notice that these are first-
(a) Begin by rewriting these two coupled order differential equations.
differential equations as one differen-
tial equation involving matrices. 6.188 d 2 x1 ∕dt 2 = 7x1 + 2x2 , d 2 x2 ∕dt 2 = 6x1 +
8x2 , x1 (0) = 2, x2 (0) = 1. Assume lim x1 =
(b) Find the eigenvectors and eigenval- t→∞
lim x2 = 0. (You’ll apply that condition dur-
ues of the transformation matrix t→∞
in your equation. ing the same part of the process where
you apply the initial conditions.)
(c) Plug in the first eigenvector and solve the
resulting differential equations for x1 and 6.189 d 2 x1 ∕dt 2√= 4x1 + 9x2 , d 2 x2 ∕dt 2 = 2x1 + x2 ,
x2 to find one normal mode of the system. x2′ (0) = 2, x1 (0) = x2 (0) = x1′ (0) = 0.
(d) Rewrite and solve the differential equation
for the second eigenvector. 6.190 In the Explanation (Section 6.8.2) we found
(e) Combine your two solutions. The that plugging x2 = (2∕3)x1 into Equation 6.8.3
result should be a solution with gave us a differential equation that we could
two arbitrary constants. solve for x1 . In fact, it gave us the same dif-
(f) Find the solution to this system if both ferential equation twice. What happens if
objects start at rest with x1 = 1 and x2 = 1. we try an initial state that is not an eigen-
vector? Plug x2 = 2x1 into Equation 6.8.3
In Problems 6.186–6.189 use eigenvectors and and show that you end up with two con-
eigenvalues to solve the given differential equations tradictory differential equations.
In this explanation we are going to solve two “coupled differential equations” problems. The
first solution will closely parallel the process we used in Section 6.1; it makes use of matrices
and eigenvalues, but fundamentally works in the basis of the two positions x1 and x2 . At
the very end it uses the normal-mode basis to find the solution for a particular set of initial
conditions. The second process translates the entire problem into the basis defined by the
eigenvectors—the normal modes—right from the start. In this natural basis we find that the
problem almost solves itself.
In both cases we have changed the original differential equations so the answers will come
out different. Following both solutions carefully is a great way to make sure you understand
the issues we have discussed throughout this chapter.
The first approach, illustrated in the example on Page 337, uses matrices in three key
steps.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 337
The next step is to find the two eigenvectors of the 2 × 2 matrix in Equation ( 6.9.1. )
1
The first, which you can find by hand (see Section 6.8) or on a computer, is
1∕2
with eigenvalue −4. Any multiple of an eigenvector is also an eigenvector, so that first
eigenvector tells us that if you plug in any state in which x2 = (1∕2)x1 , the matrix will
simply multiply that state by −4. So if we replace x2 with (1∕2)x1 the equation
becomes: ( ) ( )
ẍ1 x1
= −4
(1∕2)ẍ1 (1∕2)x1
So we now have the equation ẍ1 = −4x1 . (In fact we have this same equation twice.)
The solution to this equation becomes the first normal mode.
1
x1 (t) = A1 cos (2t) + A2 sin (2t) , x2 (t) = x (t) (6.9.2)
2 1
( )
1
The other eigenvector is with eigenvalue −1, which becomes the other
−1
normal mode.
x1 (t) = B1 cos t + B2 sin t, x2 (t) = −x1 (t) (6.9.3)
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 338
The general solution to our initial problem is a sum of these two normal modes. Next
we apply our initial conditions. Knowing that the initial velocities are zero we discard
the sine terms. We can easily write a matrix that converts normal mode amplitudes to
initial positions; inverting that matrix allows us to find the normal mode amplitudes
that match our given initial positions. As with the eigenvectors above, we’re not
showing you here the process of finding an inverse matrix, which is in Section 6.6;
we’re showing you what to do with it.
A1 B1 x1 (0) x2 (0)
( ) ( )
x1 (0) 1 1 A1 2∕3 2∕3
The inverse of is
x2 (0) 1∕2 −1 B1 1∕3 −2∕3
Applying this matrix to the initial conditions x1 (0) = 3, x2 (0) = −9 gives A1 = −4,
B1 = 7, so the solution is
If there are any steps in that solution that you didn’t follow, we hope you can find the
appropriate part of the chapter to go back and review. Pay particular attention to the way
each eigenvector of the matrix became a normal mode of the system; because the matrix
multiplication became a scalar multiplication, the solution was two different oscillations with
the same frequency.
The process we just modeled is all you need to solve this type of coupled oscillator prob-
lem. In the rest of this section we are going to show you a slightly different way of writing
the steps in that solution. This second approach illustrates the meaning of basis states and
diagonalization more directly than the method we just used.
Above we approached Equation 6.9.1 by plugging in the eigenvectors one at a time, and
seeing what they did to the equation. Instead, let’s rewrite the entire equation in the basis
defined by the eigenvectors. So the state of the system—the positions of the balls, which we
have previously expressed as x1 and x2 —we will now express like this.
( ) ( ) ( )
x1 (t) 1 1
= f (t) + g (t) (6.9.4)
x2 (t) 1∕2 −1
You can interpret Equation 6.9.4 physically. As we have said all along, instead of asking “where
is each ball?” we ask “how much of each normal mode do we have?” But right now let’s focus
on two mathematical questions.
Can we do that? Yes we can, for absolutely any functions x1 and x2 , and here’s why. At
any given time t, the values x1 and x2 represent a vector. Equation 6.9.4 tells us that we can
express that vector as a linear combination of the two eigenvectors. We know we can do that,
not because they are eigenvectors, but just because they are two non-parallel vectors and
therefore form a basis. The numbers f and g represent the coefficients in this new basis. At
a later time t the values x1 and x2 are different; we are converting them to the same basis with
different coefficients. That is why f and g are functions of t.
What happens when we do that? When we express all our states in a different basis, the entire
equation—including the transformation matrix itself—changes. When you use the eigen-
vectors as a basis, the transformation is represented by a diagonal matrix of the eigenvalues.
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 339
(This fact comes very directly from the definition of an eigenvector.) And a diagonal matrix
always decouples the equations. Equation 6.9.1 becomes:
( ) ( )( )
f̈ −4 0 f
= (6.9.5)
g̈ 0 −1 g
( )
f
Our hope is that you understand why this equation must be true given that is the vector
( ) g
x1
expressed in the natural basis of the transformation matrix, but you can derive it more
x2
directly by plugging Equation 6.9.4 into Equation 6.9.1. See Problem 6.206.
This gives us the two equations f̈ = −4f and g̈ = −g . Solving those, and plugging the solu-
tions into Equation 6.9.4, leads us back to the solutions we found before.
When we used this approach to solve the three-springs problem the normal modes ended
up being sine waves, but once you reduce the problem to a set of decoupled ODEs you can
solve them to find the normal modes no matter what kind of function they are. Also, instead
of solving for a particular set of initial conditions as we have you can always choose to add
the normal modes with their arbitrary constants to get the general solution. Both of these
points are illustrated in the example below.
(Notice that these are first derivatives, so this is different from our previous example,
but the same techniques apply.)
( )
1
Answer: The eigenvectors and eigenvalues of this transformation matrix are
( ) 1∕2
1
and with corresponding eigenvalues −5 and 2 respectively. We can write x and
−3
y in this basis as ( ) ( ) ( )
x(t) 1 1
= f (t) + g (t) (6.9.7)
y(t) 1∕2 −3
Equation 6.9.7 says “you can express vector (x, y) at any given time as a linear
combination of these two basis vectors, with f and g as the coefficients.” You can see
what this conversion does to Equation 6.9.6 by directly plugging in these expressions
for x(t) and y(t) (Problem 6.207) or by “converting the transformation” (Section 7.2)
but you don’t have to: we already know that in the basis defined by the eigenvectors, a
transformation becomes a diagonal matrix of the eigenvalues. (Such a matrix says
“multiply the V⃗1 -component by 𝜆1 ” and so on.) So Equation 6.9.6 becomes:
( ) ( )( )
ḟ −5 0 f
=
ġ 0 2 g
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 340
The “natural basis” for this transformation has done its job, giving us the decoupled
differential equations ḟ = −5f and ġ = 2g with solutions f = Ae −5t and g = Be 2t .
Finally, we can go back to the xy representation to write the general solution.
1 −5t
x = Ae −5t + Be 2t , y= Ae − 3Be 2t
2
Stepping Back
As you may have guessed, we want you to be able to solve the three-spring problem. Coupled
oscillators come up a lot. If you boil the whole thing down to “the resonant frequencies of
coupled oscillators are the square roots of the absolute values of the eigenvalues,” you will
have a tidbit that may prove useful.
But beyond that, of course, our strategy in this chapter has been to use the three-spring
problem as both introduction and capstone to your understanding of linear algebra. It is for
that reason that we just solved the same problem twice. In both cases, the takeaway is “the
eigenvectors of the matrix define the normal modes of the system.”
In the first approach, this fact came very directly from the eigenvalue equation. When you
replace the state of the system with an eigenvector, the two variables get separate differential
equations but with the same frequency, which defines a normal mode. If you try replacing
the state of the system with something that isn’t a normal mode (say, x2 = 3x1 ) you will end
up with two contradictory differential equations.
In the second approach, we rewrote the entire problem in the basis defined by the normal
modes. Instead of solving directly for the positions of the balls, we solved for the amplitudes
of the normal modes. This is less direct, but it gives you perhaps the clearest view of why
these normal modes work and no others would; only in the basis of the normal modes does
the transformation matrix become diagonal, thus decoupling the two equations.
If you find the second approach confusing start by getting comfortable with the first
approach. You should be able to understand that and solve problems with it, but once you
have worked with it a while come back to the second approach and see if you can see how it
is doing the same thing as the first one, but in a different basis.
(c) Explain why the normal mode in Part (b) (c) Choose one of the eigenvectors you found
has a higher frequency than the one and use it to write y as a function of x. Plug
in Part (a). Your answer shouldn’t be this function into your matrix equation
about eigenvalues or normal mode from Part (a) and multiply the matrices.
frequencies, but about springs and The result should give you two equiva-
forces. lent differential equations for x.
6.192 The image below shows three (d) Solve your equation for x. The solu-
coupled oscillators. tion should contain two arbitrary
constants. Then use the relationship
k k k k
between x and y in this eigenvector to
m m m write the solution for y, which will con-
tain the same two constants.
You should be able to write down their (e) Repeat Parts (c)–(d) for the
equations of motion and solve them to other eigenvector.
find the normal modes, but in this prob- (f) Combine your answers to find the
lem we want to focus on the physical general solution for x and y.
interpretation so we’re going to give you (g) Plug your general solution into
the eigenvectors of the transformation the original differential equations
matrix. and verify that it works.
⎛ x1 ⎞ ⎛ √1 ⎞ ⎛ −1 ⎞ 1 ⎞
⎛ √ 6.194 [This problem depends on Problem 6.193.]
⎜ x2 ⎟ = ⎜ 2 ⎟ , ⎜ 0 ⎟, ⎜− 2⎟ (a) Assume that x(0)̇ ̇
= y(0) = 0. Use this con-
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ x3 ⎠ ⎝ 1 ⎠ ⎝ 1 ⎠ ⎝ 1 ⎠ straint to simplify your solution. (Sev-
eral terms should drop out.) Assume
(These come out the same regardless of the these initial velocities are zero through-
values of k and m.) The first of these eigen- out the rest of this problem.
vectors describes a normal mode in which (b) Write a set of initial conditions for x and
you pull all three balls to the right (or left), y for which the system will be in one of its
pulling the middle one slightly farther than normal modes. Write the solution x(t), y(t)
the outer two, and then let go. They will then corresponding to those initial conditions.
oscillate that way forever, always moving to the This should require no calculations.
right and left together, with the middle one
(c) Suppose your system is in a mix of the
oscillating with a larger amplitude than the
first normal mode with amplitude A
other two. Using that description as a guide,
plus the second normal mode with
write similar descriptions for what the other
amplitude B. Write ( the matrix
) that
two eigenvectors physically represent. A
converts the vector to the ini-
6.193 Walk-Through: Coupled Differential B
Equations, First Method. In this problem you tial conditions x(0) and y(0).
will find the solution to the equations (d) Using the matrix you found in Part (c),
write the initial conditions that
ẍ = −6x + y would give you A = B = 2.
ÿ = −4x − y (e) Write the matrix that( converts
) from the
x(0)
initial conditions to the nor-
using the approach outlined in the y(0)
example on Page 337. mal mode amplitudes A and B.
(a) Rewrite these differential equations as (f) The system starts with x(0) = 3, y(0) = 4.
a single matrix equation. Use the matrix you wrote in Part (e) to
( One ) side will
ẍ find A and B and use that to write the solu-
be the column matrix . The other
ÿ tion x(t), y(t) for these initial conditions.
side will be a square matrix of all num- 6.195 Walk-Through: Coupled Differential
bers times a column matrix that rep- Equations, Second Method. In this prob-
resents the state of the system. lem you will find the solution to the
(b) Find the eigenvectors and eigenvalues of equations
the square matrix in your equation. You
ẍ = −3x − 4y
may do this with a computer or by using
the method described in Section 6.8. ÿ = −3x + y
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 342
using the approach illustrated in the 6.200 ẍ = −4x + y, ÿ = x − 4y, x(0) = 1, y(0) = −1
example on Page 339. 6.201 ẋ = 2x + y, ẏ = 4x + 5y, x(0) = 2, y(0) = 1.
(a) Rewrite these differential equations as Notice that these are first derivatives.
a single matrix equation. ( One ) side will
ẍ 6.202 ẍ = 3x + y + z, ÿ = x + y + 3z, z̈ = −x +
be the column matrix . The other
ÿ 3y + z, x(0) = 1, y(0) = 2, z(0) = 3 Hint: For
side will be a square matrix of all num- positive eigenvalues you can solve the
bers times a column matrix that rep- ODEs using exponentials, but it’s eas-
resents the state of the system. ier to apply the condition that the initial
(b) Find the eigenvectors and eigenvalues of derivatives are zero if you write the solu-
the square matrix in your equation. You tion in terms of sinh and cosh. If you have
may do this with a computer or by using no idea what those are just go ahead and
the method described
( in
) Section 6.8. use exponentials and it won’t be too bad,
x(t) and you can learn about sinh and cosh in
(c) Write the vector as a sum of the
y(t) Section 12.3.
eigenvectors you wrote in Part (b). The
coefficients in this sum will be functions 6.203 ẋ = −2x − y + z, ẏ = −2x + y − 3z,
of time, which you can call f (t) and g (t). ż = x + y − 2z, x(0) = 1, y(0) = 1, z(0) = 1.
(d) The equation you wrote in Part (a) should Notice that these are first derivatives.
have had a column matrix on the left and
a square matrix and a column matrix on 6.204 In the example on Page 337 we solved a cou-
the right. All of these matrices were writ- pled oscillator problem on the assumption
ten in the xy basis. Rewrite the equation that the balls started with zero velocity. In this
in the fg basis. The result should be two problem you will relax that assumption. Your
decoupled differential equations. starting point will be Equations 6.9.2–6.9.3.
(e) Solve the equations to find f (t) and g (t). (a) The four initial conditions are x1 (0),
Include all arbitrary constants. x2 (0), ẋ 1 (0), and ẋ 2 (0). If A1 = 1 and the
(f) Use your solutions for f and g and other three arbitrary constants are all
your answer to Part (c) to write the zero, what are the values of these four
general solution x(t), y(t). initial conditions? Answer the same ques-
tion for each of the other three arbitrary
6.196 Solve the differential equations in Prob-
constants.
lem 6.193 using the approach illustrated
in the example on Page 339. (b) Using your answers to Part (a), write a
4 × 4 matrix to convert from the four
6.197 Solve the differential equations in Prob- arbitrary constants to the four initial
lem 6.195 using the approach outlined conditions.
in the example on Page 337.
(c) Invert that matrix to find a matrix
that converts from initial conditions to
In Problems 6.198–6.203 find the normal mode arbitrary constants. (If you think about
solutions for the given sets of equations. Then find this there’s a relatively simple way to
the amplitude of each normal mode if the system do it by hand, but you’re welcome to
starts with the given initial conditions. When writing use a computer if you prefer.)
the solutions you should include all arbitrary
(d) Write down the solution for the ini-
constants, but when you plug in initial conditions for
tial conditions x1 (0) = 3, x2 (0) = 1,
second-order equations assume the initial first
ẋ 1 (0) = 1, ẋ 2 (0) = −2.
derivatives are zero. In all cases you may use a
computer to find eigenvectors and eigenvalues or do 6.205 In the example on Page 337 our solution
it by hand, but in the case of larger matrices we’ve required (finding the)eigenvectors and eigen-
marked the problems with a computer icon to −3 −2
values of . Find those eigenvectors
indicate that doing it by hand would be −1 −2
tedious. and eigenvalues (without a computer) using
the method described in Section 6.8.
6.198 ẍ = −7x + 2y, ÿ = −4x − y, x(0) = 1, y(0) = 3 6.206 In the Explanation (Section 6.9.1) we said
6.199 ẍ = −3x − y, ÿ = −4x − 3y, x(0) = 1, y(0) = −5 that when you rewrote x1 and x2 as a sum
7in x 10in Felder c06.tex V3 - February 26, 2015 10:20 P.M. Page 343
6.212 Leave k and m as letters in your solution. the gravitational acceleration g , and
the length L of the strings.
k k k
g
2𝜃̈1 + 𝜃̈2 cos(𝜃1 − 𝜃2 ) + 𝜃̇ 22 sin(𝜃1 − 𝜃2 ) + 2 sin 𝜃1 = 0
m m L
g
𝜃̈2 + 𝜃̈1 cos(𝜃1 − 𝜃2 ) − 𝜃̇ 12 sin(𝜃1 − 𝜃2 ) + sin 𝜃2 = 0
L
These equations are non-linear and can-
6.213 k1 = 16 N/m, k2 = 4 N/m, k3 = 10 N/m, not be solved using the methods in this
m1 = 2 kg, m2 = 2 kg section. For small oscillations, however, you
k1 k2 k3 can assume 𝜃1 , 𝜃2 , and all of their deriva-
tives remain small, and thus approximate
m1 m2 this with a set of linear equations.
(a) Replace all the trig functions with the
linear terms of their Maclaurin series
6.214 k1 = 3 N/m, k2 = 2 N/m, m1 = 1 kg, m2 = 1 kg expansions. Then eliminate any remain-
ing non-linear terms. The result should be
k1 k2 two coupled, linear differential equations.
(b) Solve the equations algebraically for
m1 m2 𝜃̈1 and 𝜃̈2 so you can write them in the
form we used in this section.
(c) Find the normal mode frequencies of
this system for small oscillations.
6.215 k1 = 24 N/m, k2 = 6 N/m, m1 =
3 kg, m2 = 2 kg. The only thing you should 6.218 [This problem depends on Problem 6.217.]
need a computer for is the eigenvectors and For this problem take g = 9.8 m/s and
eigenvalues of a 3 × 3 matrix. L = 1.0 m and assume in all cases that
k1 k2 k2 k1 the two pendulums start at rest.
m1 m2 m1 (a) Choose one of the two normal modes of
the system and numerically solve the orig-
inal equations (not the linearized ones)
for 𝜃1 (0) = 0.1, with 𝜃2 (0) chosen to be
6.216 k1 = 24 N/m, k2 = 6 N/m, m1 = 3 kg, whatever it needs to be for that normal
m2 = 1 kg, m3 = 6 kg. The only thing you mode. Plot 𝜃1 (t) and 𝜃2 (t) together on
should need a computer for is the eigenvectors one plot. Include enough time on your
and eigenvalues of a 3 × 3 matrix. plot to see at least 3 full oscillations of the
pendulums.
k1 k2 k2 k1
(b) Repeat Part (a) for values of 𝜃1 (0) = 0.2,
m1 m2 m3
0.3, and so on up to 1. In each case
adjust 𝜃2 (0) to whatever the normal
mode solution says it should be.
6.217 A double pendulum is shown below. (c) Describe what the system’s behavior
looks like for small and large ampli-
tudes using these initial conditions.
Explain why the different behavior in
θ1 the different cases makes sense.
6.219 The figure below shows a circuit with
capacitors and inductors.
L L
θ2 A
I1 I3 I2
C1 C2 C1
Conservation of charge requires that (b) Find the normal modes of the system and
I3 = I1 + I2 , so there are only two variables use them to write the general solution.
needed to describe the current in the sys- (c) In one of the normal modes the current
tem. You can get equations for those currents on both sides of the circuit is equal, so
by setting the voltage drop from point A at some point in time a current I is flow-
to point B equal along all three paths and ing up on the left, an equal current I is
differentiating: I1 ∕C1 + L Ï1 = I2 ∕C1 + L Ï2 = flowing up on the right, and a current 2I
−(I1 + I2 )∕C2 . (Don’t worry if you didn’t is flowing down the middle. Then they
follow how we got that; your job is going reverse—down on both sides and up in
to be to solve it.) Remember that Ï means the middle—and go back and forth like
d 2 I ∕dt 2 . that forever. Using this description as a
(a) Write these as a pair of equations guide, describe the physical state repre-
for Ï1 and Ï2 . sented by the other normal mode.
CHAPTER 7
Linear Algebra II
∙ find the matrix that performs a given geometric transformation, or find the transformation
performed by a given matrix.
∙ identify scalars, vectors, and tensors as defined by their transformation properties.
∙ convert tensors, including transformation matrices, from one basis to another.
∙ identify vector spaces and find their dimensions.
∙ express abstract vectors in terms of different bases.
∙ use “row reduction” to find the solution to a set of linear equations, or to determine if they
are inconsistent or linearly dependent.
∙ use the “simplex method” to solve problems in linear programming.
Before After
2 2
1 1
2 1 1 2 2 1 1 2
1 1
2 2
You now have a matrix that can be multiplied by any vector x0 î + y0 ĵ to determine
where that vector will be after the rotation.
7. In the picture above the initial positions are i,̂ î + j,̂ and −î + ĵ and the rotation angle
is 30◦ . Use your matrix to determine where the three ants end up after rotation. Make
sure your answers seem to roughly match the final positions shown in the figure.
8. Without doing any calculations, predict where each of the three ants would have
ended up if the turntable had been rotated counterclockwise by 90◦ . Check your
answer by replacing Δ𝜙 with 90◦ in your matrix, and then acting with that matrix on
the ants’ initial positions.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 348
What you’ve just derived is a “transformation matrix,” a matrix that can be used to geo-
metrically transform vectors. In this section you’ll encounter transformation matrices that
stretch and reflect vectors as well as rotating them.
–4 –2 2 4
–2
–4
Stretch Rotate
Reflect
4 4 4
2 2 2
–4 –2 2 4 –4 –2 2 4 –4 –2 2 4
–2 –2 –2
–4 –4 –4
(a) (b) (c)
FIGURE 7.1 A rectangle being stretched by a factor of 2.5 in the x-direction (a), reflected across the line
y = −x∕3 (b), and rotated 60◦ counterclockwise about the origin (c).
But to describe a transformation we must describe its effects on all vectors and shapes,
not just on one. For instance, Figure 7.2 shows one vector being transformed into another.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 349
Commutative Transformations
Recall that the matrix 𝐂 = 𝐀𝐁 performs transformation 𝐁 and then transformation 𝐀. (You
read right-to-left because 𝐀𝐁V⃗ means “𝐀 acting on 𝐁V⃗ .”) Order generally matters. For
example, Figure 7.3 shows that if you start with the vector j,̂ stretch it by 2 in the y-direction,
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 350
ĵ
and then reflect it about the line y = x, you end up with 2i.̂ If
you do the reflection and then the stretch you end up with i.̂
Stretch- In the last paragraph we made a visual statement: “the
Reflect-then-
then-reflect stretch order in which you apply these two transformations matters.”
This corresponds to a mathematical statement: “the matrices
2iˆ iˆ
that represent these two transformations do not commute.”
( )( ) ( )( )
1 0 0 1 0 1 1 0
FIGURE 7.3 Order matters.
≠
0 2 1 0 1 0 0 2
When two matrices do commute mathematically, that means their transformations can be
taken in either order. For instance, any two 2D rotation matrices commute. This is not obvi-
ous looking at the matrices, but it is obvious if you think about it physically: rotating by 30◦
clockwise and 50◦ counterclockwise results in a 20◦ counterclockwise rotation, no matter
which order you do them in.
Degenerate Eigenvalues
If a matrix 𝐌 has two or more linearly independent eigenvectors with the same eigenvalue, that
eigenvalue is said to be “degenerate.” In that case any linear combination of those eigenvectors is
also an eigenvector with the same eigenvalue.
Geometrically, two linearly independent eigenvectors with a degenerate eigenvalue define a
plane on which all of the vectors are eigenvectors.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 351
This is an important addition to our understanding of determinants, and can lead to many
important insights. We offer three below.
∙ Given two matrices with |𝐀| = 3 and |𝐁| = 5, what is the determinant of matrix 𝐀𝐁? We
can answer this question with no algebra if we remember that 𝐀𝐁 means “first multiply
by 𝐁 and then by 𝐀.” So the area will multiply first by 5 and then by 3, resulting in a total
increase by a factor of 15. This train of logic leads us to the following generalization
about determinants:
|𝐀𝐁| = (|𝐀|) (|𝐁|)
∙ Our other uses for the determinant were based only on whether the determinant was
zero. This is our first reason to care what non-zero number the determinant is! But the
special case of zero is still worth noting, because a matrix with a determinant( of zero
)
2 3
will transform any shape into a shape with zero area. For instance, the matrix
0 0
drops all points onto the x-axis. The resulting figure (a line segment) does not contain
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 352
enough information to recreate the original figure (a 2D shape). This connects our
new interpretation to one of our old ones: a matrix with a determinant of zero cannot
be inverted.
∙ When you apply a transformation to a shape, it multiplies each eigenvector by the
associated eigenvalue, thus multiplying the entire area by the determinant. Those
eigenvectors, eigenvalues, and determinant are properties of the transformation itself,
not just of the matrix that represents it. In a different coordinate system that same
transformation will be represented by a different matrix (“similar” to the original). We
conclude that any two similar matrices—that is, any two matrices that represent the
same transformation in different bases—will have the same eigenvalues and the same
determinant. The eigenvectors will also be the same, although their representations in
the new basis will be different.
If we tell you that a given 2 × 2 matrix has a determinant of 10, you can conclude that it
represents two linearly independent equations and/or two linearly independent vectors. You
can conclude that it has an inverse matrix. You can now also conclude that it will multiply
the area of any point matrix by 10. But if we tell you that the determinant is −10 you can
come to all those same conclusions. So what does the negative determinant mean?
Remember that the determinant is the product of the eigenvalues. Consider a matrix with
eigenvalues 5 and 2, and a different matrix with eigenvalues 5 and −2. The first will stretch
any object by a factor of 5 in one direction and 2 in another direction. The second will do
the same, but then it will reflect shapes in the second direction, so the shapes come out
backward. The sign of the determinant therefore tells us about “orientation.” A matrix with
a positive determinant may stretch and rotate an object shaped like a lowercase ‘b’, but it
will still be a lowercase ‘b’ in the end. A matrix with a negative determinant will turn it into
a lowercase ‘d’. See Problems 7.31–7.32.
Question: Describe the effects of the following two transformation matrices. You may
use a computer to help analyze 𝐁.
⎛ 2 √0 0 ⎞ ( )
3∕5 −4∕5
𝐀=⎜ 0 3∕2 −1∕2
√
⎟ 𝐁=
⎜ ⎟ −4∕5 −3∕5
⎝ 0 1∕2 3∕2 ⎠
Answer A:
You can figure out what this matrix does with no calculations. The new x-component
only depends on the old x-component, and the new y and z only depend on the old y
and z (not the old x). If there are two groups of variables that are not mixed like that
then you say the matrix is “block diagonal.”√ a What it does to x is trivial: it doubles it.
What it does to y and z is clear if you write 3∕2 as cos 30◦ and 1∕2 as sin 30◦ . The
lower right block is a 2D rotation matrix that rotates vectors 30◦ about the x-axis.
Using that description, try to predict what this matrix will do to the unit vectors i,̂ j,̂
̂ and then test them to see if your predictions bear out.
and k,
a Technically a matrix is only block diagonal if the variables in each group are next to each other, so a
matrix that mixes x and z but doesn’t mix either one with y isn’t block diagonal, although in most ways it
acts like one. See Appendix E for a definition of “block diagonal.”
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 353
Answer B:
You can’t tell what this one does by inspection. (At least we can’t.) When in doubt,
the best way to tell what a matrix does is often to find its eigenvectors and eigen-
values. You can do this by hand or by a computer, and either way you find that î + 2ĵ
has eigenvalue −1 and −2î + ĵ has eigenvalue 1. Since all vectors along the line
y = −x∕2 stay the same and all vectors perpendicular to that line are reversed, this is a
reflection about that line.
(Bonus questions: What are the eigenvectors and eigenvalues of matrix 𝐀? What
are the determinants of both matrices? Hint: matrix 𝐀 has only one real eigenvector.)
Orthogonal Matrices
Matrices that only rotate and reflect vectors, but leave all vectors with the same length, are
called “orthogonal matrices.” All the eigenvalues of an orthogonal matrix must have magni-
tude 1. Being orthogonal implies having a determinant of ±1, but the converse is not true.
A matrix that stretches by a factor of 2 in the x-direction and 1/2 in the y-direction would
still have determinant 1.
To check if a matrix is orthogonal you could find all of its eigenvalues, but there’s a faster
way. It requires us to introduce a new term: the “transpose” of a matrix is obtained by turning
( ) ⎛1 4⎞
1 2 3
all its rows into columns. For instance, the transpose of 𝐌 = is 𝐌T = ⎜ 2 5 ⎟.
4 5 6 ⎜ ⎟
⎝3 6⎠
For a square matrix the transpose is a mirror image across the main diagonal:
( )T ( )
a b a c
= . A matrix is orthogonal if 𝐌T = 𝐌−1 . In other words, to see
c d b d
if a square matrix 𝐌 is orthogonal multiply it by its transpose; it’s orthogonal if and only if
you get the identity matrix.
The name “orthogonal” comes from the fact that all of the columns (or equivalently
all the rows) of an orthogonal matrix represent orthogonal (aka perpendicular) vectors of
length 1, just in case you were wondering.
Stepping Back
When Charles Darwin studied the finches of the Galápagos and Cocos islands in 1835, he was
impressed by the diverse shapes and sizes of their beaks, which were remarkably well adapted
to their food gathering needs. 175 years later a group of evolutionary biologists and applied
mathematicians used a set of transformation matrices to determine that nearly all the beaks
can be classified into two groups.1 Within each group, every beak can be transformed to every
other beak by a stretch (diagonal matrix). Furthermore, each group can be transformed to
the other group by a shear matrix.2 This result suggests that the wide variation in beaks can be
accounted for by only two or three genetic parameters, which one biologist observed “makes
it much more feasible that you can get that much change in a relatively short time.”
1 Campas et al., “Scaling and shear transformations capture beak shape variation in Darwins finches,” Proceedings
of the National Academy of Sciences 2010; published ahead of print February 16, 2010, doi:10.1073/
pnas.0911575107. ( ) ( )
2 In 2D a shear matrix is either 1 k 1 0
or where k is a non-zero constant. Visually it tilts shapes, e.g.,
0 1 k 1
turning a rectangle into a parallelogram.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 354
(b) Write two matrix equations that (c) For v⃗2 = 2î − 5k̂ find 𝐓v⃗2 .
express mathematically what you (d) Is v⃗2 an eigenvector of 𝐓? If so,
concluded in Part (a). find its eigenvalue.
(c) Write out those matrix equations as four (e) In general, what vectors are eigen-
algebraic equations, using a, b, c, and vectors of 𝐓?
d for the unknown elements of 𝐌, and
solve the equations to find 𝐌. 7.24 You are looking for a transformation 𝐌, and
initially all you know about it is that it turns
7.21 In this problem you’ll find the matrix 𝐌 that the vector î − ĵ into the vector −î − j.̂
stretches vectors by a factor of 3 along the
(a) Write a rotation matrix that could be
axis x = y and reflects vectors across that
𝐌, given this information.
same axis.
(b) Write a reflection matrix that could
(a) Just from this description, what are the
be 𝐌, given this information.
eigenvectors and eigenvalues of 𝐌?
(c) What would each of the matrices you
(b) Write two matrix equations that
just wrote do to the vector î + j?̂
express mathematically what you
concluded in Part (a). 7.25 Prove that any transformation matrix
(c) Write out those matrix equations as four moves the origin to itself.
algebraic equations, using a, b, c, and 7.26 Can you write a matrix that converts the
d for the unknown elements of 𝐌 and Cartesian coordinates of a point into the spher-
solve the equations to find 𝐌. ical coordinates of that point? If so, write
7.22 If you want to stretch a vector by a factor of 3 the matrix. If not, explain why not.
in direction d⃗ then your matrix’s first eigen- 7.27 Point matrices are a bit more complicated
vector should be d⃗ with eigenvalue 3, and its in 3D than in 2D, but the basic idea is
second eigenvector should be perpendicular the same. Let 𝐏 be the following big-but-
to d⃗ with eigenvalue 1. (Problems 7.20 and 7.21 not-so-bad-when-you-look-at-it-carefully
illustrated this point.) The least obvious part of matrix.
that rule is that the second eigenvector must be
⃗ ⎛0 1 1 0 0 0 1 1 0 0 0 0 1 1 1 1 0⎞
perpendicular to d.
𝐏=⎜0 0 1 1 0 0 0 1 1 0 1 1 1 1 0 0 0⎟
To investigate that claim, we begin by ⎜ ⎟
⎝0 0 0 0 0 1 1 1 1 1 1 0 0 1 1 0 0⎠
considering matrix 𝐓1 whose eigenvec-
tors are ĵ (with eigenvalue 3) and î (with (a) Describe the shape represented by
eigenvalue 1). Note: this entire problem can 𝐏. What kind of shape is it, how big
be solved with almost no calculations if you is it, and where is it?
understand at each step what the question (b) Write a matrix that reflects vectors
is asking. across the yz-plane and stretches them
(a) Let v⃗ = î + j.̂ Calculate the vector by 2 in the z-direction.
⃗
𝐓1 v. (c) Act with the transformation matrix you
(b) Draw v⃗ and 𝐓1 v⃗ on the same graph. wrote on 𝐏. Describe the resulting shape.
Now consider matrix 𝐓2 whose eigenvec- 7.28 In this problem you will show both physically
tors are A⃗ = ĵ (with eigenvalue 3) and and mathematically that 3D rotation matri-
B⃗ = î + ĵ (with eigenvalue 1). ces do not commute. Stand facing your desk.
(c) What are the components of v⃗ Define your right to be the positive x-direction,
in the A⃗B⃗ basis? in front of you to be the positive y-direction,
(d) What is the effect of matrix 𝐓2 on v? ⃗ and up to be the positive z-direction. Your
upper body is the vector you’re transforming,
(e) Draw v⃗ and 𝐓2 v⃗ on the same graph.
pointing through the top of your head, and
(f) Did 𝐓1 stretch v by 3 in the initially it points in the positive z-direction.
̂
j-direction? Did 𝐓2 ?
(a) Bend forward 90◦ . What axis did you
7.23 Matrix 𝐓 has eigenvectors î and j,̂ both rotate about and in what direction did
with eigenvalue 3. It also has eigenvec- your head end up pointing (x, y, or
tor k̂ with eigenvalue 1. z, and positive or negative)?
(a) For v⃗1 = 2î − 5ĵ find 𝐓v⃗1 . (b) Still bent over, turn left 90◦ . What axis
(b) Is v⃗1 an eigenvector of 𝐓? If so, did you rotate about and in what direc-
find its eigenvalue. tion did your head end up pointing?
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 357
(c) Stand up straight facing your desk again (a) Transform each ( ) of R with
of the vectors
to reset back to your original vector. Now 0 1
the matrix 𝐓1 = and draw the
do the same rotation you did in Part (b). −1 0
In what direction did your head end up resulting shape. Side A of the old triangle
pointing? becomes one of the sides of your new tri-
angle; mark that side A1 , and similarly for
(d) Now do the rotation from Part (a). Be
the other two sides. Does the transformed
sure to rotate about the same axis you
triangle have the same orientation as R
did before. This time it will not be a for-
(turn left from A to B) or the opposite
ward bend. (If you’re not flexible enough
orientation (turn right from A to B)?
to bend sideways 90◦ , and who is really,
just go as far as you can and pretend.) (b) Repeat Part (a) for each of the
following matrices.
In what direction did your head end up ( )
pointing? −2 0
i. 𝐓2 =
(e) Write the matrices for the rotations in 0 −2
( )
Parts (a) and (b). Multiply them in both 0 1
ii. 𝐓3 =
orders to see if they commute. How 1 0
( )
could you have predicted the answer 1 0
without doing any calculations? iii. 𝐓4 =
0 −3
7.29 Matrix 𝐒 stretches all vectors by a factor of 5 in (c) For each of the matrices 𝐓1 to 𝐓4 , cal-
direction D⃗ (and that’s all it does). When this culate the determinant.
matrix acts on most vectors, it changes both (d) Based on your results, what does the
their magnitudes and their directions. determinant of a matrix tell you about
(a) When 𝐒 acts on the vector A, ⃗ it changes the orientation of shapes?
its magnitude but not its direction. What 7.32 You perform a transformation on the shape .
can you conclude about A? ⃗ For each of the resulting shapes shown below,
(b) When 𝐒 acts on the vector B, ⃗ it does is the determinant positive or negative, and is
not change its magnitude or direction. its absolute value less than or greater than 1?
What can you conclude about B? ⃗ (a)
7.30 In this problem you will prove that the deter- (b)
minant of a diagonal matrix is the product of
(c)
the numbers along the main diagonal. (For
| 3 0 0 | (d)
| | ( ) ( )
| | 1 4 3 2
instance, | 0 5 0 | = 3 × 5 × −2 = −30.)
| | 7.33 The matrices and
| 0 0 −2 | 2 7 6 1
| | are not similar; that is, they do not rep-
(a) Prove that this is true for any diag-
resent the same transformation as each
onal 2 × 2 matrix.
other in any possible coordinate systems.
(b) Prove that if this result holds for an How we can tell that for sure?
n × n matrix, then it also holds for
an (n + 1) × (n + 1) matrix. 7.34 For each matrix below, determine if it is orthog-
onal. If it is not, find at least one vector whose
7.31 The figure below shows a right triangle with magnitude is changed by the matrix.
vertices at (0, 0), (2, 0), and (2, 1). We’ve ( )
1 1
labeled the sides A, B, and C, and we’ll (a)
1 −1
refer to the triangle as R. You can define ( √ )
the “orientation” of R by the fact that if 1∕2
√ 3∕2
(b)
you start at the origin (the vertex AC) and − 3∕2 1∕2
move along A, you then have to turn left to √ √ √
⎛ 1∕ 3 −1∕ 2 1∕ 6 ⎞
turn onto B. In this problem you’ll explore ⎜ √ √ √ ⎟
how different transformations affect this (c) ⎜ −1∕ 3 1∕ 3 −1∕ 3 ⎟
√ √
orientation. ⎜ 1∕ 3 −1∕ 6 1∕√2 ⎟
⎝ ⎠
7.35 Prove that any 2D rotation matrix is orthog-
1
onal. Explain why this result makes sense.
C
B Explain without a mathematical proof why you
R
would or wouldn’t expect the same result to
A 2 hold in 3D.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 358
7.36 For each transformation matrix described going from 0 to 2𝜋 in 8 steps. Each tri-
below, say whether the determinant is pos- angle will use two different values of 𝜙.
itive, negative, or zero, or if there isn’t The next polygons should be quadri-
enough information to tell. laterals connecting points with 𝜋∕8 to
(a) The identity matrix points with 𝜃 = 𝜋∕4, again with two val-
ues of 𝜙 per polygon. Continue until you
(b) It has three eigenvalues: 5, −5, and −2.
get a set of triangles that reach the south
(c) It transforms the vector î into −i.̂ pole. Have the computer draw all these
(d) It transforms the unit square with cor- polygons.
ners at (0, 0) and (1, 1) into a unit square (b) Redefine the list and redraw the
with corners at (0, 0) and (−1, 1). sphere with 𝜙 and 𝜃 taking on 64
(e) The vectors î + ĵ and î − ĵ get trans- values each instead of 8.
formed into each other. (c) Write a matrix 𝐒 that stretches vectors
7.37 3D objects are often represented on by a factor of 2 in the y-direction. Apply
computers by lists of polygons that can be S to each polygon vertex and draw the
drawn to approximate the surface of the result. What does it look like?
shape. In this problem you’re going to rep- (d) Define a matrix 𝐑 that rotates vectors
resent and manipulate a sphere. 45◦ around the z-axis. Apply R to the ver-
(a) Define a set of polygons representing a tices of the original polygons and draw
sphere with 8 lines of longitude and 8 the result. What does it look like?
lines of latitude. The first set should be (e) Apply the matrix 𝐑𝐒 to the original ver-
a set of triangles going from the point tices and redraw the shape. Explain
(0, 0, 1) to points with 𝜃 = 𝜋∕8 and 𝜙 why it looks the way it does.
7.2 Tensors
You know that a physical quantity such as velocity or electric field can be represented by a
vector. That vector in turn can be specified by giving a sequence of numbers (components),
but those components change if you represent the same vector in a different basis. In this
section we introduce the “tensor,” a physical quantity that can be represented by a matrix.
Like a vector, a tensor is specified using different numbers in different bases, where these
representations all describe the same thing in different ways.
We first introduced this idea in Chapter 6, where we defined “similar matrices” as ones
that represent the same transformation in different bases. In this section we convert a trans-
formation matrix from one basis to another, and we also generalize the idea to tensors other
than transformations.
2. Write a matrix that converts vectors from the x ′ y′ basis to the xy basis. How is this matrix
related to the matrix 𝐂 that you just wrote?
3. Write a matrix 𝐌′ that takes a vector in ( the )x ′ y′ basis
( and ) stretches it by a factor of 2
Vx ′ 2Vx ′
in the x -direction. In other words 𝐌
′ ′ = .
Vy′ Vy′
4. Using your answers so far, write a matrix 𝐌 that takes a vector in the xy basis, converts
it to the x ′ y′ basis, stretches it by two in the x ′ -direction, and then converts back to the
xy basis.
See Check Yourself #46 in Appendix L
5. Describe in words what the matrix 𝐌 does to a vector V⃗ in the xy basis. Your answer
should make no reference to the x ′ y′ basis, but should just say what kind of stretching
or reflection it performs. Hint: The answer does not involve a rotation.
6. Choose a vector on which the transformation you described should look very simple,
and check that the matrix 𝐌 you wrote down does what you would expect to that
vector.
yʹ y
xʹ
m
θ x
How did we determine those conversions? The mass was easy since a scalar is the same in any
coordinate system. The force can be converted by drawing right triangles, but we tackled it by
using a rotation matrix—the same matrix that converts any point from one system to another.
(In Problem 7.51 you’ll show that this gives the same results as drawing the triangles.)
If this all sounds too obvious (or too familiar) to discuss, here comes something new: those
two conversions are the definitions of the terms “scalar” and “vector.”
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 360
∙ In the first coordinate system, a certain value is specified by the number s. If this is a “scalar”
quantity then its value in the second coordinate system is given by:
s′ = s (7.2.1)
⃗
∙ In the first coordinate system, a different value is specified by the sequence of numbers v.
If this is a “vector” quantity then its value in the second coordinate system is given by:
v⃗ ′ = 𝐂v⃗ (7.2.2)
Be cautious about the distinction between active and passive transformations when convert-
ing vectors. If your coordinate system rotates 30◦ counterclockwise, you must apply a 30◦
clockwise rotation to every object!
The definition tells us that a scalar is not just “a quantity that can be expressed by a
number”—it is a specific type of quantity that retains its value across different coordinate
systems. Less obviously, the definition tells us the same thing about a vector. You might stand
on our ramp and declare that “the block experiences a gravitational force of 98 N in this
direction (point your finger), and it accelerates at 4.9 m/s2 in that direction (point a differ-
ent finger).” Changing your basis will change the numbers you use to specify the components
of those vectors, but it will not change the vectors themselves.
Air is moving around a room in complicated ways, so at any given point there is an air
velocity. In a particular coordinate system this is represented by the numbers
v⃗ = v1 x̂1 + v2 x̂2 + v3 x̂3 .
Question: Is the air velocity a vector?
Answer:
Yes. The vector itself is independent of the coordinate system used: at a particular
point the air is moving this fast in this direction. But the coordinates v1 , v2 , and v3
depend on the coordinate system. If you switch to a different coordinate system, the
new components will be given by v⃗ ′ = 𝐂v⃗ where 𝐂 is the matrix that converts
positions from the first coordinate system to the second.
Question: Is the magnitude of the air velocity a scalar?
Answer:
Yes. Imagine that at one particular point the air is moving at 5 m/s. In any coordinate
system, the air at that point will still be moving at 5 m/s.
Question: Is the x-component of the air velocity a scalar?
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 361
Answer:
No, and this is a good example of why “a quantity that can be represented by a
number” is not an adequate definition of scalar. In different coordinate systems the
x-component will be different. To put it another way, if I tell you that the
x-component at a particular spot is 4 m/s that doesn’t tell you much if you don’t
know what coordinate system I’m using.
Question: The color of an object can be specified as a list of how much red, green,
and blue make it up. Is that list a vector?
Answer:
No. If you change coordinates each of those colors stays the same, so that is a list of
three scalars rather than a vector.
Here is a subtler example: you probably think of the cross product of two vectors as a
vector, but you’ll show in Problem 7.50 that it doesn’t obey Equation 7.2.2 under a reflection.
Given two vectors A⃗ and B,⃗ the quantity A⃗ ⋅ B⃗ is a scalar, but A⃗ × B⃗ is not actually a vector. (It’s
not a scalar either, of course. Technically it’s a “pseudovector.”)
Converting a Transformation
Jack is playing a video game and his character has just encountered “Rubber Woman,” who
has the power to stretch her body vertically. When Rubber Woman uses her power Jack sees
the following transformation occur on the screen.
Suppose you are the programmer writing this video game. The
position of each point on Rubber Woman’s body is stored as a 2D
vector. You have the computer
( multiply
) each of those vectors by
S 1 0
the stretching matrix 𝐒 = and then redraw the screen.3
0 2
Suppose, however, that before Rubber Woman performs this
stretch Jack rotates his character’s head 30◦ to the left. On his
screen he sees everything in the scene rotate ◦
( 30 to ◦the right.◦ Once
) again, you the pro-
cos 30 sin 30
grammer know what to do: the matrix 𝐂 = converts each position
− sin 30◦ cos 30◦
vector from the old coordinates to the new rotated ones. (“Conversion” in this context is
synonymous with “passive transformation.”)
Now comes the interesting part. When Rubber Woman uses her power you have to stretch
her body exactly as before but from a different perspective. How do you implement that
transformation?
You can answer that question with a three-step process. First
convert each point’s position back to the original basis. (This is
accomplished by the matrix 𝐂−1 .) Then use matrix 𝐒 to find the
Sʹ coordinates of the stretched Rubber Woman—still in the orig-
inal basis. Finally convert those coordinates into the new basis
using matrix 𝐂. Remembering the right-to-left order of matrix
multiplication, we write:
𝐒′ = 𝐂𝐒𝐂−1 (7.2.3)
3 Here and below we are assuming for simplicity that the origin of your coordinate system is at Rubber Woman’s
feet. We are also ignoring the fact that many video games define and transform objects in a three-dimensional space
and then use complicated algorithms to render them properly on a two-dimensional screen.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 362
We are not suggesting that you need to go through a computationally inefficient three-step
process for each point on Rubber Woman’s body. The entire process of converting to the
original basis, performing the stretch in that basis, and then converting back to the new
basis can be effected by one matrix, and Equation 7.2.3 gives the formula for finding that
matrix. In short, 𝐒′ performs the stretch in the head-rotated basis just as 𝐒 does in the original
basis.
Answer:
First let’s figure out what matrix
(√performs transformation
) 𝐓 in îĵ space. The stretch is
( )
2 0 3∕2 √ −1∕2
and the rotation is . So the matrix that performs both
0 6 1∕2 3∕2
operations—remember right-to-left order!—is:
(√ )( ) (√ )
3∕2 −1∕2
√ 2 0 3 −3
√
𝐓îĵ = =
1∕2 3∕2 0 6 1 3 3
) (
3 10
The matrix that converts from a V⃗ 1 V⃗ 2 representation to îĵ is . (This is
1 4
𝐂−1 ( ⃗ ⃗
) above.) The inverse of that matrix, which goes back to V 1 V 2 ,
in the formula
1 4 −10
is . So the matrix that performs transformation 𝐓 in V⃗ 1 V⃗ 2
2 −1 3
space is:
( )( √ )( )
1 4 −10 3 −3
√ 3 10
𝐓V⃗ ⃗ =
1V 2 2 −1 3 1 3 3 1 4
( √ √ )
9 3
−21 − √ −74 − 40√ 3
=
6+3 3 21 + 13 3
You could of course have a computer do that last multiplication for you. But what a com-
puter can’t do—what you need to be able to do—is see that last step as saying “first convert
from V⃗ 1 V⃗ 2 to îĵ representation, then perform the transformation in that space, and then
convert back.” The final matrix therefore performs transformation 𝐓 in V⃗ 1 V⃗ 2 space.
Tensors
The transformation in our computer game example has some important features in common
with a vector. Like a vector, it has an essential nature that is independent of a coordinate
system (it stretches objects by a certain amount in a certain direction). Also like a vector, it
is specified by a set of numbers that must be converted as we switch from one coordinate
system to another.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 363
Remember that Equation 7.2.2 tells us how to convert a vector between bases, and that
conversion is the definition of a vector. Because S follows a different conversion rule, it is a
different type of quantity. We call such a quantity a “rank two tensor.”
Definition: Tensor
Consider two different coordinate systems and an orthogonal conversion matrix 𝐂 that converts
any single point from the first system to the second. In the first coordinate system, a certain value is
specified by the sequence of numbers 𝐓. If this is a “rank two tensor” (often simply called a “tensor”)
then its value in the second coordinate system is given by:
𝐓′ = 𝐂𝐓𝐂−1 (7.2.4)
A scalar is a “rank 0 tensor” and a vector is a “rank 1 tensor.” Ranks higher than two are
important in the theory of general relativity, but are less common in most other fields. Our
explanation above is designed to convince you that a transformation is a rank 2 tensor. But
the concept is broader than that—that is, there are quantities other than transformation
matrices that convert according to Equation 7.2.4.
As an example, suppose you want to know how the air velocity changes from one place in
the room to another. Since v⃗ is given by three components vx , vy , and vz and we can measure
the rates of change of each component independently in three different directions, there are
nine independent spatial derivatives we can calculate. For instance, if you held a pinwheel4
facing the x-direction, 𝜕vx ∕𝜕z would tell you how much it would speed up or slow down as you
moved it slowly upward. The nine components of this “velocity gradient” can be arranged
in a matrix, and you’ll show in Problem 7.56 that this matrix transforms like a tensor under
coordinate transformations. That means the velocity gradient is a tensor.
The transformation rule for tensors tells you how its components change from one basis
to another, but the physical thing that the tensor represents is the same regardless of basis.
Imagine standing at a given point in the room, with your left hand pointing at the ceiling
and your right hand at the door. You could ask “How much is this (left hand) component of
the velocity changing as you move in that (right hand) direction?” The velocity gradient gives
you a way of answering that question, and the answer is the same no matter what coordinate
system you use.
Index Notation
The components of a matrix 𝐌 can be expressed in “index notation” as Mij , which refers
to the element
( in the)ith row and jth column. For example, a 2 × 2 matrix can be written
M11 M12
as 𝐌 = . Expressions involving matrix multiplication can often be expressed
M21 M22 ∑
compactly with this notation. For example, the equation Aij = k Bik Ckj completely describes
the matrix product 𝐀 = 𝐁𝐂. You’ll show in Problem 7.52 that for bases related by orthogonal
transformations the rules for transforming vectors and tensors can be expressed in index
notation as:
∑
vi′ = Cij vj (7.2.5)
j
∑∑
Tij′ = Cik Cjl Tkl (7.2.6)
k l
Index notation makes it easier to generalize the conversion rule to higher-rank tensors.
In fields such as relativity that use a lot of tensors, expressions such as these can be
expressed more easily with “Einstein summation notation.” Whenever a single index
appears twice in the same term, it is automatically summed over all its possible values. For
instance the rule for transforming a rank 2 tensor would be written Tij′ = Cik Cjl Tkl , with an
implied sum over k and l since they each appear twice in the term on the right-hand side. It
should be clear from context what the limits of those sums are, e.g., from 1 to 3 if these are
3 × 3 matrices. We will not use Einstein notation in this text, but you should be aware of it if
you read articles or books with tensor equations.
Stepping Back
The terms scalar, vector, and tensor are defined by their behavior under coordinate transfor-
mation. All three terms therefore have meaning only with respect to specific transformations.
To illustrate that point, let’s return to a couple of our examples from Page 360.
∙ The color of an object is the same in any coordinate system, so color is not a vector
but three scalars—with respect to spatial transformations. On the other hand color
transforms like a generalized vector when we convert from “red, green, and blue” to
“hue, saturation and lightness.”
∙ The speed of an object is the same in any coordinate system, so speed is a scalar—
provided those coordinate systems are not moving with respect to each other. But an
observer on the ground and an observer on a passing train will not agree on the speeds
of various objects they can both see. Under that transformation, speed is not a scalar.
For orthogonal transformations of the spatial variables in Cartesian coordinates the def-
initions we’ve given here are adequate, but in more general cases it’s necessary to define
different types of vectors and tensors that obey different transformation laws. Nonetheless,
the concepts you’ve learned here will provide a strong basis if you go on to study more
advanced tensor analysis.
We should note that Equation 7.2.3 for converting a transformation is very general
because it doesn’t depend on what type of conversion the matrix 𝐂 represents. The index
notation form, Equation 7.2.7, does not generalize to all situations, however.
stretches it by 4 in the y′ -direction, and know the answer, not based on the for-
then converts it back to the xy basis. mula in Chapter 6, but based on what
(f) Suppose the matrix you wrote in Part (e) the matrix does. (You can of course
acts upon the vector V⃗ 1 = î + 2j.̂ Predict the use the formula to make sure you’re
result of this operation without doing any right!)
( )
calculations, and explain your reasoning. 1 2
7.43 The matrix 𝐓 = performs a trans-
(g) Act with your matrix on V⃗ 1 and verify 3 4
formation on any vector that is represented
that it did what you predicted. If not,
as a linear combination of î and j.̂ Write a
figure out whether your calculations
matrix that performs the same transforma-
or your matrix were wrong.
tion on a vector that is represented√as a lin-
(h) Suppose the matrix you wrote in Part (e) ⃗ ̂ ̂
acts upon the vector V⃗ 2 = 2î − j.̂ Predict the
ear combination√ of V 1 = (1∕2)i + ( 3∕2)j
and V⃗ 2 = −( 3∕2)î + (1∕2)j.̂
result of this operation without doing any ( )
calculations, and explain your reasoning. −19 12
7.44 (a) The matrix 𝐓 = performs
(i) Act with your matrix on V⃗ 2 and verify 12 −12
a transformation on any vector that is rep-
that it did what you predicted. If not,
resented as a linear combination of î and
figure out whether your calculations
j.̂ Write a matrix that performs the same
or your matrix were wrong.
transformation on a vector that is repre-
7.39 In an example in Section 6.9 ( we converted
) sented as a linear combination of V⃗ 1 =
−4 −2
the transformation matrix from (3∕5)î + (4∕5)ĵ and V⃗ 2 = −(4∕5)î + (3∕5)j.̂
−3 1
an “x is this and y is that” representation to (b) In this case you get a very interest-
“this much of the first normal mode and that ing matrix. What does that tell you
much of the second.” Do this conversion about this particular transformation
using the method described in this section. in that particular basis?
Hint: Equation 6.9.7 gives you the conversion 7.45 Let T be a transformation that stretches
between the two coordinate systems. any vector by a factor of 3 horizontally
7.40 Using the methods described in this section, and by a factor of 4 vertically.
write a matrix that stretches all vectors by (a) The xy-coordinate system is the natu-
a factor of 2 along the line y = −x. Verify ral basis for this transformation because
that your matrix works by testing it on at the horizontal stretch lies directly
least two vectors. It may help to first work along the x-axis and the vertical stretch
through Problem 7.38 as a model. directly along the y-axis. Write a matrix
7.41 The Explanation (Section 7.2.2) introduced 𝐌 that represents the transformation
Rubber Woman, the spectacular heroine who T in the xy-coordinate system. (This
stretches her body in a video game. It also intro- should be very quick and easy.)
duced Jack, the player whose head is rotated (b) Write a matrix 𝐌′ that expresses the same
so that the entire game is seen at a 30◦ tilt. transformation in a coordinate system
(a) Find the matrix 𝐒′ that stretches Rubber rotated 45◦ with respect to the x- and y-axes.
Woman in Jack’s head-rotated basis. (c) To test your matrix, convert the vec-
(b) What are the eigenvalues and eigen- tor î + ĵ into the new space, and then
vectors of the matrix you calculated use your matrix. The result should be
in Part (a)? Explain how you know translatable back to 3î + 4j.̂
( )
the answer, not based on the meth- 0 1
7.46 The matrix reverses the x- and y-
ods described in Chapter 6, but based 1 0
on what the matrix does. coordinates of a point. What does this trans-
7.42 (a) Using the methods described in this formation look like in a coordinate system that
section, write a matrix that is rotated 45◦ from the regular axes? Why?
√ reflects all vec-
tors about the line y = x∕ 3. (Hint: In what 7.47 Prove using a tensor transformation that a 2D
coordinate system is this an easy transfor- rotation matrix is unaffected by a rotation of
mation?) Verify that your matrix works the axes. Explain why this makes sense.
by testing it on at least two vectors. 7.48 In this problem you’ll find a matrix
(b) What is the determinant of the matrix you that rotates 3D vectors about the line L,
calculated in Part (a)? Explain how you defined as z = x in the xz-plane.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 366
systems by Equation 7.2.4. For instance, those laws for a reflection. (This should
if we define a tensor by saying “it effects require no new calculations.) An object
this transformation” then as we convert it that obeys tensor transformation laws
from one coordinate system to another it under rotation but not reflection is called
should continue to effect the same transfor- a “pseudotensor.” See Problem 7.50.
mation. In this problem we have defined a 7.55 Velocity Gradient A gas is flowing through-
matrix by giving all its elements. For this out a region with velocity v⃗ = ay2 i.̂
matrix to be a tensor it must retain this
(a) Calculate the components of the
definition in different coordinate systems.
velocity gradient tensor. Write your
Show that the tensor transformation law
answer as a matrix.
leaves the components of the Kronecker
delta unchanged. Hint: it will be easier (b) Define a new set of axes rotated 30◦ coun-
to use the transformation law in matrix terclockwise from the original set. Express
form rather than index form. the velocity gradient tensor in this new
basis by using a tensor transformation.
7.54 The Levi-Civita Symbol The “Levi-Civita sym-
Your answer will be a function of y.
bol” 𝜖ijk is often used to simplify complicated
expressions. The indices all run from 1 to 3, (c) Express x and y in terms of x ′ and y′ and
and 𝜖ijk is equal to 0 if any two of them are use that to write the transformed velocity
equal, 1 if they are in the order 123, 231, or 312, gradient as a function of x ′ and y′ .
and −1 if they are in the order 132, 321, or 213. (d) Express the velocity field in this new basis.
(a) As an example of the use of the Levi- Give your answer in terms of x ′ and y′ .
Civita symbol, the determinant of a (e) Recalculate the velocity gradient from
3 × 3 matrix can be written |𝐌| = the new velocity field and verify that it
∑ ∑ ∑
k 𝜖ijk M1i M2j M3k . Write out the terms
i j
matches your answer to Part (b).
of this sum and show that it reproduces the 7.56 In this problem you will prove that the
usual formula for a determinant. (The def- velocity gradient Vij = 𝜕vi ∕𝜕xj is a tensor.
inition of the Levi-Civita symbol and this Consider an orthogonal matrix 𝐂 that con-
formula for determinants can be general- verts coordinates from one basis to another.
ized to more than three dimensions.) Since v⃗ and x⃗ are both vectors, v⃗ ′ = 𝐂v⃗
(b) Assume that in some basis B, 𝜖ijk is defined and x⃗ ′ = 𝐂x. ⃗ Your goal here is to find the
as we have described. Let 𝐂 be the matrix components Vij′ = 𝜕vi′ ∕𝜕xj′ and show that
that transforms vectors from B to another they are related to the components of 𝐕 by
basis B ′ . Assuming 𝜖ijk transforms accord- Equation 7.2.7.
ing to Equation 7.2.7 write 𝜖abc ′
in terms (a) Write the equation v⃗ ′ = 𝐂v⃗ in index
of the components of 𝐂 and 𝜖ijk . Your notation. Use that equation to rewrite
answer should contain three sums. Vij′ as a sum over the derivatives 𝜕va ∕𝜕xj′ .
(c) Use the transformation you wrote in (Notice that x is primed and v is not.
Part (b) to write all the non-zero terms We’re getting the primes off one step at
in the sums for 𝜖112 ′
. You should find a time.)
that all the terms cancel. (b) Use the chain rule to write 𝜕va ∕𝜕xj′
(d) Similarly, write out the terms for 𝜖123 ′
. Sim- as a sum over the derivatives 𝜕va ∕𝜕xb .
plify your answer. You should assume that Your answer will involve derivatives of
𝐂 is a rotation matrix, meaning |𝐂| = 1. unprimed x-components with respect
(e) Still assuming |𝐂| = 1, find 𝜖321 ′
. to primed x-components.
(f) Based on the three examples that you have (c) Write a matrix equation giving x⃗ in terms
gone through by hand, write a general rule of x⃗ ′ . Using the fact that 𝐂 is orthogo-
for what happens to any element of the nal write your matrix equation in terms
Levi-Civita matrix under a rotation. of 𝐂T instead of 𝐂−1 . Then write this
equation in index notation.
(g) The Levi-Civita symbol is not defined by ∑
the tensor transformation laws, but rather (d) Because xi′ = a Cia xa , the derivative
by the rule that in any basis 𝜖ijk has the 𝜕xi ∕𝜕xj just equals Cij . In a similar way, use
′
values described above. You should have your answer to Part (c) to find 𝜕xb ∕𝜕xj′ .
found so far that this is consistent with (e) Putting it all together, show that
the tensor transformation laws for a rota- Vij′ is related to Vab in the way you
tion. Show that it is not consistent with would expect for a tensor.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 368
7.61 [This problem depends on Problem 7.57.] Length The “stress-energy tensor” 𝐓 describes the den-
Contraction and Time Dilation sity and pressure of a fluid. The upper-left-hand
(a) In frame O, object A is at rest at the ori- element (the “time-time” component T00 ) rep-
gin and object B is at rest a distance L away resents the density of the fluid and the other
in the x-direction. Write four-vectors for diagonal elements equal the “normal stress.”
the time and position of A and B, leav- If the normal stress is the same in all three
ing t as a variable in the vectors. directions then it is simply the pressure. For a
(b) Convert your position vectors to a frame “perfect fluid,” meaning an ideal gas in equi-
O ′ moving at speed v with respect to O librium, the stress-energy tensor is diagonal
in the x-direction. What is the distance in its rest frame: T00 = 𝜌, T11 = T22 = T33 = P
between objects A and B in frame O ′ ? and all other components are zero. Find the
stress-energy tensor for a perfect fluid in a
(c) A clock is at rest at the origin in frame
frame moving at speed v relative to the fluid’s
O. Event C occurs when the clock shows
rest frame and use that to find the density
time 0 and event D occurs when it shows
of the fluid in that moving frame.
a later time t = 𝜏. Write four-vectors for
the time and position of events C and
7.63 The trace of a matrix is the sum of
D in frame O. Then convert the vectors
the elements along the main diagonal:
to frame O ′ . What is the time differ- ∑
Tr (𝐌) = i Mii . Prove that the trace is invari-
ence between the two events in frame
ant to an orthogonal transformation. Hint:
O ′ ? What is the spatial distance?
remember that if 𝐂 effects an orthogonal
7.62 [This problem depends on Problem 7.57.] The transformation then 𝐂T = 𝐂−1 .
Stress-Energy Tensor
Representation 1
Rule: The numbers a, b, c, and d represent coefficients in the polynomial
ax 3 + bx 2 + cx + d.
Example: <5, −2, 𝜋, 3∕4> represents 5x 3 − 2x 2 + 𝜋x + 3∕4.
Here is a much less obvious coding. We define four special polynomials as follows.
1( 2 ) 1( 3 )
P0 (x) = 1, P1 (x) = x, P2 (x) = 3x − 1 , P3 (x) = 5x − 3x
2 2
These are not the polynomial we want to tell you about; these are the code we will use to
communicate our polynomial, as follows.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 370
Representation 2
Rule: A0 , A1 , A2 and A3 represent coefficients in the polynomial A0 P0 + A1 P1 +
A 2 P2 + A 3 P3 .
Example: <5, −2, 𝜋, 3∕4> represents 5 − 2x + (𝜋∕2)(3x 2 + 1) + (3∕8)(5x 3 − 3x).
Representation 2 is based on the “Legendre polynomials.” You will encounter these func-
tions in another form in Problem 7.82, and we will discuss them in more detail in Chapter 12.
But here we are less interested in the particular properties of Representation 2 than in point-
ing out that both representations accomplish the same goal: they give us a way of describing
any third-order polynomial using four numbers. They also give us an excuse for introducing
some important terms.
∙ The set of all third-order polynomials is an example of a “vector space.” Any given
third-order polynomial is a “vector” in this space.
∙ The functions x 3 , x 2 , x, and 1 are themselves third-order polynomials—that is, they
are vectors in this space. Any other vector in this space can be expressed as a linear
combination of those four vectors (that’s what we did in Representation 1). Those
vectors are therefore said to “span” this vector space.
∙ None of the functions x 3 , x 2 , x, or 1 can be expressed as a linear combination of the
other three. Those four vectors are therefore said to be “linearly independent.”
∙ If you remove any vector (x 3 , x 2 , x or 1) from that list, the resulting list of three vectors
no longer spans the space; we need all four. These four vectors are therefore said to
provide a “basis” for this vector space. Each of these four vectors is a “basis vector” in
Representation 1.
∙ The four vectors P0 –P3 provide an alternative basis for the space of third-order poly-
nomials. You could come up with many such other bases, but each would require four
basis vectors. The space of third-order polynomials is therefore said to have “dimen-
sion” four.
We must emphasize that “vector” is not a fancy word for “list of numbers.” A list of num-
bers can represent a vector, but the representation depends on the basis. Above we saw one
list <5, −2, 𝜋, 3∕4> represent two completely different vectors in different bases. Similarly,
<5, 3, −1, 1> in Representation 1 and <2, 2, 2, 2> in Representation 2 are two different lists of
numbers, but they represent the same vector.
Third-order polynomials are just one example of the very broad topic of vector spaces.
Below we provide a general definition of that term. As you read each requirement in the
definition we encourage you to see how it applies to our example of third-order polynomials,
and also to the more familiar example of spatial vectors.
The reason for defining “vector space” in the abstract is that you can prove important
results from those ten properties. Such results are then guaranteed to be true for all vector
spaces.
For instance, suppose you find a set of n linearly independent vectors 𝐯1 , 𝐯2 … 𝐯n that
span a given vector space . This is enough to prove that is an n-dimensional space, which
in turn guarantees two facts.
∙ It is impossible to span with fewer than n vectors.
∙ Any set of n linearly independent non-zero vectors in define a basis for .
These results can be proven from the existence of any set of n linearly independent vectors
that span . (See Problem 7.69.) And while these results are important in themselves, they
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 371
1. Closure under addition: The sum of any two vectors 𝐚 + 𝐛 in gives a vector in .
2. Commutativity of addition: 𝐚 + 𝐛 = 𝐛 + 𝐚.
3. Associativity of addition: (𝐚 + 𝐛) + 𝐜 = 𝐚 + (𝐛 + 𝐜).
4. Additive identity: There exists a vector 𝟎 such that 𝟎 + 𝐚 = 𝐚 for all 𝐚 in .
5. Additive inverse: For each vector 𝐚 in there exists a vector −𝐚 such that 𝐚 + (−𝐚) = 𝟎.
6. Closure under scalar multiplication: For any vector 𝐚 in and any scalar k, k𝐚 is a
vector in .
7. Distributivity of scalar multiplication with respect to vector addition: k(𝐚 + 𝐛) = k𝐚 + k𝐛.
8. Distributivity of scalar multiplication with respect to scalar addition: (k + p)𝐚 = k𝐚 + p𝐚.
9. Associativity of scalar multiplication: k(p𝐚) = (kp)𝐚.
10. Scalar identity: There exists a scalar 1 such that 1𝐚 = 𝐚 for all 𝐚 in .
If the set of scalars is taken to be the real numbers then is a “real vector space.” If it’s the complex
numbers then is a “complex vector space.”
also provide a more general example of the use of a concept such a “vector space.” We do not
have to prove these results for one particular vector space or one particular kind of vector
space; we prove it once for all of them.
Orthonormal Bases
Two vectors are “orthogonal” if their inner product is zero. (For spatial vectors the dot
product is proportional to the cosine of the angle between the vectors, so “orthogonal” is
essentially synonymous with “perpendicular.”) A set of basis vectors that are all mutually
orthogonal is called an “orthogonal basis.” A vector with norm 1 is “normalized.” An orthog-
onal basis with all vectors normalized is an “orthonormal basis.” (Think of i,̂ j,̂ and k.)
̂
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 372
In an orthonormal basis, the inner product looks like the usual dot product. That is,
consider two vectors 𝐚 and 𝐛 represented in the base of vectors 𝐞1 , 𝐞2 …𝐞n .
𝐚 =< a1 , a2 , … >= a1 𝐞1 + a2 𝐞2 + …
(7.3.1)
𝐛 =< b1 , b2 , … >= b1 𝐞1 + b2 𝐞2 + …
If that base is orthonormal, then (𝐚, 𝐛) = a1 b1 + a2 b2 + … + an bn . You will prove this result in
Problem 7.72.
Problems 7.81–7.82 outline the “Gram-Schmidt” process for deriving an orthonormal
basis for a vector space.
The “transpose” of matrix 𝐌 (usually writ- The “adjoint” of matrix 𝐌 (usually written
ten 𝐌T ) comes from exchanging its rows 𝐌† which is pronounced “𝐌 dagger”) comes
for columns. from taking its transpose and taking the
complex conjugate of every element.
If a matrix equals its own transpose it is If a matrix equals its own adjoint it is
“symmetric” which implies that its eigen- “Hermitian” which implies that its eigenvec-
vectors are orthogonal to each other. tors are orthogonal to each other.
A matrix is orthogonal if and only if its A matrix is unitary if and only if its adjoint is
transpose is its inverse: 𝐌𝐌T = 𝐈. its inverse: 𝐌𝐌† = 𝐈.
A matrix is orthogonal if and only if all A matrix is unitary if and only if all its eigen-
its eigenvectors have modulus 1. So an vectors have modulus 1. So a unitary matrix
orthogonal matrix must have a determi- must have a determinant with modulus 1
nant of ±1 but the converse is not true: but the converse is not true: a non-unitary
a non-orthogonal matrix may also have a matrix may also have a determinant of
determinant of ±1. modulus 1.
Question: Which of the following matrices are Hermitian? Which are unitary?
( ) ( ) ( )
3 2−i i 0 −(1∕2) + (i∕2) (1∕2) + (i∕2)
𝐀= 𝐁= 𝐂=
2+i −2 0 −1 (1∕2) + (i∕2) −(1∕2) + (i∕2)
Answer:
We can see if they are Hermitian by inspection. The elements on the main diagonal
must all be real, and each other element must be the complex conjugate of the one
across the main diagonal from it. Thus 𝐀 is Hermitian and 𝐁 and 𝐂 are not.
To see if a matrix is unitary is usually harder, but in the case of 𝐁 it can be done by
inspection. Since the matrix is diagonal we can read off that the eigenvalues are
i and −1, so 𝐁 is unitary. For the other two it’s easiest to directly check if their adjoints
are their inverses. To take the adjoint switch each element with the one across the
main diagonal and then take its complex conjugate. Since 𝐀 is Hermitian (also
known as “self-adjoint”) A† = A.
( )( ) ( )
3 2−i 3 2−i 14 2−i
𝐀𝐀† = =
2+i −2 2+i −2 2+i 9
(d) The set of all possible box sizes will prove in this problem that no single
(height, width, length) vector can be a basis for V , and any three
7.65 Explain why each of the following sets vectors in V must be linearly dependent.
does not qualify as a vector space. Together, these two statements imply that any
basis for V must consist of exactly two vec-
(a) The set of all integers from 0 through 10,
tors. You’ll start with the second one.
where the vector space operation of “addi-
tion” is regular numerical addition. (a) Consider three vectors in space V : 𝐟 , 𝐠,
(b) The set of all real numbers, where the and 𝐡. By the definition of a basis it must
vector space operation of “addition” be possible to write 𝐟 , 𝐠, and 𝐡 as linear
is actually multiplication. combinations of 𝐱 and 𝐲. Write these out
explicitly, using the letters 𝛼, 𝛽, 𝛾, 𝛿, 𝜖, and
(c) The set of all resistors, where the vector
𝜁 for the six unknown coefficients of these
space operation of “addition” creates a new
linear combinations. Prove that 𝐡 can be
resistor which sums the two resistances.
written as a linear combination r 𝐟 + s𝐠 by
(d) The set of all spatial vectors in a plane that solving for r and s in terms of 𝛼–𝜁.
have magnitudes greater than or equal to
(b) Prove that if a single vector 𝐳 were
one, where the vector space operation of
a basis for V then 𝐱 and 𝐲 couldn’t
“addition” is regular vector addition.
be linearly independent.
7.66 Set E is the set of all even integers (positive,
You’ve just shown that if a vector space
negative, and zero). Set D is the set of all odd
has a basis with two vectors, then any possi-
integers (positive and negative). Set Z is set D
ble basis for the space must have exactly two
plus the number 0. Prove that only one of these
vectors. The argument can be generalized
three sets is a vector space, using the usual def-
in a similar way to higher dimensions.
initions of addition and multiplication.
7.70 Prove that a real vector space cannot have
7.67 (a) The space of third-order polynomials, with
exactly two vectors. Can it have just one?
addition defined in the usual way for poly-
nomials, is a vector space. For example, it 7.71 In relativity the position and time of each event
obeys closure under addition because are assigned a four-vector (t, x, y, z). The “space-
time interval” between two events is −(t2 − t1 )2 +
(a1 x 3 + b1 x 2 + c1 x + d1 ) (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 . Is the space-
+ (a2 x 3 + b2 x 2 + c2 x + d2 ) time interval an inner product? Explain.
= (a1 + a2 )x 3 + (b1 + b2 )x 2 7.72 The Explanation (Section 7.3.1) claims that
+ (c1 + c2 )x + (d1 + d2 ) the inner product for any real-valued orthonor-
mal basis looks like a dot product. In this
is still a third-order polynomial. In a problem you will prove this result.
similar way, show that the set of third- (a) Equations 7.3.1 show two vectors 𝐚 and
order polynomials obeys all the other 𝐛 written in components in a particu-
requirements for a vector space. lar basis. Use the property of linearity
(b) What is the dimension of this space? to express the inner product (𝐚, 𝐛) in
terms of the given components and inner
(c) Write two bases for this space.
products of the basis vectors.
(d) Express the polynomial 1 − x + x 3 in
(b) Assuming that the given basis is orthogo-
both of the bases you wrote.
nal, simplify the resulting formula.
7.68 The space of all real 2D matrices is a real vec-
(c) Now assuming that the given basis is
tor space with addition and multiplication by a
orthonormal, further simplify the for-
scalar defined in the usual way for matrices.
mula. The end result should look like the
(a) What is the dimension of this space? familiar formula for a dot product.
(b) Write two separate bases for this space. 7.73 In this problem you will show that for com-
(c) Write the matrix that rotates all 2D vectors plex inner product spaces, the requirement
90◦ clockwise and express it as an ordered of positive-definiteness demands that we
list in each of the two bases you wrote. include a complex conjugate in the require-
7.69 In the Explanation (Section 7.3.1) we claimed ment of symmetry. Consider a complex inner
that a vector space has a unique dimension product space with some vector 𝐚, with norm
n. For example, suppose the vectors 𝐱 and ||𝐚|| = k. Assuming 𝐚 isn’t the zero vector
𝐲 form a basis for the vector space V . You k must be a positive real number.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 375
(a) Take the symmetry condition in its real that with this definition is an inner
form: (𝐚, 𝐛) = (𝐛, 𝐚). Use linearity and sym- product space. Remember that f and
metry to prove that ||i𝐚|| = −k, which vio- g are real-valued functions when you
lates the condition of positive-definiteness. apply the test for symmetry.
(b) Now use the correct symmetry condi- 7.79 Jones Calculus A light beam traveling in the
tion (𝐚, 𝐛) = (𝐛, 𝐚)∗ (along with linear- z-direction generally contains an oscillating
ity) to show that ||i𝐚|| = k. electric field in both the x- and y-directions.
7.74 (a) The linearity condition for inner products These oscillating waves are typically repre-
only specifies what happens when the first sented by complex exponentials, where the
vector in the inner product is a linear com- actual field is understood to be either the real
bination: (k 𝐚 + p𝐛, 𝐜) = k(𝐚, 𝐜) + p(𝐛, 𝐜). or imaginary part of the complex function:
Use this condition and the symmetry Ex = Cx e i(kz−𝜔t) , Ey = Cy e i(kz−𝜔t) . Since the com-
condition to prove that for a real inner plex exponential piece at the end is the same
product space the same rule applies to for both x and y it is often not written, and the
linear combinations in the second posi- light beam is represented by a “Jones vector”
tion: (𝐚, k𝐛 + p𝐜) = k(𝐚, 𝐛) + p(𝐚, 𝐜). <Cx , Cy >. The constants Cx and Cy are gener-
(b) Use linearity and symmetry to prove that ally complex. The intensity of the light beam
for a complex vector space linear combina- is proportional to |Cx |2 + |Cy |2 . The phases of
tions in the second position follow a slightly Cx and Cy contain the information about the
modified rule that involves complex con- phase lag between the x and y oscillations.
jugates: (𝐚, k𝐛 + p𝐜) = k ∗ (𝐚, 𝐛) + p ∗ (𝐚, 𝐜). (a) When light passes through a polarizing fil-
(c) Use your result from Part (b) to prove that ter oriented in the x-direction, it blocks
if 𝐚 and 𝐛 are expressed in terms of an any y-component of the electric field
orthonormal basis 𝐞1 , 𝐞2 , …, their inner while allowing the x-component through
product is given by a1 b1∗ + a2 b2∗ + …. unchanged. Write a matrix that represents
the transformation a Jones vector under-
7.75 Prove that the inner product of any vector goes when passing through such a filter.
with the zero vector must be zero.
(b) In general, a matrix that represents the
7.76 Which of the following matrices are sym- transformation experienced by a Jones
metric? Which ones are orthogonal? vector when it passes through a medium
( )
1 1 is called a “Jones matrix.” If the Jones
(a) 𝐀 =
1 1 matrix for a particular medium is unitary,
( √ √ ) what does that tell you about the physi-
1∕√2 1∕ √2
(b) 𝐁 = cal effect it has on the light beam?
1∕ 2 −1∕ 2
(c) Which of the following transfor-
⎛ 1 0 0 ⎞ mations are unitary?
(c) 𝐂 = ⎜ −1 −2 1 ⎟ ( )
⎜ ⎟ i.
0 i
⎝ 2 0 −1∕2 ⎠ i 0
7.77 Which of the following matrices are Her- ( )
i 1
mitian? Which ones are unitary? ii.
( ) i −1
1 2i
(a) 𝐃 = iii. A brick wall that blocks all light.
−2i −3 iv. A polarizing filter oriented in
( √ √ )
i∕ √2 1∕√2 the x-direction.
(b) 𝐄 = v. A “multiplier” that doubles the x- and
−i∕ 2 1∕ 2
y-components of the electric field.
⎛ i 0 0 ⎞ vi. A crystal that rotates the electric
(c) 𝐅 = ⎜ −1 −2 1 ⎟ field vector 45◦ in the xy-plane with-
⎜ ⎟
⎝ 2i 0 i∕2 ⎠ out changing its magnitude.
7.78 Let be the space of all real, square-
7.80 Quantum States In classical physics a particle
integrable functions f (x). (“Square-integrable”
∞ has a definite energy at each moment. In quan-
means ∫−∞ [f (x)]2 dx exists.)
tum mechanics a particle can be in a “superpo-
(a) Show that meets all the crite- sition” in which it has energy E1 and energy E2
ria for a vector space. simultaneously. That doesn’t mean its energy
(b) Define the inner product of two func- is the sum of the two. In fact if you measure its
∞
tions f and g to be ∫−∞ f (x)g (x) dx. Show energy you will find one of those answers, E1
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 376
or E2 , but never both. Before the measurement 7.82 Exploration: The Gram-Schmidt Process
its state can be written c1 𝐄1 + c2 𝐄2 , where 𝐄1 and Legendre Polynomials
represents the state of definitely having energy Problem 7.81 illustrated the “Gram-Schmidt
E1 . When you measure the energy |cn |2 is the Process” for deriving an orthonormal basis set
probability that you will find it with energy En . for a vector space. In that problem you used the
(a) A particle is in a superposition of two process on spatial vectors. In this one you will
energy states, as described above. What use it on a non-spatial vector space. You do not
must |c1 |2 + |c2 |2 equal and why? need to have done Problem 7.81 to do this one.
(b) The particle passes through a device that An “analytic” function is one that can be rep-
changes its state to some other superpo- resented by a Maclaurin series. For instance,
sition of the two possible energies. The the series 1 + x + x 2 + x 3 + … converges to
matrix M represents the transformation 1∕(1 − x) on the domain −1 < x < 1 but not
that the state(undergoes outside that interval. We therefore describe
) in this
( device.)In
c1,f inal c1,initial 1∕(1 − x) as being analytic on −1 < x < 1.
other words =M . Let be the space of all functions that
c2,f inal c2,initial
Explain why M must be a unitary are analytic on the domain [−1, 1]. (So our
matrix. example 1∕(1 − x) doesn’t quite make it, since
7.81 Exploration: The Gram-Schmidt Process it is not analytic on the boundaries of that
and Spatial Vectors interval.) Since these functions can all be
The “Gram-Schmidt” process starts from any written as Maclaurin series the set of func-
set of basis vectors 𝐮1 , 𝐮2 … 𝐮n , and uses them tions u0 (x) = 1, u1 (x) = x, u2 (x) = x 2 … form
to derive a an orthonormal basis 𝐞1 , 𝐞2 … 𝐞n a basis for . You can define an inner prod-
1
for that space. In this problem we’ll illustrate uct for as (f , g ) = ∫−1 f (x)g (x) dx. In this
how the process works for spatial vectors. In problem you will derive an orthonormal basis
Problem 7.82 you’ll generalize it to a non- for . The resulting set of basis functions
spatial vector space. As our example, we’ll are called “Legendre polynomials.”
consider the space of 3D vectors, with the initial Warning: Each Legendre polynomial is
basis u⃗1 = î + j,̂ u⃗2 = 2î − 3k,
̂ u⃗3 = ĵ − 5k.̂ multiplied by a constant. In this problem
(a) Show that these three vectors are nei- you will be choosing the constants required
ther orthogonal nor normalized. Show to create an orthonormal basis; for other
that they are, however, linearly indepen- applications other constants are preferable.
dent (and thus form a basis). Therefore the polynomials you find here
(b) To define the first vector of our will not match the Legendre polynomials
orthonormal basis, e⃗1 , simply take u⃗1 listed in Chapter 12 and Appendix J.
and divide it by its own magnitude (a) Show that the basis un (x) is not
to get a normalized vector. orthonormal.
(c) The second basis vector e⃗2 has to meet (b) Define the first Legendre polynomial
a stricter criterion; in addition to being P0 (x) by dividing u0 (x) its own norm.
normalized it must be orthogonal to e⃗1 . (c) For the second Legendre polynomial
The formula u⃗2 ⋅ e⃗1 gives us the compo- you need a function that is orthog-
nent of u⃗2 in the direction of e⃗1 . You can onal to P0 (x). To get that, define
define v⃗2 as u⃗2 − (u⃗2 ⋅ e⃗1 )⃗
e 1 and it’s guar- v1 = u1 − (u1 , P0 )P0 . Then divide v1 by its
anteed to be orthogonal to e⃗1 because own norm to get P1 (x), a function that is
you’ve subtracted off the part of u⃗1 point- normalized and orthogonal to P0 (x).
ing along e⃗1 . Then you can define e⃗2 (d) Define v2 = u2 − (u2 , P0 )P0 − (u2 , P1 )P1 ,
by dividing v⃗2 by its own norm. and then find P2 = v2 ∕||v2 ||.
(d) Derive e⃗3 in a similar way. Starting with (e) Find P3 .
u⃗3 , subtract its components along e⃗1
This problem brings us back to where we
and e⃗2 and then normalize it.
started the section: you can span the ana-
(e) Verify that your resulting set of basis lytic functions with a basis of 1, x, x 2 , x 3 …, or
vectors is orthonormal. you can span the same space with a basis of
(f) The vectors q⃗1 = î + ĵ − 3k̂ and q⃗2 = ĵ + 5k̂ the Legendre polynomials. In Problem 7.83
form a basis for a 2D vector space. Use you will see a significant advantage of the
the Gram-Schmidt process to derive an latter basis—an advantage which is entirely
orthonormal basis for that space. due to the orthonormal basis vectors.
7in x 10in Felder c07.tex V3 - February 26, 2015 10:25 P.M. Page 377
7.83 Exploration: Polynomial Approximation. In the functions f (x) and P0 (x) in it, and
Problem 7.82 you derived an orthonormal you can just leave them like that.)
form of the Legendre polynomials. You do
not need to have done that problem in order 7.84 [This problem depends on Problem 7.83.]
to do this one. You just need to know that The first four normalized √Legendre
these polynomials are orthonormal: that is, polynomials
√ are P 0 = 1∕√ 2,
1 2
∫−1 Pn (x)Pm (x)dx is 0 if n ≠ m and 1 if n = m. P1 = 3∕2 x, P√ 2 = (1∕2) 5∕2(3x − 1),
3
Consider a function f (x). Your goal is to and P3 = (1∕2) 7∕2(5x − 3x).
find the third-order polynomial f̂(x) that (a) Find the best third-order polynomial
best approximates f (x) between x = −1 and approximation to e x between x = 1 and
x = 1. Our definition of “best approxima- x = −1. (This will involve either computer
tion” will be the]2 function that minimizes
1 [
integration or some tedious integration
∫−1 f (x) − f̂(x) dx. A Taylor series is the best by parts. The exact coefficients will be
polynomial approximation at and around a complicated expressions, but you can just
given point, but this technique can give you a give numerical values for them.)
better approximation on a given interval. (b) Show on one plot the original function
For reasons that will soon be clear, e x between x = −1 and x = 1 and each
we will look for an answer in the form of the first four partial sums of its poly-
f̂(x) = c0 P0 (x) + c1 P1 (x) + c2 P2 (x) + c3 P3 (x). nomial expansion. (The first partial
Your goal is to find the constants c0 –c3 . sum is the zero-order approxima-
(a) Write the integral that represents the tion and the fourth partial sum is the
quantity you are trying to minimize. third-order approximation.)
(b) Your integrand is the square of a (c) Find the fourth-order Maclaurin
5-term series. We’re not asking you to series for e x .
write all 25 resulting terms, but write (d) Show on one plot the original func-
the term that involves [P2 (x)]2 . tion e x between x = −1 and x = 1, the
(c) Now write the term that involves P2 (x)P3 (x). second-order approximation you derived
(d) Evaluate those two integrals. Remem- using Legendre polynomials, and the
ber that the Legendre polynomi- second-order Maclaurin series. Which
als are orthonormal! approximation looks better?
(e) Now that you see the pattern, expand the (e) Show on one plot the original function
square. All the integrals that involve only e x between x = −0.1 and x = 0.1, the
Legendre polynomials can be replaced second-order approximation you derived
with 0 or 1, as appropriate, leaving only using Legendre polynomials, and the
integrals that involve f (x). second-order Maclaurin series. Which
(f) Two of the terms in your answer involve approximation looks better?
c0 . Find the value of c0 that minimizes (f) In what kinds of situations would each of
those two terms. (Your answer will have these approximations be most useful?
CHAPTER 8
Vector Calculus
A single-variable function starts with one independent variable that might, for instance, rep-
resent position on a number line. It tells you how another variable, such as a height or density,
depends on that one number.
A “field” is a function of position in any number of dimensions, so it can depend on 2 or
3 variables. The field tells you how another variable, such as a density or a force, depends on
that position.
All the mathematics of functions, from basic algebra through derivatives and integrals,
can be generalized to these higher dimensional spaces. This chapter will begin with a brief
introduction to fields, and will then focus on derivatives of fields and fundamental theorems
in more than one dimension.
378
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 379
miles and v⃗ is in miles per hour.1 For instance, at the point (1, 1) the wind velocity is î + 2j.̂
The Glogrene plant is at the origin.
1. What is the wind velocity at the origin?
2. What is the wind velocity at the point (1, 2)?
See Check Yourself #48 in Appendix L
3. What is the wind velocity at the point (3, 0)?
The president has called you in to calculate how quickly radioactive waste is leaving the
Glogrene grounds.
4. Draw a diagram of the Glogrene facility. (Just draw a circle with a radius of 3.) Label
at the point (3, 0) a small segment of the circumference with height dy.
5. Which component of the wind velocity is causing waste to flow out through this
segment?
See Check Yourself #49 in Appendix L
6. Multiply that component of the velocity times the density of waste, times the height
dy, to get the rate at which waste is flowing out through that infinitesimal segment.
To find the total rate at which waste is leaving, you get ready to repeat the above calculation
for a differential segment at an arbitrary point on the circle, which will involve finding the
component of v⃗ perpendicular to the segment. Then you will
express the amount of waste flowing through that segment per
unit time in terms of just one variable (probably the polar angle
3 mi
𝜙) so that you can integrate it to find the total. This is not an
approach you want to take if you can avoid it.
As you may have guessed, you can. We assert with no justifica-
tion that the total rate at which waste is flowing out of the grounds
is given by the density of waste times the double integral over
the disk of (𝜕vx ∕𝜕x) + (𝜕vy ∕𝜕y) where vx is the x-component of v.⃗
7. Calculate (𝜕vx ∕𝜕x) + (𝜕vy ∕𝜕y) for the wind velocity given above.
8. This integral will be easiest in polar coordinates. Rewrite your answer to Part 7 in polar
coordinates.
9. Set up and evaluate the double integral of the function you found in Part 7 over
the disk. This integral should be straightforward in polar coordinates. Multiply your
answer by the density to find the rate at which waste is leaving the Glogrene grounds.
The good news is that you’ve calculated the rate at which waste is flowing into the community,
and using your result the government has created a sensible evacuation plan. The bad news is
that you calculated something that we threw at you with no justification. How would this trick
apply to other situations? When are you allowed to use it? Why does it work? What is it called?
The last question (which you probably weren’t wondering anyway) is easy. We used some-
thing called “The Divergence Theorem” to get that formula. For the other answers, you’ll
have to read this chapter.
1 For simplicity we will assume there is negligible vertical flow, so you can treat this as a two-dimensional problem.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 380
“slope” or a “minimum” of the function in visual terms, but the meaning of such phenomena
might be very abstract.
Contour Plots
Another way to visualize a scalar field is with a contour plot. For a function of two variables,
a contour plot shows the “level curves” of the function. For instance, the Figure 8.2 shows
the function f (x, y) = x 2 sin2 y. You can see the lines where f = 0, and the curves where f = 1,
and the curves where f = 2, and so on.
Contour plots can represent
any function of two variables, but 0.1 0.1
one particularly common use— 2
1
0.5
as well as a great metaphor—is 2
0.5
a topographical (or “topo”) map, 1
where the contour lines represent 0.1
0.1
curves of constant altitude. Fol- 2
lowing a curve along such a map 0.5
2
0.5
keeps you at a constant height,
1
while moving from the “1000” to 1
y 0
the “1100” curve is climbing. A 1
common convention in such maps 1
0.5
is that the curves represent evenly 2 2
0.5
spaced altitudes, such as one curve
every 100 ft or every 250 ft. With 0.1 0.1
Vector Fields
If a scalar field is a number at every point in space, a vector field is a vector at every
point. Once again, the field is not the coordinates of the point, but it is a function of
those coordinates; for instance, at the point (3, 2, 8) the value of the electric field may be
(12V/m)î − (7.9V/m)k. ̂ As you look at the room around you, every point has an air velocity;
the collection of all these velocities is a vector field. Other examples include magnetic and
gravitational fields.
Mathematically, we represent a vector field as one function for each component. For
⃗ y, z) = (2x + y)î + xyz 2 ĵ + 4k.
instance, we could write v(x, ̂ At the point (1, 1, 1) this field would
̂ ̂ ̂
have the value 3i + j + 4k. But just as with scalar fields, we will focus on two-dimensional
examples so we can draw them. The concepts, once mastered, will scale up.
Two techniques are commonly used to visualize vector fields. The first, sometimes called
a “vector plot,” places one arrow at every point on the field.2 The direction of the arrow
2 Of course you can’t literally draw one arrow at every point, so you draw a whole lot of them and hope. This is part
Field Derivatives
The rest of this chapter is primarily about derivatives of fields.
For a scalar field you can take partial derivatives along any of the coordinate axes. One
particularly useful combination of those partial derivatives is called the “gradient,” and we’ll
introduce it in Section 8.4. There is a useful combination of second derivatives called the
Laplacian, which we will introduce in Section 8.6.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 383
A vector field has nine different partial derivatives. For instance, there is the partial deriva-
tive of the y-component of the vector as you move in the x-direction. This is not a second
derivative; it is the first derivative of one component of a vector. The partial derivative of
the x-component with respect to y would generally be completely different. Two especially
useful combinations of these nine partial derivatives, called the “divergence” and “curl,” will
be introduced in Section 8.6.
8.6 Contour lines are often used on geological (a) If you were at the top of Taylor Moun-
maps. For instance, the drawing below indi- tain (near the bottom of the map) and
cates one line where the ground is 17’ above wanted to descend as quickly as possible,
sea level, two curves where the ground is 13’ would you go north or south? Explain
above sea level, and so on. Call this func- how you can tell from the map.
tion h(x, y) and assume that you are standing (b) If you were driving east on highway
in the middle of the curve labeled 5 (right 10, which is the road at the top of the
around the number “5” in the drawing). drawing, briefly describe what the land-
y scape would look like on your right
and left.
8.8 The plot below shows the density of Starbucks
coffeehouses in a part of downtown Seattle.
9 (As a reasonable approximation we can treat
this as a continuous function.) The arrow indi-
13 cates north and the dot indicates your current
position.
17
13
9
5
1
x
8.22 Figure 8.4 shows the electric field lines ema- (b) If you start at the center of the dipole and
nating from a positively charged particle. move straight up, the lines always point
The direction of the arrows indicates that directly to the left, but become sparser.
the electric field, and therefore the force on What does this tell us about the elec-
a positive charge, points outward. The den- tric field directly above a dipole?
sity of the lines indicates the magnitude of (c) Draw a 2D vector plot of the same electric
the field. You don’t get to choose their den- field. Your plot should show the field in a
sity, but you can calculate it. Assume you draw plane that includes the two particles.
n evenly spaced field lines emanating from
8.25 Figure 8.5 shows the electric field lines around
the charge. So if you draw a sphere centered
a “dipole,” a positive and a negative charge
on the charge, at any radius, n evenly spaced
kept a fixed distance apart. Draw a similar
field lines will penetrate that sphere.
plot for the electric field lines around two
(a) What is the surface area of a sphere at positive charges a small distance apart. Your
a distance r from the center? plot can be in 2D, on a plane that includes
(b) So what is the density, in lines/m2 , of the both particles. Remember that field lines gen-
lines penetrating such a sphere? erally begin at positively charged particles
(c) What does this tell us about the elec- or at infinity and end at negatively charged
tric field around a point particle? particles or at infinity. (If you’re careful you
8.23 The figure shows the electric field lines ema- can find a few exceptions in this case.)
nating from an infinite line of charge with a 8.26 The figure below shows three positively
uniform charge density 𝜆 Coulombs/meter. charged particles set at the vertices of
Assume you draw evenly spaced field lines ema- an equilateral triangle.
nating from the line of charge, as shown below.
If you draw a cylinder of height H centered
on the line, some number n of field lines will
penetrate that cylinder, regardless of its radius.
(a) Which of the following fields could be represented by field lines. The other
represented by those field lines? (Only could, but the picture would not look
one of these answers is correct.) like the one shown above. Say which one
i. 3î of those could have field lines and draw
ii. x î its field lines. Explain why the other one
iii. yî couldn’t.
(b) Of the two incorrect choices given
above, one of them could not be
But if the gravitational field isn’t all that familiar then we will just be relating one confusing
idea to another. So we begin here by relating the gravitational field to yet another idea that
we hope is familiar, the gravitational force.
We can express Newton’s law of gravitation by saying that a mass M exerts a force F⃗ on a
mass m. The direction of this force is toward M , and its magnitude is |F⃗| = GMm∕x 2 where
G is a constant and x is the distance between the two objects. In this formulation we imagine
every object in the world exerting a direct influence on every other object.
But it is mathematically more convenient—and, as it turns out, physically more correct—
to break the process into two steps. We say that mass M creates a “gravitational field” g⃗, an
invisible vector field in the space around the object. The direction of this field is toward
M and its magnitude is |g⃗| = GM ∕x 2 . Then we say that mass m feels a force from that field
given by F⃗ = m g⃗. So the calculations lead back to |F⃗| = GMm∕x 2 , but with the gravitational
field acting as a middleman between the object that creates the force and the object that
experiences it. The gravitational field is “the force per unit mass that an object would feel if
it were at that point.”
The “gravitational field” formulation introduces concepts that are more abstract than
the “direct force” approach, but it pays us back by making calculations easier. Gravitational
field to gravitational potential is another step in the same direction. In this section we will
show how potential gives us yet another mathematical formulation for the same physics; in
Section 8.4 we will see how it simplifies calculations.
x=0 x = 400
Our goal is to calculate the motion of a third object placed into this system, neglecting any
motion of the Earth and moon themselves. Because we are working in this section in one
dimension, we will assume that the third object is on the axis between the Earth and moon;
we can therefore express both vectors and scalars as numbers. Newton’s laws of gravitation
can then be expressed in three rules.
1. Each body creates a “gravitational field” g (x) that points toward itself. (That is, the
field is positive-valued to the left of the body, and negative to the right.)
2. At any given point, the total gravitational field is the sum of the fields created by both
bodies. (Sign is important here; positive and negative values can cancel.)
3. The mass at a point x will feel a force proportional to the magnitude of g and with a
direction given by the sign of g (e.g. leftward if g is negative).
We can turn these general ideas into precise predictions with the right equations. The mag-
nitude of the gravitational field from body i is |gi | = GMi ∕xi2 where G is a positive constant,
Mi is the mass of the body, and xi is the distance to the body. (You have to put in the appro-
∑
priate sign for g by hand.) The total g = gi . A mass m inside a gravitational field g feels a
force F = mg .
Take a moment to convince yourself that these rules express the familiar idea that any
mass placed into the system feels pulled toward both the Earth and the moon—or, more
generally, that all masses are attracted to each other. But this same idea can be expressed in
a very different way.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 389
1. Each body creates a “potential field” V (x) that is low at points near itself, and high at
points far away.
2. At any given point, the total potential is the sum of the potentials created by both
bodies.
3. A mass is pulled from regions of high potential to low.
The first rule can be mathematically expressed in the equation Vi = −GMi ∕|xi | (negative
∑
everywhere). The second rule is of course V = Vi . We will return below to the mathematical
expression of the third rule, and to the connection between the potential V and the gravi-
tational field g . First, however, we encourage you to consider again how these rules express
the idea that all objects are attracted to each other. In fact, the ultimate test of the “poten-
tial” formulation of gravity is that it yields the same predictions as the “gravitational field”
formulation; it is not a different physical theory, just a different mathematical expression of
the same theory.
“Why go to all this trouble to create a new, harder-to-understand expression of a theory that I already
understood the first way?” The main advantage is that the gravitational field is a vector, potential
a scalar. That makes potential much easier to compute, graph, and interpret than the gravi-
tational field. Physicists trying to understand a complicated system such as an atomic nucleus
or star cluster often begin by drawing a potential function for it. In this section we can treat
the gravitational field as a signed scalar number because we are working in only one dimen-
sion; this helps you understand what potential means, but obscures some of its real power,
which will be more apparent in the two- and three-dimensional examples in Section 8.4.
Students unused to potential often try to predict behavior based on whether the potential
is high or low, negative or positive. But as you can see from the discussion above, what matters
is whether the potential is increasing or decreasing. In regions where the potential graph slopes
up, the body is pulled to the left, and vice versa:
Equation 8.3.2 provides the link between the potential V and the gravitational field g . You
can translate a potential field to a gravitational field, and then into a force, to create an
equation of motion. But you can also predict motion directly from potential, without ever
mentioning the gravitational field.
Stepping Back
Newtonian gravitation can be described two equivalent ways. We can say that every mass
creates a “gravitational field” that points toward itself, and every other mass is pulled in the
direction of the gravitational field. Or, we can say that every mass creates a “potential field”
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 392
that is low around itself and high far away, and every other mass is pulled from regions of high
potential to low. These two formulations lead to the same physical predictions, so choosing
one over the other is a matter of mathematical convenience.
The rules of electrostatics can similarly be framed in terms of electric fields or of electric
potential. In this case, however, the sign of a charge (positive or negative) determines the
type of potential field it causes, and the sign of another charge determines how it responds
to a potential field; positive charges seek low potential, and negative charges high. In Prob-
lems 8.46–8.48 you will develop the rules of electric potential that lead to the familiar behav-
ior of charged particles.
Although potential does not represent a new physical theory, it is an indispensable mathe-
matical tool. Circuit analysis is done entirely in terms of potential, with the electric field rarely
mentioned at all. In quantum mechanics, the behavior of a system is described in terms of
the potential field around it, rather than the forces acting upon it. Newtonian mechanics is
approached the same way in the “Lagrangian” and “Hamiltonian” formulations, which are
often preferred to F = ma.
Remember that an object placed into a gravitational field will be pulled toward the lowest
potential it can find. This insight is the key to understanding potential in one dimension,
and also forms the bridge to higher dimensions, as we will see in the next section.
(d) How can you use a potential plot to pre- 8.50 What about that vertical asymptote? In this
dict the behavior of a negative charge? problem you will consider the potential due
8.47 [This problem depends on Problem 8.46.] Con- to the Earth, which we take to be at x = 0.
sider an electric field that can be represented There are no other masses in the problem
by the following potential graph. (no moon, for example). You will only con-
sider the right side of the axis (x ≥ 0). The
V(x) equations are a bit different on the left, but
P
the graphs are perfectly symmetrical.
(a) Consider a simple system in which the
Earth is a point particle with mass M
(a) A positively charged particle begins at located at x = 0. Calculate and sketch
the point labeled P . Describe its motion the potential field V (x).
over both the short and long term. You should have found a vertical asymptote
(b) A negatively charged particle begins at at x = 0. This is not a physical reality; it is due
the point labeled P . Describe its motion to the simplified assumption of the Earth as a
over both the short and long term. point particle. To create a more realistic pic-
ture, consider the Earth’s mass to be spread
8.48 [This problem depends on Problem 8.46.]
between x = −R and x = R. For x > R, the grav-
Consider an “electric dipole” comprising
itational field is still g = −GM ∕x 2 . But inside
a positive charge Q = 1 fixed at x = 0 and
the Earth, the gravitational field has magnitude
a negative charge Q = −1 fixed at x = 5.
g = −Kx where K is a positive constant.
(a) Calculate and draw the field of elec-
(b) Explain why it makes sense physically
tric potential around these charges.
in this scenario that g = 0 at x = 0
Remember that the potential at any
(the center of the Earth).
given point is the sum of the poten-
tials created by the two charges. (c) Calculate the potential field for 0 ≤ x ≤ R
such that −Kx = −dV ∕dx. Your answer
(b) Using your potential plot, predict the
should have an arbitrary constant.
motion of a positive charge placed at
x = −1, a second positive charge placed (d) Calculate V (R) based on your formula from
at x = 2.5, a third positive charge placed Part (a), and then choose the arbitrary con-
at x = 3, and a final positive charge stant in Part (c) to make V continuous at
placed at x = 10 in this field. x = R. (Your answer for the arbitrary con-
stant should depend on K , G, M , and R.)
(c) Using your potential plot, predict the
motion of a negative charge placed at (e) Write a piecewise function that represents
each of the same four spots. V (x) for all x ≥ 0, and sketch it.
8.51 A solar cell works by creating a region of fixed
8.49 In the Explanation (Section 8.3.2) we showed negative charge next to a region of fixed posi-
that the gravitational potential near Earth’s tive charge. The resulting electric field is shown
surface is gy, where y is height off the ground in the plot below. It is zero on either side of the
and g is the constant 9.8 m/s2 . The diagram cell (where wires lead to the other parts of the
below shows the profile of some hills. circuit) and negative in the middle. We are tak-
ing “right” to be the positive direction as usual.
E(x)
x
charge is pushed toward low potential and (b) If an electron starts at rest somewhere
the negative charge is pushed toward high between a minimum and a maximum
potential. Which way do they each go? of the potential, describe its long-term
Once you’ve gotten positive and negative behavior.
charges to go in different directions you can (c) If an electron starts at a maximum of the
use them to store energy in a battery, so the potential moving rapidly to the right,
mechanism you’ve analyzed above allows solar describe its long-term behavior.
cells to generate electricity from sunlight. (d) Sum up your answers by describing
8.52 An electron in a crystal experiences a periodic all of the possible behaviors of an
potential that moves up and down depend- electron in this potential.
ing on how close the electron is to one of the Many crystals have more than one kind of
positively charged nuclei. In the simplest case nucleus, which attract the electrons with
the potential looks roughly sinusoidal. (In this different strengths. In a simple alternat-
problem you will treat the electron as a small ing lattice, the potential for the electron
negatively charged Newtonian object. Real might look like the following:
electrons obey the laws of quantum mechan-
ics, which leads to very different behavior.) V(x)
Remember to take into account the fact that
the electron’s charge is negative. x
V(x)
x
(e) For the alternating lattice shown
above, describe all of the possible
behaviors of the electron.
(a) If an electron starts at rest at a local
minimum of the potential describe (f) Sketch the electric fields corresponding
its long-term behavior. to the two potentials shown above.
tan−1 (1∕6), and the x-component of the Earth’s field is gEarth,x = −6GM ∕92, 500 cos 𝜃.
Next we would use 𝜃 to find gEarth,y , and make another drawing with a different angle
to break g⃗moon into components, and so on…
You’ll finish that calculation in Problem 8.64. It’s tedious. Now imagine if we added
a third object off the plane of the page, thus requiring breaking all the 3D vectors
into components, and you can see why we want to avoid these vector sums if at all
possible.
GM GM
V = −6 √ −√
92, 500 12, 500
The above example should decisively convince you that the gravitational potential is easier
to find than the gravitational field. When we calculate gravitational effects from extended
objects instead of point sources (so we’re integrating instead of just adding) the math gets
messier still, and the advantages of potential are even greater. But what do we do with the
potential once we have found it?
Getting from V to g⃗
In one dimension, the rule g = −dV ∕dx tells us two things about a potential field. First: an
object feels a force toward a region of lower potential. Second: the faster the potential is
changing, the greater the force.
In higher dimensions, g⃗ is a vector that follows those same two rules. It points in the direc-
tion in which the potential is decreasing the fastest, and its magnitude indicates how quickly
the potential is decreasing in that direction. Consider the V (x, y) function in Figure 8.12.
Remember that this surface represents a depen-
dent variable (V ) of two independent variables
(x and y): a potential field in two dimensions. A ball
P placed at point P will roll to the right and toward
us. The slope is shallow in this direction, indicating
a relatively weak force.
V
y In three dimensions we lose our ability to draw a
graph, but the principle of moving in the direction
x of steepest descent is the same. (It is still possible
to graph level surfaces of the potential in 3D. See
FIGURE 8.12 Problem 8.13.)
To express all this mathematically we define an operator called the “gradient,” a multidi-
mensional generalization of the derivative. If you have worked through Section 4.6 you have
already seen one introduction to the gradient, starting with the idea of a “directional deriva-
tive.” Here we are approaching the same topic starting with potential functions. You can work
through either section without the other, but if you do both, take the time to convince your-
self that they are both talking about the same topic. This will deepen your understanding of
this very important operator.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 399
And how do you do the calculations? Given a scalar field, you have to find three values: the
x-component of the gradient, the y-component, and the z-component. But if you focus for a
moment on the first of these, you might realize that we solved precisely this problem in the
previous section: we found the x-component of the gravitational field by writing g = −dV ∕dx.
Similar considerations apply to the y- and z-directions, which leads to the formula.
𝜕f 𝜕f 𝜕f
⃗ =
∇f î + ĵ + k̂ (8.4.1)
𝜕x 𝜕y 𝜕z
We’ve said that the gradient tells you the direction of steepest increase and the rate of change
in that direction, but more generally each component of the gradient in any direction tells
you the rate of change of a function in that direction. The direction opposite the gradient
represents the steepest possible decrease of the function. If you move in any direction perpen-
dicular to the gradient the function doesn’t change. Mathematically this can be expressed
( ) by
saying the rate of change of a scalar field f in the direction of any unit vector û is ∇f⃗ ⋅ u.
̂
(The rate of change of a function in an arbitrary direction is called a “directional derivative.”
We discuss them in Chapter 4.)
We have introduced the gradient as a way of finding the gravitational or electric field
based on potential, and this is indeed one of its main purposes. But given any scalar field, it
is possible—and often useful—to evaluate its gradient.
(a) A fly at position (0, 0, 0) is attracted to with the frozen foods. The closer you get to
warmth, so it flies in the direction of steep- the frozen foods, the colder the air gets.
est temperature increase. Give a vector (a) Draw a simple two-dimensional schematic
that tells the direction of its flight. diagram of the supermarket with the frozen
(b) What is the magnitude of the gradi- food shelf somewhere in the middle. Draw
ent vector at position (0, 0, 0)? a vector plot that could represent the
(c) Explain in non-technical language what gradient of the temperature field, ∇T ⃗ ,
your answer to Part (b) means to the everywhere in the supermarket. Remem-
fly. Your answer should include the ber to follow both rules: the direction of
units of the gradient of T . each arrow should indicate the direction
8.59 The height of the mountains on one of the vector field (the direction of steep-
side of the land of Mordor can be mod- est increase of T ), and the length of each
2
eled by h(x, y) = e −(3x−y+1) . arrow should represent the magnitude of
the field (the rate of increase of T ).
(a) A rock is dropped at the point (0, 0, 1∕e).
Which way will it roll? (b) We’re not going to ask you to draw a three-
dimensional representation of the super-
(b) Find the curve in the xy-plane that sat-
⃗
⃗ = 0. market, but say a few words about what
isfies the equation ∇h
your vector field would look like in the
(c) An orc is walking along the path you found third dimension. Assume the freezer shelf
in Part (b), in the direction of increasing x, extends all the way to the ceiling.
and a hobbit is sneaking along a path paral-
lel to him, but lower on the hill. Are the orc 8.63 The brightness of the light on a table is
and hobbit moving uphill, downhill, or nei- given by a function B(x, y). Consider a
ther? Hint: the answer is the same for both. light source placed at (0, 0) and a brighter
⃗ = 0 and the hob- light source placed at (0, 5). Draw a 2D
(d) The orc’s path has ∇h
vector plot that could represent the gra-
bit’s doesn’t. What does that tell you about ⃗ throughout the room.
dient ∇B
what each of their paths looks like? Hint: it
can’t be whether they are moving up, down, 8.64 The Explanation (Section 8.4.2) presents the
or neither since we already told you the following scenario: the Earth (mass 6M ) is
answer is the same for both of them. at position (0, 0) and the moon (mass M ) is
8.60 The electric field E(x, ⃗ y, z) inside a conduc- at position (400, 0). In this problem you will
tor is generally zero. (If it isn’t, the electrons ( point P at
calculate the gravitational field at )
redistribute until a field of zero is reestab- (300, 50), using the formula g⃗ = −GM ∕|⃗ r |2 r̂.
lished.) What does that tell us about the poten- (a) The vector r⃗ from the Earth to P is 300î +
tial field V (x, y, z) inside a conductor? 50j.̂ Calculate the magnitude and direc-
8.61 The deeper you descend into the ocean, the tion of the gravitational field at P due
greater the water pressure. Pressure tells you to the Earth, and then calculate the x-
the force per unit area that a fluid exerts on and y-components of this field. (If you
any object, so if there is greater pressure on don’t know where to start, we outline
one side of the object than the other that the first steps in the Explanation.)
will create a net force on the object. (b) Write the vector r⃗ that points from
(a) If P (x, y, z) represents the water pressure, the moon to point P .
give a possible function that could repre- (c) Calculate the magnitude and direc-
⃗ (x, y, z). (Based on the information
sent ∇P tion of the gravitational field at P due
we have given you, there are many pos- to the moon, and then calculate the x-
sible right answers to this question. But and y-components of this field.
there are also plenty of wrong ones!) (d) Sum the contributions of the Earth and
(b) Suppose at one particular spot in the the moon to find g⃗ at this point.
ocean, ∇P⃗ has a positive component Now, let’s approach the problem the
in the x-direction. What would happen other way. The gravitational potential
to an object in the water? What would of a body is V = −GM ∕|⃗ r |.
happen to the water itself? (e) The vector r⃗ from the Earth to an arbi-
8.62 A supermarket is shaped like a large rectan- trary point is x î + yj.̂ Calculate the
gular box. Somewhere in the middle of that gravitational potential at an arbitrary
rectangular box is a smaller rectangular box point (x, y) due to the Earth.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 402
(f) Write the vector r⃗ from the moon to (b) What would look the same and what
an arbitrary point (x, y). would look different in the plot if
(g) Calculate the gravitational potential at an the charges were negative?
arbitrary point (x, y) due to the moon.
8.67 For each of the functions below, make
(h) Calculate the gravitational poten- a 3D plot and a contour plot in the region
tial V (x, y, z) caused by the Earth −1 ≤ x ≤ 1, −1 ≤ y ≤ 1. On the contour
and moon combined. plot show at least three arrows represent-
(i) Now use the formula g⃗ = −∇V ⃗ to ing the gradient at different points.
find the gravitational field due to 2 2 2 2
(a) f (x, y) = e −(x−1) −y + e −(x+1) −y
the Earth and moon.
(b) f (x, y) = sin(xy)
(j) Plug the point (300, 50) into your for-
(c) f (x, y) = sin(x 2 y)
mula for g⃗(x, y). Make sure your answer
matches your answer to Part (d)! 8.68 You are running an experiment in a large,
flat Petri dish, which you can treat as effec-
8.65 Consider an “electric dipole” comprising a
tively 2D. A heater is heating the center of
positive charge Q = 1 fixed at (0, 0, 0) and a
the dish, so the temperature throughout the
negative charge Q = −1 fixed at (5, 0, 0).
dish falls off exponentially with distance from
(a) This problem takes place in three dimen- 2 2
the heater: T = e −x −y , as shown below.
sions. However, we can calculate, draw,
and work with the potential and elec-
tric fields in two dimensions without any
loss of generality. Explain why.
(b) Calculate the electric potential field V (x, y).
⃗ y).
(c) Calculate the electric field E(x,
(d) Sketch a contour plot of V (x, y). On 2
T
that same plot draw arrows in at least 1
⃗
three places representing E.
–2 0
8.66 The following plot shows equipotential lines y
–1
due to a configuration of positive charges. 0 –1
x 1
2 –2
⃗
v⃗ = ∇F (8.5.1)
Conservative fields will be discussed in more detail in Section 8.11. There we will give some
equivalent definitions of the term (which is why we called that our “first definition”). We will
discuss important physical properties of conservative fields, and give shortcuts for determin-
ing if a given field is conservative.
In this section, however, we are introducing the term only so that we can present an impor-
tant relationship between gradients and line integrals, known as the “gradient theorem.”3
The gradient theorem gives you a fast way to evaluate line integrals. You don’t bother with
dot products, integrate anything, or even worry about the path; just plug two points into one
scalar function and you’re done. But of course, it only works if that scalar function exists!
For non-conservative fields, you still have to integrate along the curve.
The gradient theorem is also useful in the other direction: you can evaluate the line inte-
gral in order to find the potential function. But that begs two immediate questions.
∙ Along what path do you evaluate the line integral? The surprising answer, implied by the
gradient theorem, is that it doesn’t matter: you choose the path that will make your
work easiest. Does that mean that the result of a line integral depends only on the
endpoints, regardless of the path you take between them? Yes, it does—but only for
conservative vector fields! For non-conservative fields, integrals along different paths
will give different answers.
∙ The gradient theorem doesn’t tell me the potential anywhere, just the potential difference between
two points. That’s true, but remember that potential is only defined up to an arbitrary
constant. So we pick any reference point where we define F = 0 and then calculate
− ∫C v⃗ ⋅ d s⃗ from the reference point to an arbitrary point (x, y) to get the potential
F (x, y).
In the problems you will use the gradient theorem in both directions: going from the poten-
tial function to the line integral, and going from the line integral to the potential func-
tion. Below we demonstrate the second use, and then discuss why the gradient theorem
works.
3 Thegradient theorem goes by many other names, such as “The fundamental theorem for line integrals” and “The
fundamental theorem for gradients.” While we prefer more concise verbiage, there is something to be said for
names that remind us that all such theorems are variations of the fundamental theorem of calculus.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 404
Question: Find the potential F (x, y) of the field v⃗ = e y î + xe y j,̂ setting F = 0 at the
origin.
Answer:
To find the potential at (x, y) we evaluate a line integral y
from the origin to (x, y). Of course we want to choose a path
that will make our calculations easy; we will go straight (x,y)
across to (x, 0) and from there to (x, y).
Along the first segment, y is the constant zero.
But x is a variable, starting at zero and ending at the x-value
toward which we are integrating. We cannot use the letter x x
to mean both “the variable that keeps changing along this
line” and “the constant value at the end of this line.” There is no universal convention
for how to handle this situation, but we have chosen to use x̃ for the variable and x for
the constant. x x
vx d x̃ = 1 d x̃ = x
∫0 ∫0
Along the second segment, x̃ is the constant x and ỹ varies from 0 to y. We are
therefore integrating vy instead of vx .
y
xe ỹ d ỹ = xe y − x
∫0
𝜕F ̂ 𝜕F ̂
⃗ =−
−∇F i− j = e y î + xe y ĵ
𝜕x 𝜕y
⃗ = v,
Since −∇F ⃗ our answer is correct.
This would be a good time to stop and confirm the gradient theorem (or your understanding
of it). The example above gave us a vector field and its potential function. So pick any two
points, and any path between them, and verify that the two sides of the gradient theorem
give you the same answer. If something doesn’t match, you’ve made a mistake somewhere!
As with any integral, we begin our analysis with a small slice. d s⃗ is a step along
F
the path, small enough to be considered a line. At that particular spot the vector
⃗ points at some angle (which could be anything) off the path.
∇F ds
dy
What does the dot product of those two vectors represent? dx
( )
𝜕F ̂ 𝜕F ̂ ( ̂ ) 𝜕F 𝜕F
⃗ ⋅ d s⃗ =
∇F i+ j ⋅ dx i + dy ĵ = dx + dy
𝜕x 𝜕y 𝜕x 𝜕y
(𝜕F ∕𝜕x)dx is the amount the function would change if you took a small step dx in the x-
direction. Add this to the change along dy (and also dz in three dimensions) and you get the
total change in F as you move through the vector d s⃗. (If you’ve studied directional deriva-
tives you should recognize that what we are calculating here is the directional derivative in
the direction of d s⃗, multiplied by the length ds, to give the amount that F changes along
d s⃗. But directional derivatives aren’t necessary to follow this logic if you understand partial
derivatives.)
This is the key point in the derivation: ∇F ⃗ ⋅ d s⃗ tells you how much F increases (or
decreases) as you move along a small displacement d s⃗. When you integrate, you add up all
the incremental changes in F as you move along the entire curve. So you end up with the
total change in F as you move from the beginning of the curve (P1 ) to the end (P2 )—and
there’s the gradient theorem.
Stepping Back
The past two sections have been all about the relationship between a scalar field and its gradi-
ent, or equivalently between a vector field and its potential. You know that in one dimension
the following two statements are mathematically equivalent.
∙ If f (x) = x 2 then its derivative is f ′ (x) = 2x.
∙ If g (x) = 2x then its integral is ∫ g (x)dx = x 2 + C.
The gradient and potential give us a higher dimensional analog, with a negative sign
thrown in.
⃗ = 2x î + 3j.̂
∙ If F (x, y) = x 2 + 3y then its gradient is ∇F
̂ ̂
∙ If v⃗ = 2x i + 3j then its potential function is − ∫C v⃗ ⋅ d s⃗ = −x 2 − 3y + C.
In many math texts, and in hydrodynamics, no negative sign is used so that “gradient” and
“potential function” are perfect opposites. We will retain the negative sign that is used in
electricity and gravity, which simplifies the relationship to potential energy.
z without changing 𝜕F ∕𝜕x. So add an (d) Finish finding the function F . In this
arbitrary function G(y, z) to your function part your answer should have an arbi-
from Part (a). You now have the gen- trary constant, but not an arbitrary
eral solution to the partial ( differential
) function.
equation 𝜕F ∕𝜕x = 2x cos x 2 + z . (e) Take the gradient of your answer to
(c) Take the partial derivative of your answer Part (d) and verify that it works.
to Part (b) with respect to y, and set it (f) Try to use the same technique to
equal to 2yz. You can now solve for the find the potential function for
function G(y, z), but your answer will still v⃗ = x î + xyj.
̂ What went wrong?
have an arbitrary function in it.
The vector field V⃗ (x, y) represents how much mass passes by per unit cross-sectional area per
unit time. The “divergence” of V⃗ represents how much the gas is flowing away from a given
point. To approach the divergence, we consider a square region. As the gas flows through
this region, we ask the question: Is gas flowing out of this region faster than it is flowing in?
(Generally losing gas, positive divergence.) Or is gas flowing into this region faster than it
is flowing out? (Generally gaining gas, negative divergence.) Or is gas flowing in and out at
the same rate? (Neither gaining nor losing gas, zero divergence.)
In all our examples, the gas is flowing to the right. Assume
the constant k in the equations below is positive. In our first
example, the flow of the gas is constant. This field could be rep-
resented as V⃗ 1 = k i.̂
1. Compare the rates at which gas is entering our region
(from the left) and exiting (to the right). Which rate is
faster, or are they the same?
2. Is this region generally losing gas, gaining gas, or neither?
3. Is the divergence of k î positive, negative, or zero?
In our second example, the flow of the gas increases as you
move up. This field could be represented as V⃗ 2 = kyi.̂
4. Compare the rates at which gas is entering our region
(from the left) and exiting (to the right). Which rate is
faster, or are they the same?
5. Is this region generally losing gas, gaining gas, or neither?
See Check Yourself #54 in Appendix L
6. Is the divergence of kyî positive, negative, or zero?
In our third example, the flow of the gas increases as you move
to the right. This field could be represented as V⃗ 3 = kx i.̂
7. Compare the rates at which gas is entering our region
(from the left) and exiting (to the right). Which rate is
faster, or are they the same?
8. Is this region generally losing gas, gaining gas, or neither?
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 408
Our customary way of describing a vector field asks three questions at each point: “How
much does this field point in the x-direction,” “How much does it point in the y-direction,”
and “How much in the z-direction?” The answers offer a complete description of the field,
but not always in a form that is easy to work with.
Here are two very different questions: “How much does this vector field point away from
this point,” and “How much does this vector field swirl around this point?” These two, the
“divergence” and “curl” respectively, offer a very different (and often more useful) break-
down of how a vector field is flowing and changing.
2 2 2
0 0 0
–2 –2 –2
–4 –4 –4
–4 –2 0 2 4 –4 –2 0 2 4 –4 –2 0 2 4
The first drawing shows a constant vector field. At the origin (or at any other point), the field
is flowing into the point and out of the point with the same strength, so the divergence is
zero. And the field is not flowing around the point, so the curl is zero.
The second field points ever away from the origin, so the divergence at the origin is pos-
itive. (If it were moving toward the origin, the divergence would be negative.) Once again,
there is no rotation around the origin, so the curl is zero. The third drawing is the opposite:
there is a clockwise curl at the origin but a zero divergence.
4 Momentum density and flow rate were discussed more carefully in Chapter 5.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 409
The drawings below are less obvious; you might think both represent zero-divergence
zero-curl fields, but in fact neither one does. To help guide you we’ve drawn a small box
and paddle wheel centered on the origin. Once again, think about your own answers before
reading ours below.
1.0 1.0
0.5 0.5
0.0 0.0
–0.5 –0.5
–1.0 –1.0
–1.0 –0.5 0.0 0.5 1.0 –1.0 –0.5 0.0 0.5 1.0
Consider the boxed region in the first field. You see the field moving into this region from
the left, and back out again on the right. But the outward flow is greater than the inward flow,
so the net divergence is positive; over time, the total amount of fluid in that region would
decrease. You should be able to convince yourself that this would be true for any region you
could draw in this flow, so the divergence is positive everywhere. The curl of this field is zero,
by the way.
Now consider the boxed region in the second field. The inward and outward flows are
equal, so there is no divergence. There does not seem to be any rotation either, but look
what will happen to the paddle wheel: the push on top is stronger than the one on bottom,
so the paddle wheel will rotate clockwise. Once again you should be able to convince yourself
that the same would be true about a paddle wheel placed anywhere in this field, so the curl
is clockwise everywhere. Since you can have a non-zero curl at a point even when the field
vectors don’t rotate around that point, we use the term “circulation” for the property being
measured by curl. We’ll define that term more rigorously later, but for now you can just think
of it in terms of the paddle wheel.
In three dimensions we can’t describe the curl by simply saying “clockwise” or “counter-
clockwise”; the field might curl around the x-axis, or around the y-axis, or around any other
line you can draw. So we specify the curl with a direction vector, following a right-hand rule:
if you wrap the fingers of your right hand in the direction the paddle wheel would rotate,
your thumb points in the direction of the curl. So the curl for the vector field above points
into the page. Of course it wouldn’t matter if we used a left-hand rule instead, but it does
matter that we all use the same rule, and the right-hand rule is universally agreed upon. We
can thus use a single vector to communicate an arbitrary direction of circulation in three
dimensions.
The following facts are critical for understanding the divergence and curl. The divergence
is a scalar; the curl is a vector. One number tells you everything you need to know about the
divergence, noting that the number can be positive (net outward flow), negative (net inward
flow), or zero (balance). But curl has a magnitude (how much circulation) and a direction
(what line the field is circulating about, and which way it flows around that line).
Remember that divergence is a scalar. Those three partial derivatives are not components of a vec-
tor; they are simply being added. Curl is a vector.
You will sometimes see these written as div f⃗ and curl f⃗ but the notation ∇⃗ ⋅ f⃗ and ∇
⃗ × f⃗ has
mnemonic value, as we will discuss below.
The cross product and the curl are often written as the determinant of a matrix. It’s not a
real matrix, but if you’ve already memorized the determinant of a 3 × 3 matrix then you can
leverage that brain space to remember the formula for the curl.
| î ĵ k̂ ||
| 𝜕 𝜕 |
| 𝜕
∇⃗ × f⃗ = | 𝜕x 𝜕z ||
| 𝜕y
| f
| x fy fz ||
If you’re not comfortable calculating determinants you can ignore this matrix and use
Equation 8.6.2 directly. Below we illustrate the use of this trick, but if you skip over the
matrix you should still be able to follow the example.
Question: Calculate the divergence and curl of the field f⃗ = e 6x−4y î + (x 2 ∕2 − 2xy)ĵ +
(sin(z)∕x)k̂ at the point (2, 3, 𝜋∕2). What do they tell us about the field at that point?
Answer:
∇⃗ ⋅ f⃗ = 6e 6x−4y − 2x + cos(z)∕x
⃗ ⋅ f⃗(2, 3, 𝜋∕2) = 6 − 4 = 2
∇
| î ĵ k̂ |
| |
⃗ ⃗ | 𝜕 𝜕 𝜕 |
∇ × f = | 𝜕x |
| 𝜕y 𝜕z |
| e 6x−4y x 2 ∕2 − 2xy (sin z)∕x |
| |
( ) ( )
= (0 − 0) î + 0 + sin(z)∕x 2 ĵ + x − 2y + 4e 6x−4y k̂
⃗ × f⃗(2, 3, 𝜋∕2) = (1∕4)ĵ
∇
The positive divergence tells us that the field is generally moving away from the point
̂
(2, 3, 𝜋∕2) near that point. The curl in the j-direction indicates a circulation around
this point. An observer looking at this point from the positive y-direction would
measure that circulation as counterclockwise.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 411
sign of the divergence and the direction of the curl at the point (0, 1). Then calculate
the divergence and curl to check your answers.
f g
2 2
1 1
0 0
–1 –1
–2 –2
–2 –1 0 1 2 –2 –1 0 1 2
Answer:
Looking at the pictures it’s clear that f⃗ (on the left) has zero curl and g⃗ on the right
has zero divergence. Since all the vectors are pointing outward on the left and
clockwise on the right, it seems like f⃗ has positive divergence and g⃗ has a curl pointing
into the page. If you calculate them, though, you find just the opposite. At the point
⃗ ⋅ f⃗ = −2 and ∇
(0, 1), ∇ ̂ which points out of the page. What’s going on?
⃗ × g⃗ = 2k,
As always, imagine a small box around the point (0, 1) and a small paddle wheel at
that point. The vectors of f⃗ around that point are pointing straight up, but they are
rapidly getting smaller, so there’s more flow into the box than out of it. Similarly, the
vectors of g⃗ are pointing to the right there, and the ones above are much smaller
than the ones below, so the curl is counterclockwise. What makes these examples
unintuitive is that we naturally tend to focus on the overall flow, which looks outward
for f⃗ and clockwise for g⃗. At any particular point other than the origin, the flow is not
doing that, however.
The point of this example is not that you shouldn’t trust your intuition about divergence
and curl. On the contrary, you can build up a good intuition for them that works even in
unusual circumstances. You need to get in the habit of thinking about what the vectors are
doing in the vicinity of a particular point.
5 “Del” is actually the name of the mathematical object we define here. The typographical symbol ∇ is called “nabla.”
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 412
We write the î to the left of the 𝜕∕𝜕x so it won’t look like we’re taking the derivative of î which
would of course be zero. With that cleared up, you’re probably still thinking “What is that?”
It’s kind of like a vector—it has i-,̂ j-,
̂ and k-components—but
̂ it isn’t a real vector because
those components aren’t numbers. The symbol ∇ ⃗ by itself doesn’t mean anything. It’s a set
of instructions that takes on meaning when you use it to act on a function.
A set of instructions for turning one function into another one is called an operator. For
example, we could define an operator O = d∕dx. Then if f (x) = x 3 we could write Of = 3x 2 .
When it looks like O is multiplying f , it’s really acting on it.
But ∇⃗ is more complicated because it’s a vector operator. We can multiply a vector three
ways: multiply it by a scalar, find its dot product with another vector, or find its cross product
with another vector. These three operations with ∇⃗ represent the gradient, divergence, and
curl, respectively.
𝜕f 𝜕f 𝜕f
⃗ =
∇f î + ĵ + k̂
𝜕x 𝜕y 𝜕z
𝜕fx 𝜕fy 𝜕fz
∇⃗ ⋅ f⃗ = + +
𝜕x 𝜕y 𝜕z
( ) ( ) ( 𝜕f )
𝜕fz 𝜕fy 𝜕fx 𝜕fz y 𝜕fx
⃗
∇×f = ⃗ − ̂
i+ − ̂
j+ − k̂
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
⃗ introduces any new math. It’s just a shorthand notation for writing and
Nothing about ∇
remembering the formulas for these field derivatives.
The Laplacian
If the gradient, divergence, and curl represent first derivatives of fields, what are the second
derivatives? You can think of many possible answers to this question, but one of them turns
out to be particularly useful: the divergence of the gradient of a scalar field S is called the
“Laplacian” of S.
The Laplacian serves roughly the same role in higher dimensions that the second derivative
fills in one. For instance, if f ′′ (x) = 0 everywhere, then f (x) has no minima or maxima; if
∇2 f (x, y, z) = 0 everywhere, then f (x, y, z) has no minima or maxima.
More importantly, the Laplacian comes up in a variety of important equations. For
instance, in electrostatics, a charge distribution with density D(x, y, z) creates an electric
potential V that obeys Poisson’s equation ∇2 V = −D∕𝜖0 .
The Laplacian is usually written ∇2 S because it’s the expression you would get if you used
the rules above to find the “squared magnitude” ∇2 and then acted with that on S. Some
texts write ΔS instead.
Here again the analogy of ∇ ⃗ to a vector holds up. You can take the gradient of a scalar
function of n variables, and end up with an n-dimensional vector. You can take the divergence
of a vector field in n dimensions, and end up with a scalar function of n variables. The curl,
however, is an inherently three-dimensional entity. You can take the curl of a two-dimensional
vector field by treating the third component as zero, but the curl itself will point into the third
dimension. And while there are ways to generalize the idea into four or more dimensions,
you’re not talking about the curl any more. Three-dimensional vector fields define a special
case where the curl is meaningful, but they are after all a pretty important special case in our
universe.
8.84 y
8.87 y
4
4
3
3
2 2
1 1
x
1 2 3 4 x
1 2 3 4
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 414
(a) Write the field E⃗ in the form Ex î + the curl of the divergence? Why can’t we ask
Ey ĵ + Ez k.
̂ (One approach is to begin you about the gradient of the curl?
by writing E⃗ in terms of r⃗ and then 8.107 (a) Prove that any linear function f (x, y, z) =
converting to Cartesian.) ax + by + cz + d is a solution to
Laplace’s equation ∇2 f = 0.
(b) Make a rough sketch of E⃗ in two
dimensions. You may use a vec- (b) Find another (non-constant, non-linear)
tor plot or field lines. solution to Laplace’s equation.
(c) Calculate ∇ ⃗ Is the result posi-
⃗ ⋅ E. 2 2 2 2
8.108 f (x, y) = e −(x−1) −y − e −(x+1) −y
tive, negative, or zero?
(a) Have a computer make a 3D plot of
(d) Your answer to the last part may have f (x, y). Choose a domain for your plot
surprised you. Draw a small box around that lets you clearly see how the function
some point other than the origin and behaves.
explain why it makes sense from your
(b) Based on your plot, predict the sign
plot that the divergence at that point
of the Laplacian at the points (−1, 0),
behaves the way you found in Part (c).
(0, 0), and (1, 0). For each one explain
(e) Calculate ∇⃗ × E. ⃗ Explain why the result
why you would expect the answer
makes sense based on your drawing. you gave based on the plot.
8.102 A wire carrying a current I in the posi- (c) Calculate ∇2 f (by hand or on a com-
tive z-direction creates a magnetic field puter) and check your predictions.
whose magnitude is |B| ⃗ = 𝜇0 I ∕(2𝜋𝜌) where
8.109 The electric potential V is related to the
𝜇0 is a constant and 𝜌 is the distance from density of charge 𝜌 in a region by Pois-
the wire. The direction of the field points son’s equation: ∇2 V = 𝜌∕𝜖0 . You( measure
around the wire, counterclockwise as viewed )
the potential in a room to be k x 4 − y4
from above. where k is a positive constant. In what parts
(a) Write the field B⃗ in the form Bx î + of the room is there positive charge? Neg-
By ĵ + Bz k.
̂ (Just play around with this ative charge? No net charge?
until you get something that works 8.110 The steady-state temperature T in a region
at key obvious points and then check of space with no external sources of
that it works more generally.) heating or cooling approximately obeys
(b) Make a rough sketch of B⃗ in two dimen- Laplace’s equation: ∇2 T = 0.
sions for some constant z. (a) Assume a 1D rod has fixed tem-
(c) Calculate ∇ ⃗ Is the result positive, neg-
⃗ ⋅ B. peratures at the ends: T (0) = T1 ,
ative, or zero? Explain why this result T (L) = T2 . Solve Laplace’s equation in
makes sense based on your drawing. 1D to find T (x) along the rod.
(d) Calculate ∇ ⃗ Your result may sur-
⃗ × B. (b) Which of the following steady-state tem-
prise you: you will examine why it makes perature distributions is possible for
sense in Section 8.7 Problem 8.116. a 2D region without sources of heat-
8.103 Harry and Ron are hunched over a long ing or cooling? (There may be more
homework scroll. Harry says “The Lapla- than one correct answer.)
cian here obviously points straight up!” Ron i. T (x, y) = x − y
replies “You stupid git, it points down!” With- ii. T (x, y) = sin(x − y)
out glancing at the problem, Hermione iii. T (x, y) = e −x sin y
cheerily calls over her shoulder “You’re iv. T (x, y) = e −x−y
both wrong.” How does she know? 8.111 What does the formula ∇ ⃗ ⋅ v⃗ have to do with
⃗ × ∇S
8.104 Prove that for any scalar field S(x, y, z), ∇ ⃗ the question “how much does the vector
⃗ field flow away from this point?” We will con-
(“the curl of the gradient”) is 0.
fine ourselves mostly to the simplest case: a
8.105 Prove ⃗ y, z),
( that)for any vector field v(x, vector that is defined only on the positive
⃗⋅ ∇
∇ ⃗ × v⃗ (“the divergence of the curl”) is 0. x-axis, and has only an x-component.
8.106 The Laplacian is the divergence of the gra- (a) Begin with a constant vector, f⃗ = 2i.̂
dient. Problems 8.104 and 8.105 asked you For any small segment of the x-axis,
about the curl of the gradient and the diver- the vector flow into that segment is
gence of the curl. Why can’t we ask you about the same as the vector flow out of that
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 416
point. What does this imply for the (d) Repeat this exercise for vectors point-
divergence? What is dfx ∕dx? ing to the left (still on the positive x-
(b) Now consider a growing vector, g⃗ = 2x i.̂ axis). Create simple functions that are
For any given segment, which is greater, growing and shrinking, predict their
the vector flow into that segment or divergences, and then calculate their
the flow out? What does this imply for derivatives. Does the generalization
the divergence? What is dgx ∕dx? “the formula for divergence tells how
much the vector field flows away from a
(c) Now consider a shrinking vector, h⃗ =
small region around this point” always
(1∕x)i.̂ For any given segment, which is
hold up?
greater, the vector flow into that segment
or the flow out? What does this imply (e) How would we extend this argument
for the divergence? What is dhx ∕dx? to two or three dimensions?
8.7 | Divergence and Curl II—The Math Behind the Pictures 417
Are we there yet? Not quite. A surface integral is field strength times area, so as our region
shrinks to zero, the integral will do the same. So instead of looking at the integral itself, we
look at the integral per unit volume. This gives us a useable definition:
The divergence of a vector field f⃗ at a point P is the surface integral per unit volume of f⃗ through a
surface surrounding P in the limit where the volume of that surface shrinks to zero.
( )
∯ f⃗(x, y, z) ⋅ d A⃗
⃗ ⋅ f⃗ = lim
∇ (8.7.1)
ΔV →0 ΔV
This somewhat hand-waving definition can be made mathematically rigorous, but like the
definition of the derivative itself, Equation 8.7.1 is not something people actually use for
calculations. Our goal is to translate it into Equation 8.6.1 by going back to Figure 8.14 and
calculating the surface integral.
First we consider the left and right faces of the box. Since surface integrals depend only
on the component of a vector perpendicular to a surface, we are only interested in the value
of fx on those faces.
Now, suppose fx were equal on both faces, and suppose it pointed to the right. That would
create a positive (outward) flow on the right face, and an equivalent negative (inward) flow
on the left face, adding up to zero.6 The only way these two faces can contribute to the
divergence is if fx changes as we move to the right. This should be sounding like a buildup
to a derivative, but let’s keep going and see how that happens.
At the right face of the box, the perpendicular component of the field is fx (x + Δx, y, z).
The surface integral is that component multiplied by the area (ΔyΔz). The left side is
similar, but with two differences. First, the perpendicular component is fx (x, y, z). Second,
if fx is positive (rightward) then the flux through this face is negative (inward), and
vice
( versa, so we need a)negative sign. So the total integral between these two faces is
fx (x + Δx, y, z) − fx (x, y, z) Δy Δz. Equation 8.7.1 then instructs us to divide by the volume
(ΔxΔyΔz) and take the limit:
fx (x + Δx, y, z) − fx (x, y, z)
lim The contribution of the left and right faces of the cube
Δx→0 Δx
to the divergence
Remember when it sounded like we were describing a derivative above? This is the definition
of a partial derivative, so we can write more succinctly that the left and right faces of the cube
make a net contribution to the divergence of 𝜕fx ∕𝜕x. Similar arguments show that the front
and back faces contribute 𝜕fy ∕𝜕y and the top and bottom ones contribute 𝜕fz ∕𝜕z. Put it all
together and you get Equation 8.6.1.
Curl
The curl can be defined similarly. Instead of a surface we draw a closed
loop around the point, and take the line integral of the vector field
around that loop. (We’ve been bandying the term “circulation” about
when talking about the curl; the circulation of a vector field around a
closed loop means the line integral of that field around that loop.)
The line integral matches the curl just as well as the surface integral
matches the divergence, because it measures how much the field is
circling around the point, ignoring the components that radiate in or
out. But remember that curl is a vector! If our loop lies flat in the xy
6 We are following the standard convention in fluid flow where “flux” refers to the component of the vector field
pointing through the surface and “flow” refers to that component multiplied by the area, i.e. the value of the surface
integral.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 418
(horizontal) plane, then it will measure the z-component of the curl. (Remember the right-
hand rule.) So we need three different loops to get the three different components of the
curl.
Other than that, the outline is similar to what we did for the divergence. Just as before, we
shrink our loop down to focus on the region immediately around the point. Just as before,
a smaller loop will lead to a smaller line integral, so we look at the circulation per unit area
enclosed to reach a definition:
The z-component of the curl of a vector field v⃗ at a point P is the circulation per unit area of v⃗ around
a loop surrounding P , in a plane perpendicular to k, ̂ in the limit where the area of that loop shrinks
to zero.
( )
∮ f⃗(x, y, z) ⋅ d s⃗
lim the component of the curl of f⃗ perpendicular to the plane of the loop
ΔA→0 ΔA
(8.7.2)
The steps in turning this into Equation 8.6.2 are similar to those outlined above for the
divergence. We will start here with the basic ideas and leave the details for you to go through
in Problem 8.113.
We begin by drawing a rectangle in a horizontal plane with the point (x, y, z) at one cor-
ner. (Once again, our point should technically lie inside the loop, but once you follow our
derivation it isn’t hard to clean up that detail.) The area of the rectangle is ΔA = ΔxΔy.
Our goal is to find the line integral of a vector field
⃗ y) around this rectangle in a counterclockwise direc-
v(x,
tion. First we consider the line integrals on the left and
right sides of the rectangle. Since a line integral depends Δy
on the component of a vector parallel to the line, only fy
will contribute on those sides. If fy were the same on both
sides the two sides would cancel each other out since we (x,y,z) Δx (x + Δx,y,z)
are measuring the −fy -component on the left and the +fy =
component on the right. So the curl depends on how fy changes as you go from x to x + Δx.
You can once again spot a derivative on the horizon, but what matters this time is how fy is
changing with respect to x.
To finish the calculation you write an expression for the sum of the line integrals on the
right and left, divide by the area of the rectangle, and take the limit as Δx → 0, which gives
you a partial derivative. Then you do the same for the top and bottom of the rectangle and
add the two to get the z-component of the curl. The calculations of the x- and y-components
are almost identical, finally leading to Equation 8.6.2.
(a) Find the electric flux through the inner zero—possibly a surprising result, given that
hemisphere. (In electromagnetism the field looks like it is all curl! In this prob-
“flux” means the surface integral of lem you will consider why that result makes
the electric or magnetic field.) sense.
(b) Find the flux of E⃗ through the In the drawings below the wire is not
outer hemisphere. shown, coming out of the page at (x, y) =
(0, 0). We arbitrarily choose to consider the
(c) Find the flux of E⃗ through the flat
magnetic field at the point (1, 0).
ring connecting the two.
(a) As we saw in this section, the curl is
(d) Find the total flux through the surface.
defined as the line integral of the field
(e) What does your answer tell you about around a loop. The picture below shows
the divergence of E?⃗
a simple circular loop around the point
(f) In Cartesian
( coordinates,
)3∕2 ( ) (1, 0). Explain why calculating ∮ B⃗ ⋅ d s⃗
E⃗ = kq∕ x 2 + y2 + z 2 x î + yĵ + z k̂ . around this loop would be difficult—in
Take the divergence of this field and other words, why this is the wrong loop
check that it verifies your prediction. to choose.
The prediction and calculation you just did y
are valid for all points except the location
of the point charge. You will consider what’s
x
happening at that point in Problem 8.115.
8.115 The electric field E⃗ around a positive point
We choose therefore a more compli-
charge q has magnitude kq∕r 2 , where r is
cated loop that is more tailored to
distance from the point and k is a constant.
this particular situation.
The field points directly away from the
charge. y
(a) Calculate the flux of E⃗ through a
sphere of radius R centered on the x
point charge. (In electromagnetism
“flux” means the surface integral of
the electric or magnetic field.) (b) Two pieces of this loop are lines that
(b) Divide your answer to Part (a) by radiate out from the origin. Explain
the volume of the sphere. how we can tell with no calculation
(c) Take the limit of your answer to Part (b) that ∫ B⃗ ⋅ d s⃗ along these pieces is
zero.
as R → 0 to find the divergence of E⃗ at
the location of the point charge. Explain The remaining two pieces are arcs cut from
what this answer is telling you about circles around the origin. The B⃗ field there-
the field. fore points directly along the outer arc (at
radius 𝜌2 from the origin), and backward
8.116 A wire carrying a current I in the positive
along the inner arc (radius 𝜌1 ).
z-direction creates a magnetic field whose
⃗ = 𝜇0 I ∕(2𝜋𝜌) where 𝜇0 is
magnitude is |B| (c) The outer arc is longer than the
a constant and 𝜌 is the distance from the inner arc. By what factor?
wire. The direction of the field points around (d) The magnetic field is stronger at the inner
the wire, counterclockwise as viewed from arc than at the outer arc. By what factor?
above. In Section 8.6 Problem 8.102 you com- (e) Using all your results above, explain
puted that the curl of this magnetic field is why ∇⃗ × B⃗ = 0. ⃗
7 Some texts use ê with subscripts to indicate unit basis vectors, such as ê𝜌 where we use 𝜌.
̂
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 421
By contrast, f⃗(𝜌, 𝜙) = 3𝜌̂ + 4𝜙̂ does not point from the origin to 𝜌 = 3, 𝜙 = 4. It isn’t a con-
stant either; it defines a vector field whose value at each point depends on the directions of
𝜌̂ and 𝜙̂ at that particular point. This can make interpreting non-Cartesian fields a confusing
business, but it can also make life much simpler in some cases.
As an example, consider the field f⃗(𝜌, 𝜙) = −(1∕𝜌2 )𝜌̂
1.0
(𝜌 ≠ 0). One approach is to consider specific points. At (1, 0)
this vector points to the left with a magnitude of 1. At (1, 1)
0.5 the vector points left-and-down with magnitude 1/2. At
(0, 2) the vector points straight down with magnitude 1/4,
0.0 and so on.
But we can see the whole field more generally by noting
–0.5 that −𝜌̂ gives the direction (always points straight toward
the origin), and 1∕𝜌2 gives the magnitude (decreases as you
move away from the origin). With that insight you can do
–1.0
–1.0 –0.5 0.0 0.5 1.0 the complete sketch quickly.
You may have figured out that this is not an arbitrary
example; it can represent the gravitational field exerted by a point mass, or the electric field
around an electron.
Of course we could have approached the entire problem in Cartesian coordinates;
𝜌⃗ = 𝜌𝜌̂ = x î + yj,̂ so we would end up here.
1 ( )
f⃗(x, y) = − x î + yĵ
(x 2 2
+y ) (3∕2)
That’s true, and it can even be useful, but the polar representation is easier for most purposes.
It is because 𝜌̂ points in different directions at different points that it lends itself so naturally
to this physical situation.
Answer:
One approach is to consider specific points. At 1.0
(1, 0) this vector points to the right with magni-
tude 1. At (0, 1) this vector points to the left 0.5
with magnitude 1. Going through this process
slowly and manually can be a valuable way
to test and build your facility with polar vectors, 0.0
but make sure to choose points that make this
particular function easy to compute! –0.5
But after you’ve gone through that, look at the
big picture. You should recognize the 𝜌- ̂
–1.0
component as 𝜌 cos 𝜙 = x. So the radial –1.0 –0.5 0.0 0.5 1.0
component is −1 (pointing toward the origin)
where x = −1, it’s zero where x = 0, it’s 1 (pointing FIGURE 8.15
away from the origin) where x = 1 and so on. The
̂
𝜙-component is zero on the x-axis and is positive (counterclockwise) everywhere else,
reaching a maximum on the y-axis. The resulting field is shown in Figure 8.15.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 422
A field like this is easiest to understand in polar coordinates, but if you want a
computer to plot it then it’s sometimes easier to put it in Cartesian coordinates. See
Appendix D for conversions between unit vectors in different coordinate systems.
Question: Find the gradient of the function f (𝜌, 𝜙) = k𝜙 and explain why it came out
the way it did.
Answer: f = kπ/2 f = kπ/4
This isn’t a trick question; you turn to Appendix F, look up
the formula for the gradient in polar coordinates (or look up the
f=0
cylindrical and ignore the z), and plug in.
⃗ = (𝜕f ∕𝜕𝜌)𝜌̂ + (1∕𝜌)(𝜕f ∕𝜕𝜙)𝜙̂ = (k∕𝜌)𝜙.
∇f ̂
But what is all that telling us? Start by thinking about what f
looks like. It is zero all along the positive x-axis. It increases as you rotate around:
f = k𝜋∕4 along the line y = x in the first quadrant, then f = k𝜋∕2 along the positive
y-axis, and so on.
Clearly, if you choose any point and move in the +𝜌-direction, f does not change;
the direction of steepest increase is around the circle in the +𝜙-direction. That’s what
the direction of the gradient is telling us.
But you can also easily convince yourself that the rate of change is not always the
same. Consider that if you start at the point (1, 0) and go around a circle, f moves
from 0 to 2k𝜋. If you start at the point (3, 0), f moves from 0 to 2k𝜋 as you traverse a
much bigger circle. In fact, 𝜙 always goes from 0 to 2k𝜋 as you cover a distance of 2𝜋𝜌,
which makes the rate of change k∕𝜌, exactly as the formula predicts. (We’ll leave it to
you to think about why it makes sense that this formula is undefined at 𝜌 = 0.)
The rest of this explanation, and many of the problems, are devoted to deriving these formu-
las. If you skip all that, the formulas will work anyway, and you’ll be no worse off than many
students. But if you take the time to work through the derivations, you will be rewarded
with a deeper understanding of both curvilinear coordinate systems and the vector calculus
operators. (And of course, that deeper understanding may last long after the details of the
derivations are forgotten.)
After we combine these two integrals, Equation 8.8.1 instructs us to divide by ΔV and take
the limit:
1 (𝜌 + Δ𝜌) f𝜌 (𝜌 + Δ𝜌, 𝜙, z) − 𝜌f𝜌 (𝜌, 𝜙, z)
lim
𝜌 Δ𝜌→0 Δ𝜌
The “aha!” moment comes when you recognize that as the definition of a partial deriva-
tive, not for the function f𝜌 , but for the function 𝜌f𝜌 . And
( )so we arrive at the first part of
the divergence in cylindrical coordinates, (1∕𝜌) (𝜕∕𝜕𝜌) 𝜌f𝜌 . You will look at the remain-
ing four sides, and so finish the cylindrical divergence, in Problem 8.143; the spherical is
Problem 8.144.
This is not an easy derivation, but it carries enough insights to be worth your time. Refresh-
ing yourself on the definition of the divergence and where it comes from is a valuable part
of understanding that operator. The volume and the surface areas associated with the cylin-
drical box are vital to working in that coordinate system.
8.121 f⃗(𝜌, 𝜙) = 𝜌𝜌̂ + (cos 𝜙)𝜙̂ (a) Draw the x, y, and z coordinate axes.
Draw a point. Draw a small arrow from
8.122 f⃗(𝜌, 𝜙) = (cos 𝜙)𝜌̂ + 𝜌𝜙̂ that point in the r̂-direction.
(b) If you take a small step in the direc-
In Problems 8.123–8.129, find the gradient and
Laplacian of the given scalar field. tion of that arrow, what is its distance
ds in terms of r , 𝜃, 𝜙, and dr ? (The
8.123 f (𝜌, 𝜙) = 2𝜌 + 3𝜌𝜙 answer is very simple, and should come
8.124 f (𝜌, 𝜙) = 𝜌 sin 𝜙 straight from your drawing.)
8.125 f (𝜌, 𝜙, z) = 𝜌2 cos2 𝜙 + 𝜌2 sin2 𝜙 + z 2 (c) The r -component of the gradient is the
change in f (which is df ) divided by
8.126 f (𝜌, 𝜙, z) = 𝜙e 𝜌+2z the distance traveled (which you just
8.127 f (r , 𝜃, 𝜙) = r (𝜃 + 𝜙) wrote). Write this component. A par-
8.128 f (r , 𝜃, 𝜙) = r sin 𝜃 sin 𝜙 tial derivative will be involved.
8.129 f (r , 𝜃, 𝜙) = ln (sin 𝜙∕ cos 𝜃) + 2 (d) Returning to your drawing, draw
another small arrow from the same
̂
point in the 𝜃-direction.
In Problems 8.130–8.138, find the divergence and
curl of the given vector field. (e) If you take a small step in the direction of
that arrow, what is its distance ds in terms
8.130 f⃗(𝜌, 𝜙, z) = 𝜌̂ of r , 𝜃, 𝜙, and d𝜃? (The answer is a bit
8.131 f⃗(𝜌, 𝜙, z) = 𝜌̂ + 𝜌𝜙̂ + k̂ more complicated than the first one.)
8.132 f⃗(𝜌, 𝜙, z) = (1∕𝜌)𝜌̂ + cos 𝜙𝜙̂ − z k̂ (f) The 𝜃-component of the gradient is the
change in f (which is df ) divided by
8.133 f⃗(𝜌, 𝜙, z) = z 2 𝜌̂ + 𝜌2 sin2 𝜙𝜙̂ + 𝜌2 k̂ the distance traveled (which you just
8.134 f⃗(r , 𝜃, 𝜙) = 𝜃̂ wrote). Write this component. A par-
8.135 f⃗(r , 𝜃, 𝜙) = 𝜙̂ tial derivative will be involved.
(g) Find the 𝜙 component of the gra-
8.136 f⃗(r , 𝜃, 𝜙) = r̂ + 𝜃̂ + 𝜙̂
dient. The process is the same, but
8.137 f⃗(r , 𝜃, 𝜙) = r cos 𝜃 r̂ + r cos 𝜃 𝜃̂ + r cos 𝜃 𝜙̂ the answer will be a bit more com-
8.138 f⃗(r , 𝜃, 𝜙) = 𝜙r̂ + r 𝜃̂ + 𝜃 𝜙̂ plicated than the other two.
8.143 Figure 8.16 shows a cylindrical box with vol-
8.139 Make up a field, expressed in polar coor- ume ΔV = 𝜌Δ𝜌Δ𝜙Δz. In the Explanation
dinates, with non-constant 𝜌 and 𝜙 coordi- (Section 8.8.2) we found the integral per
nates, but zero divergence everywhere. unit volume through the inner and ( outer
) sur-
8.140 A positive charge Q creates a potential field faces, ending up with (1∕𝜌)(𝜕∕𝜕𝜌) 𝜌f𝜌 . In
V = kQ ∕r in spherical coordinates. this problem you will work through
(a) Calculate the electric field E⃗ = −∇V
⃗ . the other four sides.
(b) Calculate the divergence and curl of the (a) Beginning at the top surface (at z + Δz),
electric field and explain why these results ask yourself three questions. First: which
make sense for this physical situation. component of f⃗ points through this sur-
8.141 A wire carrying a current I in the posi- face? Second: when this component
tive z-direction creates a magnetic field is positive, is the resulting flux positive
B⃗ = 𝜇0 I ∕(2𝜋𝜌)𝜙̂ in cylindrical coordinates. or negative? Third: what is the area
Calculate the divergence and curl of the of this surface? Based on these
magnetic field and explain what they tell us three answers, write “surface integral=
about this magnetic field. (The answer to component×area” for this surface.
this particular problem is explored in more (b) Repeat Part (a) at the bottom surface.
depth in Section 8.7 Problem 8.116.) (c) Add your two answers, and divide the
8.142 The gradient in spherical coordi- result by ΔV . When you cancel and
nates looks like:⟨ simplify, you should be able to rewrite
⟩
⃗
⟨ ∇f (r , 𝜃, 𝜙)
⟩ = ⟨something ⟩r̂ +
your answer as a partial derivative, as
something 𝜃̂ + something 𝜙̂ we did in the explanation.
Your job is to find those coefficients. You (d) Repeat the entire process for the left
can of course check your answers against the and right sides to create the final par-
spherical gradient formula in Appendix F. tial derivative. You’ll know if you got
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 426
it right by checking against the for- (a) Beginning at the right-hand surface (at
mula in Appendix F for the divergence 𝜙), ask yourself three questions. First:
in cylindrical coordinates. which component points through this
8.144 The figure below shows a small box in which surface, fr , f𝜃 , or f𝜙 ? Second: when this
each side is defined by holding one of the component is positive, is the resulting
spherical coordinates constant. The volume flux positive or negative? Third: what
of this box is ΔV = r 2 sin 𝜃Δr Δ𝜃Δ𝜙. is the area of this surface? (That last
question will take some thought about
z the lengths of the sides.) Based on
these three answers, write “surface inte-
gral=component×area” for this surface.
(b) Repeat Part (a) at the left-hand
surface (at 𝜙 + Δ𝜙).
Δr
(c) Add your two answers, and divide the
result by ΔV . When you cancel and sim-
Δϕ plify, you should be able to rewrite your
answer as a partial derivative.
(d) Repeat for the inside and outside
Δθ surfaces (at 𝜃 and 𝜃 + Δ𝜃).
y
(e) Repeat for the top and bottom sur-
faces (at r + Δr and r ). You’ll know if
you got it right by checking against the
formula in Appendix F for the diver-
x gence in spherical coordinates.
8.145 Show that the divergence of the gradi-
You will use this drawing, along with ent in cylindrical coordinates produces
Equation 8.8.1, to find the formula for the the formula in Appendix F for the Lapla-
divergence in spherical coordinates. cian in cylindrical coordinates.
(b) Will vector (2) contribute a positive number to the total, a negative number, or
zero?
(c) Will vector (3) contribute a positive number to the total, a negative number, or
zero?
2. Now, in general, what kinds of vectors will contribute to a positive total divergence
within the region?
Vector field f⃗ is defined everywhere in region V . You can evaluate the surface integral of f⃗
through the outer surface of V . You can also find the divergence of f⃗, and then use a triple
integral to add that up throughout the region. The “divergence theorem” promises that
these two calculations will give the same answer.
That opening paragraph begs two obvious questions. First, those two different calculations
seem to have nothing to do with each other; why would they always give the same answer?
Second, those two calculations don’t look particularly useful; who cares if they give the same
answer?
The short answer to both questions is this: the integral of a field through its bounding
surface, and the total divergence of a field, are two different ways of asking “How much is
this field moving out of this region?” In this section we will expand on that answer, clarifying
what the divergence theorem says, why it works, and how it is used.
In words: if you add up the divergence everywhere within the region, you get the integral through
the bounding surface.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 428
This theorem goes by many names including Gauss’s theorem, Green’s theorem8 , Ostro-
gadsky’s theorem, and the fundamental theorem for divergences. The theorem tells us that
both the total divergence within the region, and the integral through the outer surface,
measure how much the vector field is flowing out of the region, so we will get the same
answer whichever way we calculate it. In the example above the divergence theorem was use-
ful because the surface integral was hard to calculate directly, but the divergence was simple.
In the boxed example below the divergence is hard to integrate but the surface integral is
easy to find. The divergence theorem says that when you want to find the flow of a vector out
of a region, you are free to choose whichever of these two approaches is easier.
√
Question: Find the flow rate of the field f⃗ = 𝜌∕ a 2 + 𝜌2 𝜌̂ out of the region bounded
by the cylinder 𝜌 = R and the planes z = 0 and z = H .
Answer:
For the left side of the divergence theorem, we calculate
the divergence of the vector field using the formula for ( )
divergence in cylindrical coordinates: ∇⃗ ⋅ f⃗ = (1∕𝜌)(𝜕∕𝜕𝜌) 𝜌f𝜌 .
After multiplying by 𝜌, taking the derivative with the quotient
rule, dividing by 𝜌, simplifying the result, and taking the
𝜙 and z integrals we eventually end up with the expression
( )
R 𝜌 2a 2 + 𝜌2
2𝜋H
∫0 ( )3∕2 d𝜌
a 2 + 𝜌2
√
Handing this mess to a computer we get the answer 2𝜋HR 2 ∕ a 2 + R 2 .
If we don’t relish long algebra calculations, though, we could just consider the
integral through the outer surface. Since f⃗ is all in the 𝜌-direction there is no flux
⃗
through the top and √bottom of the cylinder. Along the sides f points directly outward
2 2
with magnitude R∕ a + R , so we can √ just multiply that constant by the surface area
of the cylinder, 2𝜋RH , to get 2𝜋HR 2 ∕ a 2 + R 2 .
8 The name “Green’s theorem” is more commonly used for a different but related theorem that we will discuss in
So when we add up the entire divergence, the net effect of any vector is zero, unless that vec-
tor is on the surface, and points inward or outward. A vector on the surface pointing outward
creates positive divergence inside the region (where we count it) and negative divergence
outside (where we don’t count it). That’s how adding up all the divergence inside the region
(left side of Equation 8.9.1) leads us to the flow through the surface (right side).
Below we present three different uses for the divergence theorem: we will derive the
equation of continuity, we will discuss field lines, and we will convert Gauss’s law from inte-
gral to differential form. Each of these is important in its own right, but they also serve as
examples of the power and scope of this theorem.
For a fluid the first two terms on the right represent “sources” and “sinks,” meaning places
where fluid can be created or destroyed. If the fluid is conserved then these terms equal zero,
and the only change in the fluid’s mass comes from the flow in and out of the region. Using
V⃗ for the momentum density of the fluid, “input – output” becomes the surface integral of
fluid through the boundary of the region, but with a negative sign (do you see why?), and
Equation 8.9.2 becomes ∯ V⃗ ⋅ d A⃗ = −dm∕dt.
The divergence theorem allows us to rewrite that equation.(To avoid ) confusion with the
momentum density V⃗ we’ll use 𝜏 for volume and write: ∫ ∫ ∫ ∇⃗ ⋅ V⃗ d𝜏 = −dm∕dt. In the
𝜏
limit where the region becomes infinitesimally small, we can (consider
) the divergence to be
constant throughout, and our integral becomes a product: ∇⃗ ⋅ V⃗ 𝜏 = −dm∕dt. Dividing
both sides by the volume turns the mass into a density, and the result is called the “equation
of continuity.” Because we have switched from talking about mass in a region to talking about
density, which is a field, the derivative becomes partial.
The equation of continuity is central to fluid mechanics and it also comes up in fields ranging
from electricity and magnetism to relativity.
Field Lines
Section 8.2 presented two different tools for visualizing a vector field: vector plots, and field
lines (aka streamlines). The latter method is easier to draw and interpret, but cannot be used
for all vector fields. We are now at last ready to say exactly when you can use it, and why.
Remember that field lines follow the direction of a vector field, and their density is pro-
portional to the magnitude of the field. For example, looking at Figure 8.18 you can see that
the vector field points always to the right, and also points upward when y > 0 and downward
when y < 0. But you can also tell that the magnitude of the field decreases as you move to
the right, because the lines become less dense.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 430
1.0 Now suppose you wanted to draw field lines for (1∕x)i.̂
Because the field always points in the +x-direction, the field
0.5 lines would all be horizontal, so their density would not
decrease as you move to the right. That doesn’t mean there
0.0 is anything wrong with (1∕x)î as a vector field, but it does
mean we cannot represent it with field lines; we have to use a
–0.5
vector plot. (Vector plots can be drawn for any vector field.)
So what kind of field can have field lines? If you draw
a closed surface in a region with no sinks or sources then
–1.0
0.6 0.8 1.0 1.2 1.4 each field line will enter the region somewhere and leave it
somewhere else (Figure 8.19). Thus the surface integral out
FIGURE 8.18 Field lines. of any closed region will be zero. That in turn implies—this
is where the divergence theorem comes in—that the diver-
gence must be zero everywhere. Fields with zero divergence
are called “incompressible vector fields,” which makes per-
fect sense if you look at the equation of continuity. (They
are also sometimes called “solenoidal vector fields,” which
makes less sense.)
We conclude that any vector field that is incompressible
in some region can be represented with field lines in that
region. This mathematical fact has a physical corollary: the
FIGURE 8.19 Field lines go in, and velocity field for an incompressible fluid has a divergence
then they go out!. of zero.
Q
E⃗ ⋅ d A⃗ = Gauss’s law in integral form (8.9.4)
∯ 𝜀0
In general, any rule that can be expressed as an integral can also be expressed as a derivative.
For instance, in one dimension, the relationship between position and velocity can be written
in two ways:
dx
x= v dt or, equivalently v =
∫ dt
How do you find the total charge inside a region? If you know the charge density 𝜌(x, y, z),
you integrate it through the region:
( ) ∫ ∫ ∫ 𝜌 dV
⃗ ⋅ E⃗ dV =
∇
∫∫∫ 𝜀0
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 431
Some care is required in this next step. Just because two quantities have the same integral
over some region does not mean that the two quantities are equal. But if they have the same
integral over any region—anywhere, any size—then the two quantities must be equal every-
where. This leads us to Gauss’s law in differential form:
𝜌
∇⃗ ⋅ E⃗ = Gauss’s law in differential form (8.9.5)
𝜀0
Equations 8.9.4 and 8.9.5 are not two of Maxwell’s four equations; they are one of the four
equations, written in two equivalent ways. Each is more useful than the other in some
situations.
(a) Solve for that rate by taking a surface (b) The divergence theorem only works for
integral of the momentum density closed surfaces, which means you have to
through the dome and confirm that consider the ground beneath the dome
you get the same answer. The infor- to be part of the surface. Why didn’t
mation you need is in the Explanation you need to include that in your cal-
(Section 8.9.2). culation of flow rate in Part (a)?
Stokes’ Theorem
Suppose surface S is bounded by curve C. For a vector field f⃗ defined throughout that surface:
( )
⃗ × f⃗ ⋅ d A⃗ =
∇ f⃗ ⋅ d s⃗ (8.10.1)
∫∫ ∮
S C
In words: if you add up the curl everywhere within the surface, you get the line integral around the
bounding curve.
This theorem goes by many names including Kelvin-Stokes’ theorem, the curl theorem,
and the fundamental theorem for curls. To make matters more complicated, mathematicians
apply the name “Stokes’ theorem” to a more general theorem of which both this and the
divergence theorem are special cases.
As similar as Stokes’ theorem is to the divergence theorem, there is
an important difference in the way you apply them. In the divergence
theorem, the region bounded by a closed surface is unambiguous;
for instance, there is exactly one region bounded by a given sphere.
When you apply Stokes’ theorem, however, one curve can be the bound-
ary of many different surfaces. Look at just a few of the surfaces that
are bounded by a horizontal circle. Stoke’s theorem leads to the non-
obvious conclusion that the surface integral of the curl will be the same
on all these surfaces, and on any other surface bounded by the same
horizontal circle.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 433
For the right side of Stokes’ theorem, we want a line integral around the curve
x = 3 cos 𝜙, y = 3 sin 𝜙, z = 0 as 𝜙 goes from 0 to 2𝜋.
( ) ( )
f⃗ ⋅ d s⃗ = −3 sin 𝜙î + 3 cos 𝜙ĵ + 10k̂ ⋅ −3 sin 𝜙 d𝜙 î + 3 cos 𝜙 d𝜙 ĵ = 9
2𝜋
And ∫0 9 d𝜙 = 18𝜋, validating the theorem.
This problem asked us to calculate the line integral and surface integral shown
above. But if you wanted to know the circulation of this field around this loop,
neither of these calculations would be the easiest way. Remember that Stokes’
theorem promises us the same answer if we integrate the curl on any surface bounded
by this curve. The hemisphere works, but a much simpler surface bounded by the
⃗ × f⃗ = 2k̂ through that surface
same curve is the disk x 2 + y2 ≤ 9, z = 0. To integrate ∇
2 × 9𝜋 gives us 18𝜋 with much less work.
The vector field itself could be anything. But imagine, for a moment, that f⃗ happens to
point directly upward when you are between the two loops labeled I and II. As we integrate
around loop I we will integrate upward through this point, so we will get a positive result.
Around loop II we will integrate downward through this point, so we will get a negative result.
Hence, such a vector will not contribute anything after we sum (integrate) over the whole
region.
Does that mean the total curl must be zero? Not exactly. Imagine now that f⃗ points upward
when you are on the right side of loop II. This vector will contribute positively to the line
integral around loop II, but this contribution will not be canceled because there is no loop
on the other side. ( )
By this argument, you can see that the only vectors that contribute to ∫ ∫S ∇⃗ × f⃗ ⋅ d A⃗ are
vectors at the very edge of the surface, pointing parallel to that edge. When we add up all
those vectors we get a line integral around that edge. Hence, the left side of Equation 8.10.1
has become the right side.
Sign Issues
The surface integral over a region that isn’t closed has an ambiguous sign; we can calculate
the flow through the surface in either direction. Similarly, a line integral along a curve can
be evaluated in either direction along the curve. So how should we evaluate the integrals
in Stokes’ theorem? The answer is we can do them either way, but once we’ve chosen
the direction of one of the integrals the direction of the other one is determined by the
right-hand rule.
Consider the simplest case: a flat, horizontal disk bounded by a circle. If we evaluate the
line integral around the circle clockwise, then the right-hand rule tells us to evaluate the
surface integral downward. If we choose to evaluate the line integral counterclockwise then
we must do the surface integral upward.
If the surface is more complicated than a flat disk, you can still figure out the direction of
integration from the right-hand rule. For example, in the example in the gray box above we
calculated the flow going out of the hemisphere. Now, picture breaking the hemisphere up
into many tiny loops, as we did to a square region in Figure 8.20. Pointing your thumb out-
ward from each little loop, the curl of your fingers dictates that we go around the bounding
circle counterclockwise. We did that by having 𝜙 increase as we integrated.
Ampere’s Law
Section 8.9 discussed how Gauss’s law for electric fields can be written in either differential
or integral form, with the divergence theorem converting between the two. “Ampere’s law” is
the magnetic version of Gauss’s law; it relates the magnetic field to the enclosed current. Like
Gauss’s law, Ampere’s law can be expressed in integral or differential form. In the following
equations B⃗ is the magnetic field, I is the enclosed current, J⃗ is the current density, and 𝜇0
is a positive constant.
Green’s Theorem
The phrase “positively oriented” in the following definition indicates a counterclockwise
traversal of a curve.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 435
Green’s Theorem
Let C be a positively oriented, piecewise-smooth, simple closed curve in the plane and let D be the
region bounded by C. If the scalar fields P (x, y) and Q (x, y) have continuous partial derivatives on
an open region that contains D, then
( )
( ) 𝜕Q 𝜕P
P dx + Q dy = − dA
∮C ∫ ∫D 𝜕x 𝜕y
This is, on the face of it, one of the more inscrutable theorems in this book. The right side
is the double integral over some bounded region in the plane of a scalar function of x and
y—a peculiar looking function perhaps, but at least you can see what to do with it. But what
is the left side? What does ∫ P (x, y)dx mean?
As always, such an integral should properly be thought of as a limit.
ds Imagine breaking a curve into a large (but finite) number of small line
segments. At each segment you evaluate the value of the function P (x, y).
dx
Now, for a “regular” scalar line integral you would multiply P (x, y) times
the arc length ds of the small line segment. (That in turn is given by the
FIGURE 8.21
Pythagorean theorem ds 2 = dx 2 + dy2 .) In this case you multiply P (x, y)
P (x, y)ds and
by the segment’s horizontal length dx. (See Figure 8.21.) Add up those
P (x, y)dx.
products all around the curve. In the limit of this process as the number
of pieces approaches infinity, you have ∫ P dx.
So now you have some idea of what it means, but it’s still not clear what it’s doing. It is
perhaps more useful to begin by considering a vector f⃗ = P (x, y)î + Q (x, y)j.̂ Then the left
side of Green’s theorem becomes something familiar: it is ∮C f⃗ ⋅ d s⃗, where d s⃗ = dx î + dy ĵ as
always. If you take the curl of that same vector f⃗ you get the right side of Green’s theorem.
Viewed this way Green’s theorem is just Stokes’ theorem, written for the special case of a
surface that lies entirely in the xy-plane.
We are not emphasizing Green’s theorem; in practice it can be treated as a special case of
Stokes’ theorem. We have that luxury because we are focusing on intuitive demonstrations of
all these theorems, rather than formal proofs. A more rigorous approach works the opposite
way: first you prove Green’s theorem, and then you use Green’s theorem to prove both the
divergence theorem and Stokes’ theorem.
z
In Problems 8.163–8.167, the vector V⃗ represents the
momentum density of a fluid. Calculate the absolute
value of the circulation of the fluid around the
4 indicated curve. In each case you should start by
thinking about which side of Stokes’ theorem will be
easier to use. If you evaluate a surface integral you
should also choose the easiest possible surface.
x
3
y 8.163 V⃗ = ln(𝜌)𝜙̂ around the circle x 2 +
y2 = R 2 in the xy-plane.
(a) Calculate the curl ∇⃗ × f⃗. ( )
( ) 8.164 V⃗ = x 2 x 2 + (z − 1)2 k̂ around a cir-
(b) Calculate ∫ ∫ ∇⃗ × f⃗ ⋅ d A⃗ point-
S
cle of radius 1 in the xz-plane cen-
ing into the can… tered on the point (0, 0, 1)
i. …on the cylinder
8.165 V⃗ = x î + yĵ around a circle of radius 1 cen-
ii. …on the back
tered on the origin in the x = z plane.
iii. …on the bottom
(c) Add together these three contribu- 8.166 V⃗ = z î − x k̂ around a circle of radius 1 cen-
tions to find the total integral of tered on the origin in the x = z plane.
the curl along surface S. 8.167 V⃗ = yz î − x ĵ around a curve made up of
(d) Now, ignoring the work you have the following segments: the quarter-circle
already done, calculate the line inte- (z − 1)2 + (x − 1)2 = 1 on the xz-plane
gral ∮ f⃗ ⋅ d s⃗ counterclockwise… from (1, 0, 0) to (0, 0, 1), the quarter-circle
i. …along the line segment on the top (z − 1)2 + (y − 1)2 = 1 on the yz-plane
ii. …along the semicircle on the top from (0, 0, 1) to (0, 1, 0), and the line
(e) Add together these two contribu- x + y = 1 on the xy-plane from (0, 1, 0)
tions to find the total line integral to (1, 0, 0).
along the bounding curve.
(f) If your answers to Parts (c) and (e) 8.168 In this problem you will confirm Stokes’
match, pat yourself on the back for ver- theorem for the vector field f⃗ = (x + y2 + z 3 )î
ifying Stokes’ theorem in this case. If on the circle y2 + z 2 = 9 on the yz-plane.
they don’t match, go back and find (a) Calculate ∮ f⃗ ⋅ d s⃗ along this circle.
the problem! ( )
(b) Calculate ∫ ∫ ∇ ⃗ × f⃗ ⋅ d A⃗ on
S
In Problems 8.159–8.162, confirm Stokes’ theorem the x ≥ 0 part of the paraboloid x =
for the given vector field f⃗. If a surface S is specified 9 − y2 − z 2 and confirm that it equals
take the surface integral on that surface and the line what you got in Part (a).
integral on its bounding curve. If a bounding curve C (c) Choose an easier surface that is also
is specified, you may choose any surface bounded by bounded by the same circle and con-
C that makes the surface integral as easy as firm that Stokes’ theorem works
possible. for that surface as well.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 437
8.169 Why does Stokes’ theorem work? To answer component that points directly out of
this question, consider a vector field defined the surface, and a component tangent
throughout a bounded two-dimensional to the bounding curve. Which compo-
surface. Your object is to add up the curl nent is relevant to the total curl within
of the vector field throughout the sur- the surface?
face. The drawing shows three sample vec- (g) Explain how this reasoning leads
tors within a bounded surface S. to Stokes’ theorem.
2 8.170 In this problem you will find the dif-
ferential form of Ampere’s law, start-
ing with the integral form:
3 1
B⃗ ⋅ d s⃗ = 𝜇0 Ienclosed
∮
Conservative Fields
⃗ . To put the same thing
1. A conservative vector field is the gradient of some scalar field: v⃗ = ∇F
a different way, a conservative vector field has a potential function. (In math texts F would
generally be the potential of v,⃗ but here we follow most physics texts and say F is minus the
⃗
potential of v.)
2. The line integrals of a conservative vector field are “path independent.” That is, ∫C v⃗ ⋅ d s⃗
depends only on the starting and ending points, not on the specific curve drawn between
them.
3. The line integral of a conservative vector field around a closed path (one where the starting
and ending points are the same) is always zero: ∮ v⃗ ⋅ d s⃗ = 0.
⃗ (This is not a
4. A conservative vector field is “irrotational,” meaning it has no curl: ∇⃗ × v⃗ = 0.
perfect definition, as discussed below.)
We are not saying “If you want to show that a particular field is conservative, you have to show
that it meets all of these criteria.” Rather, we are saying that any one of these statements is
a sufficient definition—with an exception for Definition 4—because each one implies the
others.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 439
For instance, we saw in Section 8.5 how to evaluate the line integral of a gradient based on
only the endpoints. The gradient theorem says that for any vector field that can be written
as the gradient of some scalar field, the line integrals are path independent. That is, it says
that Definition 1 above implies Definition 2.
Our job in this Explanation is to show how all four of these seemingly unrelated definitions
connect with each other and with conservation of mechanical energy.
Equation 8.11.1 holds for all forces. In the case of non-conservative forces, the result of the
calculation depends on the path taken. In the case of conservative forces the calculation
depends only on the starting and ending locations. By Definition 2, this means F⃗ is a con-
servative vector field if and only if it represents a conservative force. That in turn implies
our second bullet point above, namely that for a conservative force we can uniquely define
a scalar field PE by the equation F⃗ = −∇PE.
⃗
Question: A mass m is subject to Earth gravity F⃗ = −mg j,̂ using y for vertical position
and x for horizontal. Find the potential energy at the point (w, h).
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 440
Answer:
The question is ambiguous until we choose a point to represent PE = 0; we choose
(purely for convenience) the origin. The work done in moving our object from (0, 0)
to (w, h) is its gain in kinetic energy, and therefore its loss in potential energy. We will
calculate the work three times, using the different paths in the picture below.
Since the force is vertical, F⃗ ⋅ d s⃗ is zero on the horizontal paths.
∙ Path 1: W = 0 − mgh = −mgh
∙ Path 2: W = −mgh + 0 = −mgh
w
∙ Path 3: The line is y = (h∕w)x, so dy = (h∕w)dx and the work is ∫0 (−mgh∕w)
dx = −mgh
No matter what path we use to calculate it, we get the same answer. The work done by
gravity as the mass m moves from the origin to (w, h) is −mgh, so the object’s potential
energy at (w, h) is mgh.
Path 1 Path 2 Path 3
y y y
h h h
x x x
w w w
Question: A mass is subject to a different force, F⃗ = −kx 2 j.̂ Before you read any
further, we encourage you to try calculating PE at the point (w, h) for this new force
the same way we did above for gravity.
Answer:
Once again the horizontal segments of the path contribute nothing to the work since
the force is entirely vertical.
∙ Path 1: The force is −kw 2 j,̂ so the work is −kw 2 h
∙ Path 2: x = 0 so F = 0 and the work is zero
w
∙ Path 3: the work is W = ∫0 (−kx 2 h∕w)dx = −kw 2 h∕3
What happened? The second force is not conservative: the amount of energy it
imparts to an object depends on the path along which the object moves. It is
therefore impossible to associate a potential energy with that force.
In introductory physics classes the rule for figuring out if a force is conservative is very sim-
ple: if the teacher gave you a formula for the potential energy associated with that force,
then it must be a conservative force. But to mathematically determine if a particular force is
conservative, you need vector calculus.
it we would need to calculate the line integrals between every possible pair of points in the
universe along every possible path. You might be forgiven if you checked a few and assumed
that the rest would work, but you might also get the wrong answer.
So here is an alternate definition: a vector field is conservative if its line integral around
any closed loop (i.e., a path that ends where it started) is zero. The notation ∮ is used for such
an integral, so ∮ f⃗ ⋅ d s⃗ = 0 is Definition 3 of a conservative field. You’ll show in Problem 8.184
that this definition is equivalent to the previous one.
That may not sound like a big improvement—you still cannot test every possible closed
loop—but Stokes’ theorem turns it into a practical test. The circulation of a vector around a
closed loop equals the integral of the curl on a surface bounded by that loop. If the circula-
tion is zero around every closed loop, then the curl must be zero everywhere. The converse
is also true; if the curl of a field is zero everywhere then the circulation around any closed
loop must be zero and the field is conservative. Starting from Definition 3, we’ve thus derived
Definition 4. This one gives us a practical test, however. A vector field is conservative if and
only if its curl is zero everywhere.
Except it doesn’t quite work. In Problem 8.188 you will explore a field that violates Defi-
nition 4 and seems to also violate Definition 1. It is the gradient of a scalar function, its curl
is zero, and it is manifestly not conservative. The problem is not a contrived mathematical
curiosity, but a common physical situation, the magnetic field around a straight wire. So
exactly how equivalent are these definitions?
Before we can answer that question, we have to raise one
technical issue: “conservative” is a property of a field in a
The wire
given region of space. To say that a field is conservative (or
The region
non-conservative) at a particular point is meaningless, since
both line integrals and gradients require some elbow room.
In the case of the magnetic field around a wire, we can-
not consider “the whole universe” as our region because
FIGURE 8.22 A region around a
B⃗ blows up at 𝜌 = 0, which is to say, on the wire itself. So
current-carrying wire.
we ask the question: is the magnetic field conservative on a
disk-shaped region around the wire (Figure 8.22)?
In Problem 8.188 you’ll find a potential function for B⃗ that seems to work everywhere
in this region, but look at it carefully: even though the magnetic field is defined through-
out the region, its potential function is not. You cannot write B⃗ = ∇V ⃗ everywhere in the
region, so Definition 1, along with Definitions 2 and 3 (as you’ll also show), concludes that
the magnetic field is not conservative.
Definition 4 is more problematic, as the curl is definitely zero everywhere within this
region. This definition only applies when the region in question is “simply connected,” how-
ever. In 2D that roughly means that the region doesn’t have any holes in it. Because this
region clearly has a hole in it, Definition 4 does not give us a guarantee.
Why does that hole matter so much? Remember that Definition 4 is directly related to
Definition 3 by Stokes’ theorem: the line integral around a closed path equals the total curl
integrated along any surface bounded by that path. To apply Stokes’ theorem in this case we
draw a closed loop inside our region (a circle around the wire) and then we draw a surface
that is bounded by that loop. But here we run into trouble. Whether we draw a simple disk
or an elaborate parachute, the wire—the place where B⃗ is undefined—is going to cut right
through our surface. That means we can’t apply Stokes’ theorem in our region, which is why
Definition 4 gets it wrong.
That leads us to a more precise definition of “simply connected”: any closed loop in a
simply connected region bounds at least one surface entirely within the region, which means
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 442
you can apply Stokes’ theorem in that region.9 See Problem 8.175. In 3D this definition is
a bit more complicated than “doesn’t have any holes.” For example, the region in between
two concentric spherical shells is simply connected, while a doughnut is not. Make sure you
understand why in both cases.
Bottom line? If you want to determine if a field is conservative within a given region, here
is our recommendation.
1. If the curl is not zero everywhere in the region, then the field is not conservative—
guaranteed.
2. If the curl is zero everywhere in the region, and the region is simply connected, then the
field is conservative—guaranteed.
3. If the curl is zero everywhere in the region, and the region is not simply connected, then
you’re not sure at this stage. In that case take one closed line integral around each of the
holes in the region. If they all come out zero the field is conservative. If any of them comes
out non-zero the field is not conservative by Definition 3.
A charge Q creates an electric field (vector, kQ ∕r 2 r̂) and a potential field (scalar, kQ ∕r )
around itself; these are only functions of position, not of what you put at that position. If
you place a charge q into this region it will experience a force (vector, kQq∕r 2 r̂) and have a
certain potential energy (scalar, kQq∕r ); these will depend on both the field and the charge
q of the particle.
Everything we’ve said here about electricity works exactly the same with gravity if we
replace k with G, write mass instead of charge, and put a minus sign in front of each of
the formulas. (The minus sign is because two positive charges repel but two positive masses,
which are the only kind there are, attract.)
9 The rigorous definition of “simply connected” also says it can’t consist of two disconnected regions, such as two sep-
arated disks. This part of the definition is irrelevant for conservative fields since you can still apply Stokes’ theorem
in such regions.
7in x 10in Felder c08.tex V3 - February 26, 2015 10:38 P.M. Page 443
Stepping Back
If you were to arbitrarily choose three functions of x, y, and z, and call them the three com-
⃗ the odds are good that you would not end up with a conservative
ponents of a vector field v,
⃗ for some scalar field, that scalar
field. For instance, suppose vx = x 2 y2 . If we want v⃗ = ∇F
field must have the form F = (1∕3)x 3 y2 plus some function of y. That, in turn, means that
vy = (2∕3)x 3 y plus some function of y. But you don’t get to pick vy , so you just have to hope it
works out! The point is that vx and vy (and vz in three dimensions) have to carefully match
each other to give a conservative field.
But if conservative fields are a very restrictive special case, they are also a very impor-
tant one. They include important real-world examples such as gravitational and electrostatic
fields. And once you know that a field is conservative, you can use its potential function and
the gradient theorem to simplify calculations tremendously. If the field represents a force,
you also know that it will conserve mechanical energy.
8.181 v⃗ = 𝜙̂ over the entire xy-plane point charge Q at the origin. Show that
this vector field is conservative.
8.182 v⃗ = 𝜙𝜌̂ + 𝜙̂ over a unit disk in the xy-plane
centered on the origin. Hint: Where is (b) Write a vector in spherical coordinates
this field undefined and why? that represents the electric field from a
[ ] point charge Q at the origin. Show that
8.183 v⃗ = z cos(xz) − y∕x î − (ln x)ĵ +
this vector field is conservative.
x cos(xz)k̂ for x > 0
(c) Does this imply that the total electric field
from any collection of point charges is
8.184 Page 438 contains a number of different
conservative? Why or why not?
definitions of conservative. Show that Defi-
nitions 2 and 3 are equivalent. The picture 8.188 A straight wire carrying a constant current I
below may help. Consider the fact that the line creates a magnetic field around itself given
integral from B to A along path 2 is minus the by B⃗ = 𝜇0 I ∕(2𝜋𝜌)𝜙̂ where 𝜇0 is a positive
line integral from A to B along that path. constant.
(a) Show that B⃗ is not a conservative
B
field by showing that ∮ B⃗ ⋅ d s⃗ around
2 the loop 𝜌 = 1 is not zero.
(b) Show that B⃗ is irrotational. Why,
in this case, does this not imply
a conservative field?
(c) Show that B⃗ = ∇S ⃗ where S = 𝜇0 I 𝜙∕(2𝜋).
Why, in this case, does this not imply
1 a conservative field?
8.189 In this problem you will replicate the
A work from Problem 8.188 in Cartesian
coordinates, but you do not need to have
8.185 If a vector field has the property that
⃗ the field is said to be “irrotational.”
⃗ × v⃗ = 0, done Problem 8.188 to do this one. A
∇
straight wire carrying a constant current
Page 438 presents this as a definition of a
I creates a magnetic field around itself
conservative vector field, which is mostly—
given by:
but not entirely—accurate. Show that if
( )
Definition 1 is true for some vector field, 𝜇 I −y x ̂
B⃗ = 0 ̂+
i j
then Definition 4 must also be true. 2𝜋 x 2 + y2 x 2 + y2
8.186 Prove that any constant vector field
is conservative. where 𝜇0 is a positive constant.
8.187 A charge Q creates an electric field around (a) Show that B⃗ is not a conservative field by
| |
it with magnitude |E⃗| = kQ ∕r 2 where k is showing that ∮ B⃗ ⋅ d s⃗ around the loop
| | x = cos t, y = sin t is not zero.
a positive constant and r is the distance
to the charge. The direction of this field (b) Show that B⃗ is irrotational. Why,
is radially outward from the charge. Hint: in this case, does this not imply
each part of this problem should involve a conservative field?
less algebra than the part before. (c) Show that B⃗ = ∇S⃗ where S = 𝜇0 I ∕(2𝜋)
(a) Write a vector in Cartesian coordinates tan−1 (y∕x). Why, in this case, does this
that represents the electric field from a not imply a conservative field?
CHAPTER 9
∙ identify the amplitude, frequency, period, and phase of the function A cos(px + 𝜙).
∙ represent oscillatory functions with complex exponentials.
∑
∙ represent series, finite or infinite, with notation.
(You don’t need to know about Taylor series to read this chapter, but we compare and contrast
them to Fourier series in several places. If you’re not familiar with Taylor series you’ll have to
ignore those bits.)
∙ use a Fourier series to represent a periodic function as a sum of sines and cosines, or equiv-
alently as a sum of complex exponentials.
∙ determine if a set of functions is “orthogonal” and if so find the formula for the coefficients
of a series built from this set.
∙ create a Fourier series for a function defined on a limited domain by using an “even exten-
sion” to create a “Fourier cosine series,” an “odd extension” to create a “Fourier sine series,”
or an extension that is neither even nor odd to create a series with sines and cosines.
∙ use a Fourier transform to represent a non-periodic function as an integral of complex
exponentials with different amplitudes.
∙ determine the basic properties of a function from a plot of its Fourier transform, or of a
Fourier transform from the plot of the function it represents.
∙ use a discrete Fourier transform to model a function defined by a set of data points.
∙ create a multivariate Fourier series for a function of more than one variable.
Complicated behavior can often be modeled as the sum of different sine waves of different
frequencies. Think of a car going up and down hills. A rock stuck on one of the tires goes
up and down quickly due to the rotation of the tire and also goes up and down more slowly
due to the hills.
A “Fourier series” mathematically represents a function such as the height of the rock over
time as a superposition of individual sine waves. Such a device can reveal the mathematical
order underlying seemingly chaotic functions, especially in the case of periodic behavior.
Less obviously, Fourier series are also a powerful tool for solving differential equations by
adding together different sinusoidal solutions.
together around their mutual center of mass. The period of this rotation is of course 365
days, so its angular frequency is 𝜔 = 2𝜋∕365 day−1 .
Astronomers therefore look for small, regular oscillations in other stars as a way of deter-
mining if they have their own planets. Using this technique, Mayor and Queloz in 1995
announced what is now considered the first definitive detection of a planet orbiting a star
other than our sun.1
In a simpler world we would see a sine wave as the star moves back and forth, and the
period of that sine wave would be the orbital period of the planet. But real data is messier
because stars can have more than one planet, and because effects such as atmospheric dis-
tortion create “noise” in the data.
Suppose you had been measuring a star
v carefully for 14 years and collected the
4 data shown in Figure 9.1. Does this star
have planets going around it? How many
2 and with what periods? Or did your five-
year-old just scribble random dots on your
t page of data? It’s virtually impossible to
1988 1990 1992 1994 1996 1998 2000 tell. You need to know if there are reg-
ular sinusoidal variations hidden within
‒2
the random ups-and-downs of those data
points.
‒4 “Fourier analysis” is designed to answer
that question. It allows you to take a func-
FIGURE 9.1 Velocity of a star vs. time.
tion of time and figure out how much it
varies with different frequencies. Figure 9.2 shows the “Fourier transform” v(𝜔)̂ of the data
points v(t) shown in Figure 9.1.
It is vital to understand that Figure 9.2
ˆv(ω) does not represent any new data; it is cal-
culated by analyzing the data in Figure 9.1.
You will learn in Section 9.7 (see
felderbooks.com) how to perform this
analysis using a “discrete Fourier trans-
form.” Right now we are handing you
Figure 9.2, but you can still interpret it.
1. The Fourier transform shows a
ω (rad/years)
5 10 15 20 25 sharp spike at approximately 𝜔 =
19. This indicates that your star’s
FIGURE 9.2 The “Fourier transform” shows how much
variation v(t) has at different frequencies.
velocity has a high-amplitude oscil-
lation with angular frequency 19.
The only likely explanation is a planet orbiting the star. What is the period of that
planet’s rotation? (That is, how long is a “year” on that planet?)
See Check Yourself #57 in Appendix L
2. The Fourier transform also shows a spike at roughly 𝜔 = 6, indicating a second planet.
How long is a year on that planet?
3. Which planet do you think has the largest mass (and hence the biggest effect on the
star’s velocity)? Explain how you got your answer from Figure 9.2.
1 M. Mayor, D. Queloz (1995). “A Jupiter-mass companion to a solar-type star”. Nature 378 (6555): 355–359. Other
claims of “exoplanets” had been claimed earlier, but were considered more controversial. In 1992 Wolszczan and
Frail announced the detection of two planets orbiting a pulsar, but Mayor and Queloz’s discovery was the first clear
planet orbiting a regular star.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 447
FIGURE 9.3 Each plot is a sum of two cosines. Use these in answering Parts 3 and 4.
The examples we’ve used so far looked fairly simple because one wave had a much higher
amplitude than the other. When you add waves with similar amplitudes, the results tend to
look messier.
5. Make a plot of cos(5x) + cos(6x) + cos(7x) and copy it onto your paper. It’s a good
way to start to see that even simple combinations of cosines can lead to pretty compli-
cated behavior.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 448
Question: If P (t) represents the power consumption from your home, a graph of P (t)
would be an almost unreadable jumble. But if you represented P (t) as a Fourier
series, which terms would have high amplitudes? (Assume t is measured in days.)
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 449
Answer:
One of the terms in your Fourier series represents oscillation with a period of
1 day. The coefficient of that term—that is, the amplitude associated with that
frequency—would be very high. Of course the use of heat, electricity, and hot water
is not perfectly periodic (it is not identical from one day to the next) but there are
strong predictable daily trends (more usage at 8:00 AM than at 3:00 AM). The high
amplitude on this term tells us that P (t) has a predictable 1-day cycle.
There would also be a spike representing a period of one year: you use more heat
in the winter than in the spring. So you might reach a reasonable approximation by
writing: ( )
2𝜋
P (t) = A sin (2𝜋t) + B sin t (9.2.2)
365
Equation 9.2.2 could never give you a perfectly accurate representation of your power
usage. There are undoubtedly many shorter and longer cycles involved, as well as a
fair bit of randomness. But if those terms with the right coefficients give you a fairly
good approximation, that might enable the power company to create the models
they need for prediction and planning purposes.
Give yourself extra credit if you noticed a problem with this example. Your power
usage involves highs and lows that never repeat, but a Fourier series is always perfectly
periodic. If you built your series from data collected over a ten-year period then the
resulting Fourier series would repeat that data for decade after decade. You would
therefore see a ten-year period (that does not reflect your actual power usage) as well
as the annual and daily cycles (that do). We will return to the issue of Fourier series
based on data from a finite domain in Section 9.4.
A Fourier series with sines only (such as the one we wrote in that example), or cosines only,
is fairly easy to interpret. But many functions require sine and cosine terms together. (You’ll
see why in “Even and Odd Functions” below.) This makes interpreting the coefficients a bit
trickier.
For instance, suppose you evaluate the Fourier series for a function f (x) and find the
following four coefficients. (You also find many other coefficients, but let’s just talk about
these four.)
a8 = 6 b8 = 0 a9 = 3 b9 = −4
The first two numbers tell us about the part of f (x) that oscillates with angular frequency
p = 8. That part looks like 6 cos(8x). Its amplitude of 6 represents the importance of this
particular frequency in the overall makeup of f (x).
The other two numbers tell us that the p = 9 part of f (x) looks like 3 cos(9x) − 4 sin(9x).
th
√ that the n frequency terms, an cos(nx) + bn sin(nx), represent
You will show in Problem 9.24
oscillation with amplitude an2 + bn2 . So this particular frequency has an amplitude of 5:
slightly smaller than the first one.
This result means that 3 cos(nx) + 4 sin(nx), 4 cos(nx) + 3 sin(nx), and 5 cos(nx) all repre-
sent oscillations with the same frequency and amplitude. The only difference is in the phase.
You can compute that phase difference if you need it, but usually it’s the amplitudes that
matter.
The a0 term is unique. The sine and cosine terms average out to zero, so when you write a
function as a Fourier series the average value of the series is a0 . Turning that around, when
you write a Fourier series for a function f (x), a0 is the average value of f (x). If f is symmetric
above and below the y-axis then a0 = 0.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 450
∑
∞
∑
∞
f (x) = a0 + an cos(nx) + bn sin(nx)
n=1 n=1
The integrals are written here from −𝜋 to 𝜋, but because the function is periodic you can take these
integrals over any interval of length 2𝜋 and get the same result.
2… by which we mean angular frequency, but this is the last time we’ll say so.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 451
Virtually every periodic function that would arise in a physical problem satisfies these
conditions. Under those circumstances “Dirichlet’s theorem” tells us that wherever f (x) is
continuous, the Fourier series will converge to f (x). Perhaps more remarkably, if f (x) has a
jump discontinuity at some value of x, the Fourier series will converge to the average of the
values of f on either side of the jump.
When you graph an even function, the left side is a mirror image of the right. For an odd
function the left is an upside-down mirror of the right. By thinking about the “plug in both
5 and −5” test and/or the graphical symmetry, you can convince yourself of the following
rules.
∙ The sum of even functions is an even function; the sum of odd functions is an odd
function. When you add an even function to an odd function, the result generally has
neither symmetry.
∙ The product or quotient of two even functions is an even function; the product or
quotient of two odd functions is an even function. When you multiply or divide an
even and an odd function, the result is an odd function.
a
∙ If f (x) is an odd function, then ∫−a f (x)dx = 0.
a a
∙ If f (x) is an even function, then ∫−a f (x)dx = 2 ∫0 f (x)dx.
The above rules apply to any even and odd functions, but they have specific implications for
Fourier series.
∙ If f (x) is even then its Fourier series will have no sine terms. (bn = 0 for all n, as we
found for x 2 + 3 above.) To find its cosine terms you can integrate over a full period
as we did above, or you can save a little time by integrating over half the period and
doubling the result. (Why? Because if f (x) is even, then f (x) cos x is also even.)
∙ If f (x) is odd then its Fourier series will have no cosine or constant terms. (an = 0 for all
n, including a0 .) To find its sine terms you can integrate over a full period or integrate
over half the period and then double the result.
∙ If f (x) is neither even nor odd, then its Fourier series will generally have sines and
cosines.
You will prove the most important of these conclusions in Problems 9.28 and 9.29.
Stepping Back
As promised, we conclude our introduction by returning to the analogy with Taylor series.
When we write a Maclaurin series, we write a given function as an infinite series of x n terms:
f (x) = c0 + c1 x + c2 x 2 + c3 x 3 + …
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 453
Starting with the function f (x), you figure out the coefficients. But you can’t list them all, so
you have to specify the coefficients as a function of n. For instance:
∑
∞
(−1)n
e −x
= xn
n=0
n!
Starting with the function e −x , you can determine that the coefficient of x n in the Maclaurin
series is (−1)n ∕(n!). Given those coefficients, you can go the other way to figure out the
function.
What we have just established, although you may never have thought about it this way,
is a correspondence between the functions f (x) = e −x and g (x) = (−1)x ∕(x!). They carry the
same information in different forms, and you can convert from either one to the other. But
you can only make use of the information if you understand the subtle relationship between
the two representations. The fact that g (3) = −1∕6 does not tell us anything directly about
f (3); instead, it tells us how much “cubic function” is in the makeup of f (x).
Similarly, we built a Fourier series above to conclude that:
𝜋2 ∑∞
(−1)n 4
x2 + 3 = +3+ cos(nx) −𝜋 ≤x ≤𝜋
3 n=1
n2
Again, this establishes an important but subtle relationship between the functions
f (x) = x 2 + 3 and g (x) = (−1)x 4∕x 2 . Again, we can convert from one to the other. And again,
it’s important to understand what the relationship conveys: the fact that g (3) = −4∕9 tells us
how much “cosine with frequency 3” is in x 2 + 3.
For this reason we often speak about translating a function from “regular space” to
“Fourier space” and back, a topic to which we will return in Section 9.6. In quantum
mechanics, the wavefunction in Fourier space represents the probability distribution of
momentum just as the wavefunction in regular space represents the probability distribution
of position.
Appendix G lists all the formulas for finding Fourier series coefficients. It also contains a brief table of some
of the integrals that you will need most often in this chapter. Any problem that cannot be solved with u
substitution, integration by parts, or with that table will be marked with a computer icon: .
After you integrate remember that n (the summation index) is always an integer, and for any integer n:
9.1 The drawing below is the sum of two sine waves. 9.2 Sketch a graph showing the sum of
Estimate their periods and amplitudes and use two sine waves: one with a small ampli-
those estimates to write the two sine functions. tude and low frequency, the other
6 with a much higher amplitude and
4 frequency.
2 9.3 The function D(x) is odd, and the func-
x tion E(x) is even. Furthermore, D(5) = 2
2 4 6 8 10
–2 and E(5) = 6. Finally, Q (x) = D(x)∕E(x)
–4
and S(x) = D(x) + E(x).
–6
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 454
5 ⎧ −1 −𝜋 < x < 0
⎪ 1 0<x<𝜋
x f (x) = ⎨
–4 –2 2 4 −1 𝜋 < x < 2𝜋
⎪
⎩ … and so on
(a) Does this function appear to be even,
odd, or neither? (a) Use Equation 9.2.3 to write an integral for
(b) Sketch the derivative of this function. Does a0 , the constant term in the Fourier series
f ′ (x) appear to be even, odd, or neither? expansion. You should be able to evaluate
(c) Based on your answers, write a general state- this integral by just looking at the graph.
ment about even and odd functions. (b) Returning to Equation 9.2.3, the coef-
(d) Now prove it, using the definitions ficients of the cosine terms lead us
of even and odd functions. to a piecewise integral:
( 0 𝜋 )
1
In Problems 9.7–9.10 we give you a scenario that can an = − cos(nx) dx + cos(nx) dx
𝜋 ∫−𝜋 ∫0
be represented as a function. If this function is
written as a Fourier series, there will be one Evaluate this integral.
spike—one high coefficient—at the period we (c) Use Equation 9.2.3 to write an integral
indicate. In each problem … for bn . Evaluate this integral to find
∙ Explain why you would see a spike at the period the coefficients of the sine terms in
we indicate. the Fourier series expansion.
∙ Suggest one more period that would also have a (d) Some of the coefficients you calculated
high coefficient, and explain why. in Parts (a)–(c) should have come out
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 455
equal to 0. Explain how you could 9.23 The function below is the first four terms
have predicted some of those without of a Fourier sine series with half-integer
doing any calculations, just by looking terms.
at the plot of the square wave. ( ) ( )
(e) Write the Fourier expansion for 1 3
f (x) = b1 sin x + b2 sin(x) + b3 sin x
this square wave. 2 2
+ b4 sin(2x)
9.13 [This problem depends on Problem 9.12.] For
the series you found in Problem 9.12, have (a) What are the periods of the first, second,
a computer draw the 1st , 5th , 20th , and 100th third, and fourth terms of this series?
partial sums. Sketch the results and verify (b) What is the period of the entire series?
that they are approaching the original square
9.24 A Fourier series represents each frequency
wave. What happens at the x-values for which
with a sine and a cosine term, but what
the original function is discontinuous?
you typically want to know about each fre-
quency is the amplitude of the wave at that
In Problems 9.14–9.20 assume f (x) equals the given
frequency. To see how to get from one to
function from x = −𝜋 to 𝜋, and is extended perio-
the other, you’re going to rewrite the terms
dically beyond that. First, state whether f (x) is even,
an cos(nx) + bn sin(nx) as one term of the form
odd, or neither. Based on that, which (if any) of the
A cos(nx + 𝜙), where A is the amplitude and 𝜙 is
coefficients in the Fourier series do you know must be
the phase.
zero? Then calculate the remaining coefficients,
simplifying your solution by replacing all occurrences (a) Use a trig identity to rewrite A cos(nx + 𝜙)
of sin(𝜋n) with 0 and cos(𝜋n) with (−1)n . Write the as a sum
resulting Fourier series in the simplest possible form. <something> cos(nx)+<something>sin(nx).
For problems not marked with a computer icon Setting this equal to an cos(nx) + bn sin(nx),
you should be able to do the integral by hand or with find an and bn in terms of A and 𝜙.
the table in Appendix G. In all cases we urge you to (b) Solve for A and 𝜙 in terms of an and bn .
check your final answers by using computer (c) If the Fourier series of a function f (x)
graphs. has a4 = 10 and b4 = 5, what are the
amplitude and phase of the oscillation
9.14 f (x) = k (a constant) with frequency 4?
{
0 −𝜋 ≤ x < 0
9.15 f (x) =
1 0≤x<𝜋 9.25 Plot each function g (x) given below on the
9.16 f (x) = |x| same plot with the function f (x) = 13 cos(2x).
For each one describe how the plot of
9.17 f (x) = 3x 2 − x
g (x) is similar to or different from the
9.18 f (x) = x 3 plot of f (x).
9.19 f (x) = e x (a) g (x) = 5 cos(2x) + 12 sin(2x)
9.20 The figure below shows a “triangle wave.” (b) g (x) = 12 cos(2x) + 5 sin(2x)
(c) g (x) = 10 cos(2x) + 24 sin(2x)
f(x)
(d) g (x) = 5 cos(4x) + 12 sin(4x)
1
x 9.26 In this problem you will analyze the Fourier
–π π series for the function f (x), where f (x) = 1 for
−𝜋 ≤ x ≤ 𝜋∕2 and f (x) = −3 for 𝜋∕2 < x ≤ 𝜋.
9.21 Find the Fourier series for f (x) = sin2 x. (a) Find the Fourier series for f (x).
(b) What is the amplitude of the oscillation
9.22 In the example on Page 451 we found with frequency 1? You will need to consider
the Fourier series for the function f (x) both the sine and cosine terms, but your
that equals x 2 + 3 from x = −𝜋 to 𝜋 and is answer should be a single number.
extended periodically from there. (c) What is the amplitude of the oscillation
(a) Find the Fourier series for the function with frequency n?
g (x) that equals x 2 + 3 from 0 to 2𝜋 and (d) How many frequencies do you need to
is extended periodically from there. include if you want a partial sum with
(b) Use a computer to draw the 100th partial all oscillations whose amplitude is at
sum of your series. It should not look like least 1/10 of the amplitude of the n = 1
the drawing we found for f (x). Why not? oscillation?
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 456
9.27 Dirichlet’s theorem guarantees that when (b) If E(x) is an even function, and you create
a periodic function f (x) has a jump discon- a Fourier series for E(x), prove that all the
tinuity, the Fourier series for f evaluated at bn coefficients will come out zero.
that point will converge to the average of (c) Prove in a similar way that for an
the values on either side of the jump. This odd function O(x) all of the an coef-
fact can be used to calculate certain infinite ficients come out to zero.
series.
9.30 The diagram below shows an LC circuit.
Find the Fourier series for f (x) =
(a) {
1 −𝜋∕4 < x < 𝜋∕4
.
0 −𝜋 < x < −𝜋∕4, 𝜋∕4 < x < 𝜋 C
V
Write all the terms up to and includ-
ing the cos(8x) term. L
The property that allows us to find the coefficients of a Fourier series is that the functions
sin(nx) and cos(nx) for all integer n-values form an “orthogonal set,” as defined below.
A set of functions is “orthogonal” if every function in the set is orthogonal to every other function
in the set.
∑
∞
∑
∞
f (x) = a0 + an cos(nx) + bn sin(nx)
n=1 n=1
We have to find infinitely many coefficients, but just to demonstrate the method let’s find b4 ,
the coefficient of sin(4x) in the series. The trick is to multiply both sides by sin(4x) and then
integrate as x goes from −𝜋 to 𝜋.
𝜋 𝜋 ∑
∞ 𝜋
f (x) sin(4x) dx = a0 sin(4x) dx + an cos(nx) sin(4x) dx
∫−𝜋 ∫−𝜋 n=1
∫−𝜋
∑
∞ 𝜋
+ bn sin(nx) sin(4x) dx (9.3.1)
n=1
∫−𝜋
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 458
𝜋
The first integral on the right, ∫−𝜋 sin(4x)dx, you can easily confirm for yourself goes to
zero. If you’re not very comfortable with series notation, take a moment to write out those
series explicitly to see what all the other integrals look like. You will see terms such as
𝜋 𝜋
∫−𝜋 cos(x) sin(4x)dx and ∫−𝜋 sin(3x) sin(4x)dx; there are infinitely many such terms, but
orthogonality promises that all but one of these integrals evaluates as zero. The only non-zero
𝜋
integral on the right side of this equation is ∫−𝜋 sin2 (4x)dx which is 𝜋, as we said above.
So Equation 9.3.1 reduces to:
𝜋
f (x) sin(4x) dx = 𝜋b4
∫−𝜋
This reasoning leads us to Equation 9.2.3, at least for that one coefficient; you’ll use the same
trick to find an arbitrary coefficient an in Problem 9.35 (and you could just as easily find bn ).
Note that the use of this trick did not depend on any details of the sine and cosine func-
tions, but on the orthogonality of the set. We can therefore use the same trick to represent
a function as a series of almost any orthogonal set of functions. You will find the coefficients
for a series of “Legendre polynomials” in Problem 9.36, and we will build series from com-
plex exponentials in Section 9.5. In Chapter 12 we will introduce “Sturm-Liouville theory”
which proves that many other important sets of functions are orthogonal, and can therefore
be used to build series.
These two frequencies are related by 𝜔 = vp, but the constant v is based on the specific
properties of the string. If the six strings of a guitar all vibrate with the same spatial frequency,
they will produce six different notes.
Most of our math will be done on the spatial frequency p. We will be taking Fourier series
with respect to x, not with respect to t. But the goal of our calculations will be to determine
what notes we expect to hear.
Plucking a D String
If you pluck the D string on a guitar, it produces the note D, but it also produces many other
notes more quietly. The particular combination of notes gives the string its distinct sound.
Our goal in this section is to mathematically determine what notes will be produced, and at
what volumes.
So consider a guitar string 65 cm long, fixed
at both ends. If you could somehow pull it into a
perfect sine wave and let go, the string would pro-
32.5 65 duce only one note. The pitch of that note would
FIGURE 9.6 If a guitar string were pulled depend on the wavelength of the sine wave (as dis-
into this shape and released it would cussed above), and the volume would depend on
produce exactly two pitches. the amplitude (how high up you pulled the string).
If the initial shape of the string were the sum of two
sine waves, as shown in Figure 9.6, the string would
produce two notes.
But in real life, when you pluck the string you pull it into a triangle shape. What notes will
that produce?
By this point in the chapter you can probably
guess that to approach that question, we have to
3 rewrite the shape in Figure 9.7 as a Fourier series.
0 32.5 65 Every wave in that series has a frequency p that
determines the pitch of its note and an amplitude A
FIGURE 9.7 A real plucked guitar string
that determines its volume. So we are not building
starts in a triangle shape, and produces
a Fourier series of musical notes, but we are build-
many different pitches.
ing a Fourier series of guitar-string-shapes that
correspond to musical notes.
But there are two problems with this plan. First, the shape of the guitar string doesn’t
repeat periodically. We’re going to solve that problem by making a “periodic extension” of
the string’s shape and then finding the Fourier series of that periodic function. But that leads
us to the second problem, which is that the period of the resulting function won’t be 2𝜋.
Below we address these problems in reverse order. First we discuss Fourier series for peri-
odic functions with periods other than 2𝜋. Then we talk about different ways to do a periodic
extension of a function defined on a finite domain. At the end we will come back and find
the mixture of pitches produced by the plucked guitar string.
∑
∞ ( ) ∑ ∞ ( )
n𝜋 n𝜋
f (x) = a0 + an cos x + bn sin x (9.4.1)
n=1
L n=1
L
L L ( ) L ( )
1 1 n𝜋 1 n𝜋
a0 = f (x) dx an = f (x) cos x dx bn = f (x) sin x dx (9.4.2)
2L ∫−L L ∫−L L L ∫−L L
Once again, because the function is periodic, you can take these integrals over any interval of length
2L and get the same result.
Equations 9.4.1–9.4.2 give you all you need to calculate the Fourier series for a periodic
function with any period. In what follows, we explain why the formulas look the way they do.
In particular, where did the n𝜋∕L inside the sines and cosines come from?
Consider, for example, the Fourier series for a function with period 6. Recall that L is
defined as half the period, so we plug L = 3 into the formula above. For simplicity we’ll
assume f (x) is odd so its Fourier series only contains sines, but nothing about the following
explanation would change if there were cosines as well.
∑
∞ ( ) ( ) ( ) ( )
n𝜋 𝜋 2𝜋 4𝜋
f (x) = bn sin x = b1 sin x + b2 sin x + b3 sin(𝜋x) + b4 sin x +…
n=1
3 3 3 3
Figure 9.8 shows the first several waves of this series. The first term has a period of 6, the
second has a period of 3, then 2, then 3/2. Each period divides evenly into 6—each function
repeats itself perfectly every six units—so a sum of all these functions has a period of 6. In
general, then, if you wish to express a function f (x) with period 2L as a sum of sine waves,
that sum can only include sines and cosines that repeat every 2L, which means all the terms
have frequency n𝜋∕L for some integer n.
x x x x
–3 3 6 9 –3 3 6 9 –3 3 6 9 –3 3 6 9
FIGURE 9.8 The sine waves that repeat every 6 are the ones with period 6, 6∕2, 6∕3, 6∕4, and so on, so those are
the periods included in a Fourier series for a function with period 6.
You can think of n as a counter, telling us how many “humps” (local minima and maxima)
the wave fits between x = 0 and x = 3 (half a period). The n = 1 term has one hump, the
n = 2 has two, and so on. All of these functions return to zero at x = 3, so they could all fit
into a guitar string that was tacked down at x = 0 and x = 3.
Now that we have the equation for Fourier series of functions with periods other than 2𝜋,
the remaining piece we need is Fourier series for functions that are only defined on a finite
domain. Then we’ll be ready to pluck our D string.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 462
y y y
5 5 5
x x x
–15 –10 –5 5 10 15 –15 –10 –5 5 10 15 –15 –10 –5 5 10 15
–5 –5 –5
FIGURE 9.9 Three possible periodic extensions of the function y(x) = x + 3 with domain [0, 5].
∙ We can create a periodic extension of y(x) and create a Fourier series for it (involving
both sines and cosines) with a period of 5.
∙ We can create an “even extension” of the function—that is, mirror its behavior on
the left—and then create a periodic extension of that function. Then we can create a
“Fourier cosine series” with a period of 10.
∙ We can create an “odd extension” of the function—that is, mirror its behavior upside-
down on the left—and then create a periodic extension of that. Then we can create a
“Fourier sine series” with a period of 10.
Of course these three functions are not the same, but they are the same as each other—
and as the original function—between x = 0 and x = 5, which is the only place we care about.
All three are useful in different circumstances. We’ll show below, for example, why one of
these works best for our guitar string problem.
Section 9.2 discussed the Fourier series of odd and even functions. For an even function
you can assume bn = 0 for all n, and you can calculate an by integrating from 0 to L and
doubling the result. For an odd function, the roles are reversed. These shortcuts are never
essential, but they are helpful.
In the particular case we are discussing here—building an odd or even extension from
a finite domain—someone has built those shortcuts into formulas, and you can find those
formulas in Appendix G. The formulas are almost too easy to use; you can plug in a function
and integrate without being aware that you are creating an odd or even extension, and then
you may be surprised by what you get. But if you understand where they come from, those
formulas are the fastest way to find a Fourier series for a function on a finite domain.
∙ A wave with spatial frequency p = 𝜋∕65 radians/cm produces a note with time frequency
f = 294 vibrations per second.
∙ More generally, a D string in the shape A sin(n𝜋x∕65) produces a note with frequency
f = (294 × n) Hz.
The box above tells us what we will hear from a D string bent into a perfect sine wave. Our
goal is to decompose the original shape (Figure 9.7) into a series of sine waves to determine
what notes are produced at what volumes.
Since the height is only defined in 0 ≤ x ≤ 65 we need a periodic extension of the string.
We could define an odd extension, an even extension, or neither. Mathematically, any of
these three approaches would give us the right function between 0 and 65. But physically, a
series with cosines is useless to us. Why? Because a single cosine function cannot represent the state
of our string! (No cosine function goes to zero at x = 0.)
You can think of this in the following way: we can write down the note produced if the
string starts in any particular sine wave. Thus if we write the initial condition as a sum of
sine waves, we know the combination of notes produced by that initial condition. We could
also write the initial condition as a sum of cosines. That series would be mathematically valid
and it would converge to the right function for the initial condition, but since those don’t
represent possible states of the string, we wouldn’t know what to do with them if we had them.
Because we want only sines in our Fourier series, we begin with an odd extension of the
original function.
3
–130 –65 –3 32.5 65 130
The Fourier coefficients are given by Equation 9.4.2. Because we are considering an odd
function we know an = 0. The period is 130, so we use L = 65:
65 ( )
1 n𝜋
bn = f (x) sin x dx
65 ∫−65 65
Looking at Figure 9.10 we can see that this integral has to be done piecewise in three seg-
ments, −65 < x < −32.5, −32.5 < x < 32.5, and 32.5 < x < 65. We can save some effort, how-
ever, by noticing that f (x) is odd and sin(n𝜋x∕65) is odd, so the entire integrand is even. That
means the integral from −65 to 0 will give the same result as the integral from 0 to 65, so we
can go from 0 to 65 and double the result.
65 ( )
2 n𝜋
bn = f (x) sin x dx
65 ∫0 65
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 464
This integral still has to be broken into two pieces. Looking at Figure 9.10, f (x) = (3∕32.5)x
from x = 0 to 32.5, and f (x) = 6 − (3∕32.5)x from x = 32.5 to 65.
[ ]
32.5 ( ) 65 ( ) ( )
2 3 n𝜋 3 n𝜋
bn = x sin x dx + 6− x sin x dx
65 ∫0 32.5 65 ∫32.5 32.5 65
We can evaluate this using the table in Appendix G. The answer is bn = (24∕n 2 𝜋 2 ) sin(n𝜋∕2).
This goes to zero for all even n, and decreases rapidly with increasing n. We can get more
intuition for that by writing out the first few non-zero terms.
We can see that the sound is dominated by the n = 1 note with frequency 294 Hz, which
is the note called D4 (or D over middle C). The next note has frequency 812 Hz, which is
slightly higher than G5, but with an amplitude almost 10 times smaller than the dominant
note. The remaining notes contribute very little to the overall sound. We can thus see why,
when we pluck a guitar string, we think of it as making just one note. Nonetheless, the
combination of higher notes (called “overtones”) make D4 on a guitar sound different from
D4 on a piano or a flute.
Question: Find a Fourier cosine series for the function y = x + 3 between x = 0 and
x = 5.
Solution:
The even extension, drawn in Figure 9.9, has the formula:
{
x+3 0≤x≤5
f (x) =
−x + 3 −5 ≤ x < 0
We could find all the coefficients with piecewise integrals, but knowing that the
function is even offers us several shortcuts. First, we know that all the bn coefficients
are zero. Second, we know that the an integrals from −5 to 0 will contribute the same
as the ones from 0 to 5, so we can just calculate them for positive values of x and then
double the answer. (Alternatively we could have just gone straight to the formula for a
Fourier cosine series in Appendix G.)
5
1 11
a0 = (x + 3)dx =
5 ∫0 2
5 ( )
2 n𝜋
an = (x + 3) cos x dx
5 ∫0 5
( )
11 ∑ 20
∞
n𝜋
f (x) = − cos x odd n only
2 n=1
𝜋 2 n2 5
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 465
Appendix G lists all the formulas for finding Fourier series coefficients. It also contains a brief table of some
of the integrals that you will need most often in this chapter. Any problem that cannot be solved with u
substitution, integration by parts, or with that table will be marked with a computer icon: .
After you integrate remember that n (the summation index) is always an integer, and for any integer n:
9.37 Find the period of each function. 9.41 The function f (x) is defined as x 2 on
(a) sin x the domain 0 ≤ x ≤ 3.
(b) sin(2x) (a) Draw an even extension of f (x). If
(c) sin(x∕5) your even extension is repeated for-
ever, what is the period of the resulting
(d) sin(3x) + sin(6x) + sin(21x)
function?
(e) tan x
(b) Draw an odd extension of f (x). If your odd
9.38 (a) Find the period of the function extension is repeated forever, what is the
sin(2x) + sin(10x). period of the resulting function?
(b) Find the period of the function (c) Draw a simple periodic exten-
sin(2x) + sin(5x). sion of f (x), neither even nor odd.
(c) Choose a constant a such that the function What is the period of the resulting
sin(2x) + sin(ax) is not periodic at all. function?
9.39 The function g (x) = cos(px) has the prop- (d) Repeat Parts (a)–(c) for the function
erty that g (x) = g (x + 6) for all x-values. g (x) = e x on the same domain.
(a) The function g (x) might have a period of 6. 9.42 The function f (x) equals x from x = −4
What value of p would lead to this period? to x = 4 and is extended periodically
(b) The function g (x) might also have a from there, with a period of 8.
period of 3. What value of p would (a) How can you know without doing any
lead to this period? calculations that the Fourier series for
(c) What are all the possible periods of g (x), f (x) contains only sine terms?
and their corresponding p-values? (b) Write a sine function with period 8.
(d) If you sum a set of functions with all the (c) Write a sine function with period
periods in Part (c), what is the period 4. This function will also repeat
of the resulting function? when x increases by 8.
9.40 Figure 9.11 shows the n = 1, n = 3, and n = 6 (d) What is the next largest period that a
terms of a Fourier sine series. Draw the n = 2, sine function can have and still repeat
n = 4, and n = 5 terms. You don’t need to when x increases by 8? Write a sine
put numbers on the vertical axis since there’s function with that period.
no way for you to know the amplitudes with- (e) Write a Fourier series for f (x) with-
out knowing what function is being modeled. out the coefficients filled in: f (x) =
∑
Do include numbers on the horizontal axis, bn sin(something x) where bn is not-
though, because you should be able to show yet-determined, but where you do fill
each plot with the correct periods. No math in what that “something” is.
is required here, just a quick sketch!
(f) Find the coefficients bn using of it. Find its Fourier series. You should be able to
Equation 9.4.2 and write the Fourier evaluate the integrals by hand or with the table in
series for f (x). Simplify your answer Appendix G.
as much as possible. {
−1 0 < x < 3
9.48 f (x) =
9.43 [This problem depends on Problem 9.42.] For 1 3<x<6
{
the series you found in Problem 9.42, have a 1 0<x<3
9.49 f (x) =
computer draw the 1st , 5th , 20th , and 100th par- 2 3<x<6
tial sums on the domain [−12, 12]. Sketch the ⎧ −1 −2 < x < −1
results and explain why they make sense. ⎪
9.50 f (x) = ⎨ 1 −1 < x < 1
9.44 Walk-Through: Building a Fourier Series ⎪ −1 1<x<2
(Finite Domain). In this problem you’ll ⎩
find a Fourier series of the function f (x) = 9.51 f (x) = x on −2 ≤ x ≤ 2
e x on the domain 0 ≤ x ≤ 5.
(a) We define a new function g (x) that is an 9.52 Appendix G gives the formula for a “Fourier
odd extension of f (x). In other words, in Sine Series.” Derive this formula based
the domain 0 ≤ x ≤ 5, g (x) = e x . In the on Equations 9.4.1–9.4.2. Hint: Start by
domain −5 < x ≤ 0, g (x) = −e −x . Beyond creating an odd extension.
that g (x) repeats with period 10. Sketch a
graph of g (x) from x = −30 to x = 30. For Problems 9.53–9.57, find a Fourier sine series and
(b) If you create a Fourier series to represent a Fourier cosine series for the given function on the
g (x), you know without doing any work that given domain. For problems not marked with a
either the an or the bn coefficients will be computer icon you should be able to evaluate
zero. Which one, and how do you know? the integrals by hand or with the table in
Appendix G.
(c) Use Equation 9.4.2 to write down inte-
grals for the non-zero coefficients. Note 9.53 f (x) = 3 on 0 ≤ x ≤ 1
that you will have to split your integrals 9.54 f (x) = 1 − x on 0 ≤ x ≤ 1
into two parts: one going from −5 to
0, and the other from 0 to 5. 9.55 f (x) = x 2 on 0 ≤ x ≤ 2
(d) Explain how you can know without doing
any calculations that both parts of your 9.56 f (x) = x 4 on 0 ≤ x ≤ 2
integral will give the same result. This
means you can simply evaluate one of 9.57 f (x) = sin x on 0 ≤ x ≤ 𝜋
them and multiply the result by 2.
(e) Evaluate your integrals and write 9.58 A square wave is represented by the
the resulting Fourier series. Sim- following function for all t ≥ 0.
plify as much as possible.
⎧ 10 0≤t<1
9.45 [This problem depends on Problem 9.44.] For ⎪ −10 1≤t<2
f (t) = ⎨
the series you found in Problem 9.44, have a 10 2≤t<3
⎪
computer draw the 1st , 5th , 20th , and 100th par- ⎩ … and so on
tial sums on the domain [−10, 10]. Sketch the
results and explain why they make sense. Write a Fourier series that represents this
9.46 [This problem depends on Problem 9.44.] function. Since the function is only defined
Repeat Problem 9.44 with an even exten- for positive t-values, it doesn’t matter what
sion of the function. your function does for t < 0.
9.47 [This problem depends on Problem 9.44.] Repeat 9.59 Rework the guitar string problem that we
Problem 9.44 doing neither an odd nor an solved in the Explanation (Section 9.4.2),
even extension, creating a Fourier series but this time assume the string is plucked
with a period of 5. Hint: Remember that you up to a height of 3 cm at a point 2/3 of the
do not have to integrate from −L to L, but way along the string instead of directly in
can integrate along any full period. the middle. How does the amplitude of the
first overtone (the second non-zero term in
For Problems 9.48–9.50, the function f (x) is defined the Fourier series) compare to what we got
on the given interval and repeats periodically outside for the plucked-in-the-middle string?
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 467
x
–2π ––π
–2π π π 2π
2π
–1
4 4 4 4
f (x) = sin(x) + sin(3x) + sin(5x) + sin(7x) + …
𝜋 3𝜋 5𝜋 7𝜋
Anything that can be expressed with sines and cosines can also be expressed with complex
exponential functions. In this exercise you will find the complex exponential series that
represents this square wave.
1. Starting with Euler’s equations e ix = cos x + i sin x and e −ix = cos x − i sin x, derive a
formula for sin x in terms of e ix and e −ix .
2. Using that formula, replace all the sines in the above expression with complex expo-
nentials.
3. Regroup your answer to find the coefficients in the following series for our square
wave.
f (x) = c0 + c1 e ix + c−1 e −ix + c2 e 2ix + c−2 e −2ix + c3 e 3ix + c−3 e −3ix + …
(a) What is c0 ?
(b) What is c1 ?
(c) What is c−3 ?
See Check Yourself #60 in Appendix L
(d) What is cn for any even integer n?
(e) What is cn for any odd integer n?
∑
∞
f (x) = cn e i(n𝜋∕L)x (9.5.1)
n=−∞
L
1
cn = f (x)e −i(n𝜋∕L)x dx (9.5.2)
2L ∫−L
Just as with the sine-and-cosine version, you can evaluate this integral over any interval of
length 2L.
∙ In a trig-based Fourier series, the constant term is divided by 2L while all the other
terms are divided by L. In this version, the constant term c0 uses the same formula as
all the other terms.
∙ The original function f (x) is not just the real part of the series, the imaginary part of
the series, or the modulus of the series; it is the entire series. If f (x) is a real function
then the series will add up to a real value. (This means that c−n will be the complex
conjugate of cn .)
In Section 9.2 we found the Fourier series for the function x 2 + 3 on the domain
[−𝜋, 𝜋]. Here we will find the complex Fourier series for the same function on the
same domain.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 469
With that we have all the terms that make up our series.
𝜋2 ∑∞
2(−1)n inx
x2 + 3 = +3+ e (n ≠ 0)
3 n=−∞ n2
Interpreting cn
Finding a Fourier series with complex exponentials is a bit easier than it is with sines and
cosines because you only have one set of coefficients cn to find, and because integrals with
exponentials are typically easier than ones with sines and cosines. It’s a bit less obvious how
to interpret the results, though. What does it mean to have an amplitude at p = 3 and at
p = −3, and what does it tell you if one of those amplitudes is 3 + 2i?
You should think of cn e i(n𝜋∕L)x + c−n e −i(n𝜋∕L)x as one term representing an oscillation of
frequency n𝜋∕L, just as you would with an cos((n𝜋∕L)x) + bn sin((n𝜋∕L)x). When f (x) is a
real function, the interpretation of its Fourier coefficients is aided by the following useful
facts.
You’ll prove all of these in Problem 9.72. One important consequence of them is that
for a real function f (x) you know c−n once you know cn , so all of the information is in the
coefficients for n ≥ 0. In particular, you’ll show in Problem 9.73 that for a real function the
amplitude of oscillations at frequency (𝜋∕L)n is simply 2|cn |.
We have seen that when we build a Fourier series for a function defined on a finite domain,
we have three ways of building a periodic extension of the function. You can build an “odd”
or “even” copy of the function and then build that out periodically, which gives you the
advantage of working with only sines or only cosines. But with complex exponentials this
advantage disappears, so if you are going to use a complex exponential series you usually
just extend the function itself without doubling its period.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 470
Parseval’s Theorem
When we express a function in the traditional f (x) format, we say “for this value of x the value
of the function is that” for every possible x-value. When we express the same function with
a Fourier series, we convey the same information in a different way: “for this frequency the
two coefficients of the function are those” for every possible frequency. “Parseval’s theorem”
provides an important connection between these two different representations.
Parseval’s Theorem
Suppose cn are the Fourier coefficients of a periodic function f (x). The average value of |f (x)|2 over
one full period equals Σ∞ |c |2 .
−∞ n
One way to interpret Parseval’s theorem is in terms of the “intensity” of a light wave, which
is a measure of how much energy it carries. For the simplest possible light wave—one fre-
quency only—the intensity is proportional to the square of the amplitude. Parseval’s theorem
gives us mathematical assurance that for a more complicated light wave made up of many
different frequencies, the total intensity is the sum of the intensities of the individual fre-
quencies.
When applying Parseval’s theorem it is useful to remember that the average value of any
function g (x) on the interval a ≤ x ≤ b is:
b
1
g (x) dx (9.5.3)
b − a ∫a
b
f ∗ (x)g (x) dx = 0
∫a
A real function is its own complex conjugate, so our earlier definition can be seen as a special case
of this one.
You’ll show in Problem 9.74 that the functions e in𝜋x∕L for all integers n are orthogonal on
the interval −L < x < L. Specifically:
L ( ) ( m𝜋 ) {
n𝜋 0 n≠m
ei L
x
e −i L x dx =
∫−L 2L n=m
n=−∞
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 471
The orthogonality of the set allows us to use the same trick we used for sines and cosines.
To find the m th term we multiply both sides of the equation by e −i(m𝜋∕L)x and then integrate
from −L to L. ( )
L ∑∞ L ( n𝜋 ) ( )
−i m𝜋 x i x −i m𝜋 x
f (x)e L dx = cn e L e L dx
∫−L n=−∞
∫ −L
Remember how this trick works? The series on the right has an infinite number of terms,
but orthogonality guarantees us that all but one of them—the one where m = n—are zero.
When m does equal n, the integral is 2L, as we saw above. So the entire infinite series just
becomes: ( )
L
−i m𝜋 x
f (x)e L dx = cm (2L)
∫−L
Appendix G lists all the formulas for finding Fourier series coefficients. It also contains a brief table of some
of the integrals that you will need most often in this chapter. Any problem that cannot be solved with u
substitution, integration by parts, or with that table will be marked with a computer icon: .
After you integrate remember that n (the summation index) is always an integer and that for any integer
n, positive or negative,
e in𝜋 = (−1)n
9.60 This problem is not directly about Fourier 9.62 The “sawtooth function” shown below
series per se, but it gives you a few formulas
V
that will be useful in simplifying complex
Fourier series. 1
(a) Show that for any integer n you can sim- t
plify e in𝜋 to (−1)n . (You can do this by –π
–2π –π π 2π
drawing values on the complex plane, –1
or by using Euler’s formula.)
(b) If n is an integer then what is e in𝜋 − e −in𝜋 ? 9.63 The “triangle wave” shown below
(c) If n is an integer then what is e 2in𝜋 ?
f(x)
1
In Problems 9.61–9.68 find the Fourier series, in
x
complex exponential form, for the given functions. –π π
If the function is defined on a finite domain make a
periodic extension of it (not odd or even). Simplify 9.64 y = sin(3x)
your answers as much as possible. For problems not
9.65 f (x) = x on the domain [0, 5]
marked with a computer icon you should be able to
evaluate the integrals by hand, or with the table in 9.66 f (x) = e x on the domain [0, 2]
Appendix G. 9.67 f (x) = x + ix 2 on [−1, 1]
9.61 The square wave whose function is given below 9.68 f (x) = xe ix on [0, 𝜋]
function on the same domain. Show that 9.75 You can use Parseval’s theorem to sum series
the two expressions are equivalent. that would be hard to calculate otherwise.
9.70 Show that if c−n is the complex conjugate of (a) Find the coefficients of the Fourier
cn for all n, then the resulting Fourier series series (in complex exponential form)
represents a real-valued function. for f (x) = x from x = −𝜋 to 𝜋, repeated
9.71 (a) Show that if c−n and cn are real and periodically thereafter.
equal to each other for all n, then the (b) Using your answer to Part (a) write
∑∞
resulting Fourier series represents an |c |2 in the form <a constant>
∑∞ −∞ n
even real-valued function. 1
(1∕n 2 ). Hint: Combine |cn |2 and
(b) What relationship between c−n |c−n |2 into one term.
and cn would lead f (x) to be real (c) Find the average value of f (x)2 between
and odd? x = −𝜋 and x = 𝜋. (Use Equation 9.5.3.)
9.72 Prove the “fun facts about complex Fourier (d) Use Parseval’s theorem and your answers
∑∞
series of real functions” on Page 469, using to Parts (b)–(c) to find 1 (1∕n 2 ).
Equation 9.5.2. Hint: expand out the com- 9.76 [This problem depends on Problem 9.75.]
plex exponential with Euler’s formula. You’ll Use the same method you used in Prob-
find that cn comes out imaginary and that lem 9.75, starting with the Fourier series
c−n = −cn . for f (x) = −1 from x = −1 to 0, f (x) = 1
9.73 If f (x) is real then its Fourier coefficients obey from x = 0 to 1 (repeated periodically after-
c−n = cn∗ , so the oscillation with frequency n𝜋∕L wards). Your final answer should tell you
∑∞
can be written as cn e i(n𝜋∕L)x + cn∗ e −i(n𝜋∕L)x . Let the sum 0 (1∕n 2 ) for odd n only.
cn = A + iB, where A and B are real. 9.77 [This problem depends on Problem 9.75.] Use
(a) Use Euler’s formula to rewrite the oscil- the same method you used in Problem 9.75,
lation without i in it. Your answer should starting with the Fourier series for f (x) =
depend on A, B, n𝜋∕L, and x. x 2 from x = −𝜋 to 𝜋 (repeated periodi-
cally afterwards). Your final answer should
(b) Using the fact that a cos(px) +√ b sin(px) is ∑∞
tell you the sum 0 (1∕n 4 ).
an oscillation with amplitude a 2 + b 2 ,
write the amplitude of this oscillation 9.78 In quantum mechanics, the state of a particle is
as a function of A and B. described by a complex “wavefunction” 𝜓(x). If
(c) Show that the amplitude equals 2|cn |. 𝜓(x) = Ce ikx then the particle has momentum
k. If 𝜓(x) = Ae ikx + Be ipx then the particle might
9.74 In this
( n𝜋problem
) ( m𝜋you) will evaluate have momentum k or it might have momentum
L
∫−L e i L x e −i L x dx. (This result was p; the probability of each possibility is propor-
used in deriving Equation 9.5.2.) tional to the amplitude A or B squared. For
(a) Combine the exponentials into one and example, if 𝜓(x) = 2e 7ix + 3e −5ix then the parti-
evaluate the integral. Using the prop- cle either has momentum 7 or −5, and −5 is 9∕4
erties of complex exponentials and the as likely as 7. Since the two probabilities must
fact that n and m are integers, show add to 1, they are 4∕13 and 9∕13 respectively.
that your answer reduces to 0. (a) If 𝜓(x) = cos(2x) what are its possible
(b) Explain why the argument you made momenta and what are the probabili-
in Part (a) doesn’t apply when n = m. ties for each one? (You can rewrite this
(You’ll redo the calculation for this function as a sum of complex exponen-
case in Part (c). Here you just need to tials without using Equation 9.5.2.)
explain why the calculation you did (b) If 𝜓(x) = −1 for −𝜋 < x < 0 and 1 for
in Part (a) fails in this case.) 0 < x < 𝜋 (and is periodic for other val-
(c) Simplify the integrand in the case ues), how many times more likely is it to
n = m and evaluate the integral. have momentum 3 than momentum 7?
5. Calculate the coefficients c0 –c3 of the complex exponential Fourier series for
each of the three functions you considered above, with periods 2𝜋, 4𝜋, and 1000.
(If you don’t have access to a computer while you’re doing this exercise, skip this
part and the next one and at the end make a prediction as to what you would have
seen.)
2
6. In the limit where you consider the entire function e −x from −∞ to ∞, what happens
to the amplitudes cn ?
7. Given your answer to Part 3, explain why your answer to Part 6 makes sense.
Suppose you wanted to write a series to represent a complete non-periodic function like
2
f (x) = e −x . What frequencies would you include? All of them! As the period of f increases
the spacing between frequencies in the Fourier series decreases. In the limit where f never
repeats, you use frequencies at every real number. The Fourier series becomes a “Fourier
integral.”
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 474
Fourier Transforms
A non-periodic function f (x) can be represented by a Fourier integral:
∞
f (x) = f̂(p)e ipx dp (9.6.1)
∫−∞
The function f̂(p) is the “Fourier transform” of f (x), and can be calculated from:
∞
1
f̂(p) = f (x)e −ipx dx (9.6.2)
2𝜋 ∫−∞
The original function f (x) is called the “inverse Fourier transform” of f̂(p).
The Fourier transform f̂(p) plays essentially the same role in a Fourier integral that the
coefficients cn play in a Fourier series. If f̂(7.2) = 3 that doesn’t mean anything in particular
about f (7.2). Instead it means that if you express f (x) as a sum of oscillations, the one with
frequency 7.2 will have an amplitude of 3. (Well, it means roughly that. We’ll get more precise
about it a little later, but if you understand this point qualitatively you’ve got a good start on
understanding Fourier transforms.)
A quick note about terminology may be helpful here. For a periodic function we find the
Fourier coefficients and use them to write the original function as a Fourier series. For a non-
periodic function we find the Fourier transform and use it to write the original function as a
Fourier integral. But in an annoying accident of history, people tend to refer to the first process
as “finding the Fourier series” (not the coefficients) and the second as “finding the Fourier
transform” (not the integral). Neither phrase is a problem by itself, but it would be nice if
they were parallel to each other. Sorry about that.
We can combine the exponentials within each integral and we’re just left with the
integral of an exponential. After a bit of algebra we get the following.
1
f̂(p) = ( )
𝜋 1 + p2
The facts about Fourier series of real functions on Page 469 also apply to Fourier trans-
forms. If f (x) is real that means f̂(−p) is the complex conjugate of f̂(p); if f (x) is real and even
that means f̂(p) is real and even; if f (x) is real and odd that means f̂(p) is imaginary and odd.
In some physical contexts the Fourier transform of a function is called its “spectrum.” For
example, if A(t) represents the amplitude of a light wave hitting a detector (such as your eye)
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 475
̂
then A(𝜔) tells you how much of that wave is made up of each possible frequency. Since each
frequency of light represents a color, the spectrum tells you how much of each color is in a
given light wave. Although this is the most familiar use of “spectrum,” scientists sometimes
extend the term to Fourier transforms of other types of waves.
∞
Parseval’s theorem can also be extended to Fourier transforms: ∫−∞ |f (x)|2 dx =
∞ ̂
∫−∞ |f (p)|2 dp. One important application comes from quantum mechanics, where the
wavefunction 𝜓 represents position probabilities and its Fourier transform represents
momentum probabilities. A proper wave function should be “normalized”—that is, the sum
of all probabilities should be one—and Parseval’s theorem assures us that normalizing one
representation also normalizes the other.
x
–4 –2 2 4
FIGURE 9.13
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 476
Answer:
This function sums two waves: one with a large amplitude and a period of 2, and one
with a smaller amplitude and a period a tenth of the first. (You can count the number
of small peaks in each period of the large wave.) So the Fourier transform has a sharp
peak at p = 𝜋 and a smaller peak at p = 10𝜋.
fˆ(p)
p
–10π π 10π
p
–4 –2 2 4
The examples above represent the core skill that we want to develop in this section, which
is graphically going from a function to its Fourier transform and vice versa. Please make sure
you follow all of them before proceeding.
But now we would like to make a subtler point, which is that the formulas for finding f (x)
from f̂(p) (Equation 9.6.1) and finding f̂(p) from f (x) (Equation 9.6.2) are nearly identical.
The numerical factors are different and the exponent sign is reversed, but qualitatively both
equations describe the same kind of transformation. The symmetry of this transformation
gives you another level of insight.
For instance, Figure 9.15 shows a function f (x) that spikes
f (x) sharply at x = ±3. What is its Fourier transform? If you try to
answer this question directly, by figuring out what oscillatory
periods make up such a spike, you probably won’t get far. So
instead let’s ask the opposite question: “f (x) is the Fourier
transform of what function?” That’s like the first example
x above, and the answer is a sine wave with frequency 3. We con-
–4 –2 0 2 4
clude, because of the symmetry of these transformations, that
FIGURE 9.15 the Fourier transform of Figure 9.15 would look like cos(3x).
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 477
If f̂(p) spikes at p = 3 that means f (x) oscillates with frequency 3, like cos(3x).
If f (x) spikes at x = 3 that means f̂(p) oscillates with frequency 3, like cos(3p).
Test yourself: If we move the spikes in Figure 9.15 farther away from x = 0, what will happen
to the Fourier transform? What if we move the spike to x = 0? Think about both of these ques-
tions backward—pretend the spikes represent the Fourier transform and you want to find
the function instead of the other way around—and formulate your answers before reading
further.
OK, here are the answers. If you move the spike to higher |x|-values then the Fourier trans-
form (which was a frequency-3 oscillation) increases in frequency. If the spike peaks around
zero then the Fourier transform is a constant. (This is the reverse of the third example
above.) We are presenting these puzzles to develop your general understanding of Fourier
transforms, but these two specific results—“moving a function away from x = 0 increases the
frequency of its transform” and “the transform of a spike at zero is a constant”—are also
worth holding onto as you move forward.
How “sharp” must these peaks be? Strictly speaking, the integral in Equation 9.6.2 doesn’t
exist for either an infinitely thin spike or a perfect sine or cosine wave, so all the examples
above are limiting cases. Figure 9.16 shows what happens to f̂(p) when the peaks that make
up f (x) get wider. The transform still looks like a cosine wave near p = 0, but the wider the
peaks in f (x) the more rapidly f̂(p) falls off with increasing |p|. In the limit where the peaks
in f (x) are so wide that f is approaching a constant, f̂(p) becomes a thin spike at p = 0. In
the opposite limit where the peaks are getting infinitely thin, the transform is approaching
a perfect sine wave.
x x x x x
–4 –2 2 4 –4 –2 2 4 –4 –2 2 4 –4 –2 2 4 –4 –2 2 4
f ̂(p) f ̂(p) f ̂(p) f ̂(p) f ̂(p)
p p p p p
–4 –2 2 4 –4 –2 2 4 –4 –2 2 4 –4 –2 2 4 –4 –2 2 4
FIGURE 9.16 Each column shows f (x) on top and its Fourier transform f̂(p) below it. When f (x) is sharply peaked
at ±3, f̂(p) is roughly a sine or cosine wave with period 2𝜋∕3. When the peaks in f (x) get wider, f̂(p) falls off as it
moves away from p = 0.
You might protest that we have been flagrantly violating that restriction. Our examples
above included the Fourier transform of a superposition of cosines and the Fourier transform
of a constant, neither of which drops to zero. Both of those cases should properly be viewed
as limits. For instance, if you have a function that is constant for a very long time, and then
drops to zero at the distant edges, its Fourier transform will approach a spike at x = 0. As
those edges get farther away, the spike gets thinner, approaching the drawing we showed
above.3
One of the most important functions that does have a Fourier transform is a “Gaussian,”
2
such as f (x) = e −x . We discuss Gaussian functions below.
1
f̂(p) = √ e −p ∕4
2 2
Simplest case: f (x) = e −x →
2 𝜋
Aw
f̂(p) = √ e ipx0 e −w p ∕4
2 2 2
General case: f (x) = Ae −[(x−x0 )∕w] →
2 𝜋
So if you start with a Gaussian f (x) you end up with a Gaussian f̂(p). If you change f (x) that
causes changes to f̂(p). For instance, if you double A that doubles the height of f (x). This in
turn doubles f̂(p) because, as we already said, the Fourier transform is linear. We’ll discuss x0
below, but for now take it to be zero.
3 In the limit where the edges get infinitely far away the spike gets infinitely tall and thin. That limit—an infinitely
tall, thin spike—is called a “Dirac delta function” (although it’s not technically a function). We discuss Dirac delta
functions further in Problem 9.93 of this section, and much more carefully in Chapter 10.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 479
2
Question: Sketch the Fourier transforms of f (x) = 1∕(1 + e x ) and
2
g (x) = 1∕(1 + e (x∕4) ).
Answer:
We cannot find the Fourier transforms of these functions analytically. (Try it!)
However, a quick sketch shows that f (x) and g (x) are both bumps at x = 0, with g (x)
four times wider than f (x). So their Fourier transforms will look like bumps at p = 0,
but with f̂(p) four times wider than ĝ (p).
x p
–20 –10 10 20 –10 –5 5 10
The plot on the left shows the functions f (x) (dashed) and g (x) (solid). The plot on the right shows their
Fourier transforms.
The (solid) g (x) is the wider of the two functions, so it has the narrower of the two
transforms. We’ll leave it up to you for the moment to think about why the f̂(p) plot is
shorter than the ĝ (p) plot. See Problem 9.95.
The fact that wide peaks have narrow Fourier transforms and vice versa is critical in quan-
tum mechanics. The “wavefunction” 𝜓(x) describes the position of the function, and when it
is distributed over a wide range of x-values that means the particle’s position is very uncertain.
The Fourier transform 𝜓(p)̂ similarly describes the particle’s momentum. So if the position
wavefunction is sharply peaked (small uncertainty in position) the uncertainty in momen-
tum will be large, and vice versa. This result is called the “Heisenberg Uncertainty Principle”
and it is used to calculate quantities ranging from the properties of atoms to the sizes of
neutron stars.
We begin with a purely mathematical observation: every time you move a function to the
left by 1 unit, its Fourier transform multiplies by e ip . (You will prove this in Problem 9.94.) As
2
an example, we have seen that the Fourier transform of the Gaussian function e −x is another
2
Gaussian, so we now know that the Fourier transform for e −(x+x0 ) is a Gaussian multiplied by
e ix0 p . OK, but what does that look like?
Remember that e ix0 p is an oscillation around the complex plane, completing a full rota-
tion every time p advances by 2𝜋∕x0 . When p = 0 the multiplier e ix0 p is just 1, which is to
say, irrelevant: if our function was real, it’s still real. When p = 𝜋∕(2x0 ) the multiplier is i,
so the function is pure imaginary. At p = 𝜋∕x0 the function is on the negative real axis,
and so on.
In summary: if we move f (x) to the left, this does not change the modulus of f̂(p) at all. But
it does change its complex phase. If you start with a real even function such as a Gaussian,
and move it to the left by x0 units, the transform looks the same as it did before except that
as you move along the p-axis it rotates
around the complex plane with a fre-
quency of x0 . We can’t easily plot that
f (x) ̂
f (p)
complex Fourier transform, but we can
plot the real and imaginary parts of
it. As f̂(p) rotates about the complex
p plane, the real and imaginary parts
–10 –5 5 10
x both oscillate.
–10 –5 5 10
Moving the plot of a function f (x)
FIGURE 9.18 For the Gaussian on the left centered on away from x = 0 causes its transform
zero (dashed curve) the transform is real, so the plot on to oscillate. The farther away from
the right shows f̂(p). For the Gaussian on the left centered zero you go, the higher the frequency
on x = −2 (solid curve) the transform is complex, so the of oscillation (Figure 9.18). When the
plot on the right shows the real part of f̂(p). The imaginary argument is time instead of space,
part also oscillates with an amplitude that decreases with p, this result is called the “time-delay
but 90◦ out of phase with the real part.
theorem.”
frequencies from p = 6.9 to p = 7.1 contribute to this integral?” That’s the right question,
and we answer it by integrating.
This issue points to a fundamental difference between a sum and an integral, or between a
discrete function and a continuous one. We can summarize by saying that c7 is the amplitude
of an oscillation with frequency 7, while f̂(7) is the amplitude per unit frequency of oscilla-
tions with frequency close to 7. (Notice that the units work out correctly because of this; the
coefficient of e 7ix in a Fourier integral is not f̂(7), but f̂(7)dp.)
Stepping Back
In this section we’ve introduced the Fourier transform, which serves essentially the same role
for non-periodic functions that Fourier coefficients do for periodic ones. If f̂(7) is large then
you know that f (x) varies in some way on the length scale 2𝜋∕7. If you know the entire func-
tion f̂(p) then you can reconstruct f (x) and vice versa. They contain the same information in
two different forms.
If f̂(p) is sharply peaked at one or more frequencies then f (x) varies more or less sinu-
soidally at those frequencies. A sharp peak at p = 0 corresponds to a roughly constant f (x).
If f (x) has a bump in its plot, then f̂(p) will also have a bump. The narrower the bump in
f (x), the wider the bump in f̂(p).
By now you may be impatiently waiting for us to get past all these plots and intuition-stuff
and get into the real business of calculating Fourier transforms for a bunch of functions.
We’re not going to do that. There are functions you can analytically Fourier transform, and
you’ll work some examples in the problems. The number of useful ones is not that large,
though, and you can look them up or calculate them on a computer when you need them.
The hard skill, which you need to focus on, is interpreting the Fourier transform once you
have it. Hopefully this section has given you a good start on that.
7in x 10in Felder c09.tex V3 - February 26, 2015 10:45 P.M. Page 482
C H A P T E R 10
∙ model real-world situations with ordinary differential equations and interpret solutions of
these equations to predict physical behavior (see Chapter 1).
∙ given a differential equation and a proposed solution, determine if that proposed solution
works (see Chapter 1).
∙ solve differential equations by “separation of variables” and by “guess and check” (see
Chapter 1).
∙ use initial conditions to determine the arbitrary constants in a general solution (see
Chapter 1).
∙ evaluate integrals with u-substitution and integration by parts.
∙ evaluate and interpret partial derivatives, and set up equations with differentials (see
Chapter 4).
∙ use determinants to see if a set of vectors is linearly independent (for Section 10.6 only, see
Chapter 6).
After you read this chapter, you should be able to …
The following integrals come up in a number of problems throughout this chapter. We’ve
placed them here for easy reference and will refer back to this spot when they are needed.
e kx
e kx sin(bx)dx = [k sin(bx) − b cos(bx)] + C (k 2 + b 2 ≠ 0)
∫ k2 + b2
e kx
e kx cos(bx)dx = [k cos(bx) + b sin(bx)] + C (k 2 + b 2 ≠ 0)
∫ k2 + b2
d 2y dy
a2 (x) 2
+ a1 (x) + a0 (x)y (10.2.1)
dx dx
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 486
Suppose the three different functions c1 (x), c2 (x), and p(x) have the following properties: if
you plug y = c1 (x) into Equation 10.2.1 you get 0. If you plug in y = c2 (x) in, you again get 0.
And if you plug in y = p(x), you get out the function A(x).
1. What do you get if you plug y = 3c1 (x) into Equation 10.2.1?
2. What do you get if you plug y = 5c2 (x) into Equation 10.2.1?
3. What do you get if you plug y = 10p(x) into Equation 10.2.1?
4. What do you get if you plug y = 13c1 (x) − 12c2 (x) + p(x) into Equation 10.2.1?
See Check Yourself #63 in Appendix L
5. Would your answers change if Equation 10.2.1 began with a3 (x)y′′′ (x)? (Assume that
the functions c1 (x), c2 (x), and p(x) still gave the same answers as before.)
6. Would your answers change if Equation 10.2.1 contained a0 (x)y2 instead of a0 (x)y?
(same comment)
Ly = f (x) (10.2.2)
Suppose the function yp is a solution to Equation 10.2.2: in other words, suppose Lyp =
f (x). And suppose the linearly independent functions yc1 , yc2 … ycn are all solutions to the
complementary homogeneous equation: in other words, Lyci = 0 for all i. The fact that L is
a linear operator means that:
Make sure you follow how Equation 10.2.3 follows from the definition of a linear operator.
The function yp is called the “particular solution” because it is just one solution (typically with
no arbitrary constants) to the equation we want to solve. The function A1 yc1 + A2 yc2 + … +
An ycn is called the “complementary solution” because it solves the complementary homoge-
neous equation. The sum of the particular and complementary solutions is itself a solution
to Ly = f (x).
If L is an nth-order differential operator, and if all the yci functions are linearly inde-
pendent functions, then we don’t just have “a solution”: we have the general solution,
because we have the requisite number of independent arbitrary constants. Section 10.6 (see
felderbooks.com) discusses the Wronskian, a tool for determining if a given set of solutions
are linearly independent.
The fact that you can make a solution to any linear equation by combining solutions in
the form yp + A1 yc1 + A2 yc2 + … + An ycn is called the “principle of linear superposition.”
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 488
Problem:
Find the general solution to the following differential equation.
d 2y dy d2 d
2
+ 3 − 4y = 50 cos(2x) or Ly = 50 cos(2x) where L = 2
+3 −4
dx dx dx dx
(10.2.4)
Solution:
We begin by solving the complementary homogeneous equation:
d 2y dy
+ 3 − 4y = 0 or Ly = 0 (10.2.5)
dx 2 dx
What kind of guess would make sense here? If we tried a sine, then y and y′′ would
both be sines (and might happily cancel) but y′ would be a cosine (and nothing
would cancel it). You can similarly convince yourself without going through the algebra
for y = x 2 and y = ln x and most other basic functions that the terms couldn’t possibly
cancel. But an exponential function is a reasonable guess, since y and y′ and y′′ would
all look similar. Based on this, we try yc = e kx in Equation 10.2.5.
k 2 e kx + 3ke kx − 4e kx = 0
This A and B are not arbitrary constants; we’re hoping to find just the right values
that will make a solution. Substitute Equation 10.2.6 into Equation 10.2.4.
1 The method that we call “Guess and Check” is, under certain circumstances, called “The Method of Undetermined
Coefficients.” The function that you plug in, which we call a “guess,” is sometimes called an “ansatz.” An unhealthy
obsession with impressive-sounding words is sometimes called “pedantry.”
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 489
For this equation to be true for all values of x, the coefficients of sin(2x) on the left
side must add up to zero, and the coefficients of cos(2x) to 50.
−4A − 6B − 4A = 0 and − 4B + 6A − 4B = 50
Solving, A = 3 and B = −4 are the values that make Equation 10.2.6 a solution.
We have found one “particular” solution yp (x) to Equation 10.2.4, and two
“complementary” solutions yc (x) to Equation 10.2.5. Linearity tells us how to put it all
together. Because Lyp = 50 cos(2x), we know that L(2yp ) would give us 100 cos(2x),
which is not a solution: you cannot multiply your particular solution by an arbitrary
constant. But because Lyc = 0 for both yc functions, we know that any linear
combination of these functions will still give us zero, and adding such a function to yp
will still give us 50 cos(2x). So the general solution, the solution with a full suite of
arbitrary constants, is:
We leave it to you to verify that this solves Equation 10.2.4 for any values of the
constants C1 and C2 .
We stressed in that example, not only how to plug in and work with a guess, but how
to begin with a reasonable guess. Below we present some guidelines for specific situations.
While the list is by no means exhaustive, it conveys the kinds of things you will look for.
Whenever all the coefficients are constants this guess will lead to a polynomial to solve
for k. That polynomial will generally have n independent solutions; in “Some Things
That Can Go Wrong” below we discuss cases where it doesn’t.
∙ If the coefficients are descending powers of x (a “Cauchy-Euler equation”), try y = x k .
Step yourself through the algebra on that one. (We skipped several steps.) Pay attention
to how the descending powers of x in the differential equation (x 5 , x 4 , x 3 ) led to the
common power (x k+3 ) that then dropped out.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 490
d 2 y 3n 2 + 5m 4
Example: + y=0
dx 2 (36√ ) (√ )
3n 2 + 5m 4 3n 2 + 5m 4
Solution: y = C1 sin x + C2 cos x
6 6
d 2y dy
Example: (sin x) − (ln x) + 3y = 12
dx 2 dx
Solution: y = 4
That solution is easy to verify: if y is a constant then all its derivatives are zero, so it
works. That gives you one simple solution, and that’s all you need for this phase of the
process.
∙ For an exponential function, guess an exponential function with the same power.
d 2y dy
Example: 2
+ 3 + 2y = e 3x
dx dx
1
Guess: y = Ae 3x → (9A + 9A + 2A)e 3x = e 3x → A=
20
1 3x
Solution: y = e
20
d 2y dy
Example: 3 2
+ 2 + 7y = 14x 3
dx dx
Guess: y = A3 x 3 + A2 x 2 + A1 x + A0
→ 7A3 x 3 + (7A2 + 6A3 )x 2 + (7A1 + 4A2 + 18A3 )x + (7A0 + 2A1 + 6A2 )
= 14x 3
→ 7A3 = 14, 7A2 + 6A3 = 0, 7A1 + 4A2 + 18A3 = 0, 7A0 + 2A1 + 6A2 = 0
12 2 204 912
Solution: y = 2x 3 − x − x+
7 49 343
Your first guess might have been a solution of the form y = Ax 3 . Do you see why that
will generally not be successful without the lower order terms?
Also pay particular attention to the last step of that “guess” where we turned one
equation into four. If two different third-order polynomials equal each other for all
values of x then they must have the same coefficient of x 3 , and the same coefficient of
x 2 , and so on—four equations in all.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 491
∙ When you have sines or cosines, guess both sines and cosines.
d 2y dy
Example: 2
− 4 + 5y = 130 cos(2x)
dx dx
Guess: y = A sin(2x) + B cos(2x)
→ (−4A + 8B + 5A) sin(2x) + (−4B − 8A + 5B) cos(2x) = 130 cos(2x)
→ A + 8B = 0, − 8A + B = 130
Solution: y = −16 sin(2x) + 2 cos(2x)
Make sure you see why a cosine alone will generally not work, but sines and cosines
together will. Also, as in our previous example, pay attention to the step where one
equation becomes two: the coefficient of sin(2x) on the left equals the coefficient of
sin(2x) on the right, and the same for cos(2x).
We cannot possibly list every case, so it’s better to see the list above as a way of thinking
rather than a checklist. For instance, suppose the inhomogeneous term were 4e 2x cos(3x).
What kind of functions would be worth trying? (See Problem 10.22.) Also be aware that
the guessing is harder (but sometimes possible) when the coefficients on the left are not
constant.
As a final (and very important) note: if the inhomogeneous term is a sum of terms, lin-
earity saves the day once again. For instance, to solve Ly = 10 + 3 sin x you can solve Ly = 10
and Ly = 3 sin x separately and then add the two solutions.
If this happens: Your guess leads to a polynomial in k that has no real roots.
…try this: Learn to love complex numbers!
If you know how to read complex functions, this solution tells you clearly that the system
oscillates with a period
[ of 𝜋 and a decaying ] amplitude. If you need a real solution, you can
convert to y = e −3x C1 cos(2x) + C2 sin(2x)
This process is covered in more detail in Chapter 3.
If this happens: Your equation has constant coefficients, so you guess an exponential
function. When you solve the quadratic function for k you get only one
root (a “double root”). So now you have one solution but you need two.
…try this: Multiply your solution by x to get a second solution.
In Problem 10.23 you will show that this always works in this situation. In Section 10.9 (see
felderbooks.com) you will learn the method of “Reduction of Order” which can be used to
derive this result.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 492
In Problem 10.24 you will show that this always works in this situation. In Section 10.9 (see
felderbooks.com) you will learn the method of “Reduction of Order” which can be used to
derive this result.
If this happens: You’re solving for a particular (inhomogeneous) solution, and you
stumble upon a solution to the homogeneous equation.
…try this: Multiply your guess by x. (Yes, we gave you the same advice above, but
for a totally different problem.)
If multiplying by x gives you another solution to the homogeneous equation, try multiplying
by x 2 instead.
(a) Suppose y1 and y2 are both solutions of (a) We begin with the specific case of y′′ −
this equation. What is L(y1 + y2 )? 10y′ + 25y = 0. Guess a function of the
(b) In general, y1 + y2 is not a solution form e kx and show that you find one
to Equation 10.2.2. Under what value of k, and therefore one solution
circumstances is it a solution? with one arbitrary constant.
10.5 Consider the equation: (b) Multiply your answer from Part (a) by x
and show that the resulting function is
d2 d also a valid solution of the differential
Ly = 10 where L = x3 + x2 −x
dx 2 dx equation. Based on this, write the com-
(10.2.8) plete solution with two arbitrary constants.
(a) Show that yc = x and yc = 1∕x are (c) The more general case is y′′ + 2Ry′ +
both solutions to the complemen- R 2 y = 0 where R is any constant.
tary homogeneous equation. Repeat Parts (a) and (b) for this
(b) Show that y = (ln x)∕x is not a solution to differential equation to show that
either the original equation or the com- the trick always works.
plementary homogeneous equation. 10.24 The Explanation (Section 10.2.2) discusses
(c) Find a solution to Equation 10.2.8. Hint: the unusual case where you plug in the
The function that did not work in Part (b), guess x k and the equation for k has only
and the fact that L is a linear operator, will one solution, a “double root.”
enable you to just write down this solution. (a) We begin with the specific case of
(d) Write the general solution to x 2 y′′ + 3xy′ + y = 0. Guess a function
Equation 10.2.8. of the form x k and show that you find
10.6 Prove that if L is a linear opera- one value of k, and therefore one solu-
tor then L 2 is also. tion with one arbitrary constant.
(b) Multiply your answer from Part (a) by ln x
In Problems 10.7–10.22 find the general solution y(x) and show that the resulting function is
to the given linear differential equation. Assume any also a valid solution of the differential
letter other than x or y is a constant. equation. Based on this, write the com-
plete solution with two arbitrary constants.
10.7 2y′′ + 7y′ − 4y = 20
(c) The more general case is x 2 y′′ + (2R +
10.8 2y′′ + 7y′ − 4y = e x 1)xy′ + R 2 y = 0 where R is any con-
10.9 2y′′ + 7y′ − 4y = 3e −4x stant. Repeat Parts (a) and (b) for
( )
10.10 y′′ = −k 2 y + a sin(px) k≠p this differential equation to show
that the trick always works.
10.11 y′′ = −9y + 26e 2x
10.25 The “Fourier transform” of the function
10.12 y′ = 2y + 3x + 5
f (x) is defined by the formula:
10.13 y′ + ky = px 2
10.14 y′′ − 4y′ + 4y + 169 cos(3x) = 0 1
∞
f̂(p) = f (x)e −ipx dx
10.15 y′′ + 2y′ + 5y = 25x 2𝜋 ∫−∞
10.16 y′′ + ay′ + by + c = 0
10.17 y′′ + ay′ + by + c + dx = 0 Is the operator that turns f (x) into f̂(p) a lin-
ear operator or not? Prove your answer.
10.18 x 2 y′′ − 6xy′ − 18y + 6 = 0
10.19 3x 3 y′′ + x 2 y′ − 8xy = 24x In Problems 10.26–10.29 a block of mass m is
10.20 ax 3 y′ + bx 2 y = c Hint: for the particu- attached to a spring with spring constant k. In
lar solution start with the guess yP = addition to the spring force the block experiences a
Ax k and solve for A and k. damping force FD = −bv where b is a constant and v is
10.21 y′′ = −9y′ − 18y + 27e −x − 34 cos(4x) the block’s velocity. For each set of constants given,
write the differential equation for the block’s motion
10.22 y′′ − 2y′ + y = 8e 3x cos(2x)
and find the general solution to it. If initial
conditions are given, plug those in and solve for the
10.23 The Explanation (Section 10.2.2) discusses values of the arbitrary constants. Finally, give a brief
the unusual case where you plug in the qualitative description of the resulting motion.
guess e kx and the equation for k has only
one solution, a “double root.” 10.26 m = 2 kg, b = 6 N⋅s/m, k = 4 N/m.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 494
V C = 10–6 F
L = 10 H
This is a big and important topic, so we have broken it into two sections. This section dis-
cusses variable substitution in general. Section 10.8 focuses on three special cases: Bernoulli’s
equation, a particular form of equation that is unfortunately called “homogeneous,” and
equations where the dependent variable appears only in its derivatives.
You should be able to quickly convince yourself that none of the techniques we’ve discussed
so far will work on this equation. It’s not separable, it’s not linear, it’s not exact, and there’s
no obvious guess to try. Since the problem is the x − 2t term in the exponential, we’ll see if
the substitution u = x − 2t simplifies things. (It will.)
When you evaluate an integral with u-substitution you have to find a relationship between
du and dx, and use that relationship to replace dx with du in your integral. Similarly, we
have to find the relationship to replace dx∕dt with du∕dt in our equation. To do that we
differentiate both sides of u = x − 2t with respect to t. That gives us
du dx
= −2
dt dt
We could solve this for dx∕dt and plug that into Equation 10.7.1, but it’s a bit easier to plug
Equation 10.7.1 into this relationship.
du dx
= − 2 = e x−2t = e u
dt dt
Now we have a separable equation, which gives us ∫ e −u du = ∫ dt, which becomes −e −u =
t + C, and thus u = − ln(C − t). (We replaced C with −C but since C just means “any number”
we kept the same letter.) Finally, we plug in u = x − 2t to get the solution x(t).
x = 2t − ln(C − t)
You can easily check that this solves Equation 10.7.1. As always, it’s worth taking a moment to
consider what this solution is telling us. It says that at some finite time t = C (that depends
on the initial condition) x will approach infinity. That’s not surprising for a function whose
rate of growth goes like the exponent of itself.
More important for our current purpose, though, is how we found that solution. We will
show many examples of variable substitution throughout this section, but they will all follow
the same pattern. Starting with a differential equation for x(t)…
1. Define a new variable u as a function of x and/or t. Use the rules of derivatives to
rewrite dx∕dt (and any higher order derivatives in your equation) in terms of u.
2. Rewrite the differential equation for that new variable. If you chose the right substitu-
tion, the payoff will be an easier differential equation than the one you started with.
Which brings us to…
3. Solve it!
4. Finally, convert back to the original variables.
You have used this process every time you evaluated an indefinite integral with u-substitution.
First, define u(x). Then rewrite the integrand in terms of u, but be careful to also define du
in terms of dx. Integrate the resulting function of u, and finally convert back to x. (In fact
u-substitution is a special case of this method since taking an integral is a special case of
solving a differential equation; evaluating ∫ x sin(x 2 ) dx means solving the equation df ∕dx =
x sin(x 2 ).)
dx 1
= (10.7.2)
dt t + 3x − 4
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 497
Solution:
Looking at the question, it seems at least reasonable to try this substitution.
u = t + 3x − 4 (10.7.3)
So the right side of Equation 10.7.2 becomes 1∕u. To convert the left side, we need to
take the derivative of both sides of Equation 10.7.3 with respect to t:
du dx
=1+3
dt dt
With that substitution in place, Equation 10.7.2 becomes:
( )
1 du 1
−1 =
3 dt u
We can put this in a separable form and solve it.
du 3+u
=
dt u
u
du = dt
∫ 3+u ∫
u − 3 ln |3 + u| = t + C
t + 3x − 4 − 3 ln |3 + t + 3x − 4| = t + C
x − ln |t + 3x − 1| = C
That is the answer, in the simplest form we’re going to find it. We can’t solve it for x.
But we can check it by taking the derivative of both sides with respect to t:
( )
dx 1 dx
− 1+3 =0
dt t + 3x − 1 dt
It takes a bit of algebra, but you can solve this for dx∕dt and it comes back around to
Equation 10.7.2, proving that the solution works.
How did we know to define u = t + 3x − 4 in the example above? This is not an exact
science, but it looked like a reasonable place to start. As with the method of guess and check,
our best advice is “try stuff, but be smart.”
We assume that you know how( to)use the chain rule(to)find the derivative of a composite
function. For instance, if x = sin t 2 then dx∕dt = cos t 2 2t.
To understand the chain rule more generally, consider the breakdown of functions in that
example. We can write the problem like this:
x(u) = sin(u)
u(t) = t 2
Take a moment to calculate dx∕du and du∕dt and convince yourself that Equation 10.7.4
does represent
( ) the steps you take when you use the chain rule to find the derivative of
x = sin t 2 . If you’re not completely convinced (“I’ll take their word for it” doesn’t count),
keep playing with it until you see it.
So what about second derivatives? Students often write d 2 x∕dt 2 = (d 2 x∕du 2 )(d 2 u∕dt 2 )
which has the advantage of being a simple, reasonable extension of Equation 10.7.4 and the
disadvantage of being wrong. Instead, you have to remember two key facts.
∙ The “second derivative” is shorthand for “the derivative of the derivative,” so we can
express it in terms of first derivatives:
( )
2 d dx
d x dt
≡
dt 2 dt
∙ Equation 10.7.4 applies to any function. If you replace x with q or p or r it still works.
If you replace x with dx∕dt that works too—and it gives us a correct formula for the
second derivative.
We have now given you enough information to write the correct chain rule for the second
derivative. Try it on your own, and then check to make sure you got the same thing we got:
( ) ( )
⎛ dx ⎞
d dx
dt ⎜ d ( )
dt ⎟ du
=⎜ ⎟ using the chain rule to find a second derivative
dt ⎜ du ⎟ dt
⎝ ⎠
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 499
If you got all that, you got the hard part. Now it’s time to put it into action.
What you start with What you want What you have to work with
dx d 2x dx d 2x du
and 2 and 2
dt dt du du dt
The tool for making these conversions is the chain rule, which is why we discussed it above.
√ a moment to write the relationship between dx∕dt and du∕dt, using the
Based on that, take
function u(t) = t.
Done? Hopefully you wrote something like this. The first step is the chain rule—you know
we got it right because it looks like canceling fractions! In the second step we plug in our
known du∕dt.
( )( )
dx dx du
=
dt du dt
( )
dx 1 dx
= √ (10.7.6)
dt 2 t du
That gives us the relationship between the two first derivatives. For the second derivatives we
take the derivative of both sides of Equation 10.7.6 with respect to t.
( )
dx
d
d 2x 1 dx 1 du
=− + √ (10.7.7)
dt 2 4t 3∕2 du 2 t dt
And what do we do with that awkward last term? This is where the chain rule comes in again.
Pay careful attention to how the equation below is exactly the same as Equation 10.7.4 (the
generic chain rule), but with different functions.
( ( )
)
⎛dx dx ⎞
d ⎜ d
du du ⎟ du
( )
=⎜ ⎟
dt ⎜ du ⎟ dt
⎝ ⎠
( )
dx
d du ( )
1 d 2x
= √ 2
dt 2 t du
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 500
Now that we have the relationships between the derivatives we started with and the derivatives
we want, we are ready to plug everything back into Equation 10.7.5.
( ) √ 1 ( dx )
1 d 2x 1 dx
− √ + (1 + t) √ − 3x = 0
2 du 2 2 t du 2 t du
d 2 x dx
+ − 6x = 0
du 2 du
guess x = e ku quickly leads to the solution x(u) = Ae −3u + Be 2u . Finally, plug back in
The √
u = t to get the general solution.
√ √
t t
x(t) = Ae −3 + Be 2
You can (and should) plug this into Equation 10.7.5 and check that it works.
The intimidating part of this process is finding the relationships between the x(t) deriva-
tives and the x(u) derivatives. The good news is that those steps will be essentially identical
in every such problem.
1. Evaluate the first derivative using the chain rule.
2. Take the derivative of your result to get the second derivative. This will introduce a
mixed partial.
3. Use the chain rule again to relate that mixed partial to the second derivative you want.
Note in the example below how we take all the same steps, in the same order, as in the
previous example.
Question: Use the substitution u = sin t to solve the following differential equation.
d 2x dx
+ tan t + (cos2 t)x = 0 (10.7.8)
dt 2 dt
Solution:
To perform the substitution we have to rewrite dx∕dt and d 2 x∕dt 2 in terms of u. We
can find dx∕dt directly from the chain rule.
dx dx du dx
= = cos t (10.7.9)
dt du dt du
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 501
To find d 2 x∕dt 2 we begin by taking the derivative of both sides of Equation 10.7.9
with respect to t.
( )
⎛ dx ⎞
2 ⎜ d du ⎟
d x dx
= − sin t + cos t ⎜ ⎟
dt 2 du ⎜ dt ⎟
⎝ ⎠
d 2x dx dx
cos2 t − sin t + sin t + (cos2 t)x = 0
du 2 du du
Cancelling the two middle terms and dividing by cos2 t gives us the simple harmonic
oscillator equation x ′′ (u) + x(u) = 0, with solution x(u) = A cos u + B sin u. Finally,
putting this back in terms of t:
Hopefully you were able to follow each step of that example, and could do the same thing
yourself as long as we told you what substitution to do. Unlike our earlier examples, though,
this is probably not a substitution that would have occurred to you. It certainly seems rea-
sonable to try a trig function since there are a couple of them sitting in the equation, but it
might have taken some trial and error to settle on u = sin t.
Sometimes the best strategy, just as with the guess and check approach, is to leave a k in
your substitution and see what the math does. For example if you have an equation with a
bunch of exponentials try u = e kt and see if there’s any value of k for which the equation
simplifies a lot. (See Problem 10.141.) There’s no guaranteed formula for finding the right
substitution, but as with u substitution for integrals, the more you practice the better your
guesses get.
Stepping Back
It’s not entirely unfair to think of variable substitution as two completely different techniques.
(In the following description we are using x to mean “the dependent variable of your differ-
ential equation,” t to mean “the independent variable of your differential equation,” and u
to mean “the substitution variable.” Of course different letters can be used!)
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 502
The overall outline is the same in both cases: choose a substitution, rewrite the differen-
tial equation, solve, and then substitute back. But the “rewrite the differential equation”
step—especially replacing the derivatives—is where beginners get lost. For that step all u(t)
problems follow our solutions to Equations 10.7.1 and 10.7.2, and all x(u) problems follow
our solutions to Equations 10.7.5 and 10.7.8. Of course, you still have to solve what you get—
so the more equations you can solve without variable substitution, the more use you can make
of this technique.
10.127 2x d 2 x∕dt 2 + 2(dx∕dt)2 + 49x 2 = 0, u = x 2 g so that the first derivative term x ′ (u)
2 2
10.128 d x∕dt + (3 + 3e −3t
)dx∕dt − 18e −6t
x = 0, vanishes in this equation?
u = e −3t 10.141 In this problem you will solve the differential
10.129 d 2 f ∕dx 2 + (− cot x + tan x)df ∕dx + equation x ′′ (t) − 3x ′ (t) + 36e 6t x(t) = 0.
(tan2 x)f = 0, u = cos x (a) Substitute u = e kt into this equation,
where k is an as-yet-unknown con-
10.130 y′′ (x) − (2∕x)y′ (x) − 9x 4 y(x) = 0, u = x 3
stant. Write the differential equation
for x(u), with no t in it.
In Problems 10.131–10.139, solve the given (b) What must k equal so that the differ-
differential equation by substitution. (If you try ential equation for x(u) has no first-
something that doesn’t work, try something else! You order derivative term in it?
may need to try a generic form like u = t k or u = e kt .) (c) Write the differential equation for
x(u) using the value of k you found in
10.131 dy∕dx = (3x − 3y + 1)2 (Your substitu- Part (b) and solve it for x(u).
tion will lead you to a separable equation
(d) Rewrite your solution in terms of t to find
that can be integrated using partial
the solution x(t) to the original equation.
fractions.)
(e) Plug in your solution x(t) and
10.132 t(d 2 x∕dt 2 ) − dx∕dt + 4t 3 x = 0
verify that it works.
10.133 dy∕dx = e 6y−2x + (1∕3)
10.142 The “Bessel functions” J0 (y) and Y0 (y) are the
10.134 dy∕dx = e 6y−2x Hint: this problem ends solutions f (y) to the differential equation
up involving a somewhat tricky inte-
gral, but it can be done with the right yf ′′ (y) + f ′ (y) + yf (y) = 0 (10.7.10)
u-substitution.
10.135 d 2 x∕dt 2 + (3e 2t − 2)(dx∕dt) + 2e 4t x = 0 Bessel explored them in detail, but they were
dx (x − 2t)2 + 2(x − 2t) + 1 originally discovered by Daniel Bernoulli
10.136 = while studying standing waves in a hang-
dt x − 2t
10.137 d 2 x∕dt 2 + [8 sin(2t) − 2 cot(2t)] dx∕dt + ing chain. If the chain is suspended from
12 sin2 (2t)x = 0 one end and the other end is unattached,
( ) the standing waves obey the equation
dy 1
10.138 2y −1= 2 + y2 − t − 3
dt y −t −3
𝜔2
10.139 dy∕dx = sec(x 2 − y) + 2x yf ′′ (y) + f ′ (y) + f (y) = 0 (10.7.11)
g
u. Your final answer will involve Bessel (c) Find the value of k that elimi-
functions. It should depend on y, 𝜔, nates the du∕dt term from the dif-
g , and two arbitrary constants. ferential equation.
10.143 The “Legendre polynomials” Pn (x) (d) Using this value of k, simplify the
and Qn (x) are the solutions f (x) to differential equation for u(t) and
the differential equation find its general solution.
( ) (e) Using your solution for u(t) and the sub-
1 − x 2 f ′′ (x) − 2x f ′ (x) + n(n + 1)f (x) = 0 stitution you made, find the general
(10.7.12) solution for 𝜙(t) in this model.
where n is an integer. These functions arise (f) Describe how 𝜙(t) evolves over time.
in many problems in spherical coordi-
nates. For example, suppose the potential 10.145 In one simple model a(t) = 𝛼t 1∕2 and
V on a spherical shell is a known func- V (𝜙) = (1∕4)𝜆𝜙4 , where 𝛼 and 𝜆 are con-
tion of the polar angle 𝜃. (In other words, stants. In this problem you will handle this
we’re assuming it doesn’t depend on the model the same way you attacked Prob-
azimuthal angle 𝜙.) In Chapter 11 we show lem 10.144; in Problem 10.146 you will see a
that the potential inside the shell equals better way.
r n (for some integer n) times a function (a) Write Equation 10.7.14 using these
f (𝜃), which is given by the equation: functions a and V .
(b) Plug in a substitution u = t k 𝜙 and sim-
f ′′ (𝜃) + cot(𝜃)f ′ (𝜃) + n(n + 1)f (𝜃) = 0 plify the resulting equation for u(t).
(10.7.13) (c) Find the value of k that elimi-
Use the substitution x = cos 𝜃 to transform nates the du∕dt term from the dif-
Equation 10.7.13 into Equation 10.7.12, and ferential equation.
then write the general solution f (𝜃). Your
(d) Using this value of k, simplify the dif-
final answer will include Legendre polynomi-
ferential equation for u(t). This will
als, and it should only depend on 𝜃 and n.
not be an equation you can solve by
hand, but in this form it would be eas-
Problems 10.144–10.147 are about the theory of ier for a computer to solve numeri-
inflation, which describes the universe a fraction of a cally than the original equation.
second after the big bang. According to this theory
the universe at that time was dominated by an exotic 10.146 In one simple model a(t) = 𝛼t 1∕2 and
form of energy known as the “inflaton field.” For V (𝜙) = (1∕4)𝜆𝜙4 , where 𝛼 and 𝜆 are con-
these problems all you need to know about the stants. You approached this problem
inflaton field is that it’s generally designated by the with a dependent variable substitution in
letter 𝜙, and that under certain simplifying Problem 10.145; here you will substitute
assumptions it obeys the differential equation twice, once for the dependent and once
for the independent variable, and make
d 2 𝜙 3 da d𝜙 dV the equation even simpler. It still won’t
+ + =0 (10.7.14) be solvable as such, but we don’t really
dt 2 a dt dt d𝜙
expect that in the real world. (You don’t
The function V (𝜙), known as the “potential function” need to have done Problem 10.145 to do
for 𝜙, is different in different models of the theory. this one.)
The function a(t), known as the “scale factor,” (a) Write Equation 10.7.14 using these
describes the expansion of the universe and has to be functions a and V .
calculated from other equations. Depending on the
(b) Plug in a substitution u = t 1∕2 𝜙 and sim-
functions V and a, Equation 10.7.14 can be solved
plify the resulting equation for u(t).
either analytically or numerically, but either way it is
easier to solve if you use a variable substitution to (c) Plug the substitution 𝜏 = t k into
eliminate the d𝜙∕dt term in the middle. your equation for u(t) to get an
equation for u(𝜏).
10.144 In one simple model a(t) = 𝛼t 2∕3 and V (𝜙) = (d) Choose the value of k that simplifies your
(1∕2)m 2 𝜙2 , where 𝛼 and m are constants. equation for u(𝜏) as much as possible.
(a) Write Equation 10.7.14 using these (You can’t use k = 0. Do you see why?)
functions a and V . (e) The equation you ended up with
(b) Plug in the substitution u = t k 𝜙 and sim- is still not simple to solve, but you
plify the resulting equation for u(t). should be able to look at it and
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 505
describe how it will behave. Based on 10.148 Make your own: dependent variable sub-
your differential equation, sketch a stitution. Write a differential equation
plot of u(𝜏). Assume 𝜆 > 0. that can be solved with the substitution
(f) Based on the sketch of u(𝜏) you just u = t 2 − x, and solve it. (One way to do this
made, sketch a plot of 𝜙(t). is to start with an equation that has that
term in it, make the substitution, and then
10.147 In many cases the function a(t) is not known
if it doesn’t simplify enough modify the
before solving the equation, but a substitu-
equation you started with as needed.)
tion can still simplify the equation. Plug the
substitution u = a s 𝜙 into Equation 10.7.14 10.149 Make your own: independent variable sub-
and find the value of the constant s that stitution. Write a differential equation that
eliminates the first derivative term in the can be solved with the substitution u = t 3 ,
resulting equation for u(t). Hint: you will and solve it. (One way to do this is to start
need to use the product rule and the chain with a differential equation for x(u) that
rule to find d𝜙∕dt in terms of du∕dt, a, you know you can solve, and then use
and da∕dt. the substitution t = u 1∕3 to see what x(t)
equation needs this substitution.)
the piled-up parts, so effectively gravity acts only on the dangling part with mass m.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 506
Because our goal is to write an equation for v(x) we want to get rid of those time derivatives.
The first one can be replaced by using the chain rule.6
( )( )
dv dv dx dv
= =v
dt dx dt dx
For the second one we note that m = 𝜆x and 𝜆 is a constant.
dm dx
=𝜆 = 𝜆v
dt dt
Since 𝜆 is constant we could just leave it in the equation, but then the other two terms would
include the function m(t). If we replace 𝜆 with m∕x we can cancel m from all of the terms.
Our equation now looks like this.
dv v 2
g =v +
dx x
That’s the equation we’re looking for. It is an example of a “Bernoulli equation” for v(x).
More generally, a Bernoulli equation for a function y(x) is one that can be written in the
following form.
For n = 0 or n = 1 this is a linear equation, and we can use the techniques of Section 10.4
(see felderbooks.com). For any other value of n we can turn this into a linear equation in
two steps.
1. Divide both sides of the differential equation by yn .
2. Use the substitution u = y1−n .
In Problem 10.171 you will use this trick to solve for the motion of our falling chain. In
Problem 10.172 you will show more generally that this process turns any Bernoulli equation
into a linear equation. (Note in particular that if n = 0 then both steps do nothing at all!)
This technique only works if the equation is written in the form of Equation 10.8.1. If there is a factor
in front of dy∕dx, start by dividing it out.
dy 2
+ y = x 6 y3 (10.8.2)
dx x
6 The chain rule! Get it? We’re using the chain rule! Oh, never mind.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 507
Next we apply the substitution u = y−2 . This means that du∕dx = −2y−3 (dy∕dx), so we
will replace y−3 (dy∕dx) in the equation with (−1∕2)(du∕dx).
( )
1 du 2
− + u = x6
2 dx x
Now we are working with a first-order linear equation, and we can approach it with
the technique discussed in Section 10.4 (see felderbooks.com). First multiply both
sides by −2x −4 .
du
x −4 − 4x −5 u = −2x 2
dx
The left side is now the derivative of x −4 u (confirm that!) so integrating both sides
gives us x −4 u = −(2∕3)x 3 + C, or u = (−2∕3)x 7 + Cx 4 . Finally we go back; replace u
with y−2 and solve the resulting equation for y.
1
y= √
−(2∕3)x 7 + Cx 4
(Be careful not to confuse this with the more common use of “homogeneous differential equation”
described on Page 487.)
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 508
You will show in Problem 10.173 that the substitution u = y∕x turns any this-kind-of-
homogeneous differential equation into a separable equation. In Problem 10.169 you’ll
show that the moth trajectory equation above can be written in this form and you’ll use this
substitution to solve that equation.
Question: Solve
dy y2 + xy
= 2 (10.8.4)
dx x − xy
Solution:
We can rewrite this in the form dy∕dx = f (y∕x) by dividing the top and bottom of the
fraction by x 2 .
dy (y∕x)2 + (y∕x)
=
dx 1 − (y∕x)
Since this equation is homogeneous we make the substitution u = y∕x. Writing this as
y = xu we can rewrite dy∕dx using the product rule
dy du
=u+x
dx dx
Plugging this in, and replacing y∕x with u, we get:
du u2 + u 2u 2
x = −u =
dx 1−u 1−u
Separating variables:
( )
1 1−u 1 1
dx = du = − du
∫ x ∫ 2u 2 ∫ 2u 2 2u
1 1
ln |x| = − − ln |u| + C
2u 2
Finally, replace u with y∕x. Remembering that ln(ab) = ln a + ln b, we can do a little
algebra to get:
x
ln |y| + ln |x| + = C
y
(We replaced 2C with C since they both mean “any constant.”) In some problems of
this type we will be able to solve the final answer for y. In this case we cannot, but we
can still check it by taking the derivative of both sides with respect to x.
1 dy 1 1 x dy
+ + − =0
y dx x y y2 dx
In Section 10.5 (see felderbooks.com) we showed you how to solve equations of the form
P (x, y) dx + Q (x, y) dy = 0 if the equation happened to be exact. Another approach to such
equations is to write them in the form dy∕dx = −P ∕Q and see if the resulting equation is
homogeneous. This method, like several others in this chapter, comes with the following
warning: even if an equation is homogeneous and you can therefore make it separable, that
doesn’t guarantee that you’ll be able to integrate what you get.
dv b
+ v = −g (10.8.6)
dt m
With a bit of algebra we can separate and solve that.
( mg )
dv b
=− v+
dt m b
dv b
= − dt
∫ v + mg ∫ m
b
| mg |
ln ||v + |=−b t +C
| b || m
b mg
v = Ce − m t −
b
To find the position function we were originally looking for, we just integrate the velocity.
This adds another arbitrary constant, independent from the first one:
b mg
x= v(t) dt = C1 e − m t − t + C2
∫ b
You can verify that this is a valid solution to Equation 10.8.5. You can also make sense of
these results intuitively. The v(t) function shows that whether Ben starts going fast or slow,
his velocity will decay quickly toward −mg ∕b. If you plug that into Equation 10.8.6 you will
see that it corresponds to dv∕dt = 0, the “terminal velocity” where the upward force of the
wind perfectly balances the downward force of gravity.
Why that particular substitution? Equation 10.8.5 represents a recognizable special case
because it has derivatives of x, but it does not have any term based on x itself. Whenever you
see such an equation, you can make your new dependent variable u = x ′ (t). This gives you a
differential equation for u(t) that is one degree lower than your original equation for x(t).
Much less obviously this same substitution turns out to be useful for “autonomous
equations,” meaning ones in which the independent variable doesn’t appear. The substitution
v = dx∕dt will convert a second-order autonomous equation for x(t) into a first-order
equation for v(x). See Problem 10.176.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 510
Bernoulli equation. Solve it both ways and 10.171 In the Explanation (Section 10.8.1)
verify that you get the same result. we set up the equation of motion for
10.169 In the Explanation (Section 10.8.1) we a chain falling off a table.
described a moth flying at speed v across (a) Find the general solution v(x)
the street toward a streetlight, while a wind to this equation.
blows it along the street at speed s. We put (b) Find the solution v(x) with ini-
the streetlight at the origin, the moth’s origi- tial condition v(0) = 0.
nal position at (L, 0), and the wind direction (c) The acceleration is dv∕dt. Find it using
as the positive y-direction. To set up the the chain rule, so the right-hand side
equation for the moth’s trajectory y(x), start of your answer will have both x and
with the simplest case where s = 0, so there is v in it.
no wind.
(d) Since you already know v(x), plug
(a) The moth’s horizontal speed is this in to get an expression for a
the x-component of its veloc- that doesn’t have v in it.
ity. Write an equation for dx∕dt
(e) If you chopped off a segment of
in terms of its speed v and its
the chain, it would fall with accel-
position (x, y).
eration g (since we ignored friction
(b) The moth’s vertical speed is and air resistance). From your solu-
the y-component of its veloc- tion, is this chain falling at that rate,
ity. Write an equation for dy∕dt falling faster, or falling slower? Explain
in terms of its speed v and its physically why.
position (x, y).
10.172 Starting with the general form for any
(c) The chain rule says that dy∕dx =
Bernoulli equation, Equation 10.8.1,
(dy∕dt)∕(dx∕dt). Based on everything
divide both sides by yn and then substi-
you have written above, write a dif-
tution u = y1−n . Show that the resulting
ferential equation for the moth’s
equation for u(x) can be put into the form
trajectory y(x). Your answer should
of Equation 10.4.3 (see felderbooks.com),
only depend on x and y.
the generic first-order linear equation.
(d) Solve that differential equation by
10.173 In this problem you will prove that
separation of variables and show that
the substitution u = y∕x always makes
the resulting motion is a straight
a homogeneous differential equation
line toward the origin.
dy∕dx = f (y∕x) separable.
Now you’ll repeat the process with (a) Writing the substitution as y = xu, find
a non-zero wind speed. This has no dy∕dx in terms of u and du∕dx.
effect on the moth’s x-velocity and sim-
(b) Plug this in to rewrite dy∕dx = f (y∕x)
ply adds s to its y-velocity.
in terms of u and x, with no y.
(e) Redo the calculations until you (c) Separate the resulting equation so there
have a new equation for dy∕dx in is only u on one side and x on the other.
terms of x, y, v, and s.
(d) Integrate both sides. You should be
(f) Rewrite that equation in the form able to evaluate the x integral. You
of Equation 10.8.3. will not be able to evaluate the u
(g) Solve to find y(x), using the initial condi- integral because it will contain the
tion that the moth starts at position (L, 0). unknown function f (u). Your final
You may find it helpful to √know that answer will be an equation relating a
( )−1∕2 | |
∫ 1 + x2 dx = ln |x + 1 + x 2 | + C. known function of x to an integral with
| |
respect to u.
(h) Check that your answer to Part (g)
includes your answer to Part (d) 10.174 Show that the differential equation
as a special case.
x a yb dx + x c yd dy = 0
10.170 [This problem depends on Problem 10.169.]
Draw graphs of the moth’s trajectory for is “homogeneous” by the definition
v >> s, v just a little bigger than s, and given in this section, which means it can
v << s. Explain why the graphs make sense be written in the form dy∕dx = f (y∕x),
physically in each of those cases. if and only if a + b = c + d.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 512
10.175 Make Your Own (b) Rewrite the generic equation above as a
(a) Write and solve a Bernoulli equation first-order differential equation for u(y).
that isn’t in this section (including the Now you’re going to consider the
problems) and solve it by dividing by yn equation for an object falling straight
and using the substitution u = y1−n . toward Earth, d 2 r ∕dt 2 = −GME ∕r 2 . Here
(b) Write and solve a homogeneous equation of course the dependent variable is r and
that isn’t in this section (including the independent variable is t. Since the
the problems) and solve it by using new variable is going to be dr ∕dt, which
the substitution u = y∕x. is the object’s velocity, call it v.
(c) Using the substitution v = dr ∕dt,
10.176 Exploration: An Equation With No x
rewrite the equation above as a first-
A differential equation in which the inde-
order equation for v(r ).
pendent variable doesn’t appear explicitly is
called “autonomous.” For example dy∕dx = (d) Solve using separation of variables to
y2 is autonomous but dy∕dx = y2 + x is not. It find an algebra equation relating v
turns out that a second-order autonomous and r . Don’t bother solving for v.
differential equation for y(x) can be reduced (e) Multiply both sides of your equation by
to a first-order equation with the substitution m, the mass of the object, and rewrite
u = dy∕dx. This is the same substitution we it in the form (1∕2)mv 2 +<some func-
used for equations in which y doesn’t appear tion of r >= C, where C is the arbitrary
explicitly, but it works for a very different constant from your integration. (Since
reason in this case. In that case this substi- it’s an arbitrary constant you can just
tution turned a second-order equation for absorb the factor of m into it.)
y(x) into a first-order equation for u(x). In (f) What does this equation represent
this case it’s going to turn a second-order physically? What does the <some func-
equation for y(x) into a first-order equation tion of r > term represent?
for u(y). At this point you might want to finish the
To begin with, consider the generic problem by finding r (t). You can plug in
second-order autonomous equation v = dr ∕dt to get a first-order equation for r (t),
d 2 y∕dx 2 + f (y)dy∕dx + g (y) = 0. but in this particular case it has no simple
(a) Starting with u = dy∕dx, use the solution. (For some autonomous equations
chain rule to calculate d 2 y∕dx 2 in it would.) Nonetheless, knowing the velocity
terms of u and du∕dy. at every position is a useful result by itself.
In Parts 1–6 graph each given function. You should be able to draw all 6 graphs in under five
minutes with no electronic aid.
1. H (x)
2. H (x − 3)
3. H (x) − H (x − 3)
See Check Yourself #68 in Appendix L
4. x 2 H (x)
5. x 2 H (x − 3)
6. (x − 3)2 H (x − 3)
We now define another function:
⎧ 0 x<0
⎪
D(x) = ⎨ 1∕k 0≤x≤k
⎪ 0 x>k
⎩
∞
7. For k = 4 draw a graph of D(x) and calculate ∫−∞ D(x) dx.
∞
8. For k = 1 draw a graph of D(x) and calculate ∫−∞ D(x) dx.
∞
9. For k = 1∕3 draw a graph of D(x) and calculate ∫−∞ D(x) dx.
d 2x dx
m 2
+b + kx = Fe (t)
dt dt
If Fe is an exponential, a trig function, or a power law, this is easy to solve by guess and check.
Suppose, however, that Fe is zero until t reaches some constant value t1 and grows linearly
from that point on. Or suppose that Fe represents a blow from a hammer that imparts an
impulse Δp to the mass in essentially zero time. How would you even write a function for
such a force, never mind solve the differential equation?
To take another example, consider an RLC circuit connected to a power source that sup-
plies a voltage v(t). Assuming the capacitor is initially uncharged, the current i in the circuit
will follow the following equation.8
t
1 di
iR + i d𝜏 + L = v(t) (10.10.1)
∫
C 0 dt
No matter what v(t) is, that integral is going to make this equation hard to solve. Furthermore
the voltage source, like the external force on our spring above, may be discontinuous: for
instance, the power source supplies a constant voltage for t1 seconds and is then turned off.
Section 10.11 will present the method of Laplace transforms, which will allow us to
solve all the differential equations described above. In this section we introduce the three
mathematical tools we will need for this method. First we discuss Heaviside functions and
Dirac delta functions, which give us a concise notation for some discontinuous functions.
Then we introduce Laplace transforms, which turn these differential equations into algebra
equations.
8 Usually we use capital letters for current and voltage, but here we need I and V for their Laplace transforms. Also,
The Heaviside function really is as simple as it looks, but it gives us a notation that applies
to a wide variety of discontinuous functions. As with any important function, you want to
work with this one enough to be able to “see” graphs in your head with little or no work. The
Discovery Exercise (Section 10.10.1) gave you some practice with this skill; the examples
below will give you more if you sketch each graph quickly on your own before looking at our
answer.
FIGURE 10.7
More properly, we define the Dirac delta function as the limit of a sequence of functions:
⎧ 0 x<0
⎪
𝛿(x) = lim D(x) where D(x) = ⎨ 1∕k 0≤x≤k
k→0
⎪ 0 x>k
⎩
Since 𝛿(x) is defined by the fact that its integral equals 1 (a unitless number), 𝛿(x) has units of 1∕x.
So 𝛿(x) is designed to pack an area of exactly 1 into an infinitesimal width. Strictly speaking
𝛿(x) is not a function at all, but it is often useful to treat it as one. Mathematicians call it a
“generalized function.”
4 4 4 4
2 2 2 2
x x x x
–2 –1 1 2 –2 –1 1 2 –2 –1 1 2 –2 –1 1 2
∞ 10 0.3
You can see that ∫−∞ 𝛿(x)dx and ∫−10 𝛿(x)dx and ∫−2 𝛿(x)dx all give the result 1. The integral
15
over any region across x = 0 is exactly 1. On the other hand, ∫2 𝛿(x)dx or any such region
that does not cross x = 0 is zero.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 516
Just as with the Heaviside function, you have to play with variations to see how we use
the Dirac delta function to represent different situations. For instance, the function 3𝛿(x)
looks visually exactly like 𝛿(x)—it is infinitely high at x = 0 and zero everywhere else—but
the integral under the curve is now 3. The function 𝛿(x − a) is an infinite spike at x = a.
You can put those two variations together to get a very important property: for any
function f (x),
∞
f (x)𝛿(x − a) dx = f (a) (10.10.2)
∫−∞
So just as the Heaviside function can be used to “turn off” a function everywhere except
along some interval, the Dirac delta can turn off a function everywhere except at one point.
As we will see, Equation 10.10.2 is essential when working with differential equations that
involve Dirac delta functions.
d 2x dx
m +b + kx = A𝛿(t − t1 ) (10.10.4)
dt 2 dt
Case 3: The voltage in an RLC circuit is turned on for a time t1 and then turned off again.
The capacitor is initially uncharged.
1
t
di [ ]
iR + i d𝜏 + L = V0 H (t) − H (t − t1 ) (10.10.5)
∫
C 0 dt
Below we introduce a tool called “Laplace transforms.” In Section 10.11 we will use Laplace
transforms to solve Equation 10.10.5, and you will solve the other two in the problems in this
section.
Laplace Transforms
Our final topic in this section is an operator, a formula for transforming one function into
a different-but-related function. (If you are familiar with Fourier transforms, Laplace trans-
forms are in many ways analogous.)
[ ] ∞
f (t) = f (t)e −st dt (10.10.6)
∫0
[ ]
We often use a lowercase letter for a function and uppercase for its Laplace transform, so f (t)
is designated as F (s).
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 517
This of course does not mean that 3t and 3∕s 2 are the same function, but they are
related to each other: the Laplace transform of 3t is 3∕s 2 .
−1 [3∕s 2 ] = 3t
If you want to follow the math in that example, take it slowly. We evaluated the antideriva-
tive using integration by parts; you can replicate that on your own, or confirm our work by
taking the derivative of the answer. (In either case, remember that t is the variable and s is a
constant!) Then we applied the limits of integration, but even that is not entirely straightfor-
ward; an integral to ∞ is called an “improper” integral and must be evaluated with a limit,
[ ] [ ]
f1 (t) = f2 (t) then f1 and f2 can differ at individual points, but cannot differ over any interval of finite
10 If
and lim (−3t∕s)e −st requires l’Hôpital’s rule. And all that was just for f (t) = 3t! You can see
t→∞
why people don’t want to go through this very often for complicated functions.
On the other hand, for functions based on the Heaviside or Dirac delta functions, the “by
hand” approach is often the way to go.
This follows from the linearity of the integral operator itself, but its importance cannot be
overstated. When we take the Laplace transform of both sides of a differential equation, we
will take Laplace transforms one term at a time—an easy step to miss, but it is only allowed
because the transform is linear. (If you squared both sides, you couldn’t square one term at
a time!)
2. The Laplace transform turns derivatives into multiplication.
Those constants are f (0) and f ′ (0) with lowercase “f ”s; they are the initial conditions for the
original function, not for its transform. You will prove these two rules in Problem 10.228,
and extend them to higher order derivatives in Problem 10.229. Because of these rules, the
Laplace transform turns a differential equation into an algebra equation.
3. The Laplace transform turns integrals into division.
[ t ]
F (s)
f (𝜏) d𝜏 = (10.10.9)
∫0 s
We would be remiss if we did not mention one other important property: a function f (t)
will have a Laplace transform if it meets the following conditions.
1. The function must be “piecewise continuous.” This means that on any finite inter-
val, you must be able to describe f as a finite number of continuous functions. So a
function that is 0 in [0, 1), then 1 in [1, 2), then 0 in [2, 3), then 1 in [3, 4), and so on
forever is piecewise continuous. A function that is 0 for all rational numbers and 1 for
irrational numbers is not.
2. There must be some constants M and k such that |f (t)| ≤ Me kt for all t-values.
Informally, we can say that a function can only have a Laplace transform if it grows no faster
2
than an exponential function. This means you cannot take the Laplace transform of e t , since it
obviously causes the improper integral in Equation 10.10.6 to diverge.11 But compared to a
Taylor series (which requires the function to be continuous and differentiable) or a Fourier
transform (which requires the function to approach zero as x → ±∞), Laplace transforms
are remarkably unrestricted.
Appendix H summarizes much of the information given here on Laplace transforms. It
also contains a brief table of Laplace transforms, and discusses the use of “partial fractions”
and “convolutions” in the use of such a table.
Throughout these problems, H (x) refers to the Heaviside function and 𝛿(x) to the Dirac delta
function.
[ 2]
11 We asked Mathematica for e t and were surprised to get an answer rather than an error. It turns out there
is a generalization of the definition that allows for Laplace transforms of such “superexponential” functions. The
idea was first proposed in Vignaux, JC. 1939. Sugli integrali di Laplace asintotici. Atti. Acad. Naz. Lincei, Rend. Cl. Sci.
Fis. Mat., 29: 396-402 and worked out in greater detail in Deakin, MAB. 1994. Laplace transforms for superexponential
functions. J. Aust. Math. Soc. Ser. B, 36: 17–25.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 520
10.204 sin(x − 𝜋∕2)H (x − 𝜋∕2) Equation 10.10.6. (You may of course want to use a
x
10.205 e [H (x) − H (x − 1)] + computer to check yourself.)
(10 − x) [H (x − 3) − H (x − 5)] 10.215 k (a constant function)
10.206 𝛿(x + 2) 10.216 e kt
10.207 𝛿(x + 2) + 𝛿(x − 2) 10.217 t
10.208 2𝛿(x + 1) 10.218 t 2
In Problems 10.209–10.212 write a function that 10.219 H (t − 2) − H (t − 5)
matches the given graph. 10.220 [H (t) − H (t − 10)] e kt
10.221 [H (t) − H (t − 1)] t
10.209 10.222 𝛿(t − 10)
1
10.223 𝛿(t − 3) + 𝛿(t − 4)
t 10.224 (ln t)𝛿(t − 4)
–1 1 2 3 4 5
10.225 Problem 10.209.
10.226 Problem 10.211. Hint: you may find the
10.210 integrals on Page 485 useful.
1
The Laplace transform on the right is easy to evaluate by hand. It may help if you draw
the function; it is 9 on the domain 0 ≤ t ≤ 3 and zero everywhere else.
∞ 3
9 |3 9 ( )
V (t)e −st dt = 9 e −st dt = − e −st || = 1 − e −3s
∫0 ∫0 s |0 s
The Laplace transform on the left requires the properties of Laplace transforms. In
the first line below, we use the property of linearity to separate the terms and pull out
constants. In the second line we use the properties of derivatives and integrals to turn
the differential equation into an algebra equation. Note our use of I (s) for [i(t)].
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 523
I 9( )
3I + + 2sI = 1 − e −3s
s s
2. Solve the algebra equation. ( )
9 1 − e −3s
I =
3s + 1 + 2s 2
We’re done, right? We wanted to find I and we have!
Actually, a lot of electrical engineers would agree with that statement. They often
work with the Laplace transforms directly, without bothering to convert back. But for
our purposes let’s finish the process and find the actual current i instead of its trans-
form I .
3. Take the inverse Laplace transform. We plugged that function into a computer, simplified
its answer a bit, and got:
[ t ( 3−t ) ]
i(t) = 9 e − 2 − e −t + e 3−t − e 2 H (t − 3) (10.11.2)
We always urge you to stop and look at your final solution and see what it’s telling you
(and if it makes sense), but as the methods become more complicated and more computer-
dependent this step becomes even more important. Here are a few things we can see from
Equation 10.11.2.
∙ The Heaviside function in the middle means that the current follows one curve while
the power source is on (before t = 3) and a different curve after the source is switched
off. Given the scenario, it would be quite surprising if it were otherwise!
∙ Because the term in parentheses (multiplied by the Heaviside function) goes to zero
at t = 3, the current is continuous even when the voltage is not. (The inductor resists
sudden changes in current.)
∙ Over time, the current decays to zero. (The resistor gradually drains off all the energy.)
You can of course check the solution by plugging it back into Equation 10.11.1, but a bit of
care is required with the integral. You’ll go through that in Problem 10.269.
d 2q dq
2 2
+ 3 + q = sin2 t with initial conditions q(0) = 4, q ′ (0) = 5
dt dt
12 Eagle-eyedobservers may recognize that this models the same circuit as Equation 10.11.1 with a different driving
function. We rewrote it as a second-order differential equation for q instead of a first-order integro-differential
equation for i, not for any compelling physical reason, but to demonstrate a range of Laplace transform problems.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 524
Solution:
We will follow the same three-step recipe as above.
1. Take the Laplace transform of both sides of the equation.
[ 2 ]
d q dq [ ]
2 2 + 3 + q = sin2 t
dt dt
On the left we will use the properties of Laplace transforms to write every-
thing in terms of Q , the as-yet-unknown Laplace
[ ] transform of our solution.
On the right the computer tells us that sin2 t = 2∕[s(s 2 + 4)].
[ ] [ ]
d 2q dq [ ] 2
2 + 3 + q =
dt 2 dt s(s 2 + 4)
[ ] [ ] 2
2 s 2 Q − sq(0) − q ′ (0) + 3 sQ − q(0) + Q =
s(s 2 + 4)
[ ] [ ] 2
2 s 2 Q − 4s − 5 + 3 sQ − 4 + Q =
s(s 2 + 4)
Notice that the properties of Laplace transforms brought in the initial conditions.
2. Solve the algebra equation.
2
Q (2s 2 + 3s + 1) − 8s − 10 − 12 =
s(s 2 + 4)
2
Q (s + 1)(2s + 1) = 2 + 8s + 22
s(s + 4)
8s 4 + 22s 3 + 32s 2 + 88s + 2
Q =
s(s + 1)(2s + 1)(s 2 + 4)
3. Take the inverse Laplace transform. Once again we hit the computer.
1 1 [ ] 68 −t 290 −t∕2
q(t) = + 7 cos(2t) − 6 sin(2t) − e + e
2 170 5 17
If you plug that solution into the original differential equation you end up with
[1 − cos(2t)]∕2, which is the same as sin2 t.
Remember that a Laplace transform assumes a domain from 0 to ∞. If you need to model a
current that turns on at t = −2 consider shifting your time variable. A tricky case arises when
the right side of a differential equation contains 𝛿(t) because this causes a discontinuous
change at t = 0. The initial conditions at that point are ambiguous—do they occur before or
after the instantaneous delta function?—and the Laplace transform method can give wrong
answers. One way to solve this is by putting your delta function at a positive time and then
taking the limit as that time approaches zero. See Problem 10.265.
A Laplace transform, by contrast, is more a trick for calculation than a tool for insight.
This section is therefore about solving equations rather than interpreting transforms. But
consider the following situation, which is unfortunately common. You start with a differential
equation. You take the Laplace transform and solve the algebra. Then you ask a computer
for the inverse Laplace transform…and there isn’t one. “This is the Laplace transform of
the solution” is as far as you’re going to get. At that point, anything you can learn about the
original function is going to come directly from its Laplace transform.
So we’re going to give you a quick rule for interpreting Laplace transforms.
That rule isn’t obvious from anything we’ve said here, although you can get some idea
of why it’s true by verifying that the Laplace transform of e s0 t is 1∕(s − s0 ). (Section 10.10
Problem 10.216.) You will see more generally why that rule works when you learn how to
find an inverse Laplace transform in Chapter 13. Here we want to give a few examples to
clarify the deliberately vague phrase “goes like.”
As you can see, vertical asymptotes—real numbers that are out of domain in the Laplace
transform—indicate exponential growth and decay. Imaginary numbers give frequencies of
oscillation, just as in a Fourier series. This rule of thumb doesn’t tell you whether f (t) is a
sine or a cosine or exactly what its amplitude is, but it does tell you the general behavior.
The examples above are all simple cases designed to illustrate how to interpret a Laplace
transform, but in those cases it’s easy to find the inverse Laplace transform—by hand, with
a table, or with a computer—so there’s no real need for this interpretation. The example
below illustrates the more typical case of a function that cannot be easily inverse Laplace
transformed.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 526
Problem:
Find the functions x1 (t) and x2 (t) that solve the following equations subject to the
initial conditions x1 (0) = 2, x2 (0) = 1∕3, and ẋ 1 (0) = ẋ 2 (0) = 0.
d 2 x1
= −4x1 + 3x2
dt 2
d 2 x2 2
2
= x1 − 3x2
dt 3
Solution:
We begin by taking the Laplace transforms of both equations, using X1 and X2 for
[x1 ] and [x2 ] as usual.
(s 2 + 4)X1 − 3X2 = 2s
2 1
− X1 + (s 2 + 3)X2 = s
3 3
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 527
The next step is to solve two simultaneous linear equations for X1 and X2 . This
involves a fair bit of algebra that we’re going to skip. (You can use substitution,
elimination, or matrices.) After simplification, you end up here.
s(2s 2 + 7)
X1 =
(s 2 + 2)(s 2 + 5)
s(s 2 + 8)
X2 =
3(s + 2)(s 2 + 5)
2
Stepping Back
After “Guess and Check,” this section may be the most important in the chapter. Laplace
transforms are a powerful approach to solving differential equations, and their influence is
also felt far beyond those applications. (Laplace himself used them in probability theory.) As
one electrical engineer put it:13 “EEs regularly toss around rational polynomials as descrip-
tions of system behavior, and use them to tweak the systems by adding feedback loops, filters,
whatever. All these polynomials are Laplace transforms, where the transform is so taken for
granted that we don’t even mention it.”
The method we’ve presented here is most useful for differential equations with constant
coefficients, for which you can apply linearity and the properties of Laplace transforms very
directly. In some special cases there are tricks you can use to solve other ODEs. (See Prob-
lem 10.277.) The real power of Laplace transforms will become evident, however, when we
use them to solve partial differential equations in Chapter 11.
ii. On the left side of the equation, 10.257 dx∕dt = 2x + y, dy∕dt = 3x + 4y;
apply the properties of Laplace x(0) = 1, y(0) = −1
transforms: first linearity, and 10.258 d 2 x∕dt 2 = 5x − 3y, d 2 y∕dt 2 = 4x − 2y;
then the derivative properties x(0) = 6, y(0) = 2, x ′ (0) = y′ (0) = 0
(Equations 10.10.7 and 10.10.8).
Use X to represent [x], and use the 10.259 d 2 x∕dt 2 = −10x + 3y, d 2 y∕dt 2 = 3x − 2y;
values given for x(0) and x ′ (0). x(0) = 10, y(0) = 40, x ′ (0) = 30, y′ (0) = −20
(b) Solve your resulting algebra equation
to find the function X (s). In Problems 10.260–10.264 you will be given a
Laplace transform F (s). Based on the Explanation
(c) Use a computer to find the (Section 10.11.1) write a real-valued function f (t) that
inverse Laplace transform of X (s). matches the qualitative behavior of its inverse Laplace
Write the function x(t) that solves transform. (To find the exact function requires a
the original equation. complex integral, but your function should have the
exponential functions and cosines in all the right
places.) Computers should not be used for these
In Problems 10.244–10.255 solve the given
problems.
differential equation subject to the given initial
conditions. Use the three-step Laplace transform 10.260 3∕(s + 4)
method modeled in Problem 10.243 even if other 10.261 2s∕(s − 7)
methods would be easier (which they will be in some
cases). Computers should be used only to find 10.262 4e −s ∕(s 2 + 3s − 10)
Laplace and inverse Laplace transforms as 10.263 1∕(s 2 + 25)
necessary. 10.264 1∕(8s 2 + 4s + 1)
10.244 x ′′ (t) + 9x(t) = 0; x(0) = 2, x ′ (0) = 5
10.245 x ′′ (t) − 9x(t) = 0; x(0) = 2, x ′ (0) = 5 10.265 In this problem you’re going to use Laplace
′′ ′ 5t transforms to solve the equation x ′′ (t) +
10.246 x (t) − 6x (t) + 5x(t) = e ;
2x(t) = 𝛿(t). The trick is taking the Laplace
x(0) = x ′ (0) = 0
t
transform of that delta function.
10.247 y′ (t) + ∫0 y(𝜏)d𝜏 = e −t ; y(0) = 0 (a) Find the Laplace transform of 𝛿(t − 1).
10.248 2y′′ (t) − 7y′ (t) − 4y(t) = e t sin t; (This should be quick and easy.)
y(0) = 2, y′ (0) = 3 (b) Show that the Laplace trans-
t
10.249 y′′ (t) + ∫0 y(𝜏)d𝜏 = 1; y(0) = y′ (0) = 0 form of 𝛿(t + 1) is 0.
10.250 3f ′′ (t) − f ′ (t) − 4f (t) = H (t − 3) − (c) Explain why the Laplace transform of
H (t − 4); f (0) = 1, f ′ (0) = −1 𝛿(t) is ambiguous given the definition
10.251 g ′′ (t) − 25g (t) = 𝛿(t − 1) + 3𝛿(t − 2); of the Laplace transform and the prop-
g (0) = 7, g ′ (0) = 5 erties of Dirac delta functions.
Since the value of [𝛿(t)] is ambiguous,
10.252 h ′′ (t) − 6h ′ (t) + 8h(t) =
you can approach this problem as a limit.
[H (t − 1) − H (t − 2)] e −t ; h(0) = 0, h ′ (0) = 0
10.253 h ′′ (t) − 4h(t) = [H (t) − H (t − 𝜋)] sin t; (d) Use Laplace transforms to solve
h(0) = 0, h ′ (0) = 0 the equation x ′′ (t) + 2x(t) = 𝛿(t − t1 )
10.254 Equation 10.10.3 with m = 1, b = 4, k = 3, with initial conditions x(0) = x ′ (0) = 0,
̇
F1 = 2, t1 = 5, x(0) = 6, x(0) =7 where t1 is a positive constant. Use
a computer only for the inverse
10.255 Equation 10.10.4 with m = 2, b = 7, k = 3,
Laplace transform.
̇
A = 4, t1 = 3, x(0) = 6, x(0) =1
(e) Take the limit of your solution
as t1 → 0+ to find the solution to
In Problems 10.256–10.259 use the three-step the original equation.
Laplace transform method to solve the given coupled (f) Sketch your solution. (This should
differential equations and initial conditions, using a not require a computer!)
computer only to find inverse Laplace transforms as (g) Suppose the differential equation in this
necessary. problem represents a mass on a spring.
10.256 dx∕dt = 3x + 5y, dy∕dt = x − y; First describe the forces acting on this
x(0) = 4, y(0) = 2 mass based on the differential equation.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 529
Then describe the resulting motion up with Equation 10.11.2. In this prob-
based on your solution, and explain lem you will confirm that our answer
why it makes sense for this scenario. works.
10.266 In the method of Laplace transforms, unlike (a) Confirm that the solution works
other methods, the initial conditions are for 0 < t < 3.
plugged in as part of the process. But you (b) Now show that the solution works for
can still find the general solution by putting t > 3, but be careful with the integral:
3
in arbitrary values for the initial conditions. ∫0 i(t)dt represents the charge buildup
(a) Solve the equation x ′′ (t) + 𝜔2 x(t) = on the capacitor during the first three
t
k[H (t − t1 ) − H (t − t2 )] (where t1 and seconds and ∫3 i(t)dt represents the
t2 are positive constants) with initial charge buildup after that time.
conditions x(0) = x0 , x ′ (0) = v0 . 10.270 Equation 10.10.5 represents an RLC
(b) What are the arbitrary constants circuit with a voltage source that is
in your solution? Does it have the turned on after t1 seconds.
correct number to be the general (a) Solve the equation, leaving R, L, C, V0
solution to this equation? and t1 as unknown constants, subject to
the initial condition i(0) = 0. Your answer
10.267 In the example on Page 526 we claimed should be the Laplace transform I (s).
that
√ the inverse Laplace transform of It can be inverse Laplace transformed
1 + s∕(s 2 + 2s + 2) is a mess, but that it has to give i(t), but the result is too messy
the same qualitative behavior as the inverse to bother with here, so don’t.
Laplace transform of 1∕(s 2 + 2s + 2). Check
(b) Here’s a different approach to the same
this by asking a computer√ for the inverse problem: start by taking the time deriva-
Laplace transform of 1 + s∕(s 2 + 2s + 2). tive of both sides. This turns a first-
Then have the computer plot the inverse order integro-differential equation into
Laplace transforms of these two functions a second-order differential equation.
on the same plot. In what ways are they Solve that. Here are two hints.
similar, and in what ways are they differ-
∙ The derivative of a Heaviside func-
ent? (Note: your inverse transform may
tion is a Dirac delta function.
appear to be complex, but it is real in the
domain t ≥ 0 which is all that matters.) ∙ A second-order equation needs two ini-
tial conditions, and we only gave you
10.268 In this problem you’ll solve the differ- one. You can find the other one by
ential equation ÿ + ẏ + y = (sin t)∕t. Feel plugging t = 0 into both sides of the
free to start by trying to solve it without original integro-differential equation.
Laplace transforms if you really want to (Since you are looking for the initial
appreciate why Laplace transforms are condition, assume H (t) is still zero.)
sometimes the best method to use. (c) Show that for C = 1, L = 2, R = 3,
V0 = 9, t1 = 3 your answer reduces
(a) Solve this equation with initial condi-
to the one we found in the Expla-
̇
tions y(0) = y(0) = 1 to find the Laplace
nation (Section 10.11.1).
transform Y (s).[You can]use a com-
puter to find (sin t)∕t but you should 10.271 An RLC circuit has L = 2, R = 7, C = 1∕6.
be able to do the rest by hand. It starts with zero current and no initial
(b) Looking at Y (s), explain how you can charge on the capacitor. A voltage of 9 sin(t)
know that y(t) should oscillate with is turned on from t = 0 to t = 2𝜋 and then
a decaying amplitude. What is the turned off. Use Equation 10.10.1 to find the
period of that oscillation? Laplace transform of the current. Hint:
you may find the integrals on Page 485
(c) Have a computer take the inverse
useful.
Laplace transform of your solution
Y (s) and plot the resulting function 10.272 A 1 kg block is attached to a spring
y(t). Check that it oscillates with a with spring constant k. It experiences
decaying amplitude and check if its a damping force F = −bv. Take k =
period matches your prediction. 8 N/m and b = 6 N⋅s/m.
10.269 In the Explanation (Section 10.11.1) (a) Write a differential equation for the
we solved Equation 10.11.1 and ended position x(t) of the block.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 530
and 𝜃2 (t). (You can do the algebra by a computer to calculate the inverse
hand but it’s tedious, so we recommend Laplace transform x(t).
letting the computer do it for you.) i. fe = 𝛿(t − 1)
10.275 The Transfer Function An RLC circuit, a ii. fe = e −2t
mass on a damped simple harmonic oscil- iii. fe = sin t
lator, and a variety of other systems obey 10.276 [This problem depends on Problem 10.275.]
the equation x ′′ (t) + a1 x ′ (t) + a0 x(t) = f (t). The transfer function is only defined
The constants a1 and a0 define the proper- for the homogeneous initial conditions
ties of the system (e.g. resistors, capacitors, x(0) = x ′ (0) = 0. To see why, start with the
and inductors, or damping force and spring initial conditions x(0) = x0 , x ′ (0) = v0 and
constant). The driving term f (t) represents show that you can’t find a transfer func-
the external input to the system (e.g. voltage tion that is independent of the driving
source or external force). You can solve this function.
equation for a particular a0 , a1 , and f (t) with 10.277 Exploration: Laguerre Polynomials
some set of initial conditions. For the homo- The equation ty′′ (t) + (1 − t)y′ (t) + ny(t) = 0
geneous initial conditions x(0) = x ′ (0) = 0, is called “Laguerre’s equation” and its solu-
however, you can also derive a general tions for integer values of n are called
relation between the input f (t) and the “Laguerre polynomials.” You can solve
output x(t). this equation using Laplace transforms,
(a) Solve this differential equation using but first you need to figure out what to
Laplace transforms. Your answer will do with the Laplace transform of ty′ (t)
give X (s) as a function of a0 , a1 , and and ty′′ (t).
F (s), the Laplace transform of the (a) The definition of a Laplace transform
driving term. ∞
is F (s) = ∫0 f (t)e −st dt. Differentiate
(b) The transfer function G(s) is defined both sides of this equation with respect
as X (s)∕F (s), or (roughly speaking) out- to s. Use your result to write a for-
put/input. Calculate G(s) as a function mula relating [tf (t)] to F (s).
of a0 and a1 . (The transfer function is (b) You know that [y′ (t)] = sY (s) − y(0). Use
more often designated H (s) but we’re the formula you just derived to write
already using that for the Heaviside a formula for [ty′ (t)]. Hint: Remem-
function. There just aren’t enough ber that y(0) is a constant.
letters.)
(c) Derive a similar formula for [ty′′ (t)].
The transfer function only depends on
(d) Take the Laplace transform of both
the properties of the system, not on the
sides of Laguerre’s equation. The result
driving term f (t). The rest of this prob-
should be a first-order ODE for Y (s).
lem will focus on a 1 kg block attached to a
spring with spring constant 3 N/m, with a (e) Solve that ODE for Y (s) using sepa-
damping term Fd = −bv, b = 2 N⋅s/m. ration of variables. The integration
step can be done with partial frac-
(c) Write the differential equation for this
tions, or you can have a computer
system and find its transfer function.
evaluate the integral for you.
We have not told you whether there
is an external force or not, because (f) The first two Laguerre polynomials
G(s) doesn’t depend on it. are L0 (t) = 1 and L1 (t) = 1 − t. Take
the Laplace transform of these two
(d) For each of the following driv- functions and show that they match
ing forces fe , calculate X (s) from the the solution you found for n = 0
relation X (s) = G(s)F (s). Then use and n = 1.
dx
+ x = 𝛿(t − 3) with x(0) = 0
dt
1. For all t ≠ 3 this equation is just ẋ + x = 0. Write the general solution to this equation.
2. For t < 3 you have the initial condition. Plug it in to find the arbitrary constant and
write the solution.
3. At t = 3 the function undergoes a discontinuous but finite jump upward. After t = 3
your solution from Part 1 applies again, although with a different value of the arbitrary
constant. Based on all that information, sketch the solution for t ≥ 0. Your sketch does
not have to be accurate about the size of the t = 3 jump, but should be qualitatively
correct in all other respects.
You can check yourself by solving the same equation with a Laplace transform!
Because the differential equation is linear, the sum (y1 + y2 + y3 )(x) will solve your original
problem.
Solving smaller problems and then summing the solutions is a common approach to linear
equations. To take a more sophisticated example, suppose you wanted to solve y′′ (x) + 6y(x) =
f (x) for some terribly complicated f (x). You might write f (x) as a sum of sin(nx) terms (a
Fourier sine series). The solution to y′′ (x) + 6y(x) = sin(nx) is y = sin(nx)∕(6 − n 2 ) for any
value of n, so you can sum all those individual functions to find the solution—in the form of
an infinite series—of the equation you started with.
With “Green’s functions” you write the inhomogeneous term, not as a series of sines, but
as a series—actually an integral—of Dirac delta functions. But the idea is very similar to the
Fourier approach. If you can solve the differential equation for an arbitrary delta function,
you can integrate over all your solutions to solve the equation you started with.
We begin, then, with a discussion of differential equations with Dirac delta functions.
d 2y ( ) ( )
𝜋 𝜋
+ y = 2𝛿 x − with y(0) = y =0 (10.12.1)
dx 2 6 2
The differential equation is an ideal candidate for a Laplace transform, but the boundary
conditions are not. A Laplace transform would need y(0) and y′ (0), but we have been given
y(0) and y(𝜋∕2). (This is one of the reasons we tend to use Laplace transforms on the time
variable rather than on spatial variables.)
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 533
So we’re going to use a different approach. We begin with the observation that for all
x in our domain except x = 𝜋∕6, Equation 10.12.1 is just the simple harmonic oscillator
equation:
d 2y ( )
𝜋
+ y = 0 x ≠ (10.12.2)
dx 2 6
So the delta function divides our domain into three regions.
We’ll tell you what happens at that particular x-value: y(x) is continuous, but dy∕dx discon-
tinuously jumps up by 2. We’ll justify those two claims below, but first let’s see how they allow
us to fill in our solution.
𝜋 𝜋
y is continuous ∶ A sin = D cos
6 6
dy 𝜋 𝜋
goes up by 2 ∶ A cos + 2 = −D sin
dx 6 6
√
Solving simultaneously we find A = − 3 and D = −1, which gives us the complete solution.
{ √
y= − 3 sin x x < 𝜋∕6
− cos x x > 𝜋∕6
y(x)
π πx
4 2
–0.4
–0.8
The result above was stated in terms of second-order equations, but it generalizes both
up and down. In a first-order equation a delta function causes a discontinuity in the func-
tion itself (as in the Discovery Exercise, Section 10.12.1). In a third-order equation the delta
function does not cause a discontinuity in the function or its first derivative, but the sec-
ond derivative jumps by k. In general, a delta function in an nth-order equation causes a
discontinuous jump in the (n − 1)th derivative.
That makes sense if you think about position functions. A first-order differential equation
tells you the velocity of an object as a function of its position and time: v = f (x, t). If you
introduce a delta function then your velocity spikes momentarily to infinity, causing a dis-
continuous leap in position. A second-order equation tells you the acceleration: a = f (v, x, t).
In this case a delta function causes your acceleration to momentarily spike, so your velocity
changes abruptly. (Think of a baseball hitting a bat.)
That last paragraph is a hand-wave designed to convince you that for a second-order dif-
ferential equation, a delta function causes a discontinuity in the derivative but not in the
function itself. You will take up that question more rigorously in Problem 10.291. But it leaves
open the question, how much does the first derivative jump?
Remember that a delta function is defined by its integral. So we’re going to return to
Equation 10.12.1 and integrate both sides around the delta function. Defining 𝜖 as a small
Δx we will integrate from (𝜋∕6 − 𝜖) to (𝜋∕6 + 𝜖). Then we will let 𝜖 → 0 to see what happens
in the infinitesimal space around the delta function.
∑
∞
f (x) = bn sin(nx) = b1 sin(x) + b2 sin(2x) + b3 sin(3x) + …
n=1
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 535
In the example above the frequencies are the positive integers 1, 2, 3 … You can cover a
wider period if you use half-integer frequencies 0.5, 1, 1.5, 2 … But to represent an entire
non-periodic function you have to use all real numbers as frequencies. Your coefficients now
become a Fourier sine transform and your sum becomes an integral.
∞
f (x) = f̂s (p) sin(px)dp (10.12.4)
∫0
Think of f̂s (p) as the coefficient of the sin(px) term, and the integral as adding these terms
up across all p-values.
In this section our goal is to represent an arbitrary function f (x) as an integral, not of sines
or cosines, but of delta functions. How do we do that? The key is Equation 10.10.2. Below we
have copied that equation, but with different letters. Where before we used the letter x we
now use p, and where we used a we now use x.
∞
f (x) = f (p)𝛿(p − x) dp (10.12.5)
∫−∞
Please confirm for yourself that this is just Equation 10.10.2 relabeled; that’s how you
will know it’s true. But then look at it by analogy to Equation 10.12.4. At any given point
x = p we write a delta function;14 its coefficient is the value of f (x) at that particular point.
When we integrate to add up these functions across all p-values, we end up with the original
function f (x).
Finally, remember why we’re doing all that! Imagine a differential equation with an inho-
mogeneous term f (x) that is difficult to work with. Equation 10.12.5 tells you how to replace
f (x) with an integral over delta functions. You solve the differential equation for the delta
functions at each point p, and integrate the solutions over all p-values to solve the original
equation.
d 2y ( )
𝜋
+ y = f (x) with y(0) = y =0 (10.12.6)
dx 2 2
𝜋∕2
f (x) = f (p)𝛿(p − x) dp
∫0
2. Write the differential equation with the inhomogeneous term replaced by one arbitrary
delta function. The solutions to this equation are called “Green’s Functions.” Above
we solved Equation 10.12.6 with f (x) replaced by a delta function at x = 𝜋∕6. Now we
14 If 𝛿(p − x) in Equation 10.12.5 is bothering you, feel free to replace it with 𝛿(x − p). These two are the same because
have to repeat that exercise at every x-value in our domain; in other words, we have to
solve the following equation for an arbitrary p.
( )
d 2G 𝜋
2
+ G = 𝛿(x − p) with G(0) = G =0 (10.12.7)
dx 2
In Problem 10.285 you will solve that equation and end up here.
{ ( )
− (cos p) sin x x<p
G(x, p) =
− sin p cos x x>p
For any given p-value (that is, any given spot where you draw the delta function) there
is a particular G(x) function that solves this equation. These functions are collectively
referred to as the “Green’s functions” for this equation.
3. Integrate over all the Green’s functions to solve the original problem. We have seen
that the original f (x) is an integral across all the delta functions, and that each Green’s
function is the solution for one particular delta function. Because the differential
equation is linear, summing (or in this case integrating) all those individual solutions
will solve the original equation.
∞
y(x) = f (p)G(x, p) dp (10.12.8)
∫−∞
In Problem 10.290 you will show that this integral does give you a solution to the differ-
ential equation you started with. Our current problem involves only x-values between
0 and 𝜋∕2 so those form the limits of integration.
𝜋∕2 𝜋∕2 ({ ( ) )
− (cos p) sin x x<p
y(x) = f (p)G(x, p) dp = f (p) dp
∫0 ∫0 − sin p cos x x>p
To integrate a piecewise function you break it up. Because p is the variable over which
we are integrating, we look at the region 0 < p < x (the bottom part of the piecewise
function above) and then at the region x < p < 𝜋∕2 (top part).
x 𝜋∕2
y(x) = − cos x f (p) sin p dp − sin x f (p) cos p dp (10.12.9)
∫0 ∫x
That is the solution to Equation 10.12.6. Of course the problem had an unspecified
inhomogeneous term f (x) so that function appears as part of the solution. In Prob-
lems 10.286 and 10.287 you will find solutions for a few specific f (x) functions.
Problem:
Solve the following equation on the domain t ≥ 0. Your answer will be a formula
involving the unknown function f (t).
d 2x dx
2 − 7 − 4x = f (t) with x(0) = x ′ (0) = 0 (10.12.10)
dt 2 dt
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 537
Solution:
We will employ the three-step Green’s function method discussed above.
1. Rewrite the inhomogeneous term as an integral over delta functions. This
is just a matter of writing down Equation 10.12.5 on the proper domain.
∞
f (t) = f (p)𝛿(p − t) dp
∫0
2. Solve the differential equation with the inhomogeneous term replaced by one arbi-
trary delta function. The “Green’s functions” for this problem are the solutions to
the following differential equation.
d 2G dG
2 −7 − 4G = 𝛿(t − p) with x(0) = x ′ (0) = 0 (10.12.11)
dt 2 dt
At all points except t = p we have an equation we can easily solve with guess and
check.
d 2G dG
2 2 −7 − 4G = 0 → G = Ae 4t + Be −t∕2
dt dt
On the left side (t < p) the initial conditions G(0) = G ′ (0) = 0 lead to A = B = 0,
or G = 0. So the function stays flat until the delta function “kicks” it. To
see what happens there, divide both sides of Equation 10.12.11 by two:
G ′′ − (7∕2)G ′ − 2G = (1∕2)𝛿(t − p). This is now in the form of Equation 10.12.3
with k = 1∕2, so the first derivative jumps by 1∕2 at t = p. This gives us the initial
conditions for the second half of the journey: G(p) = 0 and G ′ (p) = 1∕2.
Ae 4p + Be −p∕2 = 0
4Ae 4p − (1∕2)Be −p∕2 = 1∕2
Solving, A = (1∕9)e −4p and B = −(1∕9)e p∕2 , so the solution to Equation 10.12.11 is
this. {
G(t, p) = [ 0 ] t<p
(1∕9) e 4(t−p) − e −(1∕2)(t−p) t > p
3. Integrate over all the Green’s functions to solve the original problem.
Because G vanishes for all p > t we need only integrate it for p < t.
t
1
t[ ]
x(t) = f (p)G(x, p) dp = f (p) e 4(t−p) − e −(1∕2)(t−p) dp (10.12.12)
∫0 ∫
9 0
For any given f (t), Equation 10.12.10 represents a specific differential equation and
Equation 10.12.12 is the solution.
Stepping Back
If L is a linear operator, Green’s functions are one approach to solving the inhomogeneous
equation L(y) = f (x). The Green’s function G(x, p) is the solution the equation would have
if the inhomogeneous term were a delta function 𝛿(x − p). Physically the Green’s function
represents the response the system would have to a brief stimulus at one moment of time or
point in space.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 538
The actual inhomogeneous term f (x) can be written as an integral of delta functions,
f (x) = ∫ f (p)𝛿(x − p)dp, so by superposition the actual solution can be written as an integral
over the Green’s functions: y(x) = ∫ G(x, p)f (p)dp. Here are a few important things to keep
in mind about this method.
∙ The Green’s function G(x, p) only depends on the complementary equation, not on
the inhomogeneous term. That term comes in when you integrate to find the solution.
∙ The Green’s function does depend on the initial or boundary conditions. The same
differential equation with two different sets of initial/boundary conditions will have
different Green’s functions.
∙ The method is based on linear superposition, so it only works if the differential oper-
ator is linear.
∙ For the same reason, this method only works if the boundary conditions are homo-
geneous. That means that if x1 (t) and x2 (t) both obey the boundary conditions then
x1 + x2 does also. (See Problem 10.292 for a trick that can be used to extend the
method to inhomogeneous boundary conditions.)
Most of the differential equations we are solving in this section, both in this explanation
and in the problems, could be solved by other methods. To find the Green’s function you
need to be able to solve the complementary differential equation, and once you can do
that variation of parameters will generally give you the particular solution if it can be found
analytically. Our goal in this section is to familiarize you with Green’s functions—what they
are, how to use them, and what they represent physically—so that you will be prepared when
you encounter them in more complicated contexts.
Equation 10.12.6. Simplify your solution as (a) First check that definition. Make a
much as possible, and then verify that it does sketch of the right-hand side (for some
indeed solve the differential equation. non-zero 𝜖). You should be able to see
10.288 Show that for a solution y(x) to that as 𝜖 → 0 it’s becoming infinite at
Equation 10.12.3, the first derivative dy∕dx x = p and zero everywhere else. Evalu-
∞
will experience a jump of k at x = p. ate ∫−∞ of the function on the right and
Hint: We showed this in the Explanation show that it equals 1 for any value of 𝜖.
7in x 10in Felder c10.tex V3 - February 27, 2015 3:40 P.M. Page 540
(b) To solve the differential equation with (h) Take the limits of Δy and Δy′ as 𝜖 →
this function on the right-hand side, 0. You should find that in this limit
you need to write three solutions. In the function is continuous across the
the region 0 < x < 𝜋∕6 − 𝜖 the dif- jump but its derivative is not.
ferential equation is y′′ + y = 0. Find 10.292 In this problem you’ll solve the differen-
the solution to that equation with tial equation y′′ − 2y∕x 2 = 1 with bound-
the boundary condition y1 (0) = 0. ary conditions y(0) = 0, y(1) = 1.
Call your one remaining arbitrary
(a) Use the Green’s function method
constant A.
to solve the differential equation
(c) Similarly, solve for y3 (x) in the with the homogeneous boundary
region 𝜋∕6 + 𝜖 < x < 𝜋∕2 subject to conditions y(0) = y(1) = 0.
y(𝜋∕2) = 0. Call your one remain-
(b) Solve the homogeneous differential
ing arbitrary constant D.
equation y′′ − 2y∕x 2 = 0 with bound-
(d) In the region 𝜋∕6 − 𝜖 < x < 𝜋∕6 + 𝜖 the ary conditions y(0) = 0, y(1) = 1.
differential equation is y′′ + y = 1∕(2𝜖).
(c) Add your two solutions to find the com-
Find the solution y2 (x) in this region,
plete solution y(x). Plug this solution
with arbitrary constants B and C.
in and verify that it solves the inhomo-
(e) For any non-zero 𝜖 the driving term geneous differential equation with the
is finite, so both y(x) and y′ (x) must inhomogeneous boundary conditions.
be continuous and differentiable. Set (You will need to take a limit to verify
y1 = y2 and y1′ = y2′ at the boundary one of the boundary conditions.)
𝜋∕6 − 𝜖, and likewise for y2 and y3 at
10.293 Consider an object acted on by two
their boundary. The result should be four
forces, a drag force Fd = −bv and an
equations for the four unknown arbitrary
external force Fext (t). At t = 0 the object
constants.
is at rest at the origin.
(f) Solve for A, and D, eliminating B and
(a) Write a differential equation for the
C. You can do this by hand, but you
position x(t) of the object.
are welcome to use a computer to do
the tedious algebra for you if you’d (b) Find the Green’s function for
prefer. Either way, simplify as much the ODE you wrote.
as possible by using the trig addition (c) Assume the external force is zero
and subtraction identities. until t = t1 and grows linearly as
(g) Find Δy, the change in y from Fext = k(t − t1 ) until it abruptly stops at
𝜋∕6 − 𝜖 to 𝜋∕6 + 𝜖, and find Δy′ t = t2 . Use the Green’s function you
across the same interval. found to find x(t) for this force.
C H A P T E R 11
∙ solve ordinary differential equations with “initial” or “boundary” conditions (see Chapters 1
and 10).
∙ evaluate and interpret partial derivatives (see Chapter 4).
∙ find and use Fourier series—sine series, cosine series, both-sines-and-cosines series, and
complex exponential forms, for one or more independent variables (see Chapter 9).
Fourier transforms are required for Section 11.10 only.
∙ find and use Laplace transforms (for Section 11.11 only, see Chapter 10).
∙ model real-world situations with partial differential equations and interpret solutions of
these equations to predict physical behavior.
∙ use the technique “separation of variables” to find the normal modes for a partial differen-
tial equation. We focus particularly on trig functions, Bessel functions, Legendre polynomi-
als, and spherical harmonics. You will learn about each of these in turn, but you will also
learn to work more generally with any unfamiliar functions you may encounter.
∙ use those normal modes to create a general solution for the equation in the form of a series,
and match that general solution to boundary and initial conditions.
∙ solve problems that have multiple inhomogeneous conditions by breaking them into sub-
problems with only one inhomogeneous condition each.
∙ solve partial differential equations by using the technique of “eigenfunction expansions.”
∙ solve partial differential equations by using Fourier or Laplace transforms.
The differential equations we have used so far have been “ordinary differential equations”
(ODEs) meaning they model systems with one independent variable. In this chapter we will
use “partial differential equations” (PDEs) to model systems involving more than one inde-
pendent variable.
We will begin by discussing the idea of a partial differential equation. How do you set up
a partial differential equation to model a physical situation? Once you have solved such an
equation, how can you interpret the results to understand and predict the behavior of a
system? Just as with ordinary differential equations, you may find this part as or more chal-
lenging than the mechanics of finding a solution, but you will also find that understanding
the equations is more important than solving them. The good news is, the work you have
already done in understanding ordinary differential equations and partial derivatives will
provide a strong foundation in understanding these new types of equations.
In the first three sections we will discuss a few key concepts including arbitrary functions,
boundary and initial conditions, and normal modes. These discussions should be seen as
extensions of work you have already done in earlier chapters. For instance, you know that
the general solution to an ordinary differential equation (ODE) involves arbitrary constants.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 542
A partial differential equation (PDE) may have an infinite number of arbitrary constants, or
(equivalently) an arbitrary function, in the general solution. These constants or functions are
determined from the initial and/or boundary conditions.
In the middle of the chapter—the largest part—we will deal with the engineer’s and physi-
cist’s favorite tool for solving partial differential equations: “separation of variables.” This
technique replaces one partial differential equation with two or more ordinary differential
equations that can be solved by hand or with a computer. The solutions to those equations
will lead us to work with a wide variety of functions, some familiar and some new.
We will then discuss techniques that can be used when separation of variables fails. Some
partial differential equations that cannot be solved by separation of variables can be solved
with “eigenfunction expansion,” which involves expanding your function into a series before
you solve. If one of your variables has an infinite domain you can still use this trick, but it’s
called the “method of transforms” because you take a Fourier or Laplace transform instead
of using a series expansion. Appendix I guides you through the process of looking at a new
PDE and deciding which technique to use.
As you explore these techniques you will encounter many of the most important equations
in engineering and physics: equations governing light and sound, heat, electric potential,
and more. After this chapter you will be prepared to understand these equations, to solve
them, and to interpret the solutions.
1 Throughout this chapter we will use the letter u for temperature. Both u and T are commonly used in thermody-
3. First, suppose we start with the temperature of the bar uniform. We use u(x, 0) to indi-
cate the initial temperature of the bar—that is, the temperature at time t = 0—so we can
express the condition “the initial temperature is a constant” by writing u(x, 0) = u0 .
u0
x
PL P PR
x
PL P PR
x
PL P PR
FIGURE 11.1
(c) Will the rate of heat transfer between P and PL be faster, slower, or the same as
the rate of heat transfer between P and PR ?
(d) Will the temperature of P go up, go down, or stay constant?
6. For each of the three cases you just examined, what were the signs (negative, positive,
or zero) of each of the following quantities at point P : u, 𝜕u∕𝜕x, 𝜕 2 u∕𝜕x 2 , and 𝜕u∕𝜕t?
Just to be clear, you’re giving 12 answers in all to this question. You’ll get the signs of u,
𝜕u∕𝜕x, and 𝜕 2 u∕𝜕x 2 from our pictures, and 𝜕u∕𝜕t from what was happening at point
P in each case.
7. Which of the following differential equations would be consistent with the answers you
gave to Part 6? In each one, k is a (real) constant, so k 2 is a positive number and −k 2
is a negative number.
(a) 𝜕u∕𝜕t = k 2 u
(b) 𝜕u∕𝜕t = −k 2 u
(c) 𝜕u∕𝜕t = k 2 (𝜕u∕𝜕x)
(d) 𝜕u∕𝜕t = −k 2 (𝜕u∕𝜕x)
(e) 𝜕u∕𝜕t = k 2 (𝜕 2 u∕𝜕x 2 )
(f) 𝜕u∕𝜕t = −k 2 (𝜕 2 u∕𝜕x 2 )
The “heat equation” you just found is an example of a “partial differential equation,”
which involves partial derivatives of a function of more than one variable. In this case, it
involves derivatives of u(x, t) with respect to both x and t. In general, partial differential
equations are harder to solve than ordinary differential equations, but there are systematic
approaches that enable you to solve many linear partial differential equations such as the
heat equation analytically. For non-linear partial differential equations, the best approach is
often numerical.
Things become more complicated when differential equations involve functions of more
than one variable, and therefore partial derivatives. For the following questions, suppose
that z is a function of two independent variables x and y.
𝜕
2. Consider the differential equation 𝜕x z(x, y) = z(x, y). In words, “if you start with the
function z(x, y) and take its partial derivative with respect to x, you get the same func-
tion you started with.” Note that 𝜕z∕𝜕y is not specified by this differential equation,
and may therefore be anything at all.
(a) Which of the following functions are valid solutions to this differential equation?
Check all that apply.
i. z = 5
ii. z = e x
iii. z = e y
iv. z = e x e y
v. z = ye x
vi. z = xe y
vii. z = e x sin y
(b) Write a general solution to the differential equation 𝜕z∕𝜕x = z. Your solution will
have an arbitrary function in it.
(c) Find the only specific solution that meets the condition z(0, y) = sin(y).
See Check Yourself #69 in Appendix L
𝜕 𝜕
3. Consider the differential equation 𝜕x z(x, y) = 𝜕y z(x, y).
(a) Express this differential equation in words.
(b) Which of the following functions are valid solutions to this differential equation?
Check all that apply.
i. z = 5
ii. z = e x
iii. z = e y
iv. z = e x+y
v. z = sin(x + y)
vi. z = sin(x − y)
vii. z = ln(x + y)
(c) Parts i, iv, v, and vii above are all specific examples of the general form
z = f (x + y). By plugging z = f (x + y) into the original differential equation
𝜕z∕𝜕x = 𝜕z∕𝜕y, show that any function of this form provides a valid solution.
(d) Find the only specific solution that meets the condition z(0, y) = cos y. Hint: It’s not
in the list above.
Multivariate functions
Consider a guitar string, pinned to the x-axis at x = 0 and x = 4𝜋, but free to move up and
down between the two ends.
y
x
π 2π 3π 4π
We can describe the motion by writing the height y as a function of the horizontal position
x and the time t. We can look at such a y(x, t) function in three different ways:
∙ At any given moment t there is a particular y(x) function that describes the entire string.
The string’s motion is an evolution over time from one y(x) to the next.
∙ Any given point x on the string oscillates according to a particular y(t) function. The
next point over (at x + Δx) oscillates according to a slightly different y(t), and all the
different y(t) functions together describe the motion of the entire string.
∙ Finally, we can treat t as a spatial variable and plot y on the xt-plane.
The function y(x, t) has two derivatives at any given point: 𝜕y∕𝜕x gives the slope of the string
at a given point, and 𝜕y∕𝜕t gives the velocity of the string at that point. (A dot is often used for
a time derivative, so ẏ means 𝜕y∕𝜕t.) There are therefore four second derivatives. 𝜕 2 y∕𝜕x 2 is
concavity, and 𝜕 2 y∕𝜕t 2 is acceleration. The “mixed partials” 𝜕 2 y∕𝜕x𝜕t and 𝜕 2 y∕𝜕t𝜕x generally
come out the same.
All this is a quick reminder of how to think about multivariate functions and derivatives.
It does not, however, address the question of what function a guitar string would actually
follow. The function y(x, t) = (1 − cos x) cos t looks fine, doesn’t it? But in fact, no free guitar
string would actually do that. In order to show that, and to find what it would do, we have to
start with the equation that governs its motion.
𝜕2 y 1 𝜕 y
2
= The wave equation (11.2.1)
𝜕x 2 v 2 𝜕t 2
where v is a constant. You will explore where this equation comes from in Problems 11.32 and
11.45, but here we want to focus on what it tells us. Let’s begin by considering the function
we proposed earlier.
𝜕2 y 𝜕2 y
y(x, t) = (1 − cos x) cos t → = cos x cos t and = −(1 − cos x) cos t
𝜕x 2 𝜕t 2
We see that 𝜕 2 y∕𝜕x 2 is not the same as (1∕v 2 )(𝜕 2 y∕𝜕t 2 ) (no matter what the constant v
happens to be), so this function does not satisfy the wave equation. Left to its own
devices, a guitar string will not follow that function.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 547
We see with PDEs—just as we saw with ODEs in previous chapters—that it may be difficult
to find a solution, but it is easy to verify that a solution does (or in this case does not) work.
We also saw with ODEs that we can often predict the behavior of a system directly from
the differential equation, without ever finding a solution. This is also an important skill to
develop with PDEs. Let’s see what we can learn by looking at the wave equation.
The second derivative with respect to position, 𝜕 2 y∕𝜕x 2 , gives the concavity: it is related to
the shape of the string at one frozen moment in time. The second derivative of the height
with respect to time, 𝜕 2 y∕𝜕t 2 , is vertical acceleration: it describes how one point on the string
is moving up and down, independent of the rest of the string. So Equation 11.2.1 tells us that
wherever the string is concave up, it will accelerate upward; wherever the string is concave
down, it will accelerate downward.
As an example, suppose our guitar string starts at rest in the shape y(x, 0) = 1 − cos x.
(Don’t ask how it got there.) This function has points of inflection at 𝜋∕2, 3𝜋∕2, 5𝜋∕2, and
7𝜋∕2. So the two peaks, which are concave down, will accelerate downward; the middle, which
is concave up, will accelerate upward. On the left and right are other concave up regions that
will move upward, but remember that the guitar string is pinned down at x = 0 and x = 4𝜋,
so the very ends cannot move regardless of the shape. Based on all these considerations, we
predict something like Figure 11.2.
The drawing shows the curve moving up
y where it was concave up, and down where
it was concave down, while remaining fixed
x
at the ends. We can state with confidence
π 2π 3π 4π that the string will move from the blue curve
to something kind of like the black curve,
FIGURE 11.2 The blue plot shows the string at more or less. For exact solutions we have to
time t = 0 and the black plot shows the string a actually find the function that matches the
short time later. wave equation and all the conditions of this
scenario. Within the next few sections you’ll
know how to do all that. But if you followed how we made that drawing, then you’ll also know
how to see if your answers make sense.
∙ You need the “initial conditions”: that is, you need both the position and the velocity
of every point on the string when it starts (at t = 0). In the example above we gave you
the initial position as y = 1 − cos x, and told you that the string started at rest.
∙ You also need the “boundary conditions.” In the example above the guitar string was
fixed at y = 0 for all time at the left and right sides. A different boundary condition
(such as a moving end) would lead to different behavior over time, even if the initial
conditions were unchanged.
When you solve a linear ODE you need one condition—one fact—for each arbitrary con-
stant. For instance a second-order linear ODE has two arbitrary constants in the general
solution, so you need two extra facts to find a specific solution. These might be the values of
the function at two different points, or the value and derivative of the function at one point.
A PDE, on the other hand, requires an infinite number of facts. In our example the ini-
tial state occurs at an infinite number of x-positions and the boundary conditions occur at
an infinite number of times. To match such conditions the solution must have an infinite
number of arbitrary variables: in other words, an entire arbitrary function.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 548
Several students have objected to our use of the notation f ′ in this example: it
obviously means a derivative, but with respect to what? (This is the kind of question
that
( only excellent
) students ask.) In this case the derivative is being taken with respect
to y − x 2 ∕2 but more generally it means “the derivative of the f function.” For
instance in the previous example f was a sine so f ′ was a cosine.
The question of exactly what information you need in order to specify the arbitrary function
and find a specific solution to a PDE turns out to be surprisingly complicated. We will return
to that question when we discuss Sturm-Liouville theory in Chapter 12. Here we will offer
just one guideline, which is to look at the order of the equation. For instance, the heat
equation 𝜕u∕𝜕t = 𝛼(𝜕 2 u∕𝜕x 2 ) is second order in space, and therefore requires two boundary
conditions (such as the function u(t) at each end). It is first order in time, and therefore
requires only one initial condition (usually the function u(x) at t = 0). The wave equation
𝜕 2 y∕𝜕x 2 = (1∕v 2 )(𝜕 2 y∕𝜕t 2 ), on the other hand, requires two boundary conditions and two
initial conditions (usually position and velocity).
Remember that a linear ODE is referred to as “homogeneous” if every term in the
equation includes the dependent variable or one of its derivatives. If a linear equation
is homogeneous then a linear combination of solutions is itself a solution: a very help-
ful property, when you have it! The same rule applies to PDEs: Laplace’s equation
𝜕 2 V ∕𝜕x 2 + 𝜕 2 V ∕𝜕y2 + 𝜕 2 V ∕𝜕z 2 = 0 is linear and homogeneous, so any linear combination
of solutions is itself a solution. Poisson’s equation 𝜕 2 V ∕𝜕x 2 + 𝜕 2 V ∕𝜕y2 + 𝜕 2 V ∕𝜕z 2 = f (x, y, z)
is inhomogeneous as long as f ≠ 0.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 549
However, unlike with ODEs, we will now be making the same distinction with our boundary
and initial conditions.
Definition: Homogeneous
A linear differential equation, boundary condition, or initial condition is referred to as “homoge-
neous” if it has the following property:
If the functions f and g are both valid solutions to the equation or condition, then the function
Af + Bg is also a valid solution for any constants A and B.
If the end of our waving string is fixed at y(0, t) = 0, we have a homogeneous boundary
condition. In other words, if f (0, t) = 0 and g (0, t) = 0, then (f + g )(0, t) = 0 too.
However, y(0, t) = 1 represents an inhomogeneous condition. If f (0, t) = 1 and
g (0, t) = 1, then ( f + g )(0, t) = 2, so the function f + g does not meet the boundary
condition.
As a final note on conditions, you may be surprised that we distinguish between “initial”
(time) and “boundary” (space) conditions. You can graph a y(x, t) function on the xt-plane,
so doesn’t time act mathematically like just another spatial dimension? Sometimes it does,
but initial conditions often look different from boundary conditions, and must be treated
differently. In our waving string example the differential equation is second order in both
space and time, so we need two spatial conditions and two temporal—but look at the ones
we got! The boundary conditions specify y on both the left and right ends of the string. The
initial conditions, on the other hand, say nothing about the “end time”: instead, they specify
both y and ẏ at the beginning. This common (though not universal) pattern—boundary
conditions on both ends, initial conditions on only one end—leads to significant differences
in the ways initial and boundary conditions affect our solutions.
𝜕2y 1 𝜕 y
2
= the wave equation (11.2.2)
𝜕x 2 v 2 𝜕t 2
We discussed the one-dimensional wave equation above. In two dimensions we could write
𝜕 2 Ψ∕𝜕x 2 + 𝜕 2 Ψ∕𝜕y2 = (1∕v 2 )(𝜕 2 Ψ∕𝜕t 2 ) and in three dimensions we would add a third spatial
derivative term. We can write this equation very generally using the “Laplacian” operator as
∇2 Ψ = (1∕v 2 )(𝜕 2 Ψ∕𝜕t 2 ), which is represented by different differential equations as we change
dimensions and coordinate systems. You may recall the Laplacian from vector calculus but
we will supply the necessary equations as we go through this chapter.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 550
The wave equation describes the propagation of light waves through space, of sound waves
through a medium, and many other physical phenomena. You will explore where it comes
from in Problems 11.32 and 11.45.
𝜕u 𝜕2u
=𝛼 2 (𝛼 > 0) the heat equation (11.2.3)
𝜕t 𝜕x
Like Equation 11.2.2, Equation 11.2.3 (which you derived in the motivating exercise) is a
one-dimensional version of a more general equation where the second derivative is replaced
by a Laplacian. The equation is used to model both conduction of heat and diffusion of
chemical species. (In the latter case it’s called the “diffusion equation.”)
Poisson’s equation is used to model spatial variation of electric potential, gravitational poten-
tial, and temperature. In general, f is a function of position—for instance, it may represent
electrical charge distribution. Poisson’s equation therefore represents many different PDEs
with different solutions, depending on the function f (x, y, z). In the special case f = 0 it
reduces to Laplace’s equation:
We have chosen to express Equations 11.2.4 and 11.2.5 in their general multidimensional
form: the one-dimensional versions are not very interesting, and in fact are not partial dif-
ferential equations at all. These equations relate different spatial derivatives, but no time
derivative. Problems involving these equations therefore have boundary conditions but no
initial conditions.
ℏ2 2 ( ) dΨ
− ∇ Ψ + V x⃗ Ψ = iℏ Schrödinger’s equation (11.2.6)
2m dt
theory. It’s similar to the wave equation, but Problems 11.29–11.35 depend on the Motivating
with an added term. Repeat Problem 11.20 for Exercise (Section 11.1).
the Klein-Gordon equation. In what ways is
the behavior described by these two equations 11.29 In the Motivating Exercise you used physi-
similar and in what ways is it different? cal arguments to write the equation for the
11.22 State whether the given condition is homo- evolution of the temperature distribution in
geneous or inhomogeneous. Assume a thin bar. In that problem we ignored the
that the function f (x, t) is defined on the surrounding air, assuming that heat trans-
domain 0 ≤ x ≤ xf , 0 ≤ t < ∞. fer within the bar takes place much faster
than transfer with the environment. Now
(a) f (0, t) = f (xf , t) = 0
write a different PDE for the temperature
(b) f (0, t) = 3 u(x, t) in a thin bar with an external heat
(c) f (x, 0) = 0 source that provides heat proportional to
( )
(d) f (x, 0) = sin 𝜋x∕xf distance from the left end of the bar.
(e) lim f (x, t) = 0 11.30 The “specific heat” of a material measures how
t→∞
(f) lim f (x, t) = 1 much the temperature changes in response
t→∞ to heat flowing in or out. If the same amount
(g) lim f (x, t) = ∞ of heat is supplied to two bricks of the same
t→∞
11.23 The function z(x, y) is defined on size, one of which has twice as high a spe-
x ∈ [3, 10], y ∈ [2, 5]. The boundary cific heat as the other, the one with the
conditions are: z(3, y) = 0, z(x, 2) = 1, higher specific heat will increase its tem-
z(10, y) = 2, and z(x, 5) = x 2 . perature by half as much as the other one.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 553
(Be careful about that difference; higher (d) Does the net force on P depend
specific heat means smaller change in on y, 𝜕y∕𝜕x, or 𝜕 2 y∕𝜕x 2 ?
temperature.) (e) Does the force on P determine
The motivating exercise tacitly assumed y, 𝜕y∕𝜕t, or 𝜕 2 y∕𝜕t 2 ?
that the specific heat of the bar was constant. (f) Explain in words why Equation 11.2.2
Now consider instead a bar whose specific is the correct description for the
heat is proportional to distance from the left motion of a string.
edge of the bar: h = kx. Derive the PDE for
the temperature u(x, t) across the bar. 11.33 [This problem depends on Problem 11.32.] Write
the PDE for a string with a drag force, e.g.
11.31 The Motivating Exercise was based on a practi- a string vibrating underwater. The drag
cally one-dimensional object. Now consider force on each piece of the string is propor-
heat flowing through a three-dimensional tional to the velocity of the string at that
object. The outer surface of that object is held point, but opposite in direction. Hint: Go
at a fixed temperature distribution. (One sim- through the derivation in Problem 11.32
ple example would be a cube, where the top and see at what point the drag force would
is held at temperature u = 100◦ and the other be added to what you did there.
five sides are all held at u = 0◦ . But this prob-
lem refers to the general three-dimensional 11.34 [This problem depends on Problem 11.32.]
heat equation with fixed boundary condi- Write the PDE for a string whose den-
tions, not to any specific example.) sity is some function 𝜌(x). (Hint: Think
about what this implies for the mass
(a) Write the heat equation: the partial dif-
of each small segment, and what that
ferential equation that governs the tem-
means for the acceleration of that
perature u(x, y, z, t) inside the object.
segment.)
(b) Under these circumstances, the heat
11.35 The chemical gas Wonderflonium has accu-
equation will approach a “steady state”
mulated in a pipe. When the density of
that, if reached, will never change. Write
Wonderflonium is even throughout the
a partial differential equation for the
pipe it stays constant, but if there’s more
steady-state solution u0 (x, y, z).
Wonderflonium in one part of the pipe
11.32 The wave equation: In the Motivating Exercise than another, it will tend to flow from
you used physical arguments to write down the region of high Wonderflonium den-
the heat equation. In this problem you’ll sity to the region of low Wonderflonium
use similar arguments to explain the form density.
of the wave equation. Consider a string with
(a) Write a PDE that could describe the con-
a tension T . As you did in the motivating
centration of Wonderflonium in the pipe.
exercise, you should focus on a small piece
of string (P ) at some position x and how (b) Give the sign and units of any constants
it interacts with the pieces to the left and in your equation.
right of it (PL and PR ). You should justify your
For Problems 11.36–11.44 use a computer to
answers to all the parts of this problem based on
numerically solve the given PDEs and graph the
the physical description, not based on the wave
results. You can make plots of f (x) at different times,
equation, since that equation is what we’re trying
make a 3D plot of f (x, t), or do an animation of f (x)
to derive.
evolving over time. For each one describe the late
(a) If the string is initially flat, term behavior (steady state, oscillating, growing
y(x, 0) =<constant>, will PL exert an without bound), based on your computer results.
upward, downward, or zero force Then explain, based on the given equations, why you
on P ? What about the force of PR would expect that behavior even if you didn’t have a
on P ? Will the net force on P be computer.
upward, downward, or zero?
(b) Repeat Part (a) if the initial shape of 11.36 𝜕 2 f ∕𝜕t 2 = 𝜕 2 f ∕𝜕x 2 , 0 < x < 1,
the string is linear, y(x, 0) = mx + b. 0 < t < 10, f (0, t) = 0, f (1, t) = e − 1,
(Assume m > 0.) f (x, 0) = e x − 1, ḟ (x, 0) = 0
(c) Repeat Part (a) if the initial shape of the
string is parabolic, y(x, 0) = ax 2 + bx + c 11.37 𝜕 2 f ∕𝜕t 2 = 𝜕 2 f ∕𝜕x 2 , 0 < x < 1, 0 <
as shown in Figure 11.1 (the increasing t < 10, f (0, t) = 0, f (1, t) = 0, f (x, 0) =
part of a concave up curve). 0, ḟ (x, 0) = sin(𝜋x)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 554
𝜕2y 1𝜕 y
2
= (11.3.1)
𝜕x 2 9 𝜕t 2
The phrase “fixed at both ends” in the problem statement gives us the boundary conditions:
1. One solution to this problem is y = 5 sin(2x) cos(6t). (Later in the chapter you will find
such solutions for yourself; right now we are focusing on understanding the equation
and its solutions.)
(a) Confirm that this solution solves the differential equation by plugging y(x, t) into
both sides of Equation 11.3.1 and showing that you get the same answer.
(b) Confirm that this solution meets the boundary conditions, Equation 11.3.2.
(c) Draw graphs of the shape of the string between x = 0 and x = 𝜋 at times t = 0,
t = 𝜋∕12, t = 𝜋∕6, t = 𝜋∕4, and t = 𝜋∕3. Then describe the resulting motion in
words.
(d) Which of the three numbers in this solution—the 5, the 2, and the 6—is an
arbitrary constant? That is, if you change that number to any other constant,
the function will still solve the differential equation and meet the boundary
conditions.
2. Another solution to this equation is y = −(1∕2) sin(10x) cos(kt), if you choose the cor-
rect value of k.
(a) Plug this function into both sides of Equation 11.3.1 and solve for k.
(b) Does the resulting function also meet the boundary conditions?
(c) What is the period of this function in space (i.e. the distance between adjacent
peaks, or the wavelength)?
(d) What is the period of this function in time (i.e. how long do you have to wait before
the string returns to its initial position)?
See Check Yourself #70 in Appendix L
3. For the value of k that you calculated in 2, is the function y = 5 sin(2x) cos(6t) −
(1∕2) sin(10x) cos(kt) a valid solution? (Make sure to check whether it meets both the
differential equation and the boundary conditions!)
4. Next consider the solution y = A sin(px) cos(kt).
(a) For what values of k will this function solve Equation 11.3.1? Your answer will
depend on p.
(b) For what values of p will this solution match the boundary conditions?
5. Write the solution to this differential equation in the most general form you can.
𝜕2y 1 𝜕 y
2
= (11.3.3)
𝜕x 2 v 2 𝜕t 2
The dependent variable y represents the displacement of the string from its relaxed height.
(Note that y can be positive, negative, or zero.) The constant v is related to the tension and
linear density of the string.
y
When you are solving an Ordinary Differential Equation (ODE) you often try to find the
“general solution,” a closed-form function that represents all possible solutions, with a few
arbitrary constants to be filled in based on initial conditions. The same can sometimes be
done for a Partial Differential Equation (PDE), and below we present the general solution
to the wave equation, valid for all possible initial and boundary conditions. However, it is not
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 557
always practical to solve problems using this general solution. Instead we will find a special set
of particular solutions known as “normal modes” and build up solutions for different initial
and boundary conditions using these normal modes.
In this section we’ll simply present the normal modes for the wave equation and use com-
binations of them to build other solutions. In later sections we’ll show you how to find the
normal modes for other PDEs.
𝜕2 y 𝜕2y 𝜕2 y 1 𝜕 y
2
y = f (x + vt) → = f ′′ (x + vt), 2 = v 2 f ′′ (x + vt) → = 2 2
𝜕x 2 𝜕t 𝜕x 2 v 𝜕t
For similar reasons, any function g (x − vt) will also be a solution. Any sum of such functions
will be a solution as well, so we can write the general solution of this equation as:
(We don’t need arbitrary constants in front because f and g can include any constants.) This
form of the general solution to the wave equation is known as “d’Alembert’s Solution.” The
constant v in this solution is not arbitrary; it was part of the original differential equation.
However, f and g are arbitrary functions: replace them with any functions at all and you have a
solution. For instance, the function y = 1∕(x + vt)2 − 3 ln(x − vt) solves the equation (as you
will demonstrate in Problem 11.47).
What does all that tell us about vibrating strings? g (x − vt) represents a function that does
not change its shape over time: it only moves to the right with speed v. For instance, the
2
function y(x, t) = e −(x−vt) describes the behavior you might see if you grab one end of a rope
and give it a quick jerk.
t=0 t=2 t=4
1 1 1
–6 –4 –2 2 4 6 –6 –4 –2 2 4 6 –6 –4 –2 2 4 6
Similarly, f (x + vt) represents an arbitrary curve moving steadily to the left. But the general
solution f (x + vt) + g (x − vt) does not describe a static curve that moves: it can change its
shape
[ over ]time in ways that might surprise you. For instance, we show here the function
2
cos (x + t)2 + e −(x−t−1) .
t=0 t = 1/3 t = 2/3 t=1
1 1 1 1
1 2 3 1 2 3 1 2 3 1 2 3
–1 –1 –1 –1
The correct combination of functions f and g describe any possible solution to the wave
equation. But while f retains its shape while moving to the left, and g retains its shape while
moving to the right, a combination of the two may evolve in complicated ways.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 558
Every point on the string oscillates with a period of 2𝜋∕v and an amplitude given by its
original height (Figure 11.4).
π π π π π
–1 –1 –1 –1 –1
Is it really that simple? The “rubber band” argument may not be sufficiently rigorous
for you; in fact this solution may remind you of a similar function that didn’t work in the
previous section. But the initial condition is different now, and our new solution works per-
fectly. We leave it to you to confirm that the function y = sin(x) cos(vt) satisfies the wave
equation (11.3.3), our boundary conditions y(0, t) = y(𝜋, t) = 0, and our initial conditions
̇ 0) = 0. Those confirmations are the acid test of a solution, however we
y(x, 0) = sin x and y(x,
arrived at it.
And what about our general solution? Can this function be rewritten in the form y(x, t) =
f (x + vt) + g (x − vt)? It must be possible, because all solutions to the wave equation must have
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 559
this form. In this particular case it can be done using trig identities, but it isn’t necessary.
It was easier to start with a physically motivated guess than to find the right functions
f and g .
Making an educated guess works great if the initial condition happens to be y = sin x, but
can we do it for other cases? You probably won’t be surprised to hear that guessing a solution
is just as easy if y(x, 0) = sin(2x), or sin(10x), or any other sine wave that fits comfortably into
the total length: the string just oscillates, retaining its original shape but changing amplitude.
You’ll show in the problems that if the initial conditions are zero velocity and y(x, 0) = sin(nx)
(where n is an integer), then the solution is y(x, t) = sin(nx) cos(nvt).
Solutions of this form are called the “normal modes” of the string.
To see more clearly what this definition means, consider the example y(x, t) =
sin(x) cos(vt) discussed above. At x = 𝜋∕6 this becomes y(𝜋∕6, t) = .5 cos(vt), which is
an oscillation with frequency v∕(2𝜋) and amplitude .5. (Remember that frequency just means
one over the period.)
√ At the point x = 𝜋∕3 the string oscillates with frequency v∕(2𝜋)
and amplitude 3∕2. Since every point on the string oscillates at the same frequency, this
solution is a normal mode of the system. This may remind you of the “normal modes” of
coupled oscillators in Chapter 6. In that case the system consisted of a finite number of
discrete oscillators instead of an infinite number of oscillating points, but the definition
of normal mode is the same in both cases.
If the string starts out in the curve y(x, 0) = sin(nx), we know exactly how it will evolve over
time. Surprisingly, that insight turns out to be the key to the motion of our string under any
initial conditions.
Problem:
Consider a string subject to the same boundary condition we used above,
y(0, t) = y(𝜋, t) = 0, but starting in the initial form y(x, 0) = 7 sin x + 2 sin(8x).
How will such a string evolve over time?
Solution:
The wave equation is linear and homogeneous, which means that any linear
combination of solutions is also a solution. Our boundary conditions are
also homogeneous, which means that any sum of solutions will match the
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 560
boundary conditions as well. Since we know that y(x, t) = sin(x) cos(vt) and
y(x, t) = sin(8x) cos(8vt) are both solutions, it must also be true that:
y(x, t) = 7 sin(x) cos(vt) + 2 sin(8x) cos(8vt)
is a solution. Since it matches the boundary and initial conditions, it is the solution for
this case.2
y
The larger waves make one full cycle up and down ...
... in the time that the smaller waves make eight full cycles.
x
–15 –10 –5 5 10 15
Once again, this should look familiar if you’ve studied coupled oscillators. There too you
can solve for any initial condition by writing it as a sum of normal modes.
You may object that our example was too easy: the initial condition wasn’t exactly a normal
mode, but it was the next best thing. How do we find the solution for an initial condition
that doesn’t just “happen” to be the sum of a few sine waves?
The answer is that practically any function on a finite domain “happens” to be the sum
of sine waves—or at least, we can write it as the sum of sine waves if we want to. That’s
what Fourier series are all about! And that insight leads us to a general approach. First you
decompose your initial condition into a sum of sine waves (or, more generally, into a sum
of normal modes). Then you write your solution as a sum of individual solutions for each of
these normal modes.
Problem:
Solve the wave equation with boundary { conditions y(0, t) = 0 and y(𝜋, t) = 0, and
x∕2 0 < x < 𝜋∕2
̇ 0) = 0, y(x, 0) =
initial conditions y(x,
(𝜋 − x)∕2, 𝜋∕2 < x < 𝜋
y
π
4
x
0 π π
2
2 As we said earlier, we are not going to formally discuss exactly what boundary and initial conditions are sufficient
to conclude that a solution is unique. In this case, however, we can make the argument on purely physical grounds:
we know everything there is to know about this particular string, and it can only do one thing!
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 561
Solution:
We begin by writing the initial condition as a sum of normal modes. In other
words, we write the Fourier sine series for the function. The answer (after some
calculations) is:
∑
∞
2 2 1 2
y(x, 0) = (−1)(n−1)∕2 sin (nx) , n odd = sin (x) − sin (3x) + sin (5x) + …
n=1
n2 𝜋 𝜋 9𝜋 25𝜋
That infinite sum may look intimidating, but if you look term by term, you can see
that for each value of n this is just a constant times a sine function. We can do the
same thing we did above with the sum of just two sine functions; we multiply each
sine function of x by the corresponding cosine function of t. So the solution is:
∑
∞
2
y(x, t) = (−1)(n−1)∕2
2𝜋
sin (nx) cos (nvt) , n odd
n=1
n
2 1
= sin (x) cos (vt) − sin (3x) cos (3vt) + … (11.3.5)
𝜋 9𝜋
If you had trouble with the step where we took the Fourier sine series of y(x, 0) you should
review Fourier series now; we’re going to use them a lot in this chapter. All the relevant
formulas are in Appendix G.
Even if you didn’t have trouble with the calculation, you may still find an infinite series to
be an unsatisfying answer. However, that 1∕n 2 will make the series converge pretty quickly, so
for most purposes the first few terms will give you a pretty good approximate answer. As we
go through the chapter, you’ll see that most analytical solutions for PDEs come in the form
of series; expressions in closed form are the exceptions.
Now you know how to solve for the motion of a vibrating string, at least if:
∙ The ends of the string are at x = 0 and x = 𝜋.
∙ The ends of the string are held fixed at y = 0.
∙ The initial velocity of the string is zero everywhere.
In the problems you will use the same technique with these three conditions remaining
constant—only the initial position, and therefore the Fourier series, will vary. Then you
will do other problems that change all three of these conditions, and you will find that
although the forms of the solutions vary, the basic idea carries through. The example below,
for instance, is the same in length and initial velocity, but different in boundary conditions.
The air inside a flute obeys the wave equation 𝜕 2 s∕𝜕x 2 = (1∕cs2 )(𝜕 2 s∕𝜕t 2 ), where s(x, t)
is displacement of the air and the constant cs is the speed of sound. For reasons we’re
not going to get into here, s is not constrained to go to zero at the edges, but 𝜕s∕𝜕x is.
Hence, we are solving the same differential equation with different boundary
conditions.
For consistency, let’s consider a flute that extends from x = 0 to x = 𝜋, and let’s
take as a simple initial condition s(x, 0) = cos x, 𝜕s∕𝜕t(x, 0) = 0. (Why can’t we start
with the same initial condition we used for our string above? Because it doesn’t meet
our new boundary conditions!)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 562
The solution to our new system is s(x, t) = cos(x) cos(cs t). You can confirm this
solution by verifying the following requirements:
1. It satisfies the wave equation 𝜕 2 s∕𝜕x 2 = (1∕cs2 )(𝜕 2 s∕𝜕t 2 ).
2. It satisfies the boundary condition 𝜕s∕𝜕x = 0 at x = 0 and x = 𝜋.
3. It satisfies the initial condition s(x, 0) = cos x. (Did you expect our solution to have
sin(cs t) instead of the cosine? This would meet the first two requirements, but not
the initial condition. Try it!)
This solution is a normal mode since every point in the flute vibrates with the same
frequency. More generally, s(x, t) = cos(nx) cos(ncs t) is the normal mode of this system
for the initial condition s(x, 0) = cos(nx), 𝜕s∕𝜕t(x, 0) = 0.
Therefore, if the system happened to start in the state s(x, 0) = 10 cos(3x) +
2 cos(8x), 𝜕s∕𝜕t(x, 0) = 0, its motion would be described by the function s(x, t) =
10 cos(3x) cos(3cs t) + 2 cos(8x) cos(8cs t). More generally, we can decompose any initial
condition into a Fourier cosine series, and simply write the solution from there. (For
a function defined on a finite interval you create an “odd extension” of that function
to write a Fourier sine series or an “even extension” to write a Fourier cosine series:
see Chapter 9.)
Normal modes provide a very general approach that can be used to solve many problems
in partial differential equations, provided you can take two key steps.
1. Figure out what the normal modes are. In this case we figured them out using physical
intuition—or, as it may seem to you, improbably lucky guesswork.
2. Rewrite any initial condition as a sum of normal modes. In this case we used Fourier
series.
In the next section we will see a more general approach to the first step, finding the normal
modes. In the sections that follow, we will see that the second step is often possible even when
the normal modes do not involve sines and cosines.
does not simply mean that these functions are guess into the wave equation (11.3.3) along
valid solutions of the differential equation; it with the initial conditions y(x, 0) = sin x and
means something much stronger than that. ̇ 0) = 0, and solve for A, B, 𝛼, and 𝛽.
y(x,
Explain in your own words what a “normal
mode” means, and why it is important. In Problems 11.64–11.68 you will solve for the
displacement of air inside a flute of length 𝜋. The
11.62 A string of length 1 is fixed at both displacement s(x, t) obeys the wave equation
ends and obeys the wave equation (11.3.3) 𝜕 2 s∕𝜕x 2 = (1∕cs2 )(𝜕 2 s∕𝜕t 2 ), just like a vibrating string,
with v = 2. For each of the initial con- but the boundary conditions for the flute are
ditions given below assume the initial 𝜕s∕𝜕x(0, t) = 𝜕s∕𝜕x(𝜋, t) = 0. This leads to a different
velocity of the string is zero. set of normal modes, which we found in the example
(a) Have a computer numerically solve the on Page 561. The initial condition for s(x, 0) is given
wave equation for this string with initial below, and you should assume in each case that
condition y(x, 0) = sin(2𝜋x) and animate 𝜕s∕𝜕t(x, 0) = 0. For each of these initial conditions
the resulting motion of the string. Solve find the solution s(x, t) to the wave equation, taking
to a late enough time to see the string cs = 3.
oscillate at least twice, using trial and
11.64 s(x, 0) = cos(5x)
error if necessary. Describe the resulting
motion. 11.65 s(x, 0) = cos(x) + (1∕10) cos(10x)
(b) Have a computer numerically solve for 11.66 s(x, 0) = (1∕10) cos(x) + cos(10x)
and animate the motion of the string ( )2
for initial condition y(x, 0) = sin(20𝜋x), 11.67 s(x, 0) = x 2 − 𝜋 2 . Your answer will
using the same final time you used in be an infinite series. Make plots of the 20th
Part (a). How is this motion different partial sum of the series solution at three
from what you found in Part (a)? or more different times. Describe how the
function s(x) is evolving over time.
(c) Consider the initial condition y(x, 0) =
sin(2𝜋x) + (.1) sin(20𝜋x). What would 11.68 Make up an initial position s(x, 0). You may
you expect the motion of the string to use any function that obeys the bound-
look like in this case? Solve the wave ary conditions 𝜕s∕𝜕x(0, t) = 𝜕s∕𝜕x(𝜋, t) = 0
equation with this initial condition numer- except the trivial case s(x, 0) = 0, or any
ically and animate the results. Did the function we have already used in the Expla-
results match your prediction? nation (Section 11.3.2) or the problems
(d) Finally, make an animation of the solution above.
to the wave[ equation for the case y(x, 0) =
.2 sin(2𝜋x) 7 + 6x] − 100x 2 + 100x 3 + 11.69 In the Explanation (Section 11.3.2), we con-
−36x 2
sidered a string of length 𝜋 with fixed ends
cos(36x) − e . (We chose this sim- and zero initial velocity. We wrote an expres-
ply because it looks like a crazy, random sion for the normal modes of this system and
mess.) Describe the resulting motion. showed how to solve for the motion if the
(e) How is the evolution of a string that string started in one of the normal modes.
starts in a normal mode different More importantly, we showed how to write
from the evolution of a string that any other given initial condition as a sum of
starts in a different shape? normal modes using Fourier series and thus
11.63 In the Explanation (Section 11.3.2), we dis- solve the wave equation. In this problem you
cussed a string of length 𝜋 fixed at both will perform a similar analysis for a string of
ends with initial shape y(x, 0) = sin x and length L.
no initial velocity. Based on physical argu- (a) If a string starts in the initial position
ments we guessed that the initial shape of y(x, 0) = sin(nx) where n is any integer,
the string would oscillate sinusoidally in time. it is guaranteed to meet the boundary
We then jumped to the exact correct func- conditions y(0, 0) = y(𝜋, 0) = 0. What
tion with very little justification. (Did you must be true of the constant k if the
notice?) In this problem, your job is to fill initial position meets the boundary
in the missing steps. Start with a “guess” that conditions y(0, 0) = y(L, 0) = 0? (Your
represents the initial function oscillating: answer will once again end in the phrase
y(x, t) = sin(x) (A sin(𝛼t) + B cos(𝛽t)). Plug this “where n is any integer.”)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 565
(b) For a string of length 𝜋 with fixed (d) Express your guess as a mathematical
ends and zero initial velocity, the nor- function y(x, t) and verify that it solves
mal modes can be written as y(x, t) = the wave equation and matches the ini-
sin(nx) cos(nvt), n = 0, 1, 2, … . Based tial and boundary conditions. (If at first
on the initial position you wrote in your guess doesn’t succeed: try, try again.)
Part (a), write a similar expression for (e) Next consider the initial velocity
the normal modes y(x, t) for a string of 𝜕y∕𝜕t(x, 0) = sin(3x). Find the solution
length L. Confirm that your solution y(x, t) for this case and make sure your
solves the wave equation (11.3.3). solution satisfies the wave equation and
(c) Find the solution y(x, t) if the string starts the initial and boundary conditions.
at rest in position y(x, 0) = sin(3𝜋x∕L). (f) In the Explanation (Section 11.3.2),
Make sure your solution satisfies the we found that the normal modes for a
wave equation and matches the ini- string of length 𝜋 with fixed ends and
tial and boundary conditions. zero initial velocity can all be written as
(d) Find the solution y(x, t) if the string y(x, t) = sin(nx) cos(nvt), n = 0, 1, 2, … .
starts at rest in position y(x, 0) = Write a similar expression for the normal
2 sin(3𝜋x∕L) + 5 sin(8𝜋x∕L). modes of a string of length 𝜋 with fixed
(e) Now find the solution if the ends and zero initial displacement.
string starts
{ at rest in position (g) Find the solution y(x, t) if the string
1 L∕3 < x < 2L∕3 starts at y(x, 0) = 0 with initial velocity
y(x, 0) = .
0 elsewhere 𝜕y∕𝜕t(x, 0) = 2 sin(3x) + 4 sin(5x).
(Hint: You will need to start by expand- (h) Find the solution if the string starts
ing this initial function in a Fourier at y(x, 0) = 0 {with initial velocity
sine series.) Your answer will be in 1 𝜋∕3 < x < 2𝜋∕3
the form of an infinite series. 𝜕y∕𝜕t(x, 0) = .
0 elsewhere
11.70 In the Explanation (Section 11.3.2), we con- This might occur if the middle of the
sidered a string with zero initial velocity and string were suddenly struck with a ham-
non-zero initial displacement. We wrote an mer. (Hint: You will need to start by
expression for the normal modes of this sys- expanding this initial function in a
tem and showed how to solve for the motion Fourier sine series.) Your answer will be
if the string started in one of the normal in the form of an infinite series.
modes. More importantly, we showed how 11.71 In the Explanation (Section 11.3.2), we found
to write a more complicated initial position the normal modes for a vibrating string that
as a sum of normal modes using Fourier obeys the wave equation (11.3.3) with fixed
series and thus solve the wave equation. boundaries and zero initial velocity. If the
In this problem you will perform a similar string is infinitely long we no longer have
analysis for a string with the same bound- those boundary conditions. What are all the
ary conditions, y(0, t) = y(𝜋, t) = 0, but with possible normal modes for a system obeying
different initial conditions: your string has the wave equation (11.3.3) with zero initial
zero initial displacement and non-zero initial velocity on the real line: −∞ < x < ∞? To
velocity. answer this you should look at the Explana-
(a) Regardless of its initial velocity, this tion and see what restriction the boundary
string has zero initial acceleration. conditions imposed on our normal modes,
How do we know that? and then remove that restriction.
(b) Suppose that the string has initial 11.72 A function y(x, t) may be said to be a “normal
velocity 𝜕y∕𝜕t(x, 0) = sin x. Sketch mode” if the initial shape y(x, 0) evolves in time
the shape of the string a short time by changing amplitude—that is, by stretch-
later after the initial time. ing or compressing vertically—but does not
(c) Describe in words how you would expect change in any other way. We have seen that the
the string to evolve over time. (To answer normal modes for the wave equation are sines
this, you will need to take into account and cosines, but different equations may have
not only the initial velocity, but also very different normal modes. Express with no
the wave equation (11.3.3) that dictates words (just a simple equation) the statement
its acceleration over time.) “y(x, t) is a normal mode” as defined above.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 566
𝜕u 𝜕2u
=𝛼 2 (11.4.1)
𝜕t 𝜕x
where u(x, t) is the temperature and 𝛼 is a positive constant called “thermal diffusivity”
that reflects how efficiently the bar conducts heat. Both ends are immersed in ice water, so
u(0, t) = u(L, t) = 0, but the initial temperature may be non-zero at other interior points.
You are going to solve for the temperature u(x, t). Your strategy will be to guess a solution of
the form u(x, t) = X (x)T (t) where X is a function of the variable x (but not of t), and T is a
function of the variable t (but not of x).
1. Plug the function u(x, t) = X (x)T (t) into Equation 11.4.1. Note that if u(x, t) =
X (x)T (t) then 𝜕u∕𝜕x = X ′ (x)T (t).
2. “Separate the variables” in your answer to Part 1. In other words, algebraically rear-
range the equation so that the left side contains all the functions that depend on t,
and the right side contains all the functions that depend on x. In this problem—in
fact in many problems—you can separate the variables by dividing both sides of the
equation by X (x)T (t). You should also divide both sides of the equation by 𝛼, which
will bring 𝛼 to the side with t; this step is not necessary but it will make the equations
a bit simpler later on.
3. The next step relies on this key result: if the left side of the equation depends on t
(but not on x), and the right side depends on x (but not on t), then both sides of the
equation must equal a constant. Explain why this result must be true.
4. Based on the result from Part 3, you can now turn your partial differential equation
from Part 1 into two ordinary differential equations: “the left side of the equation
equals a constant” and “the right side of the equation equals the same constant.” Write
both equations. Call the constant P .
See Check Yourself #71 in Appendix L
The next step, finding real solutions to these two ordinary differential equations, depends
on the sign of the constant P . We shall therefore handle the three cases separately. We’ll start
by solving the equation for X (x). Remember that the constant 𝛼 is necessarily positive. Avoid
complex answers.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 568
5. Assuming P > 0, we can replace P with k 2 where k is any real number. Solve the
equation for X (x) for this case. Your solution will have two arbitrary constants in it:
call them A and B.
6. Assuming P = 0, solve the equation for X (x), once again using A and B for the arbi-
trary constants.
7. Assuming P < 0, we can replace P with −k 2 where k is any real number. Solve the
equation for X (x) for this case, once again using A and B for the arbitrary constants.
8. For two of the three solutions you just found the only way to match the boundary
conditions is with the “trivial” solution X (x) = 0. Only one of the three allows for
non-trivial solutions that match the boundary conditions. Based on that fact, what
must the sign of P be?
See Check Yourself #72 in Appendix L
9. The boundary condition u(0, t) = 0 implies that X (0) = 0. Plug this into your solution
for X (x) to find the value of one of the two arbitrary constants.
10. Find the values of k that match the second boundary condition u(L, t) = 0. Hint: there
are infinitely many such values. We will return to these values when we match initial
conditions. For the rest of this exercise we will continue to just write k.
Having found X (x), we now turn our attention to the ODE you wrote for T (t) way back in
Part 4.
11. Replace P with −k 2 in the T (t) differential equation, as you did with the X (x) one.
(Why? Because both differential equations were set equal to the same constant P .)
12. Having done this replacement, solve the equation for T (t). Your solution will intro-
duce a new arbitrary constant: call it C.
13. Write the solution u(x, t) = X (x)T (t) based on your answers. This solution should
depend on k.
14. Explain why, when you combine your X (x) and T (t) functions into one u(x, t) func-
tion, you can combine the arbitrary constants from the two functions into one arbi-
trary constant.
The solution you just found is a normal mode of this system. In the Explanation that follows
we will write the general solution to such an equation as a linear superposition of all the
normal modes, and use initial conditions to solve for the arbitrary constants.
Before we dive into the details, let’s start with an analogy to a familiar problem. To solve
the ordinary differential equation y′′ − 6y′ + 5y = 0, we might take the following steps:
∙ Guess a solution with an unknown constant. In Chapter 1 we saw that the correct guess for
such an equation would be y = e px .
∙ Plug your guess into the original equation, to solve for the unknown constant. Plugging y = e px
into the original differential equation, we can solve to find two solutions y = e x and
y = e 5x . (Try it.)
∙ Sum the solutions. If a differential equation is linear and homogeneous, then any lin-
ear combination of solutions is also a solution. So we write y = Ae x + Be 5x . Since we
are solving a second-order linear equation and we have a solution with two arbitrary
constants, this is the general solution.
∙ Plug in initial conditions. “Initial conditions” in this context means two additional pieces
of information beside the original differential equation. For instance, if we know that
f (2) = 3 then we can write 3 = Ae 2 + Be 10 . If we know one other piece of information—
such as another point, or the derivative at that point—we can solve for the arbitrary
constants, finding the specific solution we want.
Plugging in “guesses” in this way reduces ordinary differential equations to algebra
equations, which are generally easier to solve.
Separation of variables allows you to solve partial differential equations in much the same
way. We will show below that by “guessing” a solution that changes in amplitude only, you
turn your partial differential equation into several ordinary differential equations. Solving
these equations gives you your normal modes with an infinite number of arbitrary constants
to match based on “initial” (time) and “boundary” (space) conditions. With the variables
separated, you can approach these two kinds of conditions separately.
The Problem
We’re going to demonstrate this new technique with a familiar problem. When we arrive at
the solution, we will recognize it from the previous sections. But this time we will derive the
solution in a way that can be applied to different problems.
Here, then, is our familiar problem: a string of length L obeys the wave equation:
𝜕2 y 1 𝜕 y
2
= (11.4.2)
𝜕x 2 v 2 𝜕t 2
The string is fixed at both ends, which gives us our boundary conditions:
To fully solve for the motion of the string, we need to know its initial position, and its initial
velocity. For the moment we will keep both of those generic:
dy
y(x, 0) = f (x) and (x, 0) = g (x) (11.4.4)
dt
Solve for the motion of the string.
where X is a function of the variable x (but not of t), and T is a function of the variable t
(but not of x).
It’s worth taking a moment to consider what sort of functions we are looking at. X (x)
represents the shape of the string at a given moment t. Since we have placed no restrictions
on the form of that function, it could turn out to be simple or complicated.
But how does that function evolve over time? T (t) is simply a number, positive or negative,
at any given moment. When you multiply X (x) by a number, you stretch it vertically. You may
also turn it upside down. But beyond those simple transformations, the function X (x) will not alter
in any way.
y
2f(x)
f(x)
x
–f(x)
Wherever X (x) = 0, it will stay zero forever: that is to say, the places where the string crosses
the x-axis will never change. (The only exception is that, whenever T (t) happens to be zero,
the entire string is uniformly flat.) The x-values of the critical points, where the string reaches
a local maximum or minimum, will likewise never change.
A function of that form—changing in amplitude, but not in shape—is called a “normal
mode.” In guessing a solution of the form y(x, t) = X (x)T (t), we are asserting “There exists
a set of normal modes for this differential equation.” We will then plug in this function to
find the normal modes. If the initial state does not happen to correspond perfectly to a
normal mode (which it usually doesn’t) we will build it up as a sum of normal modes; since
we understand how each of them evolves in time, we can compute how the entire assemblage
evolves.3
Now comes the step that gives the technique of “separation of variables” its name. We rear-
range the equation so that all the t-dependency is on the right, and all the x-dependency is
on the left. We accomplish this (in this case and in fact in many cases) by dividing both sides
by X (x)T (t).
X ′′ (x) 1 T ′′ (t)
= 2 (11.4.5)
X (x) v T (t)
If x changes, does the right side of this equation change? The answer must be “no” since T (t)
has no x-dependency. That means—since the two sides of the equation must stay equal for
all values of x and t—that changing x cannot change the left side of the equation either.
Similarly, changing t has no effect on the left side of the equation, and must therefore
have no effect on the right side. If both sides of the equation are equal, and neither one
3 When can you build up your initial conditions as a sum of normal modes? The answer, to make a long story short,
is “almost always.” We’ll make that story long again when we discuss Sturm-Liouville theory in Chapter 12.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 571
depends on x or t, then they must be…(drum roll please)…a constant! Calling this constant P ,
we write:
X ′′ (x) 1 T ′′ (t)
= P and 2 =P (11.4.6)
X (x) v T (t)
Step 3: Solve the Spatial Equation and Match the Boundary Conditions
Instead of one partial differential equation, we now have two ordinary differential
equations—and pretty easy ones at that. We start with the spatial equation:
X ′′ (x) = PX (x)
The solution to this equation depends on the sign of P . In the following analysis we define
a new real number k. Because k is real we use k 2 to mean “any positive P ” and −k 2 to mean
“any negative P .”
P =0 X ′′ (x) = 0 X (x) = mx + b
The exponential4 and linear solutions can only satisfy our boundary condition (y = 0 at both
ends) with the “trivial” solution X (x) = 0. Unless our initial conditions place the string in an
unmoving horizontal line, this solution is inadequate.
On the other hand, sines and cosines are flexible enough to meet all our demands. We
can tailor them to meet our boundary conditions, and then we can sum the result to meet
whatever initial condition the string throws at us. We therefore declare that P is negative and
replace it with −k 2 .
Our first boundary condition is y(0, t) = 0. Plugging x = 0 into A sin(kx) + B cos(kx) and
setting it equal to zero gives B = 0, so we have a sine without a cosine.
The next condition is y(L, t) = 0.
A sin(kL) = 0
One way to match this condition would be to set A = 0, which would bring us back to the
trivial y(x, t) = 0. The alternative is sin(kL) = 0, which can only be satisfied if kL = n𝜋 where
n is an integer. The case n = 0 gives us the trivial solution again, and negative values of n give
us the same solutions as positive values (just with different values of the arbitrary constant
A), so the complete set of non-trivial functions X (x) is:
n𝜋
X (x) = A sin(kx) (k = for all positive integers n) (11.4.7)
L
We need to stress that we are not saying that our string must, at any given time, take the shape
x = A sin(n𝜋x∕L). We are saying, instead, that Equation 11.4.7 defines the normal modes of
the string. If the string is described by that equation then it will evolve simply in time—we’ll
figure out exactly how in a moment. If the string is not in such a shape then we will write it
as a sum of such functions and then evolve each one independently.
4 The exponential solution can also be written with hyperbolic trig functions as A sinh(kx) + B cosh(kx). This is often
more convenient for matching boundary conditions, but it does not change anything fundamental such as the
inability to meet these particular conditions.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 572
1 T ′′ (t)
= −k 2
v 2 T (t)
We can quickly rewrite this as T ′′ (t) = −k 2 v 2 T (t), which has a similar form—and therefore a
similar solution—to our spatial equation.
We found that to match the boundary conditions for X (x) we needed k = n𝜋∕L, and (again)
this is the same k. The solution we are looking for is a product of the X (x) and T (t) functions.
( )[ ( ) ( )]
n𝜋 nv𝜋 nv𝜋
X (x)T (t) = sin x C sin t + D cos t where n = 1, 2, 3 … (11.4.8)
L L L
We have absorbed the arbitrary constant A into the arbitrary constants C and D. Why can we
do that? Remember that A simply means “you can put any constant here” and C also means
“you can put any constant here,” so AC simply stands for “any constant.” We can call this new
constant C with no loss of generality.
For any positive integer n and any values of C and D the function 11.4.8 is a valid solution
to the differential equation 11.4.2 and matches the boundary conditions 11.4.3.
Equation 11.4.9 has two different frequencies5 and it’s important not to get them confused.
3𝜋∕L is a quantity in space, not time—it means that the wavelength of the string is 2L∕3.
The wavelengths of our normal modes were determined by the boundary conditions y(0, t) =
y(L, t) = 0. 3v𝜋∕L is a quantity in time, not space—it means that the string will return to its
starting position every 2L∕(3v) seconds. The fact that this frequency is v times the spatial one
is a result of the PDE we are solving.
Equation 11.4.9 tells us that if the string starts at rest in a perfect sine wave with spatial
frequency 3𝜋∕L then it is in a normal mode and will retain its basic shape while its amplitude
oscillates with temporal frequency 3v𝜋∕L.
That was a normal mode with n = 3. Every positive integer n corresponds to a different
wavelength. Every such wavelength fits our boundary conditions, and every such mode will
oscillate with a different period in time. The amplitude of each normal mode can be any-
thing, represented by the arbitrary constants C and D.
Because we are solving a linear homogeneous PDE with homogeneous boundary
conditions,
[ any linear combination
] of these solutions is [itself a solution. So sin (𝜋x∕L)]
3 sin (v𝜋t∕L) + 4 cos (v𝜋t∕L) is a solution, and sin (2𝜋x∕L) 5 sin (2v𝜋t∕L) − 8 cos (2v𝜋t∕L)
5 Our word “frequency,” whether in space or time, is shorthand for the quantity that should more properly be termed
“angular frequency.”
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 573
is another solution, and if you sum those two functions you get yet another solution. To find
the general solution we sum all possible solutions.
∑
∞ ( )[ ( ) ( )]
n𝜋 nv𝜋 nv𝜋
y(x, t) = sin x Cn sin t + Dn cos t (11.4.10)
n=1
L L L
We’ve added a subscript to the constants C and D because they can take on different values
for each value of n.
𝜕y ∑
∞ ( )[ ( ) ( ) ( ) ( )]
n𝜋 nv𝜋 nv𝜋 nv𝜋 nv𝜋
= sin x Cn cos t − Dn sin t (11.4.11)
𝜕t n=1 L L L L L
∑
∞ ( )
n𝜋
y(x, 0) = Dn sin x = f (x) (11.4.12)
n=1
L
𝜕y ∑∞ ( ) ( )
nv𝜋 n𝜋
(x, 0) = Cn sin x = g (x) (11.4.13)
𝜕t n=1
L L
It is now time to relate these equations to the results we saw in the previous section. We said
that if the string is in a normal mode, its motion will be very simple. If it is not in a normal
mode, we can understand its motion by building the initial conditions as a sum of normal
modes.
Let’s start with a very simple example: the initial position f (x) = 3 sin (5𝜋x∕L), and the
initial velocity g (x) = 0. Can you see what this does to our coefficients in Equations 11.4.12
and 11.4.13? It means that D5 = 3, that Dn = 0 for all n other than 5, and that Cn = 0 for all
n. In other words, y(x, t) = 3 cos (5v𝜋t∕L) sin (5𝜋x∕L).
On the other hand, what if f (x) is not quite so convenient? We are left with using
Equation 11.4.12 to solve for all the Dn coefficients to match the initial position. But that
is exactly what we do when we create a Fourier sine series: we find the coefficients to
build an arbitrary function as a series of sine waves. You may want to quickly review that
L
process in Chapter 9. Here we are going to jump straight to the answer: Dn = (2∕L) ∫0 f (x)
sin (n𝜋x∕L) dx.
Similar arguments apply to the initial velocity and Cn , with the important caveat that
the Fourier coefficients in Equation 11.4.13 are Cn (nv𝜋∕L). So we can write Cn (nv𝜋∕L) =
L
(2∕L) ∫0 g (x) sin (n𝜋x∕L) dx.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 574
We’re done! The complete solution to Equation 11.4.2 subject to the boundary conditions
11.4.3 and the initial conditions 11.4.4 is:
∑
∞ ( )[ ( ) ( )]
n𝜋 nv𝜋 nv𝜋
y(x, t) = sin x Cn sin t + Dn cos t (11.4.14)
n=1
L L L
L ( ) L ( )
2 n𝜋 2 n𝜋
Cn = g (x) sin x dx, Dn = f (x) sin x dx
nv𝜋 ∫0 L L ∫0 L
In the problems you’ll evaluate this solution analytically and numerically for various initial
conditions f (x) and g (x). For some simple initial conditions you may be able to evaluate this
sum explicitly, but for others you can use partial sums to get numerical answers to whatever
accuracy you need.
𝜕z 𝜕2z
4 − 9 2 − 5z = 0
𝜕t 𝜕x
on the domain 0 ≤ x ≤ 6, t ≥ 0 subject to the boundary conditions z(0, t) = z(6, t) = 0
and the initial condition z(x, 0) = sin2 (𝜋x∕6).
1. Assume a solution of the form z(x, t) = X (x)T (t).
2. Plug this solution into the original differential equation and obtain
4T ′ (t) 9X ′′ (x)
−5=
T (t) X (x)
4T ′ (t) 9X ′′ (x)
−5=P and =P
T (t) X (x)
5. Because both our equation and our boundary conditions are homogeneous, any
linear combination of solutions is a solution:
∑
∞ ( )
𝜋n 2 2
z(x, t) = Dn sin x e (−𝜋 n +20)t∕16
n=1
6
( ) ∑
∞ ( )
𝜋x 𝜋n
sin2 = Dn sin x
6 n=1
6
This is a Fourier sine series. The formula for the coefficients is in Appendix G.
The resulting integral looks messy but it can be solved by using Euler’s formula
and some trig identities or by just plugging it into a computer program. The
result is
6 ( ) ( ) { ( )
2 2 𝜋x n𝜋x 8∕ 4n𝜋 − n 3 𝜋 n odd
Dn = sin sin dx =
∫
6 0 6 6 0 n even
So the solution is
∑
∞ ( )
8 𝜋n 2 2
z(x, t) = 3𝜋
sin x e (−𝜋 n +20)t∕16 , n odd
n=1
4n𝜋n 6
Stepping Back Part I: Solving Problems that are Just Like this One
It may seem like the process above involved a number of tricks that might be hard to apply
to other problems. But the method of separation of variables is actually very general, and the
steps you take are pretty consistent.
1. Begin by assuming a solution which is a simple product of single-variable functions: in
other words, a normal mode.
2. Substitute this assumed solution into the original PDE. Then try to “separate the vari-
ables,” algebraically rearranging so that each side of the equation depends on only
one variable. (This often involves dividing both sides of the equation by X (x)T (t).)
Set both sides of the equation equal to a constant, thus turning one partial differen-
tial equation into two ordinary differential equations. Depending on the boundary
conditions, you may be able to quickly identify the sign of the separation constant.
3. Solve the spatial equation and substitute in your boundary conditions to restrict your
solutions. This step may eliminate some solutions (such as the cosines in the example
above) and/or restrict the possible values of the separation constant.
4. Solve the time equation and multiply it by your spatial solution. Since both solutions
will have arbitrary constants in front, you can often absorb the arbitrary constants from
the spatial solution into those from the time solution. The resulting functions are the
normal modes of your system.
5. Since your equation is linear and homogeneous, the linear combination of all these
normal modes provides the most general solution that satisfies the differential
equation and the boundary conditions. This solution will be an infinite series.
6. Finally, match your solution to your initial conditions to find the coefficients in that
series. For instance, if your normal modes were sines and cosines, then you are build-
ing the initial conditions as Fourier series.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 576
In the examples above the boundary conditions were homogeneous while the initial condi-
tions were not. If two solutions y1 and y2 satisfy the boundary conditions y(0, t) = y(L, t) = 0
then the sum y1 + y2 also satisfies these boundary conditions. That’s why we can apply these
boundary conditions to each normal mode individually, confident that the sum of all normal
modes will meet the same conditions.
By contrast if two solutions y1 and y2 individually satisfy the initial condition y(x, 0) = f (x),
their sum (y1 + y2 )(x, 0) will add up to 2f (x). So we have to apply the condition y(x, 0) = f (x)
to the entire series after summing, rather than to the individual parts.
It’s fairly common for the boundary conditions to be homogeneous and the initial condi-
tions inhomogeneous. In such cases, you will follow the pattern we gave here: first apply the
boundary conditions to X (x), then sum all solutions into a series, and then apply the initial
∑
conditions to X (x)T (t).
Stepping Back Part II: Solving Problems that are a Little Bit Different
Many differential equations cannot be solved by the exact method described above. We list
here the primary variations you may encounter, many of which are discussed later in this
chapter.
∙ You can’t separate the variables. If you cannot algebraically separate your variables—
for instance, if the differential equation is inhomogeneous—this particular technique
will obviously not work. You may still be able to solve the differential equation using
other techniques, some of which are discussed later in this chapter and some of which
are discussed in textbooks on partial differential equations.
∙ Your boundary conditions are inhomogeneous. Just as with ODEs, you can add a
“particular” solution to a “complementary” solution. We discuss this situation in
Section 11.8.
∙ Your initial conditions are homogeneous. You’ll show in Problem 11.97 why separation
of variables doesn’t generally work for homogeneous initial conditions. This problem
forces you to find an alternative method, some of which are discussed in this chapter.
∙ You have more than two independent variables. Do separation of variables multiple
times until you have one ordinary differential equation for each independent variable.
We’ll solve such an example in Section 11.5.
∙ You have more than one spatial variable and no time variable. If you have inhomoge-
neous boundary conditions along one boundary, and homogeneous boundary condi-
tions along the others, you can treat the inhomogeneous boundary the way we treated
the initial conditions above. (Apply the homogeneous boundary conditions, then sum,
and then apply the inhomogeneous boundary condition.) You already know everything
you need for such a problem, so you’ll try your hand at it in Problem 11.96. If you have
more than one inhomogeneous boundary condition, you need to apply the method
described in Section 11.8.
∙ Your normal modes are not sines and cosines. Separation of variables gives you an ordi-
nary differential equation that you can solve to find the normal modes of the partial
differential equation you started with. Besides trig functions these normal modes may
take the form of Bessel functions, Legendre polynomials, spherical harmonics, and
many other categories of functions. If you can find the normal modes, and if you can
build your initial conditions as a sum of normal modes, the overall process remains
exactly the same. See Sections 11.6 and 11.7.
∙ Your differential equation is non-linear. You’re pretty much out of luck. Separation of
variables will not work for a non-linear equation because it relies on writing the solution
as a linear combination of normal modes. In fact, there are very few non-linear partial
differential equations that can be analytically solved by any method. When scientists
and engineers need to solve a non-linear partial differential equation, which they do
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 577
quite often, they generally have to find a way to approximate it with a linear equation
or solve it numerically.
solution as a series and write expressions for the boundary conditions unless A = B = 0. Then
coefficients in the series, as we did for the wave explain why we discard that solution too.
equation. When specific initial conditions are given, 11.96 Given enough time, any isolated region of
solve for the coefficients. The solution may still be in space will tend to approach a steady state
the form of a series. It may help to first work through where the temperature at each point is
Problem 11.82 as a model. unchanging. In this case the time derivative
11.84 𝜕y∕𝜕t = c 2 (𝜕 2 y∕𝜕x 2 ), y(x, 0) = f (x) in the heat equation becomes zero and the
temperature obeys Laplace’s equation 11.2.5.
11.85 𝜕y∕𝜕t = c 2 (𝜕 2 y∕𝜕x 2 ), y(x, 0) = sin (𝜋x∕L) In two dimensions this can be written as
11.86 𝜕y∕𝜕t + y = c 2 (𝜕 2 y∕𝜕x 2 ), y(x, 0) = f (x) 𝜕 2 u∕𝜕x 2 + 𝜕 2 u∕𝜕y2 = 0. In this problem you
11.87 𝜕 2 y∕𝜕t 2 + y = c 2 (𝜕 2 y∕𝜕x 2 ), will use separation of variables to solve for the
y(x, 0) = f (x), 𝜕y∕𝜕t(x, 0) = 0 steady-state temperature u(x, y) on a rectan-
𝜕2 y 𝜕2 y gular slab subject to the boundary conditions
11.88 + y = c 2
, u(0, y) = u(xf , y) = u(x, 0) = 0, u(x, yf ) = u0 .
𝜕t 2 { 𝜕x 2
x 0 ≤ x ≤ L∕2
y(x, 0) = , u = u0
{ L − x L∕2 <x≤L yf
𝜕y −x 0 ≤ x ≤ L∕2
(x, 0) =
𝜕t x − L L∕2 < x ≤ L u=0 u=0
̇ 0) =
11.89 𝜕 4 y∕𝜕t 4 = −c 2 (𝜕 2 y∕𝜕x 2 ), y(x, 0) = y(x,
0 u=0 xf
ÿ (x, 0) = 0, 𝜕 3 y∕𝜕t 3 (x, 0) = sin(3𝜋x∕L) Hint:
the algebra in this problem will be a little
easier if you use hyperbolic trig functions. (a) Separate variables to get ordinary
If you aren’t familiar with them you can differential equations for the func-
still do the problem without them. tions X (x) and Y (y).
(b) Solve the equation for X (x) subject to the
11.90 𝜕y∕𝜕t = c 2 t(𝜕 2 y∕𝜕x 2 ), y(x, 0) = y0 sin (𝜋x∕L)
boundary conditions X (0) = X (xf ) = 0.
Problems 11.91–11.94 refer to a rod with temperature (c) Solve the equation for Y (y) subject to the
held fixed at the ends: u(0, t) = u(L, t) = 0. For each homogeneous boundary condition Y (0) =
set of initial conditions write the complete solution 0. Your solution should have one undeter-
u(x, t). You will need to begin by using separation of mined constant in it corresponding to the
variables to solve the heat equation (11.2.3), but if one boundary condition you have not yet
you do more than one of these you should just find imposed.
the general solution once. (d) Write the solution u(x, y) as a sum of
normal modes.
11.91 u(x, 0) = u0 sin(2𝜋x∕L)
{ (e) Use the final boundary condition
cx 0 < x < L∕2 u(x, yf ) = u0 to solve for the remaining
11.92 u(x, 0) =
c(L − x) L∕2 < x < L coefficients and find the general solution.
⎧ 0 0 < x < L∕3 (f) Check your solution by verifying that
⎪ it solves Laplace’s equation and meets
11.93 u(x, 0) = ⎨ u0 L∕3 < x < 2L∕3
⎪ 0 2L∕3 < x < L each of the homogeneous bound-
⎩ ary conditions given above.
11.94 u(x, 0) = 0. Show how you can find the solu-
tion to this the same way you would for (g) Use a computer to plot the 40th
Problems 11.91–11.93, and also explain partial sum of your solution. (You will
how you could have predicted this solu- have to choose some values for the
tion without doing any calculations. constants in the problem.) Looking at
your plot, describe how the tempera-
11.95 As we solved the “string” problem in the Expla- ture depends on y for a fixed value of x
nation (Section 11.4.2) we determined that (other than x = 0 or x = xf ). You should
a positive separation constant P leads to the see that it goes from u = 0 at y = 0 to
solution X (x) = Ae kx + Be −kx . We then dis- u = u0 at y = yf . Does it increase lin-
carded this solution based on the argument early? If not describe what it does.
that it cannot meet the boundary condi- 11.97 We said in the Explanation (Section 11.4.2)
tions X (0) = X (L) = 0. Show that there is no that separation of variables generally doesn’t
possible way for that solution to meet those work for an initial value problem when the
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 579
initial conditions are homogeneous. To illus- Your work above applies to any one-
trate why, consider once again the wave dimensional Schrödinger equation prob-
equation 11.4.2 for a string stretched from lem; further work depends on the particular
x = 0 to x = L. Assume the initial position potential function. For the rest of the prob-
and velocity of the string are both zero. lem you’ll consider a “particle in a box” that
(a) If the boundary conditions are homoge- experiences no force inside the box but can-
neous, y(0, t) = y(L, t) = 0, what is the solu- not move out of the box. In one dimension
tion? (You can figure this one out without that means V (x) = 0 on 0 ≤ x ≤ L with the
doing any math: just think about it.) boundary conditions 𝜓(0) = 𝜓(L) = 0,
Note that this case is easy to solve but not (c) Solve your separated equation using this
very interesting or useful. For the rest of the V (x) and boundary conditions to find all
problem we will therefore assume that the the allowable values of E—in other words
boundary conditions are not entirely homo- find the energy levels of this system.
geneous, and for simplicity we’ll take the (d) Write the general solution for 𝜓(x) as a
boundary conditions y(0, t) = 0, y(L, t) = H . sum over all individual solutions. Your
(b) Explain why you need to apply the ini- answer will not explicitly involve E.
tial conditions to each T (t) function (e) Write the general solution for Ψ(x, t).
separately, and then apply the bound- Use the values of E you found so that
ary conditions to the entire sum. your answer has n in it but not E.
(c) Separate variables and write the result- (f) Find Ψ(x, t) for the initial condition
ing ODEs for X (x) and T (t). Ψ(x, 0) = 𝜓0 (a constant) for L∕4 ≤
(d) Solve the equation for T (t) with x ≤ 3L∕4 and 0 elsewhere.
the initial conditions given above. 11.100 Exploration: Laplace’s Equation on a Disk
(Choose the sign of the separation The steady-state temperature in an isolated
constant to lead to sinusoidal solu- region of space obeys Laplace’s equation
tions, not exponential or linear.) (11.2.5). In this problem you are going to
Using your solution, explain why separa- solve for the steady-state temperature on a
tion of variables cannot be used to find disk of radius a, where the temperature on
any non-trivial solutions to this problem. the boundary of the disk is given by u(a, 𝜙) =
u0 sin(k𝜙). Laplace’s equation will be easiest
11.98 Solve Problem 11.97 using the heat
to solve in polar coordinates, so you are look-
equation (11.2.3) instead of the wave
ing for a function u(𝜌, 𝜙). (Feel free to try
equation (11.4.2).
this problem in Cartesian coordinates. You’ll
11.99 In quantum mechanics a particle is described feel great about it until you try to apply the
by a “wavefunction” Ψ(x, t) which obeys boundary conditions.) In polar coordinates
the Schrödinger equation: Laplace’s equation can be written as:
ℏ2 𝜕 2 Ψ 𝜕Ψ
− + V (x)Ψ = iℏ 𝜕2 u 1 𝜕 2 u 1 𝜕u
2m 𝜕x 2 𝜕t + + =0
𝜕𝜌2 𝜌2 𝜕𝜙2 𝜌 𝜕𝜌
(We are considering only one spatial dimen-
sion for simplicity.) This is really a whole (a) This problem has an “implicit,” or
family of PDEs, one for each possible poten- unstated, boundary condition that
tial function V (x). Nonetheless we can u(𝜙 + 2𝜋) = u(𝜙). Explain how we
make good progress without knowing any- know this boundary condition must
thing about the potential function. be followed even though it was never
(a) Plug in a trial solution of the form stated in the problem. Because of that
Ψ(x, t) = 𝜓(x)T (t) and separate vari- implicit boundary condition the con-
ables. Call the separation constant E. stant k in the boundary conditions
(It turns out that each normal mode— cannot be just any real number. What
called an “energy eigenstate” in quan- values of k are allowed, and why?
tum mechanics—represents the state (b) Plug in the initial guess R(𝜌)Φ(𝜙) and
of the particle with energy E.) separate variables in Laplace’s equation.
(b) Solve the ODE for T (t). Describe (c) Setting both sides of the separated
in words the time dependence of equation equal to a constant P ,
an energy eigenstate. find the general real solution to the
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 580
ODE for Φ(𝜙) for the three cases (f) There is another implicit condition: R(0)
P > 0, P = 0, and P < 0. has to be finite. Use that condition to
(d) Use the implicit boundary condition show that one of the two arbitrary con-
from Part (a) to determine which of the stants in your solution for R(𝜌) must be
three solutions you found is the correct zero. (You can assume that p > 0.)
one. Now that you know what sign the (g) Multiply your solutions for Φ(𝜙) and
separation constant must have, rename R(𝜌), combining arbitrary constants as
it either p 2 or −p 2 . Use the period of 2𝜋 much as possible, and write the general
to constrain the possible values of p. solution u(𝜌, 𝜙) as an infinite series.
(e) Write the ODE for R(𝜌). Solve it by (h) Plug in the boundary condition u(a, 𝜙) =
plugging in a guess of the form R(𝜌) = u0 sin(k𝜙) and use it to find the values
𝜌c and solve for the two possible val- of the arbitrary constants. (You could
ues of c. Since the ODE is linear you use the formula for the coefficients of
can then combine these two solutions a Fourier sine series but you can do
with arbitrary constants in front. Your it more simply by inspection.)
answer should have p in it. (i) Write the solution u(𝜌, 𝜙).
𝜕2z 𝜕2 z 1 𝜕2z
+ = the wave equation in two dimensions (11.5.1)
𝜕x 2 𝜕y2 v 2 𝜕t 2
The boundary condition is that z = 0 on all four edges of the rectangle. The initial conditions
̇ y, 0) = v0 (x, y).
are z(x, y, 0) = z0 (x, y), z(x,
1. Begin with a “guess” of the form z(x, y, t) = X (x)Y (y)T (t). Substitute this expression
into Equation 11.5.1.
2. Separate the variables so that the left side of the equation depends on x (not on y or
t), and the right side depends on y and t (not on x). (This will require two steps.)
3. Explain why both sides of this equation must now equal a constant.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 581
4. The function X (x) must satisfy the conditions X (0) = 0 and X (w) = 0. Based on this
restriction, is the separation constant positive or negative? Explain your reasoning.
If positive, call it k 2 ; if negative, −k 2 .
5. Solve the resulting differential equation for X (x).
6. Plug in both boundary conditions on X (x). The result should allow you to solve for
one of the arbitrary constants and express the real number k in terms of an integer-
valued n.
See Check Yourself #73 in Appendix L
7. You have another equation involving Y (y) and T (t) as well as k. Separate the variables
in this equation so that the Y (y) terms are on one side, and the T (t) and k terms on
the other. Both sides of this equation must equal a constant, but that constant is not
the same as our previous constant.
8. The function Y (y) must satisfy the conditions Y (0) = 0 and Y (h) = 0. Based on this
restriction, is the new separation constant positive or negative? If positive, call it p 2 ;
if negative, −p 2 .
9. Solve the resulting differential equation for Y (y).
10. Plug in both boundary conditions on Y (y). The result should allow you to solve for
one of the arbitrary constants and express the real number p in terms of an integer-
valued m.
Important: p is not necessarily the same as k, and m is not necessarily the same as n. These new
constants are unrelated to the old ones.
11. Solve the differential equation for T (t). (The result will be expressed in terms of both
k and p, or in terms of both n and m.)
12. Write the solution X (x)Y (y)T (t). Combine arbitrary constants as much as possible. If
your answer has any real-valued k or p constants, replace them with the integer-valued
n and m constants by using the formulas you found from boundary conditions.
13. The general solution is a sum over all possible n- and m-values. For instance, there
is one solution where n = 3 and m = 5, and another solution where n = 12 and
m = 5, and so on. Write the double sum that represents the general solution to this
equation.
See Check Yourself #74 in Appendix L
14. Write equations relating the initial conditions z0 (x, y) and v0 (x, y) to your arbitrary
constants. These equations will involve sums. Solving them requires finding the coef-
ficients of a double Fourier series, but for now it’s enough to just write the equations.
The Problem
The steady-state electric potential in a region with no charged particles follows Laplace’s
equation ∇2 V = 0. In three-dimensional Cartesian coordinates, this equation can be written:
We plug that guess into Laplace’s equation (11.5.2) and divide both sides by X (x)Y (y)Z (z) to
separate variables.
We now bring one of these terms to the other side. We’ve chosen the y-term below, but we
could have chosen x just as easily. It would be a bit tougher if we chose z (the variable with
an inhomogeneous boundary condition); we’ll explain why in a moment.
You may now object that the variables are not entirely separated. (How could they be, with
three variables and only two sides of the equation?) But the essential relationship still holds:
the left side of the equation depends only on y, and the right side of the equation depends
only on x and z, so the only function both sides can equal is a constant.
When you set the left side of Equation 11.5.3 equal to a constant you get the same problem
we solved for X (x) in Section 11.4, so let’s just briefly review where it goes.
∙ We are solving Y ′′ (y)∕Y (y) = <a constant>, with the boundary conditions Y (0) =
Y (L) = 0.
∙ A positive constant would lead to an exponential solution and a zero constant to a
linear solution, neither of which could meet those boundary conditions.
∙ We therefore call the constant −k 2 and after a bit of algebra arrive at the solution.
n𝜋
Y (y) = A sin(ky) where k = where n can be any positive integer
L
Now you can see why we start by isolating a variable with homogeneous boundary conditions.
The inhomogeneous condition for Z (z) cannot be applied until after we build our series, so
it would not have determined the sign of our separation constant.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 583
Both sides of this equation must equal a constant, and this constant must be negative. (Can
you explain why?) However, this new constant does not have to equal our existing −k 2 ; the
two are entirely independent. We will call the new constant −p 2 .
X ′′ (x) Z ′′ (z)
= −p 2 and k2 − = −p 2
X (x) Z (z)
The X (x) differential equation and boundary conditions are the same as the Y (y) problem
that we solved above. The solution is therefore also the same, with one twist: p cannot also
equal n𝜋∕L because that would make the two constants the same. Instead we introduce a
new variable m that must be a positive integer, but not necessarily the same integer as n.
m𝜋
X (x) = B sin(px) where p = where m can be any positive integer
L
Z ′′ (z)
k2 − = −p 2
Z (z)
( )
Rewriting this as Z ′′ (z) = k 2 + p 2 Z (z), we see that the positive constant requires an expo-
nential solution. √ √
2 2 2 2
Z (z) = Ce k +p z + De − k +p z
(√ ) √ √
2 2 2 2
(You can write sinh k 2 + p 2 z instead of e k +p z − e − k +p z . It amounts to the same
thing with a minor change in the arbitrary constant.)
When we solved for X (x) and Y (y), we applied their relevant boundary conditions as soon
as we had solved the equations. Remember, however, that the boundary condition at z = L
is inhomogeneous; we therefore cannot apply it until we combine the three solutions and
write a series.
coefficient A. There is another solution with n = 23 and m = 2, which could have a different
coefficient A. In order to sum all possible solutions, we therefore need a double sum.
∑ ∑
∞ ∞ ( ) ( )( √ √ )
m𝜋 n𝜋 2 2 2 2
V (x, y, z) = Amn sin x sin y e (𝜋∕L) n +m z − e −(𝜋∕L) n +m z (11.5.4)
n=1 m=1
L L
We have introduced the subscript Amn to indicate that for each choice of m and n there is a
different free choice of A.
∑ ∑
∞ ∞ ( ) ( )( √ √ )
m𝜋 n𝜋
y e 𝜋 n +m − e −𝜋 n +m = f (x, y)
2 2 2 2
Amn sin x sin
n=1 m=1
L L
At this point in our previous section, building up the (one-variable) function on the right as
a series of sines involved writing a Fourier Series. In this case we are building a two-variable
function as a double Fourier Series. Once again the formula is in Appendix G.
( √ √ ) L L ( ) ( )
4 m𝜋 n𝜋
Amn e 𝜋 n +m − e −𝜋 n +m = 2
2 2 2 2
f (x, y) sin x sin y dxdy (11.5.5)
L ∫0 ∫0 L L
As before, if f (x, y) is particularly simple you can sometimes find the coefficients Amn explic-
itly. In other cases you can use numerical approaches. You’ll work examples of each type for
different boundary conditions in the problems.
Problem:
Solve the two-dimensional heat equation:
( )
𝜕u 𝜕2u 𝜕2u
=𝛼 + 2
𝜕t 𝜕x 2 𝜕y
Solution:
We begin by assuming a solution of the form:
It might seem natural at this point to work with T (t) first: it is easier by virtue of being
first order, and it is already separated. But you always have to start with the variables
that have homogeneous boundary conditions. As usual, we will handle the boundary
conditions before building a series, and tackle the initial condition last. We must
separate one of the spatial variables: we choose X quite arbitrarily.
The separation constant must be negative. (Do you see why?) Calling 2
( it√−k), we solve
the equation and boundary conditions for X to find: X (x) = A sin kx∕ 𝛼 where
√
k = n𝜋 𝛼 for n = 1, 2, 3 …
Meanwhile we separate the other equation.
Once again the separation constant must be negative: we will call it −p 2 . The
equation for(Y looks)just like the previous equation for X , yielding
√ √
Y (y) = B sin py∕ 𝛼 where p = m𝜋 𝛼 for m = 1, 2, 3 …
( )
Finally, we have T ′ (t) = − k 2 + p 2 T (t), so
2 +p 2
T (t) = Ce −(k )t
∑ ∑
∞ ∞
2 𝛼(n 2 +m 2 )t
u(x, y, t) = Amn sin(n𝜋x) sin(m𝜋y)e −𝜋
n=1 m=1
Finally we are ready to plug in our inhomogeneous initial condition, which tells
us that
∑
∞ ∞
∑
Amn sin(n𝜋x) sin(m𝜋y) = u0 (x, y)
n=1 m=1
6 18
If you plot the first few terms you can see that they represent a bump in the middle of
the domain that flattens out over time due to the decaying exponentials. With
computers, however, you can go much farther. The plots below show this series (with
𝛼 = 1) with all the terms up to n, m = 100 (10,000 terms in all, although many of
them equal zero).
At t = 0 you can see that the double Fourier series we constructed accurately models
the initial conditions. A short time later the temperature is starting to even out more,
and at much later times it relaxes towards zero everywhere. (Do you see why we call
t = 0.1 “much later”? See Problem 11.105.)
have changed if you had used z = L∕2 the y-dependent term on the right.
instead? (You should not need a com- Set both sides equal to a constant and
puter to answer that last question.) explain why this separation constant
11.104 Solve Laplace’s equation for V (x, y, z) in the also must be negative. Call it −p 2 .
region 0 ≤ x, y, z ≤ L with V (x, y, 0) = f (x, y) (d) Solve the equation for X (x). Use the
and V = 0 on the other five sides. (This boundary condition at x = 0 to show
is the same problem that was solved in that one of the arbitrary constants must
the Explanation (Section 11.5.2) except be 0 and apply the boundary condition
that the side with non-zero potential at x = 1 to find all the possible values
is at z = 0 instead of z = L.) for k. There will be an infinite num-
11.105 The example on Page 584 ended with a ber of them, but you should be able to
series, which we plotted at several times. write them in terms of a new constant
From those plots it’s clear that for 𝛼 = 1 m, which can be any positive integer.
the initial temperature profile hadn’t (e) Solve the equation for Y (y) and apply
changed much by t = 0.002 but had the boundary conditions to eliminate
mostly relaxed towards zero by t = 0.1. one arbitrary constant and to write the
How could you have predicted this by constant p in terms of a new integer n.
looking at the series solution? (f) Solve the equation for T (t). Your answer
should depend on m and n and should
11.106 Walk-Through: Separation of Variables— have two arbitrary constants.
More Than Two Variables. In this problem (g) Multiply X (x), Y (y), and T (t) to find the
you will use separation of variables to solve normal modes of this system. You should
the equation 𝜕 2 u∕𝜕t 2 − 𝜕 2 u∕𝜕x 2 − 𝜕 2 u∕𝜕y2 + be able to combine your four arbitrary
u = 0 subject to the boundary conditions constants into two. Write the general
u(0, y, t) = u(1, y, t) = u(x, 0, t) = u(x, 1, t) = 0. solution u(x, y, t) as a sum over these
normal modes. Your arbitrary constants
y should include a subscript mn to indi-
cate that they can take different values
u=0
1 for each combination of m and n.
11.107 [This problem depends on Problem 11.106.]
u=0 u=0 In this problem you will plug the
initial conditions
x
u(x, y, 0) =
u=0 1 {
1 1∕3 < x < 2∕3, 1∕3 < y < 2∕3
,
0 elsewhere
(a) Begin by guessing a separable solu- 𝜕u
tion u = X (x)Y (y)T (t). Plug this guess (x, y, 0) = 0 into the solution you
𝜕t
into the differential equation. Then found to Problem 11.106.
divide both sides by X (x)Y (y)T (t) and
y
separate variables so that all the y and u=0
t dependence is on the left and the x 1
0
dependence is on the right. Put the con-
stant term on the left side. (You could u=0 1 u=0
put it on the right side, but the math
would get a bit messier later on.) u=0 1 x
(b) Set both sides of that equation equal to
a constant. Given the boundary condi- (a) Use the condition 𝜕u∕𝜕t(x, y, 0) = 0
tions above for u, what are the bound- to show that one of your arbitrary
ary conditions for X (x)? Use these constants must equal zero.
boundary conditions to explain why (b) The remaining condition should give you
the separation constant must be nega- an equation that looks like u(x, y, 0) =
∑∞ ∑∞
tive. You can therefore call it −k 2 . m=1
F sin(m𝜋x) sin(n𝜋y) (with
n=1 mn
(c) Separate variables in the remaining a different letter if you didn’t use Fmn
ODE, putting the t-dependent terms for the same arbitrary constant we
and the constant terms on the left, and did). This is a double Fourier sine
7in x 10in Felder c11.tex V3 - March 10, 2015 12:06 A.M. Page 588
(c) Have a computer calculate a par- 11.115 [This problem depends on Problem 11.114.]
tial sum of your solution including all The equation you solved in Problem 11.114
terms up to m, n = 20 (400 terms in might represent an oscillating square plate
all) and plot it at a variety of times. that was given a sudden blow in a region in
Describe how the function u through- the middle. This might represent a square
out the region is evolving over time. drumhead hit by a square drumstick.6
The solution you got, however, was a fairly
Solve Problems 11.108–11.112 using separation of complicated looking double sum.
variables. For each problem use the boundary
(a) Have a computer plot the initial func-
conditions u(0, y, t) = u(L, y, t) = u(x, 0, t) =
̇ y, 0) for the partial sum that
tion z(x,
u(x, H , t) = 0. When the initial conditions are given as
goes up through m = n = 5, then again
arbitrary functions, write the solution as a series and
up through m = n = 11, and finally up
write expressions for the coefficients in the series.
through m = n = 21. As you add terms
When specific initial conditions are given, solve for
you should see the partial sums con-
the coefficients. It may help to first work through
verging towards the shape of the initial
Problem 11.106 as a model.
conditions that you were solving for.
y (b) Take several of the non-zero terms in
the series (the individual terms, not
u=0 the partial sums) and for each one
H
use a computer to make an anima-
tion of the shape of the drumhead
u=0 u=0 ̇ evolving over time. You should
(z, not z)
see periodic behavior. You should
x include (m, n) = (1, 1), (m, n) = (1, 3),
u=0 L (m, n) = (3, 3), and at least one other
term. Describe how the behaviors of
11.108 𝜕u∕𝜕t = a 2 (𝜕 2 u∕𝜕x 2 ) + these normal modes are different from
b 2 (𝜕 2 u∕𝜕y2 ), u(x, y, 0) = f (x, y) each other.
11.109 𝜕u∕𝜕t = a 2 (𝜕 2 u∕𝜕x 2 ) + (c) Now make an animation of the par-
b 2 (𝜕 2 u∕𝜕y2 ), u(x, y, 0) = tial sum that goes through m = n =
sin(2𝜋x∕L) sin(3𝜋y∕H ) 21. Describe the behavior. How is
it similar to or different from the
11.110 𝜕u∕𝜕t + u = a 2 (𝜕 2 u∕𝜕x 2 ) +
behavior of the individual terms you
b 2 (𝜕 2 u∕𝜕y2 ), u(x, y, 0) = f (x, y)
saw in the previous part?
11.111 𝜕 2 u∕𝜕t 2 + u = 𝜕 2 u∕𝜕x 2 + 𝜕 2 u∕𝜕y2 , (d) How would your answers to Parts (b) and
u(x, y, 0) = 0, 𝜕u∕𝜕t(x, y, 0) = g (x, y) (c) have looked different if we had used
11.112 𝜕 2 u∕𝜕t 2 + 𝜕u∕𝜕t − 𝜕 2 u∕𝜕x 2 − 𝜕 2 u∕𝜕y2 + a different set of initial conditions?
u = 0, u(x, y, 0) = 0, 𝜕u∕𝜕t(x, y, 0) =
sin(𝜋x∕L) sin(2𝜋y∕H ) Solve Problems 11.116–11.118 using separation of
variables. Write the solution as a series and write
For Problems 11.113–11.114 solve the 2D wave expressions for the coefficients in the series, as we did
equation 𝜕 2 z∕𝜕x 2 + 𝜕 2 z∕𝜕y2 = (1∕v 2 )(𝜕 2 z∕𝜕t 2 ) on the for Laplace’s equation in the Explanation
rectangle x, y ∈ [0, L] with z = 0 on all four sides and (Section 11.5.2), Equations 11.5.4 and 11.5.5.
initial conditions given below. If you have not done
11.116 Solve the wave equation 𝜕 2 u∕𝜕x 2 +
the Discovery Exercise (Section 11.5.1) you will need
𝜕 2 u∕𝜕y2 + 𝜕 2 u∕𝜕z 2 = (1∕v 2 )(𝜕 2 u∕𝜕t 2 ) in
to begin by using separation of variables to solve the
a 3D cube of side L with u = 0 on all six
wave equation.
sides and initial conditions u(x, y, z, 0) =
( )
11.113 z(x, y, 0) = sin (𝜋x∕L) sin 𝜋y∕L , z(x, ̇ y, 0) = 0 ̇ y, z, 0) = 0.
f (x, y, z), u(x,
6 If “square drumhead” sounds a bit artificial, don’t worry. In Section 11.6 we’ll solve the wave equation on a more
11.117 In the wave equation the parameter v is square box of side length L, with u = 0
the sound speed, meaning the speed at on all four sides and initial conditions
which waves propagate in that medium. ̇
u(x, y, 0) = 0, u(x, y, 0) = g (x, y).
“Anisotropic” crystals have a different
( the heat equation 𝜕u∕𝜕t =
11.118 Solve )
sound speed in different directions. 𝛼 𝜕 2 u∕𝜕x 2 + 𝜕 2 u∕𝜕y2 + 𝜕 2 u∕𝜕z 2 in a 3D
Solve the anisotropic wave equation cube of side L with u = 0 on all six sides and
𝜕 2 u∕𝜕t 2 = vx2 (𝜕 2 u∕𝜕x 2 ) + vy2 (𝜕 2 u∕𝜕y2 ) in a initial condition u(x, y, z, 0) = f (x, y, z).
Here’s our point: you can look up a few key facts about a function you’ve never even
heard of and then use that function to solve a PDE. Later you can do more research to
better understand the solutions you have found.
has two real, linearly independent solutions, which are called “Bessel functions” and designated
Jp (kx) and Yp (kx). Because the equation is linear and homogeneous, its general solution is
AJp (kx) + BYp (kx).
It’s important to note that A and B are arbitrary constants (determined by initial con-
ditions), while p and k are specific numbers that appear in the differential equation. p
determines what functions solve the equation, and k stretches or compresses those functions.
For instance:
∙ The ordinary differential equation x 2 y′′ + xy′ + (x 2 − 9)y = 0 has two real, linearly inde-
pendent solutions called J3 (x) and Y3 (x). Because the equation is linear and homoge-
neous, its general solution is( AJ3 (x) +)BY3 (x).
∙ The equation x 2 y′′ + xy′ + x 2 − 1∕4 y = 0 has general solution AJ1∕2 (x) + BY1∕2 (x).
The J1∕2 (x) that solves this equation is a completely different function from the J3 (x)
that solved the previous. ( )
∙ The equation x 2 y′′ + xy′ + 25x 2 − 9 y = 0 has solutions J3 (5x) and Y3 (5x). Of course
J3 (5x) is the same function as J3 (x) but compressed horizontally.
We are not concerned here with negative values of p, although we may encounter fractional
values.
1.0
0.8
J0(x)
0.6
J1/2(x)
0.4
0.2 J2(x)
x J6(x)
5 10 15 20
–0.2
–0.4
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 591
0.4
0.2 Y0(x)
x Y1/2(x)
5 10 15 20
–0.2
Y2(x)
–0.4
–0.6 Y6(x)
–0.8
–1.0
∙ For all p-values, lim+ Yp (x) = −∞. Therefore Yp (x) solutions are discarded whenever
x→0
boundary conditions require a finite answer at x = 0.
∙ All the Yp (x) functions have an infinite number of zeros.
∙ Some textbooks refer to these as “Neumann functions,” and still others as “Weber func-
tions.” Some use the notation Np (x) instead of Yp (x). (This may not technically qualify
as a “key property” but we had to mention it somewhere.)
The terms in that series are all the function J7 (kx) where the restriction k = 𝛼7,n ∕a comes
from the condition f (a) = 0 (since J7 (𝛼7,n ) = 0 for all n by definition). The only other thing
you need is the formula for the coefficients.
Of course, there’s nothing special about the number 7. You could build the same function
as a series of J5∕3 (kx) functions if you wanted to. The allowable values of k would be different
and the coefficients would be different; it would be a completely different series that adds
up to the same function.
Fourier-Bessel Series
On the interval 0 ≤ x ≤ a, we can expand a function into a series of Bessel functions by writing
( )
∑
∞
𝛼p,n
f (x) = A n Jp x
n=1
a
∙ Jp is one of the Bessel functions. For instance, if p = 3, you are expanding f (x) in terms of
the Bessel function J3 . This creates a “Fourier-Bessel series expansion of order 3.” We could
expand the same function f (x) into terms of J5 , which would give us a different series with
different coefficients.
∙ 𝛼p,n is defined in the section on “zeros” above.7
∙ The formula for the coefficients An is given in Appendix J (in the section on Bessel
functions).
Do you see the importance of all this? Separation of variables previously led us to d 2 y∕dx 2 +
k2y = 0, which gave us the normal modes sin(kx) and cos(kx). We used the known zeros of
those functions to match boundary conditions. And because we know the formula for the
coefficients of a Fourier series, we were able to build arbitrary initial conditions as series of
sines and cosines.
Now suppose that separation of variables on some new PDE gives us J5 (kx) for our normal
modes. We will need to match boundary conditions, which requires knowing the zeros: we
now know them, or at least we have names for them (𝛼5,1 , 𝛼5,2 and so on) and can look them
up. Then we will need to write the initial condition as a sum of normal modes: we now know
the formula for the coefficients of a Fourier-Bessel expansion of order 5. Later in this section
you’ll see examples where we solve PDEs in this way.
Appendix J lists just the key facts given above—the ODE, the function that solves the ODE,
the zeros of that function, and the coefficients of its series expansion—for Bessel functions
and other functions that are normal modes of common PDEs. With those facts in hand, you
can solve PDEs with a wide variety of normal modes. But, as we warned in the beginning,
nothing in that appendix justifies those mathematical facts in any way. For the mathematical
background and derivations, see Chapter 12.
Not-Quite-Bessel Functions
Many equations look a lot like Equation 11.6.1 but don’t match it perfectly.
√ There are two
approaches you might try in such a case. The first is to let u = x 2 or u = x or u = cos x or
some such and see if you end up with a perfect match. (Such substitutions are discussed in
Chapter 10.) The second is to peak into Appendix J and see if you have a known variation of
Bessel functions.
7 Some textbooks define 𝜆p,n = 𝛼p,n ∕a to make these equations look simpler.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 593
This is just like Equation 11.6.1 except for the sign of the k 2 term. You can turn this into
Equation 11.6.1 with a change of variables, but you have to introduce complex numbers to
do so. The end result, however, is a pair of functions that give you real answers for any real
input x. The two linearly independent solutions to this equation are called “modified Bessel
function,” generally written Ip (kx) and Kp (kx). The important facts you need to know about
these functions when you’re solving PDEs are that Kp (x) diverges at x = 0 (just as Yp (x) does),
and that none of the modified Bessel functions have any zeroes at any point except x = 0.
Those facts are all listed in Appendix J.
𝜕2V 1 𝜕V 𝜕2V
+ + =0 (11.6.3)
𝜕𝜌2 𝜌 𝜕𝜌 𝜕z 2
with the boundary conditions:
1. To solve this, assume a solution of the form V (𝜌, z) = R(𝜌)Z (z). Plug this guess into
Equation 11.6.3.
2. Separate the variables in the resulting equation so that all the 𝜌-dependence is on the
left and the z-dependence on the right.
See Check Yourself #75 in Appendix L
3. Let the separation constant equal zero.
(a) Verify that the function R(𝜌) = A ln(𝜌) + B is the general solution for the resulting
equation for R(𝜌). (You can verify this by simply plugging our solution into the
ODE, of course. Alternatively, you can find the solution for yourself by letting
S(𝜌) = R ′ (𝜌) and solving the resulting first-order separable ODE for S(𝜌), and then
integrating to find R(𝜌).)
(b) We now note that within the cylinder, the potential V must always be finite. (This
type of restriction is sometimes called an “implicit” boundary condition, and will
be discussed further in the Explanation, Section 11.6.3.) What does this restriction
say about our constants A and B?
(c) Match your solution to the boundary condition R(a) = 0, and show that this
reduces your solution to the trivial case R(𝜌) = 0 everywhere. We conclude that,
for any boundary condition at the top other than f (𝜌) = 0, the separation constant
is not zero.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 594
The Problem
Consider a simple drum: a thin rubber sheet stretched along
a circular frame of radius a. If the sheet is struck with some “Drum” “Drumstick”
object—let’s call it a “drumstick”—it will vibrate according to
the wave equation.
a
The wave equation can be written as ∇2 z = (1∕v 2 )(𝜕 2 z∕𝜕t 2 ),
where z is vertical displacement. In rectangular coordinates
this equation becomes 𝜕 2 z∕𝜕x 2 + 𝜕 2 z∕𝜕y2 = (1∕v 2 )(𝜕 2 z∕𝜕t 2 ),
but using x and y to specify boundary conditions on a circle
is an ugly business. Instead we will use the Laplacian in polar coordinates (Appendix F),
which makes our wave equation:
( )
𝜕2z 𝜕 2 z 1 𝜕z 1 𝜕2z
= v2 + + 2 2 (11.6.4)
𝜕t 2 𝜕𝜌 2 𝜌 𝜕𝜌 𝜌 𝜕𝜙
You can probably figure out for yourself that the first boundary condition, the one that
led us to polar coordinates, is z(a, 𝜙, t) = 0. This is an “explicit” boundary condition, mean-
ing it comes from a restriction that is explicitly stated in the problem. (The drumhead is
attached to the frame.) There are also two “implicit” boundary conditions, restrictions that
are imposed by the geometry of the situation.
∙ In many cases, certainly including this one, z has to be finite throughout the region.
This includes the center of the circle, so z cannot blow up at 𝜌 = 0. (This is going to
be more important than it may sound.)
∙ The point 𝜌 = 2, 𝜙 = 𝜋∕3 is the same physical place as 𝜌 = 2, 𝜙 = 7𝜋∕3, so it must give rise
to the same z-value. More generally, the function must be periodic such that z(𝜌, 𝜙, t)
always equals z(𝜌, 𝜙 + 2𝜋, t).
These conditions apply because we are using polar coordinates to solve a problem on a disk
around the origin, not because of the details of this particular scenario. If you find that your
initial and boundary conditions aren’t sufficient to determine your arbitrary constants, you
should always check whether you missed some implicit boundary conditions.
The problem also needs to specify as initial conditions both z(𝜌, 𝜙, 0) and 𝜕z∕𝜕t(𝜌, 𝜙, 0).
After we solve the problem with its boundary conditions, we’ll look at a couple of possible
initial conditions.
Separation of Variables
We begin by guessing a solution of the form
Substituting this expression directly into the polar wave equation (11.6.4) yields:
( )
1 1
R(𝜌)Φ(𝜙)T ′′ (t) = v 2 R ′′ (𝜌)Φ(𝜙)T (t) + R ′ (𝜌)Φ(𝜙)T (t) + 2 R(𝜌)Φ′′ (𝜙)T (t)
𝜌 𝜌
“something is zero,” but that isn’t quite true: a homogeneous equation or condition is one
for which any linear combination of solutions is itself a solution. A sum of functions with the same
period is itself periodic, so z(𝜌, 𝜙, t) = z(𝜌, 𝜙 + 2𝜋, t) is a homogeneous boundary condition.)
We’ll isolate Φ(𝜙) because its differential equation is simpler.
The condition that Φ(𝜙) must repeat itself every 2𝜋 requires that p be an integer. Negative
values of p would be redundant, so we need only consider p = 0, 1, 2, ….
Moving on to R(𝜌), Equation 11.6.7 can be written as:
( )
𝜌2 R ′′ (𝜌) + 𝜌R ′ (𝜌) + k 2 𝜌2 − p 2 R(𝜌) = 0 (11.6.9)
You can plug that equation into a computer or look it up in Appendix J. (The latter is a
wonderful resource if we do say so ourselves, and we hope you will familiarize yourself with
it.) Instead we will note that it matches Equation 11.6.1 in Section 11.6.1 and write down the
answer we gave there.
R(𝜌) = AJp (k𝜌) + BYp (k𝜌) (11.6.10)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 597
Next we apply the boundary conditions on R(𝜌). We begin by recalling that all functions Yp (x)
blow up as x → 0. Because our function must be finite at the center of the drum we discard
such solutions, leaving only the Jp functions.
Our other boundary condition on 𝜌 was that R(a) = 0. Using 𝛼p,n to represent the nth zero
of Jp we see that this condition is satisfied if ka = 𝛼p,n .
(𝛼 )
p,n
R(𝜌) = AJp 𝜌
a
Remember that J1 (x) is one specific function—you could look it up, or graph it with a com-
puter. J2 (x) is a different function, and so on. Each of these functions has an infinite number
of zeros. When we write 𝛼3,7 we mean “the seventh zero of the function J3 (x).” The index n,
which identifies one of these zeros by count, can be any positive integer.
When we introduced p and k they could each have been, so far as we knew, any real number.
The boundary conditions have left us with possibilities that are still infinite, but discrete: p
must be an integer, and ka must be a zero of Jp . So we can write the solution as a series over
all possible n- and p-values. As we do so we add the subscript pn to our arbitrary constants,
indicating that they may be different for each term. We also absorb the constant A into our
other arbitrary constants.
(𝛼 )[ (𝛼 ) (𝛼 )] [ ]
∑ ∞ ∞
∑ p,n p,n p,n
z= Jp 𝜌 Cpn sin vt + Dpn cos vt Epn sin(p𝜙) + Fpn cos(p𝜙)
p=0 n=1
a a a
(11.6.11)
If you aren’t paying extremely close attention, those subscripts are going to throw you. 𝛼p,n
is the nth zero of the function Jp : you can look up 𝛼3,5 in a table of Bessel function zeros.
(It’s roughly 19.4.) On the other hand, Cpn is one of our arbitrary constants, which we will
choose to meet the initial conditions
( of our ) specific problem. C35 is the constant that we will
be multiplying specifically by sin 𝛼3,5 vt∕a .
Can you see why p starts at 0 and n starts at 1? Plug p = 0 into Equation 11.6.8 and you get a
valid solution: Φ(𝜙) equals a constant. But the first zero of a Bessel function is conventionally
labeled n = 1, so only positive values of n make sense.
Initial Conditions
We haven’t said much about specific initial conditions up to this point, because our goal is
to show how you can match any reasonable initial conditions in such a situation.
The initial conditions would be some given functions z0 (𝜌, 𝜙) and ż 0 (𝜌, 𝜙). From the solu-
tion above: (𝛼 )[
∑∞ ∞
∑ p,n
]
z0 = Dpn Jp 𝜌 Epn sin(p𝜙) + Fpn cos(p𝜙)
p=0 n=1
a
(𝛼 ) (𝛼 )[ ]
∑ ∑
∞ ∞
p,n p,n
ż 0 = v Cpn Jp 𝜌 Epn sin(p𝜙) + Fpn cos(p𝜙)
p=0 n=1
a a
As always, we now build up our initial condition from the series by choosing the appropriate
coefficients. If our normal modes were sines and cosines, we would be building a Fourier
series. In this case we are building a Fourier-Bessel series, but the principle is the same, and
the formula in Appendix J allows us to simply plug in and find the answer:
( ) ( ) 𝜌0 ( )
𝛼0,n 2
a 𝛼0,n 2s 𝛼0,n
v Cn = ż 0 (𝜌)J0 𝜌 𝜌d𝜌 = J0 𝜌 𝜌d𝜌
a a 2 J12 (𝛼0,n ) ∫0 a a 2 J12 (𝛼0,n ) ∫0 a
𝜌0 ( )
2s 𝛼0,n
Cn = J 𝜌 𝜌d𝜌
av𝛼0,n J12 (𝛼0,n ) ∫0
0
a
This integral can be looked up in a table or plugged into a computer, with the result
(𝛼 ) (𝛼 )
𝜌0 aJ1 0,n
𝜌0 2s𝜌0 J1
0,n
𝜌0
2s a a
Cn = =
av𝛼0,n J12 (𝛼0,n ) 𝛼0,n 2
v𝛼0,n J12 (𝛼0,n )
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 599
Plugging this into the series expansion for z gives the final answer we’ve been looking for:
(𝛼 )
∞ J 0,n
𝜌 ( ) ( )
2s𝜌0 ∑ 1 a 0 𝛼0,n 𝛼0,n
z= J 𝜌 sin vt (11.6.12)
v n=1 𝛼 2 J 2 (𝛼0,n ) 0 a a
0,n 1
That hideous-looking answer actually tells us a great deal about the behavior of the drum-
head. The constants outside the sum set the overall scale of the vibrations. The big fraction
inside the sum is a number for each value of n that tells us the relative importance of each
normal mode. For 𝜌0 = a∕3 the first few coefficients are roughly 0.23, 0.16, and 0.069, and
they only get smaller from there. So only the first two modes contribute significantly. The
first and most important term represents the shape on the left of Figure 11.5 oscillating up
and down. The second term is a slightly more complicated shape. Higher order terms will
have more wiggles (i.e. more critical points between 𝜌 = 0 and 𝜌 = a).
J0(α0,1r/a)
J0(α0,2r/a)
Figure 11.5
Problem: y
Solve the equation: z=1
2
𝜕2z 𝜕z 𝜕 2 z 9 z z=0
4x + 4 + − z=0 finite
𝜕x 2 𝜕x 𝜕y2 x
x
on the domain 0 ≤ x ≤ 2, 0 ≤ y ≤ 2 subject z=0 2
to the boundary conditions z(2, y) = z(x, 0) = 0 and
z(x, 2) = 1 and the requirement that z(0, y) is finite.
Solution:
Writing z(x, y) = X (x)Y (y) leads us to:
9
4xX ′′ (x)Y (y) + 4X ′ (x)Y (y) + X (x)Y ′′ (y) − X (x)Y (y) = 0
x
Divide both sides by X (x)Y (y) and separate the variables:
In the problems you will show that we cannot match the boundary conditions with a
positive or zero separation constant, so we call the constant −k 2 and we get:
( )
9
4xX ′′ (x) + 4X ′ (x) + k 2 − X (x) = 0 (11.6.13)
x
The analytic solution to that integral is no more enlightening to look at than the
integral itself, so we simply leave it as is:
( √ ( ) ) √ √ ( )
∑
∞
1
2 𝛼3,n e 𝛼3,n y∕ 2
− e −𝛼3,n y∕ 2 𝛼3,n √
z(x, y) = J3 √ u udu √ √ J3 √ x
2 ∫0
J
n=1 4 (𝛼 3,n ) 2 e 𝛼3,n 2
−e −𝛼3,n 2 2
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 601
You can plug that whole mess into a computer, as is, and plot the 20th partial sum of
the series. It matches the homogeneous boundary conditions perfectly, matches the
inhomogeneous one well, and smoothly interpolates between them everywhere else.
1
z 2
0
0 y
x
0
2
Once again the condition z0 = 0 leads to Dpn = 0. This time, however, there is 𝜙 dependence,
so we can’t throw out as much of our equation.
(𝛼 ) (𝛼 )[ ]
∑ ∑
∞ ∞
p,n p,n
ż 0 (𝜌, 𝜙) = v Jp 𝜌 Epn sin(p𝜙) + Fpn cos(p𝜙)
p=0 n=1
a a
(We’ve absorbed Cpn into the other constants of integration.) In Chapter 9 we found double
series expansions based entirely on sines and cosines. The principle is the same here, but
we are now expanding over both trig and Bessel functions. First note that you can rewrite
this as:
∞ ( ) ( ) {∞ ( ) (𝛼 ) }
∑ 𝛼0,n 𝛼0,n ∑∞
∑ 𝛼p,n p,n
ż 0 (𝜌, 𝜙) = v J0 𝜌 F0n + v Jp 𝜌 Fpn cos(p𝜙)
n=1
a a p=1 n=1
a a
{∞ ( ) (𝛼 ) }
∑∞
∑ 𝛼p,n p,n
+ v Jp 𝜌 Epn sin(p𝜙)
p=1 n=1
a a
Now we can turn this around, and view the left side of each of these equations as a Fourier-
Bessel series expansion of the function on the right side. We see that Fpn = 0 for all p > 0 and
Epn = 0 for even values of p. To find the other coefficients we return to Appendix J for the
coefficients of a Fourier-Bessel series expansion.
( ) a{ s } ( ) ( )
𝛼0,n 2 𝜌 < 𝜌0 𝛼0,n 𝜌0 s 𝛼0,n
v F0n = 2 J 𝜌 𝜌d𝜌 = J 𝜌
a 2 J12 (𝛼0,n ) ∫0 0 𝜌 > 𝜌0 0 1
a a a𝛼0,n J12 (𝛼0,n ) a 0
( )
𝜌 s 𝛼0,n
→ F0n = 2 02 J1 𝜌
v𝛼0,n J1 (𝛼0,n ) a 0
(𝛼 ) { }
2s
𝜌 < 𝜌0 (𝛼 ) (𝛼 )
p,n 2 a 4s 𝜌
∫0 𝜌 𝜌d𝜌 = ∫0 0 Jp 𝜌 𝜌d𝜌
p p,n p,n
v Epn = 2 (𝛼 ) Jp
a a 2 Jp+1 p,n 0 𝜌 > 𝜌0 a pa 2 J12 (𝛼p,n ) a
(𝛼 )
4s 𝜌
→ Epn = ∫0 0 Jp 𝜌 𝜌d𝜌 odd p only
p,n
pav𝛼p,n J12 (𝛼p,n ) a
The integral in the last equation can be evaluated for a general p, but the result is an ugly
expression involving hypergeometric functions, so it’s better to leave it as is and evaluate it
for specific values of p when you are calculating partial sums. Putting all of this together, the
solution is:
{ (𝛼 ) (𝛼 ) (𝛼 )
∑∞ 𝜌0 s
z = n=1 2 2 J1 0,na
𝜌 J 0
0,n
a
𝜌 sin 0,n
a
vt (11.6.15)
v𝛼o,n J1 (𝛼0,n )
[ ( (𝛼 ) ) (𝛼 ) (𝛼 ) ] }
∑ 𝜌0
+ ∞ 4s
∫ 𝜌 𝜌d𝜌 𝜌
p,n p,n p,n
p=1 2 0
Jp a
J p a
sin a
vt sin(p𝜙) odd p only
pav𝛼p,n J1 (𝛼p,n )
Even if you’re still reading at this point, you’re probably skimming past the equations think-
ing “I’ll take their word for it.” But we’d like to draw your attention back to that last equation,
because it’s not as bad as it looks at first blush. The first term in curly braces is just the solution
we found in the previous example. The dominant mode is shown on the left in Figure 11.5,
the next mode is on the right in that figure, and higher order terms with more wiggles have
much smaller amplitude. The new feature in this solution is the sum over p. Each of these
terms is again a Bessel function that oscillates in time, but now multiplied by sin(p𝜙), so it
oscillates as you move around the disk. One such mode is shown in Figure 11.6.
J4(α4,2r/a) sin(4ϕ)
Figure 11.6
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 603
The higher the values of n and p the more wiggles a mode will have, but the smaller its
amplitude will be. It’s precisely because it can support so many different oscillatory modes,
each with its own frequency, that a drum can make such a rich sound. At the same time,
the frequency of the dominant mode determines the pitch you associate with a particular
drum.
Stepping Back
When you read through a long, ugly derivation—and in this book, they don’t get much
longer and uglier than the “circular drum” above—it’s easy to get lost in the details and miss
the big picture. Here is everything we went through.
1. We were given the differential equation (the wave equation), the boundary condi-
tions (the tacked-down edge of the drum), and a couple of sample initial conditions
(first the symmetric blow, and then the asymmetric). In addition to the explicit bound-
ary conditions, we recognized that the geometry of the problem led to some implicit
boundary conditions. Without them, we would have found in step 3 that we didn’t
have enough conditions to determine our arbitrary constants.
2. Recognizing the circular nature of the boundary conditions, we wrote the wave
equation in polar coordinates.
3. We separated variables, writing the solution as z(𝜌, 𝜙, t) = R(𝜌)Φ(𝜙)T (t), but otherwise
proceeding exactly as before: plug in, separate (twice, because there are three vari-
ables), solve the ODEs, apply homogeneous boundary conditions, build a series. Not
all of this is easy, but it should mostly be familiar by now; the only new part was that
one of the ODEs led us to Bessel functions.
4. To match initial conditions, we used a Fourier-Bessel series expansion. This step takes
advantage of the fact that Bessel functions, like sines and cosines, form a “complete
basis”: you can sum them to build up almost any function.
5. Finally—possibly the most intimidating step, but also possibly the most important!—
we interpreted our answer. We looked at the coefficients to see how quickly they were
dropping, so we could focus on only the first few terms of the series. We used com-
puters to draw the normal modes: in the problems you will go farther with computers,
and actually view the motion. (Remember that it is almost never possible to explic-
itly sum an infinite series, but—assuming it converges—you can always approximate
it with partial sums.)
It’s important to get used to working with a wide variety of normal modes; trig and Bessel
functions are just the beginning, but the process always remains the same. It’s also impor-
tant to get used to big, ugly equations that you can’t solve or visualize without the aid of a
computer. Mathematical software is as indispensable to the modern scientist or engineer as
slide rules were to a previous generation.
11.122 Plot the first five Bessel functions 11.132 In the Explanation (Section 11.6.3),
J0 (x) − J4 (x) from x = 0 to x = 10. we found the Solution 11.6.12 for an oscil-
lating drumhead that was given a sudden
11.123 The Bessel functions are defined as blow in a region in the middle. This might
represent a drumhead hit by a drumstick.
∑
∞
(−1)m ( )2m+p The solution we got, however, was a fairly
x
Jp (x) = complicated looking sum. In this prob-
m! Γ(p + m + 1) 2
m=0 lem you will make plots of this solution;
you will need to choose some values for
Plot J1 (x) and the first 10 partial sums of all of the constants in the problem.
this series expansion from x = 0 to x = 10 (a) Have a computer plot the initial func-
and show that the partial sums are con- ̇ , 0) for the first 20 partial sums
tion z(r
verging to the Bessel function. (You may of this series. As you add terms you
need to look up the gamma function Γ(x) should see the partial sums converging
in the online help for your mathemati- towards the shape of the initial condi-
cal software to see how to enter it.) tions that you were solving for.
(b) Take the first three terms in the series
11.124 Repeat Problem 11.123 for J1∕2 (x). (the individual terms, not the partial
sums) and for each one use a com-
For Problems 11.125–11.127 expand the given puter to make an animation of the
function in a Fourier-Bessel series expansion of the shape of the drumhead (z, not z) ̇ evolv-
given order. You should use a computer to evaluate ing over time. You should see periodic
the integrals, either analytically or numerically. Then behavior. Describe how the behav-
plot the first ten partial sums of the expansion on the ior of these three normal modes is
same plot as the original function so you can see the different from each other.
partial sums converging to the function. Be sure to (c) Now make an animation (or just a
clearly mark the plot of the function in some way (a sequence of plots at different times) of
different color, a thicker line) so it’s clear on the plot the 20th partial sum of the solutions.
which one it is. Describe the behavior. How is it sim-
ilar to or different from the behavior
11.125 f (x) = 1, 0 < x < 1, p = 1 of the individual terms you saw in the
previous part? If you’ve done Prob-
11.126 f (x) = x, 0 < x < 𝜋, p = 0
lem 11.115 from Section 11.5 compare
11.127 f (x) = sin x, 0 < x < 𝜋, p = 1∕2 these results to the animation you got
for the “square drumhead.” Can you
Problems 11.128–11.131 refer to a vibrating see any qualitative differences?
drumhead (a circle of radius a) with a fixed edge:
11.133 Equation 11.6.15 gives the solution for
z(a, 𝜙, t) = 0. In the Explanation (Section 11.6.3) we
the vibrations of a drumhead struck with an
showed that the solution to the wave equation for this
asymmetric blow. Make an animation of this
system is Equation 11.6.11. The arbitrary constants in
solution showing that motion using the par-
that general solution are determined by the initial
tial sum that goes up to n = 10, p = 11.
conditions. For each set of initial conditions write the
complete solution z(𝜌, 𝜙, t). In some cases your 11.134 In the Explanation (Section 11.6.3), we
answers will contain integrals that cannot be found the Solution 11.6.12 for an oscillat-
evaluated analytically. ing drumhead that was given a sudden blow
in a region in the middle and we discussed
11.128 z(𝜌, 𝜙, 0) = a − 𝜌, z(𝜌,
̇ 𝜙, 0) = 0. This rep- how the different modes oscillating at dif-
resents pulling the drumhead up by a string ferent frequencies correspond to pitches
attached to its center and then letting go. produced by the drum. To find the frequency
11.129 z(𝜌, 𝜙, 0) = 0, z(𝜌,
̇ 𝜙, 0) = c sin(3𝜋𝜌∕a) of a mode you have to look at the time
11.130 z(𝜌, 𝜙, 0) = 0, { dependence and recall that frequency is one
c 𝜌 < a∕2 over period.
̇ 𝜙, 0) =
z(𝜌,
0 a∕2 < 𝜌 < a (a) For a rubber drum with a sound speed
̇ 𝜙, 0) = 0
11.131 z(𝜌, 𝜙, 0) = (a − 𝜌) sin(𝜙), z(𝜌, of v = 100m/s and a radius of a = 0.3m
find the frequency of the dominant
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 605
mode. Look up what note this corre- general solution y(x, t) as a sum over
sponds to. Then find the notes cor- these normal modes. Your arbitrary
responding to the next two modes. constant should include a subscript n
(The modes other than the dominant to indicate that they can take differ-
one are called “overtones.”) ent values for each value of n.
(b) Find the dominant pitch for a rub- (g) Use the initial condition y(x, 0) = sin(𝜋x)
ber drum with sound speed v = to find the arbitrary constants in your
100m/s and a radius of 1.0m. solution, using the equations for a
(c) By stretching the drum more tightly Fourier-Bessel series in Appendix J. Your
you can increase the sound speed. answer will be in the form of an integral
Find the dominant pitch for a more that you will not be able to evaluate. Hint:
tightly stretched rubber drum with You may have to define a new variable to
v = 200m/s and a radius of a = 0.3m. get the resulting equation to look like
(d) Explain in your own words why these dif- the usual form of a Fourier-Bessel series.
ferent drums sound so different. How 11.136 [This problem depends on Problem 11.135.]
would you design a drum if you wanted
(a) Have a computer calculate the 10th
it to make a low, booming sound?
partial sum of your solution to Prob-
11.135 Walk-Through: Differential Equation with lem 11.135 Part (g). Describe how the
Bessel Normal Modes. In this problem you function y(x) is evolving over time.
will solve the partial differential equation (b) You should have found that the function
𝜕y∕𝜕t − x 3∕2 (𝜕 2 y∕𝜕x 2 ) − x 1∕2 (𝜕y∕𝜕x) = 0 sub- at x = 0 starts at zero, then rises slightly,
ject to the boundary condition y(1, t) = 0 and then asymptotically approaches zero
and the requirement that y(0, t) be finite. again. Explain why it does that. Hint:
(a) Begin by guessing a separable solution think about why all the normal modes
y = X (x)T (t). Plug this guess into the cancel out at that point initially, and what
differential equation. Then divide both is happening to each of them over time.
sides by X (x)T (t) and separate variables.
(b) Find the general solution to the result- In Problems 11.137–11.140 you will be given a PDE
ing ODE for X (x) three times: with a and a set of boundary and initial conditions.
positive separation constant k 2 , a nega- (a) Solve the PDE with the given boundary condi-
tive separation constant −k 2 , and a zero tions using separation of variables. You may solve
separation constant. For each case you the ODEs you get by hand or with a computer.
can solve the ODE by hand using the The solution to the PDE should be an infinite
variable substitution u = x 1∕4 or you can series with undetermined coefficients.
use a computer. Show which one of your
(b) Plug in the given initial condition. The result
solutions can match the boundary con-
should be a Fourier-Bessel series. Write an
ditions without requiring X (x) = 0.
equation for the coefficients in your series using
(c) Apply the condition that y(0, t) is finite to the equations in Appendix J. This equation will
show that one of the arbitrary constants involve an integral that you may not be able to
in your solution from Part (b) must be 0. evaluate.
(d) Apply the boundary condition y(1, t) = 0
to find all the possible values for k. (c) Have a computer evaluate the integrals in
There will be an infinite number of Part (b) either analytically or numerically to cal-
them, but you should be able to write culate the 20th partial sum of your series solution
them in terms of a new constant n that and either plot the result at several times or make
can be any positive integer. Writing a 3D plot of y(x, t). Describe how the function is
k in terms of n will involve 𝛼p,n , the evolving over time.
zeros of the Bessel functions. It may help to first work through Problem 11.135 as a
(e) Solve the ODE for T (t), expressing model.
your answer in terms of n. 𝜕y 𝜕 2 y 1 𝜕y y
11.137 = 2 + − 2 , y(3, t) = 0,
(f) Multiply X (x) times T (t) to find the nor- 𝜕t 𝜕x { x 𝜕x x
mal modes of this system. You should 1 1≤x≤2
y(x, 0) = . Assume
be able to combine your two arbi- 0 elsewhere
trary constants into one. Write the y is finite for 0 ≤ x ≤ 3.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 606
this solution comes from. We will address that important question in Chapter 12 for Bessel
functions, Legendre polynomials, and some other important examples. Our focus in this
chapter is using those ODE solutions to solve PDEs.
The Problem
A spherical shell of radius a surrounds a region of space with no charge; therefore, inside
the sphere, the potential obeys Laplace’s equation 11.2.5. On the surface of the shell, the
given boundary condition is a potential function V (a, 𝜃) = V0 (𝜃). Find the potential inside
the sphere.
Laplace’s equation in spherical coordinates is derived in Chapter 8:
( )
𝜕2V 2 𝜕V 1 𝜕2 V cos(𝜃) 𝜕V 1 𝜕2V
+ + + + =0
𝜕r 2 r 𝜕r r 2 𝜕𝜃 2 sin(𝜃) 𝜕𝜃 sin2 (𝜃) 𝜕𝜙2
Many problems, including this one, have “azimuthal symmetry”: the answer will not depend
on the angle 𝜙. If V has no 𝜙-dependency then 𝜕V ∕𝜕𝜙 = 0, reducing the equation to:
( )
𝜕2V 2 𝜕V 1 𝜕2 V cos(𝜃) 𝜕V
+ + + =0
𝜕r 2 r 𝜕r r 2 𝜕𝜃 2 sin(𝜃) 𝜕𝜃
where 0 ≤ r ≤ a and 0 ≤ 𝜃 ≤ 𝜋. (We will solve this problem without azimuthal symmetry in
the next section.)
As we have done before we notice a key “implicit” boundary condition, which is that the
potential function must not blow up anywhere inside the sphere.
⌈ Cos [θ] ⌊
In[1]:= DSolve ⌊f′′[θ] + f′[θ] + k f [θ] == 0, f [θ], θ] ⌈
Sin [θ]
Oh no, it’s a whole new kind of function that we haven’t encountered yet in this chapter!
Don’t panic: the main point of this section, and one of the main points of this whole chapter,
is that you can attack this kind of problem the same way no matter what function you find
for the normal modes.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 609
So you look up “LegendreP” and “LegendreQ” in the Mathematica online help, and you
find that these are “Legendre polynomials of the first and second kind,” respectively. If Math-
ematica writes “LegendreP(3,x),” the online help tells you, standard notation would be P3 (x).
So the above solution can be written as:
( √ )
1
Θ(𝜃) = APl (cos 𝜃) + BQl (cos 𝜃) where l = −1 + 1 + 4k
2
What else do you need? You need to know a few properties of Legendre polynomials in
order to match the boundary conditions (implicit and explicit), and later you’ll need to
use a Legendre polynomial expansion to match initial conditions, just as we did earlier with
trig functions and Bessel functions. So take a moment to look up Legrendre polynomials in
Appendix J: all the information we need for this problem is right there.
In this case, the only boundary condition on 𝜃 is that the function should be bounded
everywhere. Since the argument of Pl and Ql in this solution is cos 𝜃, the Legendre polyno-
mials need to be bounded in the domain [−1, 1]. That’s not true for any of the Ql functions,
and it’s only true for Pl when l is a non-negative integer, so we can write the complete set of
solutions as
Θ(𝜃) = APl (cos 𝜃), l = 0, 1, 2, 3, …
From the definition of l above we can solve for k to find k = l(l + 1). In other words, the only
values of k for which this equation has a bounded solution are 0, 2, 6, 12, 20, etc.
You’ll solve this equation (sometimes called a “Cauchy–Euler Equation”) in the problems,
both by hand and with a computer. The result is
R(r ) = Br l + Cr −l−1
Since r −l−1 blows up at r = 0 for all non-negative integers l, we discard this solution based
on our implicit boundary condition. Combining our two arbitrary constants, we find that:
As always, since our original equation was linear and homogeneous, we write a general solu-
tion as a sum of all the particular solutions, each with its own arbitrary constant:
∑
∞
V (r , 𝜃) = Al r l Pl (cos 𝜃) (11.7.3)
l=0
∑
∞
V0 (𝜃) = Al a l Pl (cos 𝜃)
l=0
We are guaranteed that we can meet this condition, because Legendre polynomials—like
the trig functions and Bessel functions that we have seen before—constitute a complete set of
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 610
where 𝜋
2l + 1
Al = V (𝜃)Pl (cos 𝜃) sin 𝜃d𝜃 (11.7.5)
2a l ∫0 0
We plug this into the original differential equation, multiply through by r 2 ∕ [R(r )Ω(𝜃, 𝜙)],
separate out the r -dependent terms, and use the separation constant k to arrive at r 2 R ′′ (r ) +
2rR ′ (r ) − kR(r ) = 0 and
𝜕 2 Ω cos 𝜃 𝜕Ω 1 𝜕2Ω
+ + + kΩ = 0 (11.7.7)
𝜕𝜃 2 sin 𝜃 𝜕𝜃 sin2 𝜃 𝜕𝜙2
In Problem 11.158 you’ll solve Equation 11.7.7 using separation of variables. The solutions
will of course be products of functions of 𝜙 with functions of 𝜃. This PDE comes up so often
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 611
in spherical coordinates, however, that its solution has a name. Looking up Equation 11.7.7
in Appendix J we find that the solutions are the “spherical harmonics” Ylm (𝜃, 𝜙), where l is a
non-negative integer such that k = l(l + 1) and m is an integer satisfying |m| ≤ l. (The require-
ment that l and m be integers comes from the implicit boundary conditions on 𝜃 and 𝜙.)
The equation for R(r ) is a Cauchy–Euler equation, just like in the last section. If you plug
in k = l(l + 1) and use the requirement that it be bounded at r = 0 the solution is R(r ) = Cr l .
The solution to Equation 11.7.6 is thus
∑
∞
∑
l
V (r , 𝜃, 𝜙) = Clm r l Ylm (𝜃, 𝜙)
l=0 m=−l
∑
∞
∑
l
V0 (𝜃, 𝜙) = Clm a l Ylm (𝜃, 𝜙)
l=0 m=−l
From the formula in Appendix J for the coefficients of a spherical harmonic expansion we
can write 𝜋
1
2𝜋 [ ]∗
Clm = l V0 (𝜃, 𝜙) Ylm (𝜃, 𝜙) sin 𝜃 d𝜃 d𝜙
a ∫0 ∫0
[ ]∗
where Ylm (𝜃, 𝜙) designates the complex conjugate of the spherical harmonic function
Ylm (𝜃, 𝜙). This completes our solution for the potential in a hollow sphere.
Spherical Harmonics
Spherical harmonics are a computational convenience, not a new method of solving differ-
ential equations. We could have solved the problem above by separating all three variables:
we would have found a complex exponential function in 𝜙 and an associated Legendre
polynomial in 𝜃 and then multiplied the two. (“Associated Legendre polynomials” are yet
another special function, different from “Legendre polynomials.” As you might guess, they
are in Appendix J.) All we did here was skip a step by defining a spherical harmonic as
precisely that product: a complex exponential in 𝜙, multiplied by an associated Legendre
polynomial in 𝜃.
Complex exponentials form a complete basis (Fourier series), and so do associated
Legendre polynomials. By multiplying them we create a complete basis for functions of 𝜃
and 𝜙. Essentially any function that depends only on direction, and not on distance, can
be expanded as a sum of spherical harmonics. That makes them useful for solving partial
differential equations in spherical coordinates, but also for a wide variety of other problems
ranging from analyzing radiation coming to us from space to characterizing lesions in
multiple sclerosis patients.8
The last four sections of this chapter have all been on solving partial differential equations
by separation of variables. The topic deserves that much space: partial differential equations
come up in almost every aspect of physics and engineering, and separation of variables is the
most common way of handling them.
8 Goldberg-Zimring, Daniel, et. al., “Application of spherical harmonics derived space rotation invariant indices to
the analysis of multiple sclerosis lesions’ geometry by MRI,” Magnetic Resonance Imaging, Volume 22, Issue 6, July
2004, Pages 815-825.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 612
On the other hand there’s a real danger that, after going through four sections on one
technique, you will feel like there is a mountain of trivia to master. We started this section
by listing the four steps of every separation of variables problem. Let’s return to that out-
line, but fill in all the “gotchas” and forks in the road. You may be surprised at how few
there are.
1. Separate variables to turn one partial differential equation into several ordinary
differential equations.
In this step you “guess” a solution that is separated into functions of one variable
each. (Recall that such a solution is a normal mode.) Then you do a bit of algebra to
isolate the dependency: for instance, one side depends only on 𝜃, the other does not
depend at all on 𝜃. Finally you set both sides of the equation equal to a constant—the
same constant for both sides!—often using k 2 if you can determine up front that the
constant is positive, or −k 2 if negative.
If separating the variables turns out to be impossible (as it will for almost any
inhomogeneous equation for instance), you can’t use this technique. The last few
sections of this chapter will present alternative techniques you can use in such cases.
If there are three variables you go through this process twice, introducing two
constants…and so on for higher numbers of variables. (Spherical harmonics represent
an exception to this rule in one specific but important special case.)
2. Solve the ordinary differential equations. The product of these solutions is one solu-
tion to the original equation.
Sometimes the “solving” step can be done by inspection. When the solution is
less obvious you may use a variable substitution and table lookup, or you may use a
computer.
The number of possible functions you might get is the most daunting part of
the process. Trig functions (regular and hyperbolic), exponential functions (real and
complex), Bessel functions, Legendre polynomials…no matter how many we show
you in this chapter, you may encounter a new one next week. But as long as you can
look up the function’s general behavior, its zeros, and how to use it in series to build
up other functions, you can work with it to find a solution. That being said, we have
not chosen arbitrarily which functions to showcase: trig functions, Bessel functions,
Legendre polynomials, and spherical harmonics are the most important examples.
3. Match any homogeneous boundary conditions.
Matching the homogeneous boundary conditions sometimes tells you the value
of one or more arbitrary constants, and sometimes limits the possible values of the con-
stants of separation. If you have homogeneous initial conditions, this method generally
won’t work.
4. Sum up all solutions into a series, and then match the inhomogeneous boundary and
initial conditions.
First you write a series solution by summing up all the solutions you have pre-
viously found. (The original differential equation must be linear and homogeneous,
so a linear combination of solutions—a series—is itself a solution.) Then you set that
series equal to your inhomogeneous boundary or initial condition. (Your normal
modes must form a “complete basis,” so you can add them up to meet any arbitrary
conditions.)
If you had three independent variables—and therefore two separation
constants—the result is a double series. This trend continues upward as the number
of variables climbs.
If you have more than one inhomogeneous condition, you create subproblems
with one inhomogeneous condition each: this will be the subject of the next section.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 613
The rest of the chapter discusses additional techniques that can be used instead of, or some-
times alongside, separation of variables for different types of problems. Appendix I gives a
flow chart for deciding which techniques to use for which problems. If you go through that
process and decide that separation of variables is the right technique to use, you might find
it helpful to refer to the list above as a reminder of the key steps in the process.
(g) Demonstrate that your solution satis- polynomials that you can recombine into spherical
fies the original differential equation harmonics.)
and the initial condition.
11.154 𝜕y∕𝜕t = 𝜕 2 y∕𝜕𝜃 2 + cot 𝜃(𝜕y∕𝜕𝜃) +
In Problems 11.150–11.153 you will be given a PDE, a csc2 𝜃(𝜕 2 y∕𝜕𝜙2 ), 0 ≤ 𝜃 ≤ 𝜋, 0 ≤ 𝜙 ≤ 2𝜋,
domain, and a set of initial conditions. You should y(𝜃, 𝜙, 0) = 𝜃(𝜋 − 𝜃) cos(𝜙)
assume in each case that the function is finite in the 𝜕y ( ) 𝜕2 y 𝜕y 1 𝜕 y
2
(a) Use the formula we derived to find that obeys Schrödinger’s equation (11.2.6)
the gravitational potential inside a with V (⃗x ) = 0 and boundary condition
spherical shell that is held at a con- Ψ(a, 𝜃, 𝜙, t) = 0. In this problem you
stant potential V0 . Simplify your answer will find the possible energies for such
as much as possible.
2𝜋 𝜋 [ m ]Hint:
∗
the inte- a particle.
gral ∫0 √ ∫0 Yl (𝜃, 𝜙) sin 𝜃 d𝜃 d𝜙 (a) Write Schrödinger’s equation with
equals 2 𝜋 for l = m = 0 and 0 V (⃗x ) = 0 in spherical coordinates.
for all other l and m. Plug in a trial solution Ψ(r , 𝜃, 𝜙, t) =
(b) The gravitational field is given by g⃗ = T (t)𝜓(r , 𝜃, 𝜙) and separate variables to
⃗ . Use your result from Part (a)
−∇V get an equation for T (t). Verify that
to find the gravitational field in the T (t) = Ae −iEt∕ℏ is the solution to your
interior of the sphere. equation for T (t), where E is the sep-
aration constant. That constant rep-
11.160 Solve the equation
resents the energy of the atom.
𝜕2 y ( ) 𝜕2 y 𝜕y 4 (b) Plug 𝜓(r , 𝜃, 𝜙) = R(r )Ω(𝜃, 𝜙) into the
= 4 − x2 − 2x − y remaining equation, separate, and solve
𝜕t 2 𝜕x 2 𝜕x 4 − x 2
for Ω(𝜃, 𝜙). Using the implicit bound-
on the domain −2 ≤ x ≤ 2 with initial con- ary conditions that Ψ must be finite
ditions y(x, 0) = 4 − x 2 , 𝜕y∕𝜕t(x, 0) = 0. everywhere inside the sphere and peri-
11.161 The equation ∇2 u = −𝜆2 u is known odic in 𝜙, show that the separation
as the “Helmholtz equation.” For this constant must be l(l + 1) where l is an
problem you will solve it in a sphere integer. (Hint: If you get a different
of radius a with boundary condition set of allowed values for the separa-
u(a, 𝜃, 𝜙) = 0. You should also use the tion constant you should think about
implicit boundary condition that the how you can simplify the equation
function u is finite everywhere inside and which side the constant E should
the sphere. Note that you will need to use go on to get the answer to come out
the formula for the Laplacian in spherical this way.)
coordinates. (c) Solve the remaining equation for R(r ). If
(a) Separate variables, putting R(r ) on one you write the separation constant in the
side and Ω(𝜃, 𝜙) on the other. Solve the form l(l + 1) you should be able to recog-
equation for Ω(𝜃, 𝜙) without separat- nize the equation as being similar to one
ing a second time. You should find that of the ones in Appendix J. You can get it
this equation only has non-trivial solu- in the right form with a variable substitu-
tions for some values of the separation tion or solve it on a computer. You should
constant. find that the implicit boundary condi-
tion that Ψ is finite allows you to set one
(b) Using the values of the separation con-
arbitrary constant to zero. The explicit
stant you found in the last part, solve
boundary condition Ψ(a, 𝜃, 𝜙, t) = 0
the equation for R(r ), applying the
should allow you to specify the allowed
boundary condition. You should find
values of E. These are the possible ener-
that only some values of 𝜆 allow you to
gies for a quantum particle confined to
satisfy the boundary condition. Clearly
a sphere.
indicate the allowed values of 𝜆.
(c) Combine your answers to write the gen- When a large star collapses at the end
eral solution to the Helmholtz equation of its life it becomes a dense sphere of
in spherical coordinates. You should particles known as a neutron star. Know-
find that the boundary conditions given ing the possible energies of a particle in
in the problem were not sufficient to a sphere allows astronomers to predict
specify the solution. Instead the gen- the behavior of these objects.
eral solution will be a double sum over 11.163 Exploration: Potential Inside a Hemisphere
two indices with arbitrary coefficients In this problem you will find the electric
in front of each term in the sum. potential inside a hollow hemisphere of
11.162 A quantum mechanical particle moving radius a with a constant potential V = c on
freely inside a spherical container of radius the curved upper surface of the hemisphere
a can be described by a wavefunction Ψ and V = 0 on the flat lower surface.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 616
Every example we have worked so far we has either been an initial value problem with
homogeneous boundary conditions or a boundary value problem with one inhomogeneous
boundary condition. We applied the homogeneous conditions before we summed; after we
summed we could find the coefficients of our series solution to match the inhomogeneous
condition at the one remaining boundary.
If we have multiple inhomogeneous conditions, we bring back an old friend from
Chapter 1. To solve inhomogeneous ordinary differential equations we found two different
solutions—a “particular” solution and a “complementary” solution—that summed to the
general solution we were looking for. The same technique applies to the world of partial
differential equations. In Problem 11.176 you’ll apply this to solve an inhomogeneous PDE
just as you did with ODEs. In this section, however, we’ll show you how to use the same
method to solve linear, homogeneous differential equations with multiple inhomogeneous boundary
conditions.
The two examples below are different in several ways. In both cases, however, our
approach will involve finding two different functions that we can add to solve the original
problem we were given.
𝜕2u 𝜕2u
+ 2 =0
𝜕x 2 𝜕y
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 618
The Approach
We’re going to solve two different problems, neither of which is exactly the problem we were
given. The first problem is to find a solution such that uL (x, y) is L(y) on the left side and zero
on all three other sides. The second problem is to find a function such that uT (x, y) is T (x)
on the top and zero on all three other sides.
y y
u=0 u = T(x)
y=h y=h
2u
u = L(y) 2u =0 u=0 u=0 =0 u=0
x x
u=0 x=w u=0 x=w
uL (x, y) uT (x, y)
Each subproblem has only one inhomogeneous condition, so we can take our familiar
approach: separate variables and apply the homogeneous conditions, then create a series,
and finally match the inhomogeneous condition. ( Of course,
) neither uL nor uT is a solution
to our original problem! However, their sum uL + uT (x, y) will satisfy Laplace’s equation
and fit the boundary conditions on all four sides.
X ′′ (x) Y ′′ (y)
=−
X (x) Y (y)
Since we wish uL (x, y) to go to zero at y = 0 and y = h, the separation constant must be positive,
so we shall call it k 2 . Solving the second equation and applying the boundary conditions:
Y (y) = A sin(ky)
where k = n𝜋∕h. The first equation comes out quite differently. The positive separation con-
stant leads to an exponential solution:
X (x) = Ce kx + De −kx
Having plugged in our three homogeneous conditions, we now combine the two functions,
absorb C into A, and write a sum:
∑
∞
( )
uL (x, y) = An sin(ky) e kx − e k(2w−x)
n=1
where k = n𝜋∕h.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 619
∑
∞ ( )
n𝜋 ( )
An sin y 1 − e 2kw = L(y)
n=1
h
This is a Fourier sine series. You can find the formula for the coefficients in Appendix G.
( ) 2 h ( )
n𝜋
An 1 − e 2kw = L(y) sin y dy
h ∫0 h
For the second function uT (x, y) we start the process over, separating the variables and
using a negative separation constant −p 2 this time. You will go through this process in
Problem 11.164 and arrive at:
∑
∞
( )
uT (x, y) = Bm sin(px) e py − e −py (11.8.1)
m=1
∑
∞
m𝜋 ( py )
Bm sin( x) e − e −py = T (x)
m=1
w
( ) 2 w ( )
m𝜋
Bm e ph − e −ph = T (x) sin x dx
w ∫0 w
∑
∞
( ) ∑∞
( )
u(x, y) = An sin(ky) e kx − e k(2w−x) + Bm sin(px) e py − e −py
n=1 m=1
( ) h ( ) ( )
where k = n𝜋∕h, p = m𝜋∕w, An 1 − e 2kw = (2∕h) ∫0 L(y) sin n𝜋y∕h dy, and Bm e ph − e −ph =
w
(2∕w) ∫0 T (x) sin (m𝜋x∕w) dx.
That is the step you need to think about! Assuming that uL and uT solve the individual
problems they were designed to solve…
∙ Can you convince yourself that uL + uT is still a solution to the differential equation
𝜕 2 u∕𝜕x 2 + 𝜕 2 u∕𝜕y2 = 0? (This would not work for all differential equations: why must
it for this one?)
∙ Can you convince yourself that uL + uT meets all the proper boundary condi-
tions? (These are different from the boundary conditions that either function
meets alone!)
As a final thought, before we move on to the next example: suppose all four boundary condi-
tions had been inhomogeneous instead of only two. Can you outline the approach we would
use to find a general solution? What individual problems would we have to solve first?
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 620
The Approach
Once again we are going to break the initial problem into two subproblems that will sum to
our solution. In this case we will find a “particular” solution and a “complementary” solution,
just as we did for ordinary differential equations.
The “particular” solution uP (x, t) will solve our differential equation with the original
boundary conditions. It will be a very simple, very specific solution with no arbitrary con-
stants.
The “complementary” solution will solve our differential equation with homogeneous
boundary conditions uC (0, t) = uC (L, t) = 0. It will have the arbitrary constants of a general
solution: therefore, after we add it to our particular solution, we will be able to match initial
conditions.
zero at both ends with non-zero values in the middle, so we choose a negative separation
constant −k 2 . In Problem 11.170 you’ll solve for X (x) and apply the boundary conditions
X (0) = X (L) = 0, then solve for T (t), and end up here.
( )
−k 2 t k √
uC (x, t) = Ae sin √ x where k = (n𝜋∕L) 𝛼, n = 1, 2, 3, … (11.8.2)
𝛼
Since uC solves a homogeneous differential equation with homogeneous boundary condi-
tions, any sum of solutions is itself a solution, so the general solution is:
∑
∞ ( )
uC (x, t) = An e −𝛼(n
2 𝜋 2 ∕L 2
)t sin n𝜋 x
n=1
L
TR − TL ∑ ∞ ( )
n𝜋
An e −𝛼(n 𝜋 ∕L )t sin
2 2 2
u(x, t) = x + TL + x
L n=1
L
As a final step, of course, we must make this entire function match the initial condition u(x, 0) =
T0 (x). Plugging t = 0 into our function gives us:
TR − TL ∑∞ ( )
n𝜋
u(x, 0) = x + TL + An sin x = T0 (x)
L n=1
L
As usual we appeal to the power of a Fourier sine series to represent any function on a finite
domain. In this case we have to choose our coefficients An so that:
∑
∞ ( ) T − TL
n𝜋
An sin x = T0 (x) − R x − TL
n=1
L L
The solution is straightforward to write down in general, even though it may be intimidating
to calculate for a given function T0 :
L( ) ( )
2 T − TL n𝜋
An = T0 (x) − R x − TL sin x dx
L ∫0 L L
And what of the complementary solution? Because we were forced by the second-order
equation for X (x) to choose a negative separation constant, the first-order equation for
2
T (t) led to a decaying exponential function e −k t . No matter what the initial conditions, the
complementary solution will gradually die down toward zero; the total solution uP + uC will
approach the solution represented by uP . In the language of physics, uP represents a stable
solution: the system will always trend toward that state.
In some situations you will find a steady-state solution that the system does not approach
over time, analogous to an unstable equilibrium point in mechanics. In many cases such as
this one, though, the steady-state solution represents the late-time behavior of the system
and the complementary solution represents the transient response to the initial conditions.
The fact that this physical system approaches the steady-state solution is not surprising if
you think about it. If the ends of the rod are held at constant temperature for a long time,
the rod will move toward the simplest possible temperature distribution, which is a linear
change from the left temperature to the right. We mention this to remind you of where we
began the very first chapter of the book. Solving equations is a valuable skill, but computer
solutions are becoming faster and more accurate all the time. Understanding the solutions, on
the other hand, will require human intervention for the foreseeable future. To put it bluntly,
interpreting solutions is the part that someone might pay you to do.
has no arbitrary constants it cannot be (b) Solve the heat equation (11.2.3) with
made to satisfy the initial condition. these boundary conditions and with
(b) Find a complementary solution to the initial condition u(x, 0) = 100x(2 − x)
equation 𝜕uC ∕𝜕t = 𝜕 2 uC ∕𝜕x 2 − uC subject to find the temperature u(x, t). You
to the homogeneous boundary condi- can use your steady-state solution
tions u(0, t) = u(1, t) = 0. (You can solve as a “particular” solution.
this PDE by separation of variables.) This (c) The value of 𝛼 for a copper rod is about
complementary solution should have an 𝛼 = 10−4 m2 /s. If a 1 meter copper rod
arbitrary constant, but it will not satisfy started with the initial temperature given
the original boundary conditions. in this problem how long it would take
(c) Add these two solutions to get the gen- for the point x = 1∕2 to get within 1% of
eral solution to the original PDE with its steady-state temperature? Although
the original boundary conditions. your solution for u(x, t) is an infinite
(d) Apply the initial conditions to this gen- series, you should answer this question
eral solution and solve for the arbi- by neglecting all of the terms except uSS
trary constants to find the complete and the first non-zero term of uC .
solution u(x, t) to this problem. 11.176 Exploration: An Inhomogeneous PDE
(e) What function is your solution We have used the technique of finding
approaching as t → ∞? a particular and a complementary solu-
tion in two different contexts: inhomoge-
For Problems 11.172–11.174 solve the given PDEs neous ODEs, and homogeneous PDEs with
subject to the given boundary and initial conditions. multiple inhomogeneous conditions. We
For each one you will need to find a steady-state can use the same technique for inhomo-
solution and a complementary solution, add the two, geneous PDEs. Consider as an example
and then apply the initial conditions. For each the equation 𝜕u∕𝜕t − 𝜕 2 u∕𝜕x 2 = 𝜅 sub-
problem, will the general solution approach the ject to the boundary conditions u(0, t) = 0,
steady-state solution or not? Explain. It may u(1, t) = 1 and the initial condition u(x, 0) =
help to first work through Problem 11.171 (1 + 𝜅∕2) x − (𝜅∕2)x 2 + sin(3𝜋x).
as a model. (a) Find a steady-state solution uSS that solves
−𝜕 2 uSS ∕𝜕x 2 = 𝜅 subject to the bound-
11.172 𝜕 2 u∕𝜕t 2 = v 2 (𝜕 2 u∕𝜕x 2 ), u(0, t) = 0, u(1, t) = ary conditions given above. Because
2, u(x, 0) = 𝜕u∕𝜕t(x, 0) = 0 this will be a particular solution you do
11.173 𝜕u∕𝜕t = 𝜕 2 u∕𝜕𝜃 2 + u, u(0, t) = 0, u(𝜋∕2, t) = not need any arbitrary constants.
𝛼, u(𝜃, 0) = 𝛼 [sin(𝜃) + sin(4𝜃)]. (b) Find the general complementary solu-
11.174 𝜕u∕𝜕t = 𝜕 2 u∕𝜕x 2 + (1∕x)(𝜕u∕𝜕x) − (1∕x 2 )u, tion that solves 𝜕uC ∕𝜕t − 𝜕 2 uC ∕𝜕x 2 = 0
u(0, t) = 0, u(1, t) = 1, u(x, 0) = 0 subject to the boundary conditions
u(0, t) = u(1, t) = 0. You don’t need to
11.175 The left end of a rod at x = 0 is immersed apply the initial conditions yet.
in ice water that holds the end at 0◦ C and (c) Add the two to find the general solu-
the right edge at x = 1 is immersed in boil- tion to the original PDE subject to the
ing water that keeps it at 100◦ C. (You may original boundary conditions. Apply
assume throughout the problem that all the initial condition and solve for the
temperatures are measured in degrees arbitrary constants to get the complete
Celsius and all distances are measured solution.
in meters.) (d) Verify by direct substitution that this
(a) What is the steady-state temperature solution satisfies the PDE, the boundary
uSS (x) of the rod at late times? conditions, and the initial condition.
All three techniques start by writing the given differential equation in a different form.
The specific transformation is chosen to turn a derivative into a multiplication, thereby turn-
ing a PDE into an ODE.
As you read through this section, watch how a derivative turns into a multiplication, and
how the resulting differential equation can be solved to find the function we were looking
for—in the form of a Fourier series. After you follow the details, step back and consider what
it was that made a Fourier series helpful for this particular differential equation, and you will
be well set for the other transformations we will discuss.
𝜕u 𝜕2 u
=𝛼 2 (11.9.1)
𝜕t 𝜕x
u(0, t) = u(L, t) = 0
We previously found the temperature u(x, t) of such a bar by using separation of variables;
you are now going to solve the same problem (and hopefully get the same answer!) using a
different technique, “eigenfunction expansion.” Later we will see that eigenfunction expan-
sion can be used in some situations where separation of variables cannot—most notably in
solving some inhomogeneous equations.
To begin with, replace the unknown function u(x, t) with its unknown Fourier sine
expansion in x.
∑∞ ( )
n𝜋
u(x, t) = bn (t) sin x (11.9.2)
n=1
L
1. Does this particular choice (a Fourier series with no cosines) guarantee that you meet
your boundary conditions? If so, explain why. If not, explain what further steps will be
taken later to meet them.
2. What is the second derivative with respect to x of bn (t) sin (n𝜋x∕L)?
3. What is the first derivative with respect to t of bn (t) sin (n𝜋x∕L)?
4. Replacing u(x, t) with its Fourier sine series as shown in Equation 11.9.2, rewrite
Equation 11.9.1.
See Check Yourself #77 in Appendix L
Now we use one of the key mathematical facts that makes this technique work: if two
Fourier sine series with the same frequencies are equal to each other, then the coefficients
must equal each other. For instance, the coefficient of sin(3x) in the first series must equal
the coefficient of sin(3x) in the second series, and so on.
5. Set the nth coefficient on the left side of your answer to Part 4 equal to the nth coef-
ficient on the right. The result should be an ordinary differential equation for the
function bn (t).
6. Solve your equation to find the function bn (t).
7. Write the function u(x, t) as a Fourier sine series, with the coefficients properly
filled in.
8. How will your temperature function behave after a long time? Answer this question
based on your answer to Part 7; then explain why this answer makes sense in light of
the physical situation.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 625
The Problem
Let us return to the first problem we solved with separation of variables: a string fixed at the
points (0, 0) and (L, 0) but free to vibrate between those points. The initial position y(x, 0) =
f (x) and initial velocity (dy∕dt)(x, 0) = g (x) are specified as before.
In this case, however, there is a “driving function”—which is to say, the string is subjected
to an external force that pushes it up or down. This could be a very simple force such as
gravity (pulling down equally at all times and places), or it could be something much more
complicated such as shifting air pressure around the string. Our differential equation is now
inhomogeneous:
𝜕2y 1 𝜕 y
2
− = q(x, t) (11.9.3)
𝜕x 2 v 2 𝜕t 2
where q(x, t) is proportional to the external force on the string. In Problem 11.177 you will
show that separation of variables doesn’t work on this problem. Below we demonstrate a
method that does.
then a1 must equal b1 , and so on. So after we turn both sides of our equation into
Fourier series, we will confidently assert that the corresponding coefficients must be
equal, and solve for them.
∙ How does this make the problem easier? Suppose some function f (x) is expressed as a
Fourier sine series:
What happens, term by term, when you take the second derivative? b1 sin(x) is just
multiplied by −1. b2 sin(2x) is multiplied by −4, the next term by −9, and so on. In
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 626
general, for any function bn sin(nx), taking a second derivative is the same as multiplying
by −n 2 .
So, what happens to a differential equation when you replace a derivative with a
multiplication? If you started with an ordinary differential equation, you are left with
an algebra equation. In our example, starting with a two-variable partial differential
equation, we will be left with an ordinary differential equation. If we had started with
a three-variable PDE, we would be down to two…and so on.
This will all (hopefully) become clear as you look through the example, but now
you know what you’re looking for.
∙ Do you always use a Fourier sine series? No. Below we discuss why that type of series is
the best choice for this particular problem.
A note about notation: There is no standard notation for representing different Fourier
series in the same problem, but we can’t just use b3 to mean “the coefficient of sin(3x)”
when we have four different series with different coefficients. So we’re going to use by3 for
“the coefficient of sin(3x) in the Fourier series for y(x, t),” bq3 for the third coefficient in the
expansion of q(x, t), and similarly for f (x) and g (x).
The Expansion
The first step is to replace y(x, t) with its Fourier expansion in x. Recalling that y(x, t) is defined
on the interval 0 ≤ x ≤ L, we have three options: a sine-and-cosine series with period L, a
sine-only expansion with period 2L based on an odd extension on the negative side, or
a cosine-only expansion with period 2L based on an even extension on the negative side.
Chapter 9 discussed these alternatives and why the boundary conditions y(0, t) = y(L, t) = 0
lend themselves to a sine expansion. We therefore choose to create an odd extension of our
function, and write:
∑∞ ( )
n𝜋
y(x, t) = byn (t) sin x (11.9.4)
n=1
L
with the frequency n𝜋∕L chosen to provide the necessary period 2L.
This differs from Chapter 9 because we are expanding a multivariate function (x and t)
in one variable only (x). You can think about it this way: when t = 2, Equation 11.9.4 rep-
resents a specific function y(x) being expanded into a Fourier series with certain (constant)
coefficients. When t = 3 a different y(x) is being expanded, with different coefficients…and
so on, for all relevant t-values. So the coefficients byn are constants with respect to x, but vary
with respect to time.
We will similarly expand all the other functions in the problem:
∑
∞ ( ) ∑
∞ ( ) ∑
∞ ( )
n𝜋 n𝜋 n𝜋
q(x, t) = bqn (t) sin x , f (x) = bfn sin x , g (x) = bgn sin x
n=1
L n=1
L n=1
L
Since the initial functions f (x) and g (x) have no time dependence, bfn and bgn are constants.
The next step—rearranging the terms—may look suspicious if you know the properties of
infinite series, and we will discuss later why it is valid in this case. But if you put aside your
doubts, the algebra is straightforward.
∞ [ 2 ( ( )) ( ( ))] ∑ ( )
∑ 𝜕 n𝜋 1 𝜕2 n𝜋
∞
n𝜋
b yn (t) sin x − b yn (t) sin x = b qn (t) sin x
n=1
𝜕x 2 L v 2 𝜕t 2 L n=1
L
Now we take those derivatives. Pay particular attention, because this is the step where the
equation becomes easier to work with.
[ ]
∑∞ ( ) 2 ( ) ∑∞ ( )
n2 𝜋 2 n𝜋 1 d byn (t) n𝜋 n𝜋
− 2 byn (t) sin x − 2 sin x = b qn (t) sin x
n=1
L L v dt 2 L n=1
L
As we discussed earlier, the 𝜕 2 ∕𝜕x 2 operator has been replaced by a multiplied constant. At
the same time, the partial derivative with respect to time has been replaced by an ordinary
derivative, since byn has no x-dependence.
We can now rewrite our original differential equation 11.9.3 with both sides Fourier
expanded:
[ ]
∑∞ 2 ( ) ∑ ∞ ( )
n2 𝜋 2 1 d byn (t) n𝜋 n𝜋
− 2 byn (t) − 2 sin x = b qn (t) sin x
n=1
L v dt 2 L n=1
L
This is where the uniqueness of Fourier series comes into play: if the Fourier sine series on
the left equals the Fourier sine series on the right, then their corresponding coefficients
must be the same. 2
n2 𝜋 2 1 d byn (t)
− 2 byn (t) − 2 = bqn (t) (11.9.5)
L v dt 2
It doesn’t look pretty, but consider what we are now being asked to do. For any given q(x, t)
function, we find its Fourier sine series: that is, we find the coefficients bqn (t). That leaves
us with an ordinary second-order differential equation for the coefficients byn (t). We may be
able to solve that equation by hand, or we may hand it over to a computer.9 Either way we
will have the original function y(x, t) we were looking for—but we will have it in the form of
a Fourier series.
In Problem 11.178 you will address the simplest possible case, that of q(x, t) = 0, to show
that the solution matches the result we found using separation of variables. Below we show
the result of a constant force such as gravity. In Problem 11.198 you will tackle the problem
more generally using the technique of variation of parameters.
Of course the solution will contain two arbitrary constants, which bring us to our initial
conditions. Our first condition is y(x, 0) = f (x). Taking the Fourier expansion of both sides
of that equation—and remembering once again the uniqueness of Fourier series—we see
that byn (0) = bfn . Our other condition, dy∕dt(x, 0) = g (x), tells us that ḃ yn (0) = bgn . So we can
use our initial conditions on y to find the initial conditions for b: or to put it another way, we
translate our initial conditions into Fourier space.
Once we have solved our ordinary differential equation with the proper initial conditions,
we have the functions byn (t). These are the coefficients of the Fourier series for y, so we now
have our final solution in the form of a Fourier sine series.
9 Fortunately, computers—which are still poor at solving partial differential equations—do a great job of finding
For even values of n, bqn is zero. A common mistake is to ignore those values entirely, but
they are not irrelevant; instead, they turn Equation 11.9.5 into
d 2 byn (t) v2 n2 𝜋 2
=− byn (t) even n (11.9.6)
dt 2 L2
We can solve this by inspection:
( ) ( )
vn𝜋 vn𝜋
byn = An sin t + Bn cos t even n (11.9.7)
L L
For odd values of n Equation 11.9.5 becomes:
d 2 byn v2 n2 𝜋 2 4Qv 2
+ Cbyn = D where C = and D = − odd n
dt 2 L 2 𝜋n
This is an inhomogeneous second-order ODE and we’ve already solved its complementary
equation: the question was Equation 11.9.6 and our answer was Equation 11.9.7. That leaves
us only to find a particular solution, and byn,particular = D∕C readily presents itself. So we can
write: ( ) ( ) [ 4QL 2 ]
vn𝜋 vn𝜋
byn = An sin t + Bn cos t − , n odd (11.9.8)
L L 𝜋 3 n3
Next we solve for An and Bn using the initial conditions byn (0) = bfn and dbyn ∕dt(0) = bgn .
You’ll do this for different sets of initial conditions in the problems. The solution to our
original PDE is
∑∞ ( )
n𝜋
y(x, t) = byn (t) sin x
n=1
L
𝜕2y 𝜕y
−9 + 4 + 5y = x
𝜕x 2 𝜕t
on the domain 0 ≤ x ≤ 1 for all t ≥ 0 subject to the boundary conditions
𝜕y∕𝜕x(0, t) = 𝜕y∕𝜕x(1, t) = 0 and the initial condition y(x, 0) = f (x) = 2x 3 − 3x 2 .
Before we start solving this problem, let’s think about what kind of
solution we expect. The PDE is simplest to interpret if we write it as 4(𝜕y∕𝜕t) =
9(𝜕 2 y∕𝜕x 2 ) − 5y + x, which we can read as “the vertical velocity depends on . . . .” The
first term says that y tends to increase when the concavity is positive and decrease
when it is negative. The second two terms taken together say that y tends to increase if
−5y + x > 0: that is, it will tend to move up if it is below the line y = x∕5 and move
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 629
down if it is above it. Both of these effects are always at work. For instance, if the curve
is concave up and below the line y = x∕5 both effects will push y upward; if it is
concave down below the line the two effects will push in opposite directions.
The initial condition is shown below:
y
x
0.2 0.4 0.6 0.8 1.0
–0.2
–0.4
–0.6
–0.8
–1.0
Since the initial function is negative over the whole domain, the −5y + x terms will
push it upward. Meanwhile, the concavity will push it down on the left and up on the
right. You can easily confirm that the concavity will “win” on the left, so y should
initially decrease at small values of x and increase at larger values. However, as the
values on the left become more negative, they will eventually be pushed back up
again.
To actually solve the equation, we note that the boundary conditions can be most
easily met with a Fourier cosine series. So we replace y with its Fourier cosine
expansion:
ay0 (t) ∑ ∞
y(x, t) = + ayn (t) cos(n𝜋x)
2 n=1
On the right side of our differential equation is the function x. Remember that we
only care about this function from x = 0 to x = 1, but we need to create an even
extension to find a cosine expansion. While you can do this entirely by hand (using
integration by parts) or entirely with a computer, you may find that the easiest path is
1
a hybrid: you first determine by hand that an = 2 ∫0 x cos(n𝜋x)dx and then hand that
integral to a computer, or use one of the integrals in Appendix G. (For the case n = 0
in particular, it’s easiest to do the integral yourself.) You find that:
1 ∑ 2 (−1 + (−1)n )
∞
x= + cos(n𝜋x)
2 n=1 𝜋 2 n2
Plugging our expansions for y and x into the original problem yields:
∞ [ ]
5ay0 (t) ∑
′
2ay0 (t) + + 9n 2 𝜋 2 ayn (t) cos(n𝜋x) + 4ayn
′
(t) cos(n𝜋x) + 5ayn (t) cos(n𝜋x)
2 n=1
1 ∑
∞
2 [−1 + (−1)n ]
= + cos(n𝜋x)
2 n=1 𝜋 2 n2
Setting the Fourier coefficients on the left equal to the coefficients on the right and
rearranging leads to the differential equations:
′ 5 1
ay0 (t) = − ay0 (t) +
( 4 4 )
−9n 2 𝜋 2 − 5 [−1 + (−1)n ]
′
ayn (t) = ayn (t) + , n>0
4 2𝜋 2 n 2
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 630
1
ay0 (t) = C0 e −(5∕4)t +
5
9n 2 𝜋 2 −5
t 2 [−1 + (−1)n ]
ayn (t) = Cn e 4 + , n>0
𝜋 2 n 2 (9n 2 𝜋 2 + 5)
Note that the numerator of the last fraction simplifies to 0 for even n and −4 for
odd n.
We now turn our attention to the initial condition y(x, 0) = f (x) = 2x 3 − 3x 2 . To
match this to our Fourier-expanded solution, we need the Fourier expansion of this
function. Remember that we need a Fourier cosine expansion that is valid on the
domain 0 ≤ x ≤ 1. We leave it to you to confirm the result (which we obtained once
again by figuring out by hand what integral to take and then handing that integral
over to a computer):
48
af 0 = −1, afn = (odd n only)
𝜋 4 n4
Setting ay0 (0) = af 0 leads to C0 = −6∕5 and doing the same for higher values of n
gives:
48 4
Cn = 4 4 + 2 2 2 2 (odd n only)
𝜋 n 𝜋 n (9n 𝜋 + 5)
We have now solved the entire problem—initial conditions and all—in Fourier space.
That is, we have found the an -values which are the coefficients of the Fourier series
for y(x, t). So the solution to our original problem is:
[ ]
1 3 ∑ 4
Cn e −(9n 𝜋 +5)t∕4 − 2 2 2 2
2 2
y(x, t) = − e −(5∕4)t + cos(n𝜋x)
10 5 odd n
𝜋 n (9n 𝜋 + 5)
The figures above show the solution (using the 10th partial sum) at three times. As
predicted the function initially moves down on the left and up on the right, but
eventually moves up everywhere until the terms all cancel out to give
𝜕y∕𝜕t = 0.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 631
Stepping Back
We can break the process above into the following steps.
1. Replace all the functions—for instance y(x, t), q(x, t), f (x), and g (x) in our first example
above—with their respective Fourier series.
2. Plug the Fourier series back into the original equation and simplify.
3. You now have an equation of the form <this Fourier series>=<that Fourier series>. Setting
the corresponding coefficients equal to each other, you get an equation to solve for
the coefficients—one that should be easier than the original equation, because some
of the derivatives have been replaced with multiplications.
4. Solve the resulting equation to find the Fourier coefficients of the function you are
looking for—with arbitrary constants, of course. You find these arbitrary constants
based on the Fourier coefficients of the initial conditions. You now have the function
you were originally looking for, expressed in the form of a Fourier series.
5. In some cases, you may be able to explicitly sum the resulting Fourier series to find
your original function in closed form. More often, you will use computers to plot or
analyze the partial sums and/or the dominant terms of the Fourier series.
As always, it is important to understand some of the subtleties.
∙ Why did we expand in x instead of in t? There are two reasons, based on two very
general limitations on this technique.
First, you can only do an eigenfunction expansion for a variable with homogeneous boundary
conditions. To understand why, think about how we dealt with the boundary conditions
and the initial conditions in the problems above. For the initial condition—the condi-
tion on t, which we did not expand in—we expanded the condition itself into a Fourier
series in x and matched that to our equation, coefficient by coefficient. But for the
boundary condition—the condition on x, which we did expand in—we chose our sine
or cosine series so that each individual term would match the boundary condition,
knowing that this would make the entire series match the homogeneous condition.
Second, as you may remember from Chapter 9, you can only take a Fourier series for
a function that is periodic, or defined on a finite domain. In many problems, including the
ones we worked above, the domain of x is finite but the domain of t is unlimited. For
non-periodic functions you need to do a transform instead of a series expansion. Solving
partial differential equations by transforms is the subject of Sections 11.10–11.11.
∙ The “suspicious step”—moving a (derivative ) in and out of a series. In general,
d (∑ ) ∑
dx
f n (x) is not the same as df n ∕dx . For instance, you will show in Prob-
lem 11.199 that if you take the Fourier series for x, and take the derivative term by
term, you do not end up with the Fourier series for 1. However, moving a derivative
inside a Fourier series—which is vital to this method—is valid when the periodic
extension of the function is continuous.10 Since our function was equal on both ends
its extension is everywhere continuous, so the step is valid.
∙ Which Fourier series? The choice of a Fourier sine or cosine series is dictated by the
boundary conditions. In the “wave equation” problem above the terms in the expan-
sion had to be sines in order to match the boundary conditions y(0, t) = y(L, t) = 0.
In the example that started on Page 628 we needed cosines to match the condition
𝜕y∕𝜕x(0, t) = 𝜕y∕𝜕x(1, t) = 0.
∙ Why a Fourier series at all? We used a Fourier series because our differential equations
were based on second derivatives with respect to x.
10 See A. E. Taylor’s “Differentiation of Fourier Series and Integrals” in The American Mathematical Monthly, Vol. 51,
No. 1.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 632
11 Strictlyspeaking your function must also be “piecewise smooth,” meaning the derivative exists and is continuous
at all but a finite number of points per period. Most functions you will encounter pass this test with no problem,
but being continuous is a more serious issue, as the example in this problem illustrates.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 635
(b) Expanding s(x, t) and q(x) into written as 𝜕 2 V ∕𝜕x 2 + 𝜕 2 V ∕𝜕y2 + 𝜕 2 V ∕𝜕z 2 =
Fourier cosine series, write a differ- −(1∕𝜀0 )𝜌(x, y, z). In this problem you will solve
ential equation for asn (t). Poisson’s equation in a cube with boundary
(c) Find the general solution to this dif- conditions V (0, y, z) = V (L, y, z) = V (x, 0, z) =
ferential equation. You should find V (x, L, z) = V (x, y, 0) = V (x, y, L) = 0. The
that the solution for general n doesn’t charge distribution
( is)given by 𝜌(x, y, z) =
work for n = 0 and you’ll have to sin (𝜋x∕L) sin 2𝜋y∕L sin (3𝜋z∕L).
treat that case separately. Remember (a) Write Poisson’s equation with this charge
that q(x) has no time dependence, distribution. Next, expand V in a Fourier
so each aqn is a constant. sine series in x. When you plug this into
(d) Now plug in the initial conditions Poisson’s equation you should get a
to find the arbitrary constants and PDE for the Fourier coefficients bvn (y, z).
write the series solution s(x, t). Explain why the solution to this PDE
will be bvn = 0 for all but one value of
(e) As a specific example, consider n. Write the PDE for bvn (y, z) for that
2 2
q(x) = ke −(x−L∕2) ∕x0 . This driving term one value.
represents a constant force that is (b) Do a Fourier sine expansion of your b-
strongest at the middle of the tube and variable in y. The notation becomes a
rapidly drops off as you move towards bit strained at this point, but you can
the ends. (This is not realistic in several call the coefficients of this new expan-
ways, the most obvious of which is that sion bbvn . The result of this expansion
the hole in a concert flute is not at the should be to turn the equation from
middle, but it nonetheless gives a qualita- Part (a) into an ODE for bbvn (z). Once
tive idea of some of the behavior of flutes again you should find that the solution
and other open tube instruments.) Use is bbvn = 0 for all but one value of n.
a computer to find the Fourier cosine Write the ODE for that one value.
expansion of this function and plot the
(c) Finally, do a Fourier sine expansion
20th partial sum of s(x, t) as a function of x
in z and solve the problem to find
at several times t (choosing values for the
V (x, y, z). Your answer should be in
constants). Describe how the function
closed form, not a series. Plug this
behaves over time. Based on the results
answer back in to Poisson’s equation
you find, explain why this equation
and show that it is a solution to the PDE
could not be a good model of the air
and to the boundary conditions.
in a flute for more than a short time.
11.204 Exploration: Poisson’s Equation—Part II
11.202 A thin pipe of length L being uniformly [This problem depends on Problem 11.203.]
heated along its length obeys the inhomoge- Solve Poisson’s equation for the charge dis-
neous heat equation 𝜕u∕𝜕t = 𝛼(𝜕 2 u∕𝜕x 2 ) + Q ( )
tribution 𝜌(x, y, z) = sin (𝜋x∕L) sin 2𝜋y∕L z.
where Q is a constant. The ends of the The process will be the same as in the last
pipe are held at zero degrees and the pipe problem, except that for the last sine series
is initially at zero degrees everywhere. you will have to expand both the right and
(Assume temperatures are in celsius.) left-hand sides of the equation, and your final
(a) Solve the PDE with these bound- answer will be in the form of a series.
ary and initial conditions. 11.205 Exploration: A driven drum
(b) Take the limit as t → ∞ of your answer In Section 11.6 we solved the wave equation
to get the steady-state solution. If you on a circular drum of radius a in polar
neglect all terms whose amplitude is coordinates, and we found that the nor-
less than 1% of the amplitude of the mal modes were Bessel functions. If the
first term, how many terms are left drum is being excited by an external source
in your series? Sketch the shape of (imagine such a thing!) then it obeys an
the steady-state solution using only inhomogeneous wave equation
those non-negligible terms.
( 2 )
11.203 Exploration: Poisson’s Equation—Part I 𝜕2 z 2 𝜕 z 1 𝜕z
− v +
The electric potential in a region with 𝜕t 2 𝜕𝜌2 𝜌 𝜕𝜌
{
charges obeys Poisson’s equation, 𝜅 cos(𝜔t) 0 ≤ 𝜌 ≤ a∕2
=
which in Cartesian coordinates can be 0 𝜌 > a∕2
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 636
Since everything in the problem depends (c) Now return to the inhomogeneous
only on 𝜌 we eliminated the 𝜙-dependence wave equation. Expand both sides
from the Laplacian. The drum is clamped in a Fourier-Bessel expansion using
down so z = 0 at the outer edge, and the ini- the eigenfunctions you found in
tial conditions are z(𝜌, 0) = 𝜕z∕𝜕t(𝜌, 0) = 0. Part (b). The result should be an ODE
(a) To use eigenfunction expansion, we for the coefficients Azn (t).
need the right eigenfunctions: a set of (d) The resulting differential equation looks
normal modes Rn (𝜌) with the property much less intimidating if you rewrite the
that d 2 Rn ∕d𝜌2 + (1∕𝜌)(dRn ∕d𝜌) = qRn (𝜌) right side in terms of a new constant:
for some proportionality constant q.
For positive q-values, this leads to modi- 1∕2
2
𝛾n = J0 (𝛼0,n u)u du
fied (or “hyperbolic”) Bessel functions. J12 (𝛼0,n ) ∫0
Explain why these functions cannot
be valid solutions for our drum. There is no simple analytical answer for
(b) We therefore assume a negative propor- this integral, but you can find a numer-
tionality constant: replace q with −s 2 and ical value for 𝛾n for any particular n.
solve the resulting ODE for the func- Rewrite your answer to Part (c) using 𝛾n .
tions Rn (𝜌). Use the implicit boundary (e) Solve this ODE with the initial condi-
condition to eliminate one arbitrary tions Azn (0) = 𝜕Azn ∕𝜕t(0) = 0.
constant, and the explicit boundary
(f) Plug your answer for Azn (t) into
condition to constrain the values of
your Fourier-Bessel expansion for
s. You do not need to apply the initial
z(𝜌, t) to get the solution.
conditions at this stage. The Bessel func-
tions you are left with as solutions for (g) Use the 20th partial sum of your
Rn (𝜌) are the eigenfunctions you will series solution to make a 3D plot of the
use for the method of eigenfunction shape of the drumhead at various times.
expansion. Describe how it evolves in time.
𝜕u 𝜕2 u
=𝛼 2 (11.10.1)
𝜕t 𝜕x
Now consider the temperature u(x, t) of an infinitely long bar.
1. When you used the method of eigenfunction expansions to solve this problem for a
finite bar (Exercise 11.9.1), you began by expanding the unknown solution u(x, t) in
a Fourier series. Explain why you cannot do the same thing in this case.
You can take an approach that is similar to the method of eigenfunction expansion, but
in this case you will use a Fourier transform instead of a Fourier series. You begin by tak-
ing a Fourier transform of both sides of Equation 11.10.1. Using to designate a Fourier
transform with respect to x, this gives:
[ ] [ 2 ]
𝜕u 𝜕 u
= 𝛼 2 (11.10.2)
𝜕t 𝜕x
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 637
2. The u you are looking for is a function of x and t. When you solve Equation 11.10.2
you will find a new function [u]. What will that be a function of? (It’s not a function
of u. That’s the original function you’re taking the Fourier transform of.)
3. One property of Fourier transforms is “linearity” which tells us that, in general,
[ [af] +
bg ] = a (f ) + b (g ). Another property of Fourier transforms is that 𝜕 2 f ∕𝜕x 2 =
−p 2 [f ]. Apply these properties (in order) to the right side of Equation 11.10.2.
4. Another property of Fourier transforms is that, if the Fourier transform is with respect
to x and the derivative
[ ] is with respect to t, you can move the derivative in and out of the
transform: 𝜕f ∕𝜕t = 𝜕t𝜕 [f ]. Apply this property to the left side of Equation 11.10.2.
See Check Yourself #78 in Appendix L
5. Solve this first-order differential equation to find [u] as a function of p and t. Your
solution will involve an arbitrary function g (p).
6. Use the formula for an inverse Fourier transform to write the general solution u(x, t) as
an integral. (Do not evaluate the integral.) See Appendix G for the Fourier transform
and inverse transform formulas. Your answer should depend on x and t. (Even though
p appears in the answer, it only appears inside a definite integral, so the answer is not
a function of p.)
7. What additional information would you need to solve for the arbitrary function g (p)
and thus get a particular solution to this PDE?
Most importantly, our work here requires a few properties of Fourier transforms. Given that
a Fourier transform represents a function as an integral over terms of the form f̂ (p, t)e ipx , the
following two properties are not too surprising:
[ (n) ]
𝜕 f 𝜕 (n)
= (n) [f ] (11.10.3)
𝜕t (n) 𝜕t
[ (n) ]
𝜕 f
= (ip)n [f ] (11.10.4)
𝜕x (n)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 638
The second formula turns a derivative with respect to x into a multiplication, and therefore
turns a partial differential equation into an ordinary differential equation, just as Fourier
series did in the previous section.
Finally, we will need the “linearity” property of Fourier transforms:
[af + bg ] = aF (f ) + bF (g )
The Problem
In order to highlight the similarity between this method and the previous one, we’re going to
solve essentially the same problem: a wave on a one-dimensional string driven by an arbitrary
force function.
𝜕2 y 1 𝜕 y
2
− = q(x, t) (11.10.5)
𝜕x 2 v 2 𝜕t 2
𝜕y
y(x, 0) = f (x), (x, 0) = g (x)
𝜕t
However, our new string is infinitely long. This change pushes us from a Fourier series to a
Fourier transform.
We are going to Fourier transform all three of these equations before we’re through. Of
course, we can only use this method in this way if it is possible to Fourier transform all the
relevant functions! At the end of this section we will talk about some of the limitations that
restriction imposes.
The Solution
We begin by taking the Fourier transform of both sides of Equation 11.10.5:
[ 2 ]
𝜕2y 1 𝜕 y
− = [q]
𝜕x 2 v 2 𝜕t 2
Now our derivative properties come into play. Equations 11.10.3 and 11.10.4 turn this differ-
ential equation into −p 2 ŷ − (1∕v 2 )(𝜕 2 ŷ ∕𝜕t 2 ) = q̂ . This is the key step: a second derivative with
respect to x has become a multiplication by p 2 . This equation can be written more simply as:
𝜕 2 ŷ
+ v 2 p 2 ŷ = −v 2 q̂ (11.10.7)
𝜕t 2
It may look like we have just traded our old y(x, t) PDE for a ŷ (p, t) PDE. But our new equation
has derivatives only with respect to t. The variable p in this equation acts like n in the eigen-
function expansion: for any given value of p, we have an ODE in t that we can solve by hand
or by computer. The result will be the function ŷ (p, t), the Fourier transform of the function
we are looking for.
Just as you can use a series solution by evaluating as many partial sums as needed, you can
often use a Fourier transform solution by finding numerical approximations to the inverse
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 639
Fourier transform. In some cases you will be able to evaluate the integral explicitly to find a
closed-form solution for y(x, t).
The driving force in this case is independent of time. Notice that we are neglecting any forces
other than this small weight, so the rest of the string will not fall until it is pulled down by the
weighted part. Before starting the problem take a moment to think about what you would
expect the solution to look like in the simplest case, where the string starts off perfectly still
and horizontal.
Throughout this calculation there will be a number of steps such as taking a Fourier trans-
form or solving an ODE that you could solve either by hand or on a computer. We’ll just show
the results as needed and in Problem 11.206 you’ll fill in the missing calculations.
We take the Fourier transform of the right hand side of Equation 11.10.8 and plug it into
Equation 11.10.7.
𝜕 2 ŷ Qv 2
+ v 2 2
p ̂
y = − sin(Lp) (11.10.9)
𝜕t 2 𝜋p
Since the inhomogeneous part of this equation has no t dependence, it’s really just the
equation ŷ ′′ (t) + ây(t) = b where a and b act as constants. (They depend on p but not on
t.) You can solve it with guess and check and end up here.
Q
ŷ (p, t) = A(p) sin(pvt) + B(p) cos(pvt) − sin(Lp) (11.10.10)
𝜋p 3
The arbitrary “constants” A and B are constants with respect to t, but they are functions of p
and we have labeled them as such. They will be determined by the initial conditions, just as
An and Bn were in the series expansions of the previous section. In Problem 11.206 you will
solve this for the simplest case y(x, 0) = 𝜕y∕𝜕t(x,
(
3
) show that B(p) = (Q ∕𝜋p ) sin(Lp)
0) = 0 and
and A(p) = 0. So ŷ (p, t) = (Q ∕𝜋p ) sin(Lp) cos(pvt)(− 1 . )This can be simplified with a trig
3
identity to become ŷ (p, t) = −(2Q ∕𝜋p 3 ) sin(Lp) sin2 pvt∕2 . The solution y(x, t) is the inverse
Fourier transform of ŷ (p, t).
∞ sin(Lp) ( )
2Q pvt
y(x, t) = − sin2 e ipx dp (11.10.11)
𝜋 ∫−∞ p 3 2
This inverse Fourier transform can be calculated analytically, but the result is messy because
you get different functions in different domains. With the aid of a computer, however, we
can get a clear—and physically unsurprising—picture of the result. At early times the weight
pulls the region around it down into a parabola. At later times the weighted part makes a
triangle, with the straight lines at the top and edges smoothly connected by small parabolas.
In the regions x > d + vt and x < −d − vt, the function y(x, t) is zero because the effect of the
weight hasn’t yet reached the string.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 640
t=0 t=3
y y
x x
–4 –2 2 4 –4 –2 2 4
–1 –1
–2 –2
–3 –3
–4 –4
–5 –5
–6 –6
t=5 t = 300
y y
x x
–10 –5 5 10 –400 –200 200 400
–1 –50
–2 –100
–3 –150
–4 –200
–5 –250
–6 –300
FIGURE 11.7 The solution for an infinite string continually pulled down by a weight near the origin. These
figures represent the solution given above with Q = L = v = 1.
then we can see the function will tend to decrease when it is concave down and/or
positive. In addition, it will always have a tendency to increase in the range
−1 ≤ x ≤ 1.
The initial position is shown below.
2
The initial function y(x, 0) = f(x) = e–x
y
1.0
0.8
0.6
0.4
0.2
x
–4 –2 2 4
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 641
Near x = 0, the negative concavity and positive values will push downward more
strongly than the driving term pushes upward, so the peak in the middle will decrease
until these forces cancel. At larger values of x the upward push from the positive
concavity is larger than the downward push from the positive values of y (you can
check this), so the function will initially increase there. At very large values of x there
is no driving force, and the concavity and y-values are near zero, so there will be little
initial movement.
Now let’s find the actual solution. We begin by taking the Fourier transform of
both sides. [ ] [{ ]
𝜕2 y 𝜕y 1 −1 ≤ x ≤ 1
−9 2 + 4 + 5y =
𝜕x 𝜕t 0 |x| > 1
The right side of this equation is an easy enough Fourier transform to evaluate, and
we’re skipping the integration steps here.
𝜕̂y sin p
4 + (5 + 9p 2 )̂y =
𝜕t 𝜋p
Although the constants are ugly, this is just a separable first-order ODE. Once again
we can solve it by hand or by software.
2 +5)t∕4 sin p
ŷ = C(p)e −(9p + (11.10.12)
𝜋p(9p 2 + 5)
Next we apply the initial condition to find C(p). You can find the Fourier transform of
the initial condition in Appendix G.
1
f̂ (p) = √ e −p ∕4
2
2 𝜋
1 2 sin p
C(p) = √ e −p ∕4 − (11.10.13)
2 𝜋 𝜋p(9p 2 + 5)
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 642
Finally, we can get the solution y(x, t) by plugging Equation 11.10.13 into 11.10.12
and taking the inverse Fourier transform.
[( ) ]
∞ sin p sin p
1 −p2 ∕4 −(9p 2 +5)t∕4
y(x, t) = √ e − e + e ipx dp
∫−∞ 2 𝜋 𝜋p(9p 2 + 5) 𝜋p(9p 2 + 5)
There’s no simple way to evaluate this integral in the general case, but we can
understand its behavior by looking at the time dependence. At t = 0 the last√two
2
terms cancel and we are left with the inverse Fourier transform of e −p ∕4 ∕(2 𝜋),
2
which reproduces the initial condition y(x, 0) = e −x . (If the solution didn’t reproduce
the initial conditions when t = 0 we would know we had made a mistake.) At late
9p 2 +5
times the term e − 4 t goes to zero and we are left with the inverse Fourier transform
of the last term, which can be analytically evaluated on a computer to give
⎧ ( √ √ ) √
1
⎪ e 5∕3
− e − 5∕3 e ( 5∕3)x x < −1
⎪ [ 10 √ ( √ √ )]
1
lim y(x, t) = ⎨ 2 − e − 5∕3 e ( 5∕3)x + e −( 5∕3)x −1 ≤ x ≤ 1 (11.10.14)
t→∞
⎪
10 ( √ √ ) √
1
⎪ e 5∕3 − e − 5∕3 e −( 5∕3)x x>1
⎩ 10
0.8
0.6
0.4
0.2
x
–4 –2 2 4
The solution 11.10.14 at t = 0 (blue), t = .1 (black), and in the limit t → ∞ (gray).
This picture generally confirms the predictions we made earlier. The peak at x = 0
shrinks and the tail at large |x| initially grows. We were not able to predict ahead of
time that in some places the function would grow for a while and then come back
down some. (Look for example at x = 2.) Moreover, we now have an exact function
with numerical values for the late time limit of the function.
Stepping Back
The method of transforms boils down to a five-step process.
1. Take the Fourier transform of both sides of the differential equation.
2. Use the rules of Fourier transforms to simplify the resulting equation, which should
turn one derivative operation into a multiplication. If you started with a two-variable
partial differential equation, you are now effectively left with a one-variable, or ordi-
nary, differential equation.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 643
must be defined; that restriction, in turn, means that f (x) must approach zero as x approaches
±∞. This means that some problems cannot be approached with a Fourier transform. It also
means that we almost always do our Fourier transforms in x rather than t, since we are rarely
guaranteed that a function will approach zero as t → ∞. To turn time derivatives into multi-
plications you can often use a “Laplace transform,” which is the subject of the next section.
Finally, we should note that a Fourier transform is particularly useful for simplifying
equations like the wave equation because the only spatial derivative is second order. When
you take the second derivative of e ipx you get −p 2 e ipx which gives you a simple, real ODE for
f̂. For a first-order spatial derivative, the Fourier transform would bring down an imaginary
coefficient. You can still use this technique for such equations, but you have to work harder
to physically interpret the results: see Problem 11.219.
For some of the Fourier transforms in this section, you should be able to evaluate them by hand. (See
Appendix G for the formula.) For some you will need a computer. (Such problems are marked with a com-
puter icon.) And for some, the following formulas will be useful. (If you’re not familiar with the Dirac delta
function, see Appendix K.)
It will also help to keep in mind that a “constant” depends on what variable you’re working with. If you are
taking a Fourier transform with respect to x, then t 2 sin t acts as a constant. If you are taking a derivative with
respect to t, then 𝛿(x) acts as a constant.
Unless otherwise specified, your final answer will be the Fourier transform of the PDE solution. Remember
that if your answer has a delta function in it you can simplify it by replacing anything of the form f (p)𝛿(p)
with f (0)𝛿(p).
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 644
11.206 In this problem you’ll fill in some of (c) Solve the differential equation you
the calculations from the Explana- wrote in Part (a) with the initial con-
tion (Section 11.10.2). dition you found in Part (b) to get
(a) To derive Equation 11.10.9 we needed ̂ t). (Warning: the
the solution u(p,
the Fourier transform of the right answer will be long and messy.)
hand side of Equation 11.10.8. Eval- (d) Write the solution u(x, t) as an
uate this Fourier transform directly integral over p.
using the formula for a Fourier trans-
form in Appendix G. Solve Problems 11.209–11.213 on the domain
(b) Verify that Equation 11.10.10 is a solu- −∞ < x < ∞ using the method of transforms. When
tion to Equation 11.10.9. the initial conditions are given as arbitrary functions
the Fourier transforms of those functions will appear
(c) The initial conditions y(x, 0) =
as part of your solution. It may help to first work
𝜕y∕𝜕t(x, 0) = 0 can trivially be
through Problem 11.208 as a model.
Fourier transformed into ŷ (p, 0) =
𝜕̂y∕𝜕t(p, 0) = 0. Plug those conditions 11.209 𝜕y∕𝜕t − c 2 (𝜕 2 y∕𝜕x 2 ) = e −t , y(x, 0) = f (x)
into Equation 11.10.10 and derive the 2
11.210 𝜕y∕𝜕t − c 2 (𝜕 2 y∕𝜕x 2 ) = e −t , y(x, 0) = e −x
formulas for A(p) and B(p) given in the
Explanation. 11.211 𝜕y∕𝜕t − c 2 (𝜕 2 y∕𝜕x 2 ) + y = 𝜅, y(x, 0) = 0.
(You should be able to inverse Fourier
11.207 [This problem depends on Problem 11.206.] transform your solution and give your
Use a computer to evaluate the inverse answer as a function y(x, t).)
Fourier transform 11.10.11. (You can do 𝜕y 𝜕2 y
11.212 − c2 2 + y
it by hand if you prefer, but it’s a bit of 𝜕t{ 𝜕x
a mess.) Assume t > 2L∕v and simplify Q −1 < x < 1
= , y(x, 0) = 0
the expression for y(x, t) in each of the 0 elsewhere
following regions: 0 < x < L, L < x < 𝜕2 y 𝜕2 y 2
vt − L, vt − L < x < vt + L, x > vt + L. In 11.213 − c 2 2 + y = e −x cos (𝜔t) , y(x, 0) = 0,
𝜕t 2 𝜕x
each case you should find a polynomial ⎧ x 0 ≤ x ≤ 1∕2
of degree 2 or less in x. As a check on 𝜕y ⎪
(x, 0) = ⎨ 1 − x 1∕2 < x ≤ 1
your answers, reproduce the last frame of 𝜕t ⎪ 0 elsewhere
Figure 11.7. ⎩
11.208 Walk-Through: The Method of Fourier Trans-
forms. In this problem you will solve the For Problems 11.214–11.217 solve the equation
following partial differential equation on 𝜕 2 y∕𝜕x 2 − (1∕v 2 )(𝜕 2 y∕𝜕t 2 ) = q(x, t) with initial
the domain −∞ < x < ∞, 0 ≤ t < ∞. conditions y(x, 0) = f (x), y(x, ̇ 0) = g (x).
Equations 11.10.15 may be needed.
{
𝜕u 𝜕 2 u 2 F −1 < x < 1
− 2 + u = e −x 11.214 q(x, t) = 0, f (x) = ,
𝜕t 𝜕x 0 otherwise
⎧ 1 + x −1 ≤ x ≤ 0 g (x) = 0
⎪ ⎧ −Q
u(x, 0) = ⎨ 1 − x 0<x≤1 −1 ≤ x < 0
⎪
⎪ 0 elsewhere 11.215 q(x, t) = ⎨ Q 0≤x≤1 ,
⎩
⎪ 0 elsewhere
⎩
f (x) = g (x) = 0
(a) Take the Fourier transform of both 2
sides of this PDE. On the left side 11.216 q(x, t) = 𝜅e −(x∕d) sin(𝜔t), f (x) = 0, g (x) = 0
2
you will use Equations 11.10.3– 11.217 q(x, t) = −Q , f (x) = 0, g (x) = Ge −(x∕d) . The
11.10.4 to get an expression that letters Q and G stand for constants. In
̂ t) and 𝜕 u∕𝜕t.
depends on u(p, ̂ On the general any answer with f (p)𝛿(p) in it can
right you should get a function of p be simplified by replacing f (p) with the
and/or t. Equations 11.10.15 may be value f (0), since 𝛿(p) = 0 for all p ≠ 0. In
helpful. this case, however, you should have terms
(b) Take the Fourier transform of the with 𝛿(p)∕p 2 , which is undefined at p = 0.
initial condition to find the ini- Expand cos(vpt) in your answer in a Maclau-
̂ t).
tial condition for u(p, rin series in p and simplify the result. You
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 645
should get something where the coefficient Plot the temperature distribution at
in front of 𝛿(p) is non-singular, and you several different times and describe
can replace that coefficient with its value how it is evolving over time.
at p = 0. 11.221 The electric potential in a region is given
by Poisson’s equation 𝜕 2 V ∕𝜕x 2 + 𝜕 2 V ∕𝜕y2 +
11.218 (a) Solve the equation 𝜕 2 y∕𝜕x 2 − 𝜕 2 V ∕𝜕z 2 = (1∕𝜀0 )𝜌(x, y, z). An infinitely long
(1∕v 2 )(𝜕 2 y∕𝜕t 2 ) = 0 with initial condi- bar, 0 ≤ x ≤ 1, 0 ≤ y ≤ 1, −∞ < z < ∞ has
2 2
tions y(x, 0) = Fe −(x∕d) , dy∕dt(x, 0) = 0. charge density 𝜌(x, y, z) = sin(𝜋x) sin(𝜋y)e −z .
Your answer should be an equation for Assume the edges of the bar are grounded so
ŷ (p, t) with no arbitrary constants. V (0, y, z) = V (1, y, z) = V (x, 0, z) = V (x, 1, z) =
(b) You’re going to take the inverse 0 and assume the potential goes to zero
Fourier transform of your solution, in the limits z → ∞ and z → −∞.
but to do so it helps to start with (a) Write Poisson’s equation for this charge
the following trick. Rewrite the sec- distribution and take the Fourier
ond half of Equation 11.10.15 in the transform of both sides to turn it
2
form e −(x∕k) =<an integral>. into a PDE for V̂ (x, y, p).
(c) Now take the inverse Fourier trans- (b) Plug in a guess of the form V̂ =
form of your answer from Part (a) to C(p) sin(𝜋x) sin(𝜋y) and solve for C(p).
find y(x, t). Start by writing the integral
for the Fourier transform. Then do a 11.222 Exploration: Using Fourier Sine Transforms
variable substitution to make it look For a variable with an infinite domain
like the integral you wrote in Part (b). −∞ < x < ∞ you can take a Fourier trans-
Use that to evaluate the integral and form as described in this section. For a vari-
then reverse your substitution to get a able with a semi infinite domain 0 < x < ∞,
final answer in terms of x and t. however, it is often more useful to take a
(d) Verify that your solution y(x, t) Fourier sine transform. The formulas for a
satisfies the differential equation Fourier sine transform and its inverse are in
and initial conditions. Appendix G. As long as the boundary con-
dition on f is that f (0, t) = 0 the rules for
11.219 The complex exponential function e ipx
Fourier sine transforms of second derivatives
is an eigenfunction of 𝜕y∕𝜕x, but with an
are the same as the ones for regular Fourier
imaginary eigenvalue. This makes it harder
transforms.
to interpret the results of the method of
transforms when single derivatives are [ ] [ ]
𝜕2 f 𝜕2 f 𝜕 2 Fs [f ]
involved. To illustrate this, use the method Fs = −p 2 Fs [f ], Fs =
of transforms to solve the equation 𝜕x 2 𝜕t 2 𝜕t 2
{
𝜕y 𝜕 2 y 𝜕y 1 −1 < x < 1 In this problem you will use a Fourier
− 2 − = sine transform to solve the wave equation
𝜕t 𝜕x 𝜕x 0 elsewhere
𝜕 2 y∕𝜕x 2 − (1∕v 2 )(𝜕 2 y∕𝜕t 2 ) = 0 on the inter-
with initial condition y(x, 0) = 0
val 0 < x < ∞ with initial conditions
⎧ x 0<x<d
Your final result should be the Fourier ⎪
transform ŷ (p, t), which will be a com- y(x, 0) = ⎨ 2d − x d < x < 2d , y(x, ̇ 0) = 0
plex function that cannot easily be inverse ⎪ 0 x > 2d
⎩
Fourier transformed and admits no obvi-
and boundary condition y(0, t) = 0.
ous physical interpretation.
(a) Take the Fourier sine transform of the
11.220 An infinite rod being continually heated by a wave equation to rewrite it as an ordi-
localized source at the origin obeys the differ- nary differential equation for ŷs (p, t).
2
ential equation 𝜕u∕𝜕t − 𝛼(𝜕 2 u∕𝜕x 2 ) = ce −(x∕d)
(b) Find the Fourier sine transform of
with initial condition u(x, 0) = 0.
the initial conditions. These will be
(a) Solve for the temperature u(x, t). the initial conditions for the ODE
Your answer will be in the form of you derived in Part (a).
̂ t).
a Fourier transform u(p,
(c) Solve the ODE using these initial
(b) Take the inverse Fourier transform conditions and use the result to
of your answer to get the function u(x, t). write the solution ŷs (t).
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 646
Laplace Transforms
Chapter 10 introduces Laplace transforms and their use in solving ODEs; Appendix H
contains a table of Laplace transforms and some table-looking-up techniques. We’re not
going to review all that information here, so you may want to refresh yourself.
Solving a PDE by Laplace transform is very similar to solving a PDE by Fourier transform,
except that we tend to use Laplace transforms on the time variable instead of spatial variables.
Based on our work in the previous sections, you can probably see how we are going to put
the following rules to use.
[ ]
𝜕 (n) f (x, t) 𝜕 (n)
= (F (x, s)) (11.11.1)
𝜕x (n) 𝜕x (n)
[ (n) ]
𝜕 f (x, t) 𝜕f 𝜕2f 𝜕 (n−1) f
= s n F (x, s) − s n−1 f (x, 0) − s n−2 (x, 0) − s n−3 2 (x, 0) − … − (n−1) (x, 0)
𝜕t (n) 𝜕t 𝜕t 𝜕t
(11.11.2)
The first rule is not surprising: since we are using [] to indicate a Laplace transform in the
time variable, a derivative with respect to x can move in and out of the transform. (When
we took Fourier transforms with respect to x, derivatives with respect to t followed a similar
rule.)
The second rule is analogous to an eigenfunction relationship. It shows that inside a
Laplace transform, taking the nth time derivative is equivalent to multiplying by s n . So the
Laplace transform serves the same purpose as the Fourier transform in the previous section,
and the Fourier series in the section before that: by turning a derivative into a multiplica-
tion, it turns a PDE into an ODE. In this case, however, there are correction terms on the
boundary that bring initial conditions into the calculations. (Similar corrective terms come
into play with Fourier transforms on semi-infinite intervals.)
As with the Fourier transform, the final property we need is linearity: [af + bg ] = a[f ] +
b[g ] where a and b are constants, f and g functions of time.
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 647
𝜕u 𝜕2u
=𝛼 2
𝜕t 𝜕x
As often occurs in problems with infinite domains, there is an “implicit” boundary condition
that is never stated in the problem. The entire bar starts at 0◦ , and the heat applied to the
left side must propagate at a finite speed through the bar; therefore, it is safe to assume that
lim u(x, t) = 0.
x→∞
We begin by taking the Laplace transform of both sides of the equation, using the linearity
property to leave the constant 𝛼 outside.
[ ] [ 2 ]
𝜕u 𝜕 u
= 𝛼
𝜕t 𝜕x 2
𝜕2
sU (x, s) − u(x, 0) = 𝛼 U (x, s)
𝜕x 2
(Note that the second term is the original function u, not the Laplace transform U .) Plugging
in our initial condition u(x, 0) = 0, we can rewrite the problem as:
𝜕2U s
= U
𝜕x 2 𝛼
Just as we saw with Fourier transforms in the previous section, a partial differential equation
has turned into an effectively ordinary differential equation, since the derivative with respect
to time has been replaced with a multiplication. We can solve this by inspection.
√ √
s∕𝛼 x s∕𝛼 x
U (x, s) = A(s)e + B(s)e −
Equation 11.11.3 is the Laplace transform of the function u(x, t) that we’re looking for. What
can you do with that?
In some cases you stop there. Engineers who are used to working with Laplace trans-
forms can read a lot into a solution in that form, and we provide some tips for interpreting
Laplace transforms in Chapter 10. In some cases you can apply an inverse Laplace transform
to find the actual function you’re looking for. This process requires integration on the com-
plex plane, so we defer it to Chapter 13. Our approach in this section will be electronic: we
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 648
asked our computer for the inverse Laplace transform of Equation 11.11.3, and thus got the
solution to this problem. ( )
x
u(x, t) = uL erfc √
2 𝛼t
Problem:
Solve
𝜕v 𝜕v
+ =1 (11.11.4)
𝜕t 𝜕x
on the domain 0 ≤ x < ∞, 0 ≤ t < ∞ with initial condition v(x, 0) = 0 and boundary
condition v(0, t) = 0.
Solution:
We begin by taking the Laplace transform of both sides, applying the linearity and
derivative properties on the left. The Laplace transform of 1 is 1∕s, so
𝜕V (x, s) 1
sV (x, s) + =
𝜕x s
7in x 10in Felder c11.tex V3 - February 27, 2015 3:43 P.M. Page 649
You can solve this by finding a complementary and a particular solution, with the
result
1
V (x, s) = 2 + C(s)e −sx
s
The boundary condition v(0, t) = 0 becomes V (0, s) = 0, which means
1 − e −sx
V (x, s) = (11.11.5)
s2
You can easily find the inverse Laplace transform of Equation 11.11.5 on a computer:
{
t t<x
v(x, t) = (11.11.6)
x t≥x
12 One that we got a lot out of is Asmar, Nakhle, Partial Differential Equations with Fourier Series and Boundary Value
There are a lot of rules to guide you in choosing the right method for a given differential
equation. Our goal has been to present these rules in a common-sense way: you will work with
these rules better, and remember them longer, if you understand where they come from. In
the summary flow chart in Appendix I we forget all the “why” questions and just list the rules.
11.227 𝜕y∕𝜕t − (𝜕 2 y∕𝜕x 2 ) = 0, y(x, 0) = 0, equation 11.2.2. The string starts at rest with
y(0, t) = t, y(1, t) = 0 no vertical displacement but the end of it
11.228 𝜕 2 y∕𝜕t 2 − 𝜕y∕𝜕t − 𝜕 2 y∕𝜕x 2 = sin(t) sin(𝜋x), is being shaken so y(0, t) = y0 sin(𝜔t).
y(0, t) = te −t , y(1, t) = 0, y(x, 0) = (a) Using words and a few sketches of how
sin(𝜋x), 𝜕y∕𝜕t(x, 0) = 0 the string will look at different times, pre-
dict how the string will behave—not by
11.229 𝜕y∕𝜕t − c 2 (𝜕 2 y∕𝜕x 2 ) = xt, y(x, 0) = 0,
solving any equations (yet), but by phys-
y(0, t) = y(1, t) = 0
ically thinking about the situation.
11.230 In the Explanation (Section 11.11.1) we (b) Convert the wave equation and the
solved a differential equation and ended up initial conditions into an ODE for
with Equation 11.11.6. In this problem you’ll the Laplace transform Y (x, s).
consider what that solution looks like. (c) Find the general solution to this ODE.
(a) Draw sketches of v(x) on the domain (d) It takes time for waves to physically prop-
0 ≤ x ≤ 10 at t = 5 and t = 6. (This agate along a string. Since the string
should be pretty quick and easy.) starts with zero displacement and veloc-
(b) What is 𝜕v∕𝜕x at the point x = 4 ity and is only being excited at x = 0, it
in both sketches? (You can see the should obey the implicit boundary con-
answer by looking.) dition lim y(x, t) = 0. Use this boundary
x−>∞
(c) How does v at the point x = 4 move or condition to solve for one of the arbi-
change between the two sketches? Based trary constants in your solution.
on that, what is 𝜕v∕𝜕t at that point? (e) Use the explicit boundary condi-
(d) What is 𝜕v∕𝜕x at the point x = 7 tion at x = 0 to solve for the other
in both sketches? arbitrary constant and thus find the
(e) How does v at the point x = 7 move or Laplace transform Y (x, s).
change between the two sketches? Based
(f) Take the inverse Laplace trans-
on that, what is 𝜕v∕𝜕t at that point?
form to find the solution y(x, t). Your
(f) Explain why this function solves answer will involve the “Heaviside step
Equation 11.11.4. function” H (x) (see Appendix K).
(g) Describe the life story of the point x = x0 (g) Explain in words what the string is doing.
(where x0 > 0) as t goes from 0 to ∞. Does the solution you found behave
11.231 A string that starts at x = 0 and extends like the sketches you made in Part (a)?
infinitely far to the right obeys the wave If not, explain what is different.
C H A P T E R 12
If you have gone through Chapter 11 then you are familiar with the process of solving a par-
tial differential equation (PDE) by separation of variables. This process leads to an ordinary
differential equation (ODE) whose solution may be sines and cosines, but it may also be
less familiar functions: Bessel functions, Legendre polynomials, and others. Solving a PDE
often requires you to recognize an ODE and its solutions, work with the properties of those
solutions, and build a series from those solutions.
In this chapter you will see a lot of the mathematical background that we glossed over
in Chapter 11. You will see how ordinary differential equations can lead to those Bessel
functions, Legendre polynomials, and others. Along the way you will learn an important
technique for solving ordinary differential equations by assuming a solution in the form
of a power series (or a “generalized” power series). And you will encounter a number of
important functions with which you may have been previously unfamiliar.
the steps we skip—the times we just look up an answer and accept it instead of solving for it.
Many of those are the same steps we skipped in Chapter 11, and the rest of this chapter is
meant to fill in those gaps.
A circular drumhead of radius a is allowed to vibrate. If the initial state of the drum has
“azimuthal symmetry” (no 𝜙 dependence) then the drum will continue to have azimuthal
symmetry over time. Under that circumstance the height z is a function of the polar distance
𝜌 and the time t, and is governed by the following PDE (where v is a constant).
( )
𝜕2 z 𝜕 2 z 1 𝜕z the wave equation in polar coordinates
= v2 + (12.1.1)
𝜕t 2 𝜕𝜌2 𝜌 𝜕𝜌 with azimuthal symmetry
The technique called “Separation of Variables” tells us to write the solution as the product
of two different functions.
z(𝜌, t) = R(𝜌)T (t)
The next few steps—covered in Chapter 11 but not relevant for our purpose here—take us
to the two equations below. Note the introduction of a new constant k.
1. Find the general solution to Equation 12.1.2. You should be able to do this by
inspection.
2. Write the general solution to Equation 12.1.3 by matching it to one of the ODEs in
Appendix J. Hint: some of the ODEs in the appendix are actually classes of ODEs
identified by different values of a parameter p. Equation 12.1.3 is the p = 0 version of
one of the listed ODEs.
See Check Yourself #79 in Appendix L
3. An “implicit boundary condition” that R(𝜌) must obey is that it must be finite at 𝜌 = 0.
Look up in Appendix J the basic properties of the functions you found in Part 2, and
use this implicit boundary condition to set one of your two arbitrary constants to zero.
4. The other boundary condition is R(a) = 0 because the outer edge of the drum can’t
move up and down. Use that boundary condition to limit the possible values of k, once
again using information from Appendix J. There are infinitely many possible values for
k so your answer will contain a new parameter n, where n can be any positive integer.
See Check Yourself #80 in Appendix L
This should all make sense if you’re willing to simply accept the information in Appendix J,
but where did that information come from? Sections 12.4 and 12.6 will show you two (closely
related) techniques for solving ODEs such as Equation 12.1.3, and Sections 12.5 and 12.7 will
apply those techniques to two common ODEs to derive the properties of two of the functions
in Appendix J.
There is another gap, maybe harder to see. At the 𝜌 = 0 boundary we did not specify a
value of R; we merely asserted that it must be finite. At the 𝜌 = a boundary we needed a
specific value. In Section 12.8 we will see a rule for determining what kind of condition is
needed at a boundary.
You’re not done yet, though. You still need to combine these functions into a solution
z(𝜌, t) and match initial conditions.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 654
5. Multiply your solutions R(𝜌) and T (t) to find a solution z(𝜌, t). You should be able to
absorb the remaining arbitrary constants in R(𝜌) into the arbitrary constants in T (t)
to give you a solution z(𝜌, t) with two arbitrary constants. Don’t write k in your answer;
use the value of k that you found above.
See Check Yourself #81 in Appendix L
6. The solution you just found is called a “normal mode” of the PDE. There are infinitely
many such normal modes, one for each value of n. Write the general solution z(𝜌, t)
as a sum of all of these normal modes. Since the arbitrary constants can take different
values for each n, write a subscript n on them.
̇ 0) = 0. Plug this into your solution
7. For simplicity we’ll use as one initial condition z(𝜌,
and show that one of the two arbitrary constants must be zero. This should leave you
with an infinite series with just one undetermined set of constants Bn . (Of course you
may have used a different letter, but we’ll refer to it as Bn from here on.)
8. Our other initial condition will be an unspecified function z(𝜌, 0) = f (𝜌). Plugging
t = 0 into your solution for z write f (𝜌) as an infinite series.
See Check Yourself #82 in Appendix L
9. Use Appendix J to find how to determine the coefficients of a series of the type you
just wrote. Express Bn as an integral involving the unknown function f (𝜌).
Once again it all worked because we just invoked Appendix J. Can any initial condition f (𝜌)
can be written in a series of this form? It turns out the functions that you found in the
appendix have a property called “completeness,” which essentially means yes, any f (𝜌) can
be written as a sum of these functions. Section 12.8 will explain under what circumstances
the solutions of a given ODE are complete in this sense, and will also explain how to derive
the formula you used for finding the coefficients Bn of such a series.
At the end of Section 12.8 we will return to our circular drum and see how the material
in the chapter has filled in all the gaps we left in this section.
If we wanted the first term to be positive we would replace (−1)n with (−1)n+1 .
When you are generating a Fourier series you may end up with the term
cos(𝜋n). This is another alternator because cos(0) = cos(2𝜋) = cos(4𝜋) = 1, while
cos(𝜋) = cos(3𝜋) = cos(5𝜋) = −1. So for integer n you can always replace cos(𝜋n) with
(−1)n (and you should).
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 655
∑ ∑
∞
n
x means x n (odd n only) means x + x3 + x5 + …
odd n n=1
Sometimes it is useful, however, to write such series without the extra note. Replace n
with 2n + 1 and let n go over all integers from 0 to ∞. The 2n + 1 will then pick out
only the odd integer values. For example, the series above can be written as:
∑
∞
x 2n+1 = x + x 3 + x 5 + …
n=0
Similarly, replacing n with 2n picks out the even terms. In unusual cases where we want
every third term we could replace n with 3n (or 3n + 1 or 3n + 2).
3. Product notation
∑ ∏
Just as a capital Sigma, , is used for a sum, a capital Pi, , is used for a product. For
example:
∏3
(x + n)2 = (x + 1)2 (x + 2)2 (x + 3)2
n=1
∑
6
∑
4
n 2 is the same series as (n + 2)2
n=1 n=−1
The first series is 1 + 4 + 9 + 16 + 25 + 36, and the second series is term by term the
same. We can say that we got the second series by replacing n with n + 2 everywhere in
the first series, so n 2 became (n + 2)2 , the starting condition n = 1 became (n + 2) = 1,
and similarly for the ending condition n = 6.
1 The first form should only be used if the starting value of n is unambiguous from context.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 656
Rewriting a series in this way does not seem to have an official name that we
could find, so we call it “reindexing” the series. This trick is useful when we want to
combine sums. For example, suppose we wanted to add these two infinite series.2
∑
∞
∑
∞
f (x) = an x n + bn x n+2 (12.2.1)
n=0 n=0
∑ ( n )
Rewriting this as ∞n=0 an x + bn x
n+2 is perfectly valid but not very helpful. It’s more
useful to reindex the second sum so that both series are based on the same power
of x.
∑
∞
∑
∞
f (x) = an x n + bn−2 x n (12.2.2)
n=0 n=2
The most important thing here is to convince yourself that we haven’t changed any-
thing. The second series in Equation 12.2.1 is b0 x 2 + b1 x 3 + … and the second series
in Equation 12.2.2 is also b0 x 2 + b1 x 3 + … Whenever you are unsure of a reindexing
step, check the first few terms to make sure they haven’t changed.
Now we can put together Equation 12.2.2 in a more useful form.
∙ The n = 0 term (corresponding to x 0 ) and the n = 1 term (corresponding to x 1 )
exist only in the first series, so we will write them separately.
∙ All subsequent terms exist in both series, so we will combine them into one series.
∑
∞
f (x) = a0 + a1 x + (an + bn−2 )x n (12.2.3)
n=2
Those are relatively simple function definitions, but it may not be obvious what they have
to do with sines and cosines (or with hyperbolas for that matter). Below we compare the “nor-
mal” trig functions to their hyperbolic cousins. You will prove a number of these properties
in the problems.
3 Strictly
speaking we could have left off trig functions since you can get them by plugging complex arguments into
exponentials. Less obviously you can construct the inverse trig functions from logarithms with complex arguments,
so they are also elementary.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 659
Problem:
Solve the equation x ′′ (t) = k 2 x(t) subject to the initial conditions x(0) = 3 and
x ′ (0) = 5.
One approach: The general solution to this differential equation can be written
x(t) = Ae kt + Be −kt . The initial conditions in this case become A + B = 3 and
Ak − Bk = 5. To find the constants you solve these two equations simultaneously.
For instance, you might rewrite the first equation as B = 3 − A and substitute into the
second equation to get Ak − (3 − A)k = 5, which solves to A = (3k + 5)∕(2k). A similar
method leads to B = (3k − 5)∕(2k).
As a final note, replacing x with ix turns the definitions of sinh x and cosh x into formulas
for sin x and cos x. Bear in mind, however, that all these functions are real-valued for any real
input.
Take a moment to convince yourself that Equation 12.3.2 does in fact define a function of x,
not a function of t. If you choose x = 1 you can evaluate that integral (using for instance a
Riemann sum with a lot of slices) and you will get a number. If you choose x = 2 you will get
a different number. The letter t is a “dummy variable” that disappears before the calculation
is done. For any given x-value the error function gives you a number, which represents (apart
2
from a few constants thrown in for convenience) the area under the “Gaussian” curve e −t
between 0 and x.
4 The name “error function” has the unfortunate side effect that when you ask a computer to solve a differential
equation and it responds with an error function, it looks like the computer is reporting a mistake.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 660
e–t2
As an example, take a moment (without a com-
1 2
1 puter) to find ∫0 e −t dt. Go on, try it… we’ll wait.
Got stuck, didn’t you? You can see the area we
want on the drawing. You can see that the integral
t
–3 –2 –1 1 2 3 is finite and positive. But you can’t put together
FIGURE 12.2 A Gaussian Function.
sines, logs, and exponents into a function whose
2
derivative comes out e −t . That’s why we define a
new function, called the error function, whose only definition is that it is that antiderivative.
Here are a few things to know about that integral and that function.
2 √
∙ The total area under e −t between 0 and ∞ is 𝜋∕2,√ as you will prove in Problem 12.49.
The error function is therefore defined with 2∕ 𝜋 in front, so lim erf x = 1.
x→∞
2
∙ Because e −t approaches zero very quickly, the vast majority of the area under the curve
occurs very near the y-axis. For instance, erf 1 is over 0.8 and erf 2 is over 0.995. That
means that less than 0.5% of the total area under this curve is found to the right of
x = 2.
∙ And by the way, there’s a name for that. Just as the error function erf x measures the
area under the curve to the left of x, the “complementary error function” erfc x mea-
sures the area under the curve to the right of x. You can see that erfc x = 1−erf x.
∙ You will show in Problem 12.26 that erf is an odd function, meaning erf(−x) = −erf x.
The error function has many uses but the one that gives it its name comes from measurement
theory, where it tells you the probability of getting a particular “error” when you make a
measurement. This is because so many probability functions can be described by “Gaussian”
2
functions, which means variations of e −x .
For example, the height of adult U.S. men in 2003–2006 falls along a Gaussian curve with
its center (the average height) at 176.3 cm.5 If you pick an adult man at random in the U.S.
and measure his height, he will almost certainly not be exactly 176.3 cm tall. In fact, he’s
probably not within 5 cm of that height. On the other hand, he probably is within 10 cm of
the average, and almost certainly within 30. The function P (x) below gives the probability
that a randomly selected American man falls within x cm of 176.3.
x
1 2
P (x) = √ e −(t∕16) dt (12.3.3)
∫
8 𝜋 0
For example, P (3) represents the probability that a randomly chosen man will have a height
between 173.3 cm and 179.3 cm. In Problem 12.30 you’ll rewrite this probability function in
terms of erf x.
9! = 1 × 2 × 3 × 4 × 5 × 6 × 7 × 8 × 9 = 362, 880
The value of 0! is defined to be 1. The factorial function is not defined for negative numbers
or non-integers. For many purposes, including some Taylor series and probability calcula-
tions, a factorial function that is restricted to non-negative integers is enough.
5 “Anthropometric Reference Data for Children and Adults: United States, 2003–2006,” Margaret McDowell et. al.,
National Health Statistics Reports, Number 10, October 22, 2008. All height and weight statistics cited in this chapter
come from that report.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 661
This definition has x − 1 instead of x, so the relation between the gamma function and the
factorial function is off by one. That is, Γ(7) is not the same as 7! but instead is 6!.
Γ(x) = (x − 1)!
We can think of factorials in terms of the recurrence relation they follow as they move up.
For instance, we said above that 9! = 362, 880. If we now asked you for 10! you could immedi-
ately answer 3, 628, 800. The equation n! = n × (n − 1)! expresses this relationship in general
form. The recurrence relation for the gamma function looks the same except that it is offset
by one.
Γ(x) = (x − 1) × Γ(x − 1) (12.3.5)
In Problem 12.39 you will show that this recurrence relation follows from Equation 12.3.4.
(This amounts to another proof that the gamma function tracks the factorial function at the
integers.) In other problems you will use this recurrence relation to investigate the behavior
of the gamma function, such as how it acts on negative numbers.
We shall see the gamma function appearing in other special functions in this chapter such
as Bessel functions.
note, for very large x the factorial function can be approximated by “Stirling’s for-
As a final √
mula” x ex −x 2𝜋x. Because this approximates x! it also approximates Γ(x + 1). This approx-
imation can be useful when factorials or gamma functions appear in functions that need to
be differentiated or integrated.
Stepping Back
You might expect us to say at this point “hyperbolic trig functions, the error function, and
the gamma function are all very important, and you should be intimately familiar with
them.” These functions are important—and you may encounter any or all of them in various
applications—but there are a lot of other important functions out there. We can’t possibly
list them all, and it wouldn’t necessarily do you a lot of good if we did.
So we’d rather say this. You may frequently have the experience of typing an equation
into a computer and getting back a function you’ve never heard of. Don’t panic! Look up
the function you found—in the online help for your math software, or on the Web, or in the
appendices to this book—and find the properties you need to know. Does this function have
zeros? What does its graph look like? How does it behave as x → ∞? Whatever question you
started with, you can probably answer it based on the properties of your function.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 662
you will show that lim erf x = 1. Based on a woman is outside the range 140 cm–
x→∞
those facts and your conclusions above, 184 cm and then halve that result.
draw a sketch of y = erf x. You don’t need to
(c) Express the answer to Part (b) as a
have done those problems to answer this one.
percentage by asking a computer for the
(e) Use a computer to graph y = value of the erf function you found.
erf x as x goes from −6 to 6. If the (d) Your goal now is to ask the question
result does not match your drawing, “What fraction of U.S. adult men are
figure out what went wrong. taller than cm?” and have the answer
d come out the same as it did in Part (b).
12.28 Find (erf x).
dx What height should you use?
12.29 Find the Maclaurin series expansion
12.33 Calculate Γ(3) by starting with Equation 12.3.4
of erf x up to third order.
and using integration by parts.
12.30 Equation 12.3.3 represents the probability ( ) ( )
of a randomly chosen American man being 12.34 Simplify Γ 3 12 ∕Γ 2 12 . Hint: you should
( )
within x cm of the average height of 176.3. not have to find Γ 3 12 to find this ratio.
(a) With a computer you can determine
12.35 Simplify Γ(13.3)∕Γ(11.3). Hint: you should not
that P (5) ≈ 0.34. Explain what this
have to find Γ(13.3) to find this ratio.
result means in terms that a typical 7th
grade student could understand. 12.36 Simplify Γ(x + 17)∕Γ(x + 13).
(b) Use a u-substitution to express P (x) 12.37 Write (x + 4)(x + 5)(x + 6) … (x + 20) as
in terms of an error function. the ratio of two gamma functions.
12.31 Standard Deviation When you measure 12.38 In this problem you’re going to show
∞
a quantity Q that varies among many dif- that n! = ∫0 t n e −t dt.
ferent individuals, such as height, weight, (a) Show that this formula works for n = 0.
IQ, hair length, etc., you find an average (b) Using integration by parts, show that
value <Q > and some range of variation ∞ ∞
∫0 t n+1 e −t dt = (n + 1) ∫0 t n e −t dt.
around that value. Many such properties (c) Using your results from Parts (a)
are “normally distributed,” which means ∞
and (b), prove that n! = ∫0 t n e −t dt
that the probability of an individual being for all positive integers n.
within x of the average value equals
√ 12.39 Using integration by parts and the def-
x inition of the gamma function, prove
2 2
P (x) = √ e −(1∕2)(Q ∕s) dQ that Γ(x + 1) = xΓ(x).
s 𝜋 ∫0
12.40 In this problem you will examine the
for some constant s, which is known as behavior of the gamma function for neg-
the “standard deviation” of Q . ative x-values based on the recurrence
(a) Rewrite P (x) in terms of an error function. relation (Equation 12.3.5). No computer
(b) What is lim P (x)? Explain why this result should be required until Part (e).
x→∞ √
makes sense in terms of what P (x) means. (a) We begin with the fact that Γ(1∕2) = 𝜋.
(You will prove that in Problem 12.42; for
(c) Roughly what percentage of individ- this problem, take it as a given.) Given
uals will be within s of the mean value? that fact and the recurrence relation,
(d) Using data from the Explanation calculate Γ(−1∕2), Γ(−3∕2), Γ(−5∕2), and
(Section 12.3.1), what is the standard Γ(−7∕2).
deviation of adult American men’s (b) It is also true (and much more obvi-
heights? Hint: it’s not 16 cm. ous) that Γ(1) = 1. Given that and the
12.32 [This problem depends on Problem 12.31.] recurrence relation, what can you con-
(a) The average height of U.S. adult women clude about Γ(0)? From that, what
is 162 cm with a standard deviation of can you conclude about Γ(−1), Γ(−2),
11 cm. What fraction of U.S. adult women and other negative integers?
are between 151 cm and 173 cm? (c) Γ(x) is positive for any x > 0. Given that,
(b) What fraction of U.S. adult women for what negative x-values is Γ(x) positive,
are taller than 184 cm? Write your and for what values is it negative?
answer in terms of an error function. (d) Based on all your conclusions above, draw
Hint: start by finding the probability that a graph of the gamma function for x < 0.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 664
6 Strictly speaking N in this formula is the number of ways the atoms can oscillate, which is typically three times
true graph of the beta function is a sur- 12.48 Exploration: Fresnel Integrals. Two special
face, but we can get a sense for its behav- functions that we did not discuss are the “Fres-
ior from a sequence of curves. nel integrals.” (These are two different func-
tions, often denoted C(x) and S(x) in the liter-
(a) Draw graphs of y = B(x, 1), y = B(x, 3), ature.) Add them to this section. You should
and y = B(x, 10). Your graphs should write an explanation, including the defini-
be confined to the region x > 0. tion of the functions and what they are used
(b) What is lim B(x, y) in each case? for. You should also write a problem or two
x→∞
(c) The three functions are similar in many designed to help a student graph these func-
ways. How are they different? tions by hand (and solve your own problems).
12.46 You are using a differential equation to 12.49 Exploration: The Error Function at Infinity
model an important real-valued quantity ∞
starting at time t = 0 and a computer has 2 2
lim erf x = √ e −t dt
just told you that your quantity follows an x→∞
𝜋 ∫0
“Airy function” (sometimes called an “Airy
function of the first kind”) Ai(t). You may It’s not obvious that that integral will even
never have heard of an Airy function. Look give us a finite answer, much less what that
it up and answer the following questions answer is. And we can’t find it by evaluating
about your quantity. the antiderivative, because the error func-
(a) What limit will it approach as t → ∞? tion is the only antiderivative we’re going
(b) Will it oscillate, grow monotoni- to get. But Gauss discovered a clever trick
cally, decay monotonically, or behave that can be used to find this integral.
in some other way? We start by defining a constant that’s
(c) Will it ever equal zero? If so, will
equal to the integral we are looking
∞ 2 √ for: S =
∫0 e −t dt. (You’ll( multiply )
it (by 2∕ 𝜋 later.)
)
it do it a finite number of times ∞ 2 ∞ 2
or an infinite number? Therefore S 2 = ∫0 e −t dt ∫0 e −t dt ,
(d) Does this seem like a plausible model the product of two integrals.
for the height of a bouncing rubber (a) The next step is to replace every t in the
ball? The mass of a melting ball of ice? first integral with x, and every t in the sec-
The speed of an orbiting planet? In ond integral with y. Explain briefly how we
each case say why or why not. know that this does not change the answer.
12.47 You are using a differential equation to (b) Combine the two integrals in your expres-
model an important real-valued quantity sion for S 2 into one double integral. (The
starting at time t = 0 and a computer has just resulting integral is “separable” which
told you that your quantity follows an “Airy would turn it back into two separate inte-
function” (sometimes called an “Airy func- grals. We don’t want to do that, but it gives
tion of the first kind”) Ai(−t). (If you did you a quick check on your answer.)
Problem 12.46 note how this one is differ- (c) Use the laws of exponents to combine the
ent: t is positive but the argument of Ai is two exponential functions into one.
negative.) You may never have heard of an (d) Looking at your limits of integration, what
Airy function. Look it up and answer the 2D region are you integrating over?
following questions about your quantity. (e) Rewrite your double integral in polar
(a) What limit will it approach as t → ∞? coordinates. This involves rewriting the
(b) Will it oscillate, grow monotoni- integrand in terms of 𝜌 and 𝜙 and also
cally, decay monotonically, or behave replacing dx dy with 𝜌 d𝜌 d𝜙. Use your
in some other way? answer to Part (d) to figure out the limits
(c) Will it ever equal zero? If so, will of integration in polar coordinates.
it do it a finite number of times (f) You started with an integral that was
or an infinite number? impossible to evaluate. In polar coordi-
(d) Does this seem like a plausible model nates you should now have one that is easy
for the height of a bouncing rubber to evaluate. Do so and find the value of
ball? The mass of a melting ball of ice? S 2 . Then take the square root to get S, the
The speed of an orbiting planet? In integral you were originally looking for.
each case say why or why not. (g) So what is lim erf x?
x→∞
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 666
y = c0 + c1 x + c2 x 2 + c3 x 3 + c4 x 4 + c5 x 5 + … (12.4.2)
So instead of saying “we need to find the mystery function” we can reframe the question as
“we need to find the mystery coefficients” (which, in turn, define the function itself).
1. Based on Equation 12.4.2 write a formula for dy∕dx.
See Check Yourself #83 in Appendix L
Just to save you a bit of time, we will tell you—although we hope you could rapidly figure this
out yourself, or possibly even have it memorized, that:
x3 x5
sin x = x − + −…
3! 5!
2. Plug Equation 12.4.2, and your answer to Part 1, and the Maclaurin expansion of sin x,
into Equation 12.4.1.
3. Combine like powers of x on the right side of your answer to Part 2.
4. Your equation in Part 3 has a constant term on the left and a constant term on the
right. Write an equation asserting that these two must be equal.
5. Your equation in Part 3 has a coefficient of x on the left and a coefficient of x on the
right. Write an equation asserting that these two must be equal.
See Check Yourself #84 in Appendix L
6. In like manner, write equations setting the coefficients of x 2 , x 3 , and x 4 equal to each
other. You now have five equations relating the coefficients.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 667
7. Choose c0 = 0. Based on the equations you have written, find all the other coefficients
up to c5 .
8. Based on your coefficients, write down a solution to Equation 12.4.1. It will not be an
exact answer, but it will be the power series of the solution up to the fifth term. Hint:
several of the terms will be zero, and some won’t.
Such a series gives us one way of identifying a function. So instead of saying “I’m thinking
of the function that raises e to a given power,” you can instead say “I’m thinking of the func-
tion whose nth Maclaurin coefficient is 1∕(n!).” Equation 12.4.3 tells us that these are two
equivalent statements.
So, one approach we can use to solve differential equations is to look for the solution in
power series form. Instead of saying “I’m looking for this function in closed form” we say
“I’m looking for the coefficients of the power series for this function.”
Population Growth
We begin with one of the simplest and most important differential equations, the equation
for unconstrained population growth: “The more rabbits (or trees or fish or amoebas) you
have in this generation, the more you add in the next.”
dP
= kP (12.4.4)
dt
We hope you know how to solve that equation by separating variables or even by just looking
at it. But we’re going to use a guess-and-check approach where our guess is the generic
Maclaurin series.
P (t) = c0 + c1 t + c2 t 2 + c3 t 3 + c4 t 4 + … (12.4.5)
As always, we will plug the guess into the differential equation and solve for the constants; if
we find a working solution, that retroactively proves we made a good guess. This differs from
our “guess and check” examples in earlier chapters only because we now have an infinite
number of constants to solve for.
From Equation 12.4.5 we conclude that
Now comes the key step. If these two polynomials equal each other for all values of t, their
coefficients must all be the same. (To put it another way, one function cannot have two
different Maclaurin series.) So Equation 12.4.7 requires that the constant term on the left
equal the constant term on the right, and the coefficient of t on the left equal the coefficient
of t on the right, and so on. This gives us the following equations.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 668
… and so on. There are infinitely many such equations, but we can summarize them all in
two rules.
1. The constant term, c0 , can be anything.
2. For any n > 0, the coefficients must follow the relationship cn = (k∕n)cn−1 .
The second rule is a recurrence relation: it defines each coefficient in terms of its predeces-
sor. As an example, if we happen to start with c0 = 1, Equations 12.4.8 supply all the other
coefficients one at a time.
Don’t forget what all those coefficients mean! They go into Equation 12.4.5 to create one
valid solution to the differential equation.
k2 2 k3 3
P (t) = 1 + kt + t + t +…
2 2×3
Our recurrence relation tells us that the next coefficient will be k 4 ∕(4!) and so on. If you
know your Maclaurin series, you may recognize this function as e kt . But that last step is not
essential, and in fact is generally only possible with toy examples like this one. More often
you will leave your final answer in the form of a Maclaurin series with a recurrence relation.
You can then approximate the function with arbitrary accuracy by evaluating partial sums.
As a final note before we move on, c0 is the arbitrary constant in this solution. In the
example above we chose c0 = 1, but any other number would lead to a valid solution. In
applications the arbitrary constant comes from initial conditions, and in fact you can see by
plugging t = 0 into both sides of Equation 12.4.5 that c0 = P (0).
∑
∞
P (t) = cn t n
n=0
Now that we have P , we need P ′ (t). We can take the derivative inside the sum, which amounts
to taking the derivative of each individual term in the series.
dP ∑
∞
= ncn t n−1
dt n=0
This time if you plug in n = 0 you get 0. (This corresponds to the fact that the first term of
our Maclaurin series, c0 , drops out when you take the derivative.) Plug in n = 1 and then
n = 2 and you will see Equation 12.4.6 begin to emerge.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 669
The next step, just as before, is to plug the series representations of P and P ′ into the
equation we want to solve (Equation 12.4.4). We bring both series to one side of the equation
because in a moment we’re going to combine them.
∑
∞
∑
∞
ncn t n−1 − kcn t n = 0
n=0 n=0
We want the powers of t to match, so we reindex one of the two series. (If this trick is unfa-
miliar to you, please see Section 12.2; reindexing will be a vital skill in this chapter.) We
arbitrarily chose the first one.
∑
∞
∑
∞
(n + 1)cn+1 t n − kcn t n = 0
n=−1 n=0
The first series has a t −1 term that the second series does not, but that term turns out to be
zero so we drop it. All n ≥ 0 terms appear in both series, so we write them under one sum.
∑
∞
[ ]
(n + 1)cn+1 − kcn t n = 0
n=0
For this Maclaurin series to equal zero, every individual term—that is, every coefficient—
must equal zero. So we set the coefficient of t n equal to zero and get (n + 1)cn+1 = kcn , which
is the recurrence relation we found before. This relation holds for all n ≥ 0. Plug in n = 0
and it gives us c1 in terms of c0 . Plug in n = 1 and it gives us c2 in terms of c1 … and so on. But
nothing we have found constrains c0 in any way; it is the arbitrary constant in this problem.
We have now solved this problem twice with the same technique but different notation.
As you go through the example below you may want to write out each step term by term:
that will help you understand both series notation and the power series technique. (Or your
professor may require you to do so by assigning Problem 12.53.)
Problem:
d 2y dy 1
Find the particular solution to 2
− x2 +y = with y(0) = 0 and y′ (0) = 1.
dx dx x+2
Solution:
We begin with our guess, which is the generic power series, and its derivatives. This
step is the same in all power series problems.
∑
∞
∑
∞
∑
∞
y= cn x n → y′ = ncn x n−1 → y′′ = n(n − 1)cn x n−2
n=0 n=0 n=0
To represent the entire equation in power series form we need the Maclaurin series
for 1∕(x + 2). If you don’t know how to find that on your own, please review Taylor
series; it’s an important skill. (You may find the list of common Maclaurin series in
Appendix B a useful starting point for finding some series.)
1 1 x x2 x3 ∑∞
(−1)n n
= − + − +…= x
x+2 2 4 8 16 n=0
21+n
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 670
With that in place we can rewrite our entire differential equation in terms of
Maclaurin series.
∑
∞
∑
∞
∑
∞
∑
∞
(−1)n
n−2 n+1 n
n(n − 1)cn x − ncn x + cn x = xn
n=0 n=0 n=0 n=0
21+n
∑
∞
∑
∞
∑
∞
∑
∞
(−1)n
(n + 2)(n + 1)cn+2 x n − (n − 1)cn−1 x n + cn x n − xn = 0
n=−2 n=1 n=0 n=0
21+n
The n = −2 and n = −1 terms appear only in the first series, and both equal zero.
The n = 0 term appears in three out of four of the series, so we write it separately.
All n ≥ 1 terms appear in all four series, so we group them under one sum.
( ) ∞ [ ]
1 0 ∑ (−1)n n
2c2 + c0 − x + (n + 2)(n + 1)cn+2 − (n − 1)cn−1 + cn − 1+n x = 0
2 n=1
2
Of course x 0 is just 1, but we wrote it explicitly to make it clear that these represent all
the coefficients of x 0 in the sums that included that term. Because this entire sum
must equal zero, we can write two relationships. First, the total coefficient of x 0 must
be zero. Second, inside the sum, the total coefficient of x n for all n ≥ 1 must be zero.
1 1 c0
2c2 + c0 − =0 → c2 = − (12.4.9)
2 4 2
(−1)n
(n + 2)(n + 1)cn+2 − (n − 1)cn−1 + cn − =0 →
21+n
(n − 1)cn−1 − cn + (−1)n ∕21+n
cn+2 = (n ≥ 1) (12.4.10)
(n + 2)(n + 1)
Those two equations correspond to the general solution that we normally look for.
The first two constants are determined by initial conditions: c0 is y(0) and c1 is y′ (0).
Given those two values, Equation 12.4.9 gives us c2 and Equation 12.4.10 gives us c3
and beyond.
In this particular problem, c0 = 0 and c1 = 1. Equation 12.4.9 tells us that
c2 = 1∕4 − 0∕2 = 1∕4. Plugging n = 1 into Equation 12.4.10 tells us how to find c3
from c0 and c1 .
0c − c1 + (−1)1 ∕22 −1 − 1∕4 5
c3 = 0 = =−
(3)(2) 6 24
We can then plug n = 2 into the recurrence relation to find c4 from c2 and c3 , and so
on up. Here is the solution we find, up to the fifth power.
1 5 3 7 4 31 5
y = x + x2 − x + x + x +…
4 24 96 960
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 671
Plugging that solution into the left-hand-side of the differential equation gives us
(1∕2) − (x∕4) + (x 2 ∕8) + …. It would take infinitely many terms to reproduce
1∕(x + 2) exactly, but recreating the first few terms is a good sign.
d 2y dy
2
+ a1 (x) + a0 (x)y = f (x) (12.4.11)
dx dx
But there is a further restriction: the functions a1 (x), a0 (x) and f (x) must be “analytic,” which
means they can be represented in power series form. If that is the case then the solution y(x)
will also be analytic, and this method will find the coefficients of its√power series.
As an example, consider a differential equation that includes x. That function is not
differentiable at x = 0 so it cannot be represented by a Maclaurin series. You could still use
this method, but you would have to use a Taylor series centered on some other value such
as x = 1.
Note also that some Taylor series converge only within a given interval. The function
1∕(1 − x) can be represented by 1 + x + x 2 + x 3 + … but only within the interval −1 < x < 1.
So a differential equation containing this function could be solved by the method of power
series, but the solution would only be valid on that domain.
Section 12.6 will present a generalization of the power series method that avoids some of
these limitations.
Stepping Back
∑
It takes time to get used to working with series, especially expressed in notation, but
beyond that the process we’ve described here is mechanical.
∑
1. Begin with y = cn x n and find y′ and y′′ and so on as needed.
2. Convert any functions of x into power series.
3. Set like powers of x equal to each other and the result will be a recurrence relation.
Of course the answer must involve arbitrary constants! For a first-order linear differential
equation you should find that c0 is unconstrained, and the recurrence relation will define
each coefficient in terms of the previous one. For a second-order equation both c0 and c1
should be free variables, and the recurrence relation will define each coefficient in terms
of its two predecessors. As always, the arbitrary constants will be determined by initial or
boundary conditions, not by the differential equation itself.
In some cases you will find that all the coefficients past a certain point are zero. Then the
solution you were looking for is a finite polynomial, and you have it all. (You will encounter
two particularly important examples of this, “Legendre polynomials” and “Hermite polyno-
mials,” in Section 12.5.) It is also possible that you will find an infinite Taylor series that you
can convert back to a function in closed form, as in our e kx example above.
But in most cases you will have nothing but a recurrence relation—and that’s OK. The
great virtue of power series is that you can use them to make arbitrarily accurate approxi-
mations. The 5th partial sum or the 10th partial sum may give you all the accuracy you need.
And if they don’t, you can get the 100th partial sum just as easily from a computer.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 672
Now consider the particular solution for the constants 𝜇 = 20, c0 = −3, and c1 = 0.
2. Show that all the odd coefficients will be zero.
3. Find c2 , c4 , and so on until you reach a point where all subsequent even coefficients
will be zero.
4. Write down the resulting polynomial and verify that it solves Equation 12.5.1 for
𝜇 = 20.
d 2 Θ cos 𝜃 dΘ
+ + kΘ = 0 (12.5.2)
d𝜃 2 sin 𝜃 d𝜃
In Problem 12.81 you will apply the substitution u = cos 𝜃 to this equation. The result is a
particularly important equation, in part because partial differential equations in spherical
coordinates come up fairly often. So this equation and its solutions have been named and
studied.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 674
The Legendre Differential Equation: (1 − x 2 )y′′ − 2xy′ + l(l + 1)y = 0 −1<x <1 (12.5.3)
The General Solution: y = APl (x) + BQl (x) (12.5.4)
The Legendre Differential Equation for l = 3: (1 − x 2 )y′′ − 2xy′ + 12y = 0 −1<x <1
The General Solution: y = AP3 (x) + BQ3 (x)
You can think of this section as a study of how much we can learn about a function from its
recurrence relation. We begin with the following observations.
∙ The constant c0 is arbitrary. (It may be determined by initial conditions, but it is not
determined by the differential equation.) All the other even-numbered coefficients
are based on that constant.
∙ In particular, if c0 = 0 then all the even-numbered coefficients are zero.
∙ The constant c1 is also arbitrary, and the odd-numbered coefficients are based on it. If
c1 = 0 then all odd-numbered coefficients are zero.
So we have two series, one based on c0 and one based on c1 , each of which solves the dif-
ferential equation independently. The total solution is the sum of these two solutions, as we
expect from a second-order equation.
If 𝜇 = l(l + 1) for some non-negative integer l then the numerator of our recurrence rela-
tion becomes factorable.
n2 + n − 𝜇 n 2 + n − l(l + 1) (n + l + 1)(n − l)
cn+2 = cn = c = c (12.5.6)
(n + 1)(n + 2) (n + 1)(n + 2) n (n + 1)(n + 2) n
Starting from c0 and c1 you use Equation 12.5.6 to generate coefficients. But when you reach
n = l one of your two series (the odd or even-numbered coefficients) comes to an end, as we
see in the following example.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 675
Problem:
Here is the Legendre equation for l = 5.
14 3 21 5
yodd = x − x + x (12.5.8)
3 5
This is, all by itself, a valid solution to Equation 12.5.7. But it does not meet the
condition c0 = 1 that was stipulated in the problem. That condition leads to the
following.
6 × −5 8 × −3 10 × −1
c0 = 1 → c2 = (1) = −15 → c4 = (−15) = 30 → c6 = (30) = −10
1×2 3×4 5×6
This series does not terminate, but we have found the first few terms.
That infinite series is also a solution to Equation 12.5.7 (when the series converges at
all, as we discuss below). The solution that meets both given conditions is the sum of
these two solutions.
You can confirm for yourself that Equation 12.5.8 is a valid solution to Equation 12.5.7. But
it’s more important to see what happened, and how it generalizes. The recurrence relation
Equation 12.5.6 has a factor of (n − l) in the numerator, or in this case (n − 5). When we were
counting off odd-numbered coefficients, that factor killed the c7 coefficient and all the other
coefficients based on it, leaving us with a fifth-order polynomial.
On the other hand, when we were counting up even-numbered coefficients, we never
reached n = 5 exactly. Consequently, none of those coefficients came out zero, so the poly-
nomial kept going forever.
If you followed all that, you are ready for a few more general observations about the solu-
tions to the Legendre equation.
∙ If l is an odd integer then the solution based on c1 will be a polynomial of order l, but
the solution based on c0 will not terminate. Therefore, if l is an odd integer and c0 = 0
the entire solution will be a polynomial of order l.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 676
This result is not surprising if you remember where we started. We said in Section 12.4
that the method of power series supplies a solution to Equation 12.4.11 if a1 (x), a0 (x) and
f (x) are all analytic functions. The Legendre equation in this form looks like the following.
d 2y 2x dy l(l + 1)
2
− + y=0
dx 1 − x 2 dx 1 − x2
The coefficients of y′ and y are only analytic for −1 < x < 1, so we are only guaranteed to
find power series solutions in that domain.
The divergence of the Ql (x) functions is one of the most important facts about these func-
tions. In many applications we are looking for a function that is finite at x = ±1, and on this
basis we restrict our solution to the Legendre functions of the first kind.
Non-Recursive Definitions
Equation 12.5.6 gives us a “recursive definition” of a series; it defines each coefficient in
terms of the coefficients before it. Such a definition can tell us a great deal about a function
(and sometimes it’s the best we can find), but it’s a slow way to figure out c326 . In many cases,
fortunately including Legendre polynomials, we can find an “explicit definition” for each
coefficient based only on the initials ones.
From Equation 12.5.6 we get:
(l + 1)(−l)
c2 = c (12.5.11)
(1)(2) 0
(l + 3)(2 − l)
c4 = c2 (12.5.12)
(3)(4)
(l + 5)(4 − l)
c6 = c4 (12.5.13)
(5)(6)
… and so on. Plug Equation 12.5.11 into Equation 12.5.12.
Fourier-Legendre Series
In the Motivating Exercise (Section 12.1) you reviewed the process of solving a PDE. In
the last step you have to rewrite the initial state of your system—which could in principle
be almost any function—as a series based on the normal modes you have found. In many
cases involving spherical coordinates, those normal modes are Legendre polynomials. So it
is important to be able to write any given function as a “Fourier-Legendre series.”
∑
∞
f (x) = An Pn (x) = A0 P0 (x) + A1 P1 (x) + A2 P2 (x) + …
n=0
Appendix J gives the formula for finding the coefficients An , and Section 12.8 explains how
this formula was derived. Here we simply illustrate the process.
2(0) + 1 1 √3
A0 = xP0 (x)dx
2 ∫−1
We can similarly find as many of the other coefficients as we want. The integrands will
only involve powers of x so they are easy, if tedious, to integrate. For this function all
the even coefficients are zero. (Do you see why?) The next non-zero coefficient is
A3 = −6∕13, so we can write:
√
3 9 6
x= P (x) − P (x) + …
7 1 13 3
√
Using only those two terms we get a passable approximation of 3 x between x = −1
and x = 1. Adding one more term makes it significantly more accurate, and by the
time we get through P19 the approximation almost perfectly tracks the function√
within this domain. For |x| > 1 the Fourier-Legendre series looks nothing like 3 x.
A1P1 A1P1 + A3P3 A1P1 + A3P3 + A5P5 A1P1 + A3P3 + A5P5 +··· A19P19
2 4 4 4
1 2 2 2
x x x x
–2 –1 1 2 –2 –1 1 2 –2 –1 1 2 –2 –1 1 2
–1 –2 –2 –2
–2 –4 –4 –4
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 679
It is common for normal modes to be in the form Pl (cos 𝜃), and Appendix J also lists the
formulas for expanding a function f (𝜃) as a sum of such terms.
Stepping Back
A great deal more can be said—and has been said—about the Legendre polynomials. There
are recurrence relations that give Pl (x) in terms of Pl−1 (x) and Pl−2 (x), formulas for Pl′ (x), and
so on. And there are surprising applications of Legendre polynomials; for instance, they can
be used in finding polynomial approximations to curves that work better than Taylor series
under some circumstances.
But we have focused on using Legendre polynomials to solve partial differential equations.
∙ You need to know that the Legendre polynomials solve Legendre’s differential
equation. We stated that fact in Chapter 11 and now we have proven it.
∙ You need to know that Legendre polynomials of the first kind are finite series when l
is an integer, while Legendre polynomials of the second kind are always infinite series
and generally divergent at x = ±1, and are therefore discarded in many problems.
∙ Finally, you need to know how to write the initial conditions as a sum of Legendre
polynomials. We have shown you how to use the formula; you will derive that formula
in Section 12.8.
Section 12.7 will go through a similar treatment of Bessel functions.
already found for positive integers. Hint: try (a) Calculate the second-order Maclaurin
plugging in a few negative integer values series for cos(2x).
and see what the equation looks like. (b) Calculate the second-order Legendre
12.68 Walk-Through: Legendre Series. In this prob- series for cos(2x). Expand the Legendre
lem you will represent the function x 2 + 3 as a polynomials in your answer and gather
∑
series of Legendre polynomials, Al Pl (x). like powers of x so it’s in the form
(a) To find the first coefficient, plug l = 0 into of a quadratic function of x, just
the equation in Appendix J for the coeffi- like the Maclaurin series.
cients of a Legendre series. This equation
(c) On one plot, show cos(2x) and the
includes an integral that involves P0 (x).
two quadratic approximations you just
You can find P0 (x) and the other Legendre
calculated from x = −0.1 to x = 0.1.
polynomials you need in Equation 12.5.10.
Which quadratic better approximates
Evaluate the resulting integral and find A0 .
the function in this interval?
(b) Similarly find coefficients A1 –A5 .
(c) You should find that all but two of the (d) On one plot, show cos(2x) and
coefficients came out zero. (If you keep the two quadratic approximations you
going you will find nothing but zeros from just calculated from x = −1 to x = 1.
here on up.) Write the function x 2 + 3 as Which quadratic better approximates
a sum of two Legendre polynomials. the function in this interval?
(d) Confirm that your answer gives (e) Taylor series and Legendre series are two
x 2 + 3 exactly. different ways of approximating a func-
tion with a polynomial. Based on your
In Problems 12.69–12.76 find the Legendre series results, what are Taylor series more use-
expansion for the given function through the P3 (x) ful for and what are Legendre series more
term. Your answer should be in the form useful for?
A0 P0 (x) + A1 P1 (x) + A2 P2 (x) + A3 P3 (x), with specific 12.79 Starting from Equation 12.5.6 explain
values for the Al . Unless the problem is marked with a why Legendre’s differential equation will
computer icon ( ) all of the necessary integrals have one finite polynomial solution and
can be done with u-substitution and/or integration one infinite series solution for integer l,
by parts. and two infinite series solutions for non-
integer l.
12.69 x 3 12.80 In the Explanation (Section 12.5.2) we derive
√
12.70 5 x Equation 12.5.14, an explicit formula for the
12.71 x −1∕3 { Legendre polynomials based on even pow-
−1 x <0 ers of x. Use a similar process, starting from
12.72 f (x) = Equation 12.5.6, to find a formula for the
1 x ≥0
{ polynomials based on odd powers of x.
1+x x<0
12.73 f (x) = 12.81 Equation 12.5.2 comes up in the solution
1−x x≥0
of important partial differential equations
12.74 e x
in spherical coordinates. Apply the sub-
12.75 cos x stitution u = cos 𝜃 to rewrite this differen-
2
tial equation. Remember that this is a sub-
12.76 e −x stitution for the independent variable, so
you will have to write dΘ∕d𝜃 and d 2 Θ∕d𝜃 2
in terms of dΘ∕du and d 2 Θ∕du 2 .
12.77 Have a computer calculate the Legen-
dre series for f (x) = x −1∕3 up to the eleventh (a) Show that the final result can be writ-
order. Plot that partial sum and the function ten in the form of Equation 12.5.3.
f (x) on the same plot from x = −2 to x = 2. (b) Write the solution to Equation 12.5.2.
Describe how the Legendre series approxi- (You have already done all the work.)
mates the function near x = 0, in the region 12.82 In the Explanation (Section 12.5.2) we use the
0 < |x| < 1, and in the region |x| > 1. ratio test to show that the Legendre functions
12.78 In this problem you’re going to compare of the second kind converge for −1 < x < 1
the Legendre series and the Maclaurin and diverge when x < −1 or x > 1. That leaves
series for the function f (x) = cos(2x). open the question of convergence at x = ±1.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 681
In this problem you will examine the conver- (b) It will be convenient to examine the
gence of Q1 (x) at x = 1. (Pl (x) converges for terms in the sum in a different order.
all finite x since it’s a finite polynomial.) Define j = l∕2 − i and rewrite the sum
(a) Write the first five terms of the series as a sum over j instead of i.
Q1 (1). (One way to do this is to plug (c) Write out the terms for j = 0, 1, and 2,
l = 1 and x = 1 into Equation 12.5.14, simplifying as much as possible. Your
although you will need to add the x 8 answer will still contain factorials.
term to the ones we gave.) Simplify each (d) Take the l th derivative of the terms you
term in fractional form as much as pos- just wrote, once again expressing your
sible, but do not add the terms. answer in terms of factorials.
(b) Write a simple closed form expression (e) The first of your terms should be a con-
for the nth term of this series. stant. Factor that constant out of all three
(c) Prove that the resulting series con- terms and simplify as much as possible.
verges or diverges. (The one thing You should get Equation 12.5.14.
you already know, before you start, is Of course you just proved this for the first
that the ratio test will be no help here. three terms in the sum. It is possible to go
That leaves all the other tests.) through a similar process to find the j th term
12.83 In Chapter 11 we solved for the electric poten- and verify that it matches the j th term in
tial inside a hollow spherical shell of radius Pn (x), but the algebra is a bit messy.
R with potential V0 (𝜃) on the shell. (We 12.86 Showing that a set of functions is “orthogo-
assumed the potential was independent of nal” is a key step in finding the formula for the
the azimuthal angle 𝜙.) We found that inside coefficients of a series expansion, as we will
the shell the potential could be written see in Section 12.8. In this problem you will
∑
∞ use Rodrigues’ formula (Equation 12.5.15)
V (r , 𝜃) = An r n Pn (cos 𝜃) to prove the orthogonality of the Legen-
n=0 dre polynomials: that is, you will show that
1
If V0 = cos2 𝜃, find the coefficients An . ∫−1 Pl (x)Pm (x)dx = 0 for all l ≠ m.
1
(a) Use Appendix J to help you write a (a) First, prove that ∫−1 f (x)Pl (x)dx =
1
general formula (in terms of an inte- (−1)l ∕(2l l!) ∫−1 f (l) (x)(x 2 − 1)l dx, where
gral) for the coefficient An . f (x) means the l th derivative of f (x).
(l)
7 We know (l − 1)th looks awkward, but (l − 1)st looked even worse. You see the things you have to worry about
problem you will use Rodrigues’ formula but one of the terms will vanish when
(Equation 12.5.15) to derive the norm of you take the derivative.
1
the Legendre polynomials: ∫−1 [Pl (x)]2 dx.
(e) Evaluate the integral and compute
(a) Substitute for Pl (x) using Rodrigues’
the norm of the Legendre polynomials.
formula. Pull the constant out of the
integral. (f) The norm of the Legendre poly-
(b) Integrate by parts and argue that the nomials is most simply expressed as
1
term outside the integral must equal ∫−1 [Pl (x)]2 dx = 2∕(2l + 1). Your com-
zero. Hint: you may find it helpful puter may have given you an answer
to look at Problem 12.86. that looks different from this, but you
(c) Find the integral you’re left with can check it by calculating some of the
1
after l integrations by parts. terms. Calculate ∫−1 [Pl (x)]2 dx for l = 1,
(d) That integral involves the (2l)th deriva- 2, and 3 using the answer you got and,
tive of (x 2 − 1)l . Evaluate that deriva- if it looks different, the formula we just
tive. Hint: if you expand (x 2 − 1)l all gave you. Make sure they match.
y = x r (c0 + c1 x + c2 x 2 + c3 x 3 + …) = c0 x r + c1 x r +1 + c2 x r +2 + c3 x r +3 + …
∑
∞
∑
∞
( )
y = xr cn x n = cn x r +n = c0 x r + c1 x r +1 + c2 x r +2 + c3 x r +3 + … c0 ≠ 0 (12.6.1)
n=0 n=0
The restriction c0 ≠ 0 is not a limitation: it just means r is the exponent of the lowest-order term
with a non-zero coefficient.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 683
If r = 0, the “generalized” power series is just a “plain old” power series. If r is a positive
integer this is still a power series, in which the first few coefficients happen to be zero. But if r
is negative, fractional, or complex then this√ can express functions that cannot be represented
by a Maclaurin series. For instance, e x ∕ x [has no Maclaurin series ] because it is undefined
at x = 0, but it can be represented as x −1∕2 1 + x + x 2 ∕(2!) + … .
The “method of Frobenius” solves a differential equation by assuming a solution in the
form of Equation 12.6.1. From there, the process goes like this.
1. Begin by making both sides of the equation look more or less like Equation 12.6.1.
2. Choose the lowest possible power of x and set its coefficient on the left equal to its coeffi-
cient on the right. The resulting equation (the “indicial equation”) can be solved for r .
For an nth-order equation you will usually find n-values of r .
3. For each value of r , find the recurrence relations that define the coefficients of its
series. You should in general find that only c0 is undetermined, so each value of r
corresponds to one series with one arbitrary constant.
To illustrate this method, we will solve the following equation.
( ) ( )
d 2 y 1 dy 3
2 2+ − 2+ y=0 (12.6.2)
dx x dx x
We will work almost entirely in series notation, but as before we encourage you to work in
parallel without it: for each step we take, write out the first few terms of each series.
∑
∞
∑
∞
∑
∞
y= cn x r +n → y′ = cn (r + n)x r +n−1 → y′′ = cn (r + n − 1)(r + n)x r +n−2
n=0 n=0 n=0
When we plug these into the differential equation the term (2 + 3∕x)y becomes two different
series, one for 2y and one for (3∕x)y, so we have a total of four series.
∑
∞
∑
∞
∑
∞
∑
∞
2cn (r + n − 1)(r + n)x r +n−2 + cn (r + n)x r +n−2 − 2cn x r +n − 3cn x r +n−1 = 0
n=0 n=0 n=0 n=0
As in Section 12.4 we reindex those series so they all reflect the same power of x. We could,
for instance, rewrite them all with the exponent r + n. Instead we will rewrite them in terms
of x r +n−2 , just because two of them already look like that. As long as they are all the same we
will be able to compare.
∑
∞
∑
∞
∑
∞
∑
∞
r +n−2 r +n−2 r +n−2
2cn (r + n − 1)(r + n)x + cn (r + n)x − 2cn−2 x − 3cn−1 x r +n−2 = 0
n=0 n=0 n=2 n=1
As always, check the first few terms of each series to convince yourself that we haven’t changed
anything!
The n = 0 and n = 1 terms appear in only some of the series, so we write them separately.
All n ≥ 2 terms appear in all four series so we combine them into one sum. After a bit of
simplification we arrive here.
We always begin with the lowest power of x that is represented in the equation. Setting the
coefficient of x r −2 equal to zero gives us c0 r (2r − 1) = 0. Since c0 ≠ 0 by the definition of our
series, we can divide by c0 and get an equation for r . This is called the “indicial equation” for
this differential equation.
This can trivially be solved to give r = 0 and r = 1∕2. We therefore expect that the two solu-
tions to Equation 12.6.2 will be in the form of Equation 12.6.1, one with r = 0 and the other
with r = 1∕2. The general solution will be any linear combination of these two solutions.
The general recurrence relation comes from setting the coefficient of x r +n−2 inside the sum
equal to zero.
2cn−2 + 3cn−1
cn = (n ≥ 2) (12.6.6)
(2r + 2n − 1)(r + n)
Given any particular value of c0 (based on initial conditions) we use Equation 12.6.5 to find
c1 and Equation 12.6.6 to find all the rest.
∑
∞
2cn−2 + 3cn−1
y0 = cn x n where c1 = 3c0 and cn = (n ≥ 2)
n=0
n(2n − 1)
For the r = 1∕2 solution we will use d rather than c to make it clear that the coefficients of
the series below are not the same as the coefficients of the series above.
√ ∑∞
2dn−2 + 3dn−1
y1∕2 = x dn x n where d1 = d0 and dn = (n ≥ 2)
n=0
n(2n + 1)
The general solution is a sum of these two series, with arbitrary constants supplied by c0 (the
constant term in the r = 0 series) and d0 (the constant term in the r = 1∕2 series). You’ll find
solutions for specific initial and boundary conditions in Problem 12.96.
The
( 2most2important
) use of the Frobenius method is to solve “Bessel’s equation” x 2 y′′ +
′
xy + x − p y = 0. We shall devote much more space to this equation and its important
solutions in Section 12.7, but here we’re just going to use it as an example of the method of
Frobenius. Bessel’s equation is actually many different equations because of the constant p;
in the example below p = 1∕3.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 685
Problem: ( )
Solve the differential equation x 2 y′′ + xy′ + x 2 − 1∕9 y = 0.
Solution:
The method of Frobenius always begins with the same guess.
∑
∞
∑
∞
∑
∞
y= cn x r +n → y′ = cn (r + n)x r +n−1 → y′′ = cn (r + n − 1)(r + n)x r +n−2
n=0 n=0 n=0
Plug this into the differential equation. The first series comes from multiplying the y′′
expansion above by x 2 (so the exponent goes up by two). Similarly, the second series
is xy′ , followed by x 2 y and finally −y∕9.
∑
∞
∑
∞
∑
∞
∑
∞
1
r +n r +n r +n+2
cn (r + n − 1)(r + n)x + cn (r + n)x + cn x − cn x r +n = 0
n=0 n=0 n=0 n=0
9
Reindex the series so that they all have the same power of x. Since three of them
already look like x r +n we’ll change the other series to match.
∑
∞
∑
∞
∑
∞
∑
∞
1
cn (r + n − 1)(r + n)x r +n + cn (r + n)x r +n + cn−2 x r +n − cn x r +n = 0
n=0 n=0 n=2 n=0
9
The x r and x r +1 terms appear in some series; subsequent terms appear in all series.
[ ] [ ]
1 1
c0 (r − 1)r + c0 r − c0 x r + c1 r (r + 1) + c1 (r + 1) − c1 x r +1
9 9
∞ [ ]
∑ 1
+ cn (r + n − 1)(r + n)x r +n + cn (r + n)x r +n + cn−2 x r +n − cn x r +n = 0
n=2
9
1 1 1
c0 (r − 1)r + c0 r − c0 = 0 → r2 − =0 → r =± (12.6.7)
9 9 3
Because we know that c1 = 0 this relation promises us that cn = 0 for all odd n. So we
have two different series, one with r = 1∕3 and one with r = −1∕3. Each has its own
arbitrary c0 (based on the two initial conditions) and a recurrence relation for all
subsequent even-numbered n-values. In Section 12.7 we will explore the properties of
these solutions.
The functions b1 (x) and b0 (x) must be “analytic functions,” meaning that they can be rep-
resented by Maclaurin series.8 But in some cases b1 (x)∕x and b2 (x)∕x 2 may not be analytic
functions, so there are cases where you can use the method of Frobenius but you cannot use
the method of power series.
On the other hand, Equation 12.6.9 is necessarily homogeneous, so there are also cases
where you can use the method of power series but not the method of Frobenius.
Even if your problem fits Equation 12.6.9 perfectly, you can run into trouble. One poten-
tial difficulty is an indicial equation with a “double root.” (For instance, if r 2 − 6r + 9 = 0
then r = 3.) From this you get one generalized power series, but a second-order linear dif-
ferential equation requires two linearly independent solutions. You can find the second one
by using “reduction of order,” a technique discussed in Chapter 10. When you apply this tech-
nique to a Frobenius problem you end up with another infinite series added to a logarithmic
function.
A less obvious difficulty is an indicial equation with two roots that differ by an integer
(such as r = 1∕3 and r = 7∕3). Sometimes that works out fine, and you get two linearly
independent solutions (see Problem 12.97). But sometimes the recurrence relation for the
series with lower r blows up, so that solution has to be discarded. If your first solution can
be written in closed form, this problem has the same resolution as the double root problem
we discussed above: reduction of order can find you a second solution involving a log
(see Problem 12.98). In Section 12.7 we’ll see a completely different solution designed
specifically for the important case of Bessel functions.
8 Youcan also use the method of Frobenius on an equation of the form y′′ + y′ b1 ∕(x − c) + yb2 ∕(x − c)2 if b1 and b0
can be represented in Taylor series about x = c. This doesn’t seem to come up very often.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 687
(e) Write an equation setting the coef- 12.94 Solve the equation 2x 2 y′′ + xy′ + (sin x)y = 0
ficient of the lowest power of x using the method of Frobenius, calcu-
equal to zero. (This will come from lating each series through the x r +3 term.
a term outside the sum.) Start by expanding sin x in a Maclaurin
(f) Because we assume for a general- series through the x 5 term. From there
ized power series that c0 ≠ 0, you can it might be easier to just write out the
divide both sides of your equation from lowest several terms of each series rather
Part (e) by c0 . The result is the “indi- than using series notation.
cial equation.” Solve it for r . 12.95 In this problem you’ll solve the equation
(g) For this particular problem there 3x 2 y′′ + 7xy′ − 4y = 0.
should be one other power of x out- (a) Go through the method of Frobenius
side the sum. Set the coefficient of far enough to get the indicial equation
that power equal to zero to get an and find the two values of r .
equation for c1 in terms of c0 . (b) Set the coefficient of the next power of
(h) Setting the coefficient inside the x equal to zero. Given that you know the
sum equal to zero, write a recur- only two possible values for r , what does
rence relation for cn in terms of cn−1 this equation let you conclude about c1 ?
and cn−2 valid for all n ≥ 2. (c) Following similar logic, what can
(i) Now plug in the values of r you you conclude about all of the coef-
found. For each one: ficients cn , for n > 2?
i. Write the formula for c1 in terms of c0 . (d) Write the general solution to this
ii. Write the recurrence relation equation in closed form.
for cn for n ≥ 2.
iii. Write the third partial sum of this 12.96 In the Explanation (Section 12.6.2) we found
series. Include the powers of x. the general solution to Equation 12.6.2 in
the form of a sum of two series, with arbi-
12.89 [This problem depends on Problem 12.88.] In trary constants c0 and d0 . In this problem
Problem 12.88 you found two series solutions you’re going to find specific solutions for
to the differential equation 2xy′′ + y′ + 2(1 + some initial and boundary conditions. First
x)y = 0, each one with an arbitrary constant. consider some initial conditions.
(a) If you are told that y′ (0) is finite, you (a) Set y(0) = 1. What does that tell you
can eliminate one of your series solu- about the constants c0 and d0 ?
tions. Which one, and why?
(b) Calculate y′ (0). What must be true
(b) If you are also told that y(0) = 3 then about the coefficients in order
you can solve for the arbitrary con- for y′ (0) to be finite?
stant in the remaining solution. Write
(c) Write the fourth partial sum of the spe-
the third partial sum of the resulting
cific solution for y(0) = 1 with y′ (0) finite.
series.
You may be surprised that “y′ (0) is finite” is
In Problems 12.90–12.93 use the method of a sufficient initial condition. If we know that
Frobenius to solve the given differential equation. y(0) = 1 and y′ (0) is finite then we don’t get
Your final answer should be in the form of to pick y′ (0); it must be the same as y(0). We
two third-order series in the form y(x) = c0 will see more generally when such a condi-
[x r + ax r +1 + bx r +2 ]. You’ll fill in the constants r , a, tion is sufficient when we take up Sturm-
and b for each series, but c0 will be left as an arbitrary Liouville theory later in the chapter.
constant. (Call it d0 in the second solution.) It may Next you’ll treat this as a boundary
help to first work through Problem 12.88 as value problem.
a model. (d) Set y(0) = 0. What does that tell you
about the coefficients in the problem?
12.90 2x 2 y′′ + xy′ + (x − 3)y = 0
(e) Calculate the fourth partial sum of the
12.91 4xy′′ + (x − 2)y′ = 0 solution with y(0) = 0. Your answer will
( )
12.92 x 2 y′′ + xy′ + x 2 − 1∕25 y = 0 (Bessel’s still have one arbitrary constant in it.
equation with p = 1∕5) (f) Using that fourth partial sum, set y(1) = 1
( )
12.93 x 2 y′′ + xy′ + 4x 2 − 1∕25 y = 0 (Slight varia- and calculate the remaining arbitrary
tion on Bessel’s equation with p = 1∕5) constant. (This will only be an approx-
imate value since you only used the
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 688
fourth partial sum, but it should be higher r -value never blows up but that
a very good approximation.) the lower r -value must be discarded.
12.97 In this problem and Problem 12.98 you will (c) Show that the valid solution terminates
encounter indicial roots that differ by an inte- after a finite number of terms. Write
ger. The two problems will resolve themselves that solution in closed form and ver-
very differently, however. In this problem you ify that it works. We’ll call it y1 .
will solve the equation y′′ + y′ − (2∕x 2 )y = 0. (d) Since the other series had to be dis-
(a) Proceed in the usual way with the carded, you need to find the second
method of Frobenius until you have writ- solution using reduction of order. Plug
ten and solved the indicial equation and the guess y2 = u(x)y1 (x) into the differ-
written the recurrence relation for n ≥ 1. ential equation and simplify as much as
(b) You should have found that the indi- possible. Solve the resulting equation
cial equation has two roots that differ to find u(x). Hint: solving the equation
by an integer. When this happens, the will involve a substitution to turn a
higher of the two roots never causes a second-order equation into a first-order
problem. Show in this particular case equation, then separation of variables,
that the recurrence relation for the and finally partial fractions.
higher r -value never blows up. (e) Write the general solution to the dif-
(c) For the lower of the two r -values, ferential equation and verify that it
the recurrence relation blows works.
up at what n-value? 12.99 Solve the equation y′′ + (1∕x)y′ + (1∕x 2 )y = 0
(d) Start generating terms for the series using the method of Frobenius. You should
with the lower r -value. You should find that both series terminate so you can
find that your series terminates write your solution in closed form. Hint:
before it runs into trouble. you will not get a normal recursion relation,
(e) Is that really valid? Can we ignore the fact but simply an equation relating cn and r .
that this series seems destined to blow As always, assume c0 ≠ 0 and solve for the
up? As with any differential equation, two allowed values of r . Then figure out
the solution will bear itself out. Write what the equation tells you about all the
your terminating solution—arbitrary other coefficients cn . At the end, plug your
c0 constant and all—and verify that it closed form answer back in and check that
solves the differential equation. it satisfies the differential equation.
12.98 The equation (x 2 + x)y′′ − xy′ + y = 0 leads 12.100 The “Hypergeometric equation” x(1 − x)y′′ +
to an indicial equation whose roots differ by [C − (A + B + 1)x]y′ − ABy = 0 (where A,
an integer. You’ll see how that can lead to a B, and C are constants) has been much
problem, and what you can do about it. studied because its solutions, “hyperge-
ometric functions,” include many other
(a) Find the two allowed values of r and
classes of special functions.
the recurrence relation for each.
(a) Write and solve the indicial equation. You
(b) You should have found that the indicial
should find r = 0 and one other solution.
equation has two roots that differ by an
integer. When this happens, the higher (b) Find the equation for cn+1 as
of the two roots never causes a problem, a function of cn .
but the lower can. Show in this particular (c) Simplify the recurrence relation
case that the recurrence relation for the for the case of r = 0.
Our point is that Bessel functions are a big deal. They appear in electromagnetism, ther-
modynamics, quantum mechanics, and many other fields, and there is a lot to learn about
them. But we can cover the basics in a relatively short space, and from there you can look up
more details on computers or reference tables as you need them.
We recommend first reading the section “Bessel Functions—The Unjustified Essentials”
in Chapter 11. That section is our best introduction to the subject, covering the information
you need to use these functions in PDEs and other contexts. Then come back to this section
to see where Bessel functions come from and why they have the properties they do.
This exercise requires a computer. The problems in this exercise refer to “Bessel func-
tions of the first kind” Jp (x) and “Bessel functions of the second kind” Yp (x). Refer to your
software’s documentation to find the proper syntax for entering these functions. For all the
following questions, use the domain 0 ≤ x ≤ 50.
1. Graph the function J1 (x). Answer in words: how is J1 (x) like a sine function? How does
it differ from a sine function?
2. How many zeros does J1 (x) have in this domain?
3. The first three positive zeros of this function are called 𝛼1,1 , 𝛼1,2 , and 𝛼1,3 . Find their
values. Your answers should be accurate to the second decimal place.
4. Find the absolute maximum of J1 (x).
5. Graph the functions J1.2 (x), J2 (x), and J15 (x). Answer in words: how are these functions
alike? How do they differ?
6. Graph the function Y1 (x). Answer in words: how is this function like J1 (x)? How is it
different?
In that exercise you looked up this equation in Appendix J and found that the solutions
are called “Bessel functions.” In this section we will solve this equation with the method
of Frobenius and thus derive the properties of Bessel functions that you found in that
appendix. The first step is the substitution x = k𝜌. You’ll show in Problem 12.109 that this
turns Equation 12.7.1 into Bessel’s equation (defined below).
The general solution to this equation is y = AJp (x) + BYp (x), where Jp and Yp are called Bessel func-
tions of the first and second kind, respectively, both of order p.
Notice that each p represents a different differential equation with different solutions.
For example, the general solution to x 2 y′′ + xy′ + (x 2 − 9∕4)y = 0 is y = AJ3∕2 (x) + BY3∕2 (x).
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 690
The next n tells us that c1 = 0. Finally, using all four series, we find the recurrence relation.
cn−2
cn =
p2 − (r + n)2
Because every coefficient is a multiple of the coefficient two below it, c1 = 0 means that all
odd coefficients are zero. That means the entire series is made up from even values of n, so
we will write n = 2m where m runs over all non-negative integers.
c2m−2
c2m = (12.7.3)
p2 − (r + 2m)2
It’s not obvious where that leads, so let’s start generating terms.
1 1
m=1 → c2 = − c =− c
4(p + 1) 0 1! 2 (p + 1) 0
2
1 1
m=2 → c4 = − c2 = c
4(2)(p + 2) 2! 2 (p + 1)(p + 2) 0
4
1 1
m=3 → c6 = − c =− c
4(3)(p + 3) 2 3! 26 (p + 1)(p + 2)(p + 3) 0
(−1)m Γ(p + 1)
c2m = c
m! 22m Γ(p + m + 1) 0
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 691
Applying those coefficients to the generalized power series gives us one solution to Bessel’s
equation.
∑
∞
c0 (−1)m Γ(p + 1) 2m ∑∞
(−1)m ( )2m
p p x
y=x 2m Γ(p + m + 1)
x = c 0 x Γ(p + 1)
m=0
m! 2 m=0
m! Γ(p + m + 1) 2
Because m counts by ones and x is raised to the 2m we are still generating only even-numbered
terms, consistent with our early conclusion that all odd-numbered terms are zero.
But just as we did with Legendre polynomials, we now get to choose a convenient c0 to
define the Bessel functions. It seems clear that a choice with Γ(p + 1) in the denominator will
make the formula simpler. It is also conventional to put a 2p in the denominator; this turns
1
x p into (x∕2)p so it combines with the other power of x. So we set c0 = p and arrive
2 Γ(p + 1)
at the all-important formula we have been looking for.
∑
∞
(−1)m ( )2m−p
x
J−p (x) = (12.7.6)
m=0
m! Γ(−p + m + 1) 2
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 692
If you were paying close attention in Section 12.6 you may spot a potential snag. If the two val-
ues of r differ by an integer we saw that sometimes you get two valid solutions and sometimes
you don’t. For Bessel’s equation the two values of r are p and −p, which differ by an integer if
p is an integer or a half-integer. As it turns out, half-integer values of p do not lead to trouble
here but integer values do. In Problem 12.104 you will apply the method of Frobenius to
Bessel’s equation for p = 2 and show that you don’t get two valid, independent solutions. In
Problem 12.105 you will start from the formulas we’ve derived for Jp and J−p and show that
the following holds for any integer p:
Do you see why that matters? For non-integer values of p we have the two solutions we need,
and we can write the general solution to Bessel’s equation as AJp (x) + BJ−p (x). But for integer
p we have two linearly dependent functions, so AJp (x) + BJ−p (x) is really just one solution
CJp (x). We do not have a general solution in that case.
If you understand the problem as we have described it, you can appreciate the following
very clever solution.
This is the definition for non-integer values of p. For integer n, Yn (x) is defined as the limit of
Equation 12.7.7 as p → n.
(This function is sometimes labeled Np (x) but we will stick with Y .)
But there is one big difference: for all p-values, lim+ Yp (x) = −∞. For this reason, whenever
x→0
you solve Bessel’s equation on a region that requires a finite answer at x = 0, you immediately
discard the Yp (x) function.
Fourier-Bessel Series
By analogy to Legendre series you might expect us to expand a function into a series of Bessel
functions like A0 J0 (x) + A1 J1 (x) + … Well, it doesn’t work that way. Instead, you start with a
function f (x) that you want to expand and a particular Bessel function Jp (x) that you want
to expand it into. A Fourier-Bessel series gives you a series representation of f (x) in terms of
the Bessel function you specified. For a function f (x) with domain 0 ≤ x ≤ a, a Fourier-Bessel
series expansion in terms of Jp looks like this.
(𝛼 )
∑
∞
p,n
f (x) = An Jp x
n=1
a
Question: Expand the function f (x) = sin x on the domain 0 ≤ x ≤ 2 into a series
based on the function J8 (x).
Solution:
Using the equation in Appendix J the first coefficient is:
( )
2
2 𝛼8,1
A1 = 2 sin(x)J8 x x dx
4J9 (𝛼8,1 ) ∫0 2
At this point we reach for a computer, both to find that 𝛼8,1 ≈ 12.2 and to numerically
evaluate the integral to get A1 = 3.94. The computer then tells us that 𝛼8,2 ≈ 16.0 and
then another integral gives A2 = 0.706, and so on. Putting these together, we get the
p = 8-order Fourier-Bessel series for sin x on the interval [0, 2].
( ) ( ) ( )
12.2 16.0 19.6
sin x = 3.94J8 x + 0.706J8 x + 1.68J8 x +…
2 2 2
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 694
As we add more terms the series gradually converges to sin x within the domain [0, 2].
Outside that domain it doesn’t match at all.
Stepping Back
Bessel’s equation arises in many problems, including the solutions of PDEs in cylindrical
coordinates. The general solution is AJp (x) + BYp (x). In the most common cases, x = 0 is
part of the domain of the problem and the solution is constrained to be finite at that point,
so we set B = 0.
In many applications, however, the ODE that arises is not precisely Equation 12.7.2, but
some close cousin of it. These other equations can often be turned in Bessel’s equation by
a simple change of variables, but when we convert back to the original variables we end up
with Bessel functions with different arguments and/or different coefficients in front. Many
of these functions arise in applications enough to have their own names. In the Problems you
will use variable substitution and Bessel functions to derive the “modified Bessel functions,”
the “Kelvin functions,” the “spherical Bessel functions” and the “Riccati–Bessel functions.”
You can find the most commonly used of these functions listed in terms of the differential
equations they solve in Appendix J.
however, for the case p = ±1∕2. In this prob- answer can be written as Equation 12.7.6
lem you’ll consider the case p = 1∕2. (The with the proper choice of c0 .
argument for p = −1∕2 is exactly the same.) ( )
12.104 The equation x 2 y′′ + xy′ + x 2 − 4 y = 0
(a) For p = 1∕2 what are the possible is Bessel’s equation with p = 2. Starting at
values of r ? the beginning with the method of Frobe-
(b) Write the equation for c1 . For which nius show that you get two r -values that
value of r can you conclude that differ by an integer, that the higher r -
c1 = 0 and for which value of r can values leads to a valid solution, and that
you not conclude this? the lower r -value does not.
(c) Write the recurrence relation for cn 12.105 In this problem you will prove that J−p (x) =
in terms of cn−2 , valid for n ≥ 2. (−1)p Jp (x) for positive integers p.9 Your start-
(d) For the value of r for which you can- ing point will be Equation 12.7.6.
not conclude that c1 = 0, write the first (a) Because Γ(x) is infinite when x is zero
three terms of the series solution that or a negative integer, the sum begins
begins with c1 . Your answer will have c1 with one or more terms that equal
in it as an undetermined constant. zero. Rewrite the lower limit of the
(e) For the value of r for which you can sum so instead of starting at m = 0 it
conclude that c1 = 0, write the first starts at the first non-zero term.
three terms of the series solution. (b) Reindex the sum so it begins at
Your answer will have c0 in it as an m = 0 again, but this time with all
undetermined constant. the terms non-zero.
(f) Looking at your answers to Parts (d)– (c) Pull (−1)p out of the sum. What
(e), why are you allowed to set c1 = 0 for other manipulations do you have to
p = 1∕2 without losing any generality? do to this equation to show that it
12.103 In the Explanation (Section 12.7.2) we found equals (−1)p Jp (x)? Hint: remember
that Bessel’s equation can be solved with gen- how gamma functions are related to
eralized power series using r = ±p. We then factorials.
used r = p to derive the formula for Bessel (d) You have now shown that J−p (x) and
functions of the first kind Jp (x). In this pro- Jp (x) are linearly dependent, but
cess you will derive the formula for J−p (x). that’s only true for integer p-values.
(a) Plug r = −p into the recurrence rela- Name at least one step in your solu-
tion Equation 12.7.3. Simplify your tion that was invalid for non-integer
answer as much as possible. values of p, and explain why.
(b) Use your recurrence relation to find
12.106 The Explanation (Section 12.7.2) dis-
c2 , c4 , and c6 in terms of c0 .
cussed the fact that Jp (x) and J−p (x) are lin-
(c) Write the general formula for c2m in early dependent for integer values of p (a
terms of c0 . Depending on how you fact that you proved in Problem 12.105).
answered the previous part this may be The function Yp (x) is therefore constructed
very easy or it may require a lot of think- to be the second solution to Bessel’s
ing about where the numbers came from. equation. The Explanation showed that
(d) Your answer to Part (c) should include Yp (x) solves the equation; in this problem
the product (p − 1)(p − 2)(p − 3) … you will show for a few specific cases that
(p − m). (If it doesn’t, figure out how to Jp (x) and Yp (x) are linearly independent.
make it include that product!) If you (a) Choose two positive numbers, one inte-
factor −1 out of each term it reverses ger and the other non-integer. Plot Jp (x)
all the subtractions and leaves a new and Yp (x) for p = 0 and for p equal to
factor of (−1)m outside. You cannot each of the two numbers you chose,
rewrite the resulting product with fac- six plots in all. Choose domains for the
torials in this case (do you see why?) so plots that allow you to see the behavior
rewrite it using gamma functions. for both small and large positive val-
(e) Write the resulting series solution ues of x. (You do not need to include
to Bessel’s equation. Show that your negative values of x in your plot.)
9 Clearly
if it’s true for positive integers p it’s also true for negative integers, but the proof is a bit easier if you
assume p is positive.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 696
(b) Based on your plots, describe how of your job is to determine the right
each of these six functions behaves constant to use.) Based on that sub-
in the limit x → 0+ . stitution find the formulas you need
(c) Based on your answer to Part (b) to replace x, dy∕dx, and d 2 y∕dx 2
argue that Jp and Yp cannot be lin- with u, dy∕du and d 2 y∕du 2 .
early dependent for the three val- (b) Substitute to rewrite the differential
ues of p you considered. equation in terms of y(u), with the
12.107 Hankel Functions. The solution to Bessel’s variable x completely gone.
equation is sometimes written in the (c) Choose the constant a to make
form AHp(1) (x) + BHp(2) (x), where Hp(1) and your equation look like Bessel’s
Hp(2) are “Hankel functions” of the first equation. Based on your equation
and second kind, respectively, defined as write the solutions y(x) to the origi-
Hp(1) = Jp (x) + iYp (x) and Hp(2) = Jp (x) − iYp (x). nal differential equation, the “mod-
ified Bessel functions.”
(a) Show that this solution is equiva-
lent to the general solution CJp (x) + 12.111 Kelvin Functions. Solve the equation
DYp (x) by finding what values of the y′′ + (1∕x)y′ − iy = 0 by making the substi-
arbitrary constants C and D make tution u = ax and choosing the appropriate
these two solutions equal. value of a to turn this into Bessel’s equation.
(b) Express the Hankel functions in If you are stuck you may want to first work
terms of Jp (x) and J−p (x). through Problem 12.110; the solution is
different but the process is the same.
12.108 In this problem you’ll examine the 12.112 Spherical Bessel Functions. In this prob-
behavior of J.1 , J.5 , and J1.5 . lem you will solve the equation x 2 y′′ + 2xy′ +
(a) Plot all three functions from x = 0 to [x 2 − n(n + 1)]y = 0. Rather than solving
x = 100. Describe the plots. What from scratch with the method of Frobe-
is happening to the amplitude? nius, you can use a variable substitution to
To the frequency? turn it into Bessel’s equation and use the
(b) For each function, make a list of the first solution we have already worked out.
100 zeroes. Then, for each function, (a) The substitution will be of the form
make a list of the differences between v = x s y where s is a constant. (Part
successive zeroes. (For example, if you of your job is to determine the right
were doing this for sin x the second constant to use.) Based on that sub-
list would be (𝜋, 𝜋, 𝜋, …).) Describe stitution find the formulas you need
what is happening to the distance to replace y, dy∕dx, and d 2 y∕dx 2 with
between zeroes in each case. the appropriate functions of v.
(c) Repeat Parts (a)–(b) for Y.1 , Y.5 , and Y1.5 . (b) Plug all that into the equation. The result
12.109 Equation 12.7.1 comes up in the solution should be a differential equation for v(x),
of important partial differential equations with the variable y completely gone. Sim-
in polar coordinates. Apply the substitution plify your answer as much as possible.
x = k𝜌 to rewrite this differential equation. (c) Multiply both sides of your equation by
Remember that this is a substitution for the the same thing so that your equation,
independent variable, so you will have to like Bessel’s equation, will have x 2 as the
write dR∕d𝜌 and d 2 R∕d𝜌2 in terms of dR∕dx coefficient of the second derivative.
and d 2 R∕dx 2 . Show that the final result can (d) Choose the right value of the constant
be written in the form of Equation 12.7.2. s so that your equation, like Bessel’s
12.110 Modified (or Hyperbolic) Bessel Functions. equation, will have x as the coeffi-
In this problem you will solve the equation cient of the first derivative.
x 2 y′′ + xy′ − (x 2 + p 2 )y = 0. Rather than solv- (e) You should recognize your equation now
ing from scratch with the method of Frobe- as Bessel’s equation, with a different con-
nius, you can use a variable substitution to stant playing the role of p. Based on your
turn it into Bessel’s equation and use the equation and the substitution you used,
solution we have already worked out. write the solution y(x) to the original dif-
(a) The substitution will be of the form ferential equation. These functions are
u = ax where a is a constant. (Part called “spherical Bessel functions.”
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 697
12.113 Riccati–Bessel Functions. Solve the equation (a) Begin by making the substitution
x 2 y′′ + [x 2 − n(n + 1)]y = 0 by making y(x) = x a u(x). You are replacing the
the substitution v = x s y and choosing dependent variable y with a new
the appropriate value of s to turn this dependent variable u, so you will end
equation into Bessel’s equation. If you are up with a second-order differential
stuck you may want to first work through equation for u(x). Simplify as much as
Problem 12.112; the solution is differ- possible.
ent but the process is the same. (b) Now apply the substitution z = bx c .
12.114 The last few problems featured modi- This time you are replacing the inde-
fied Bessel functions, Kelvin functions, pendent variable x with a new inde-
Spherical Bessel functions, and Riccati– pendent variable z, so you will end
Bessel functions. All these functions are up with a second-order differential
special cases of the solution to the fol- equation for u(z). Simplify as much as
lowing differential equation. possible.
[ ] (c) Write the solution to the orig-
1 − 2a ′ a2 − p2 c 2 inal equation.
y′′ + y + (bcx c−1 )2 + y=0
x x2
10 Actually
the ODE is more complicated because it involves terms related to the particle’s angular momentum.
Here we’re considering particles with zero angular momentum.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 698
In classical mechanics the energy could be any non-negative real number; for any particu-
lar energy, you could write an equation for the motion. In quantum mechanics, as you will
see, only certain energy levels are possible; for any allowable energy level, you can write an
equation for the wavefunction.
1. Look up the equation for “spherical Bessel functions” in Appendix J. Do a variable
substitution of the form u = kr and find the value of k that turns Equation 12.8.1 into
that equation. (The spherical Bessel equation has a parameter p in it. Equation 12.8.1
should match it for one particular value of p.)
2. Write the general solution to Equation 12.8.1 in terms of spherical Bessel functions.
Write your answer in terms of r , not in terms of u.
See Check Yourself #87 in Appendix L
3. One boundary condition says that R(r ) must be finite at r = 0. Look up the properties
of spherical Bessel functions in Appendix J and use that boundary condition to set
one of the arbitrary constants in your general solution equal to zero.
4. The second boundary condition says that R(a) = 0. Using this boundary condition,
what are the possible values of E?
5. This problem—which is not just Equation 12.8.1, but also includes the two boundary
conditions—cannot be solved for all values of E. It can only be solved for the values
of E that you listed in Part 4. Each allowable E is called an “eigenvalue” of this prob-
lem. Is there a finite number of eigenvalues, or an infinite number? Is there a lowest
eigenvalue? Is there a highest eigenvalue?
6. Every eigenvalue E is associated with a particular solution to this problem, which is
called its “eigenfunction.” Write the first two eigenfunctions. If you are doing this
exercise as homework (or in a classroom with computers), choose any positive values
for m and a and graph those eigenfunctions.
c1 y(a) + c2 y′ (a) = 0
(12.8.3)
c3 y(b) + c4 y′ (b) = 0
A particular Sturm-Liouville problem is defined by specific functions p(x), q(x) and w(x) and specific
constants c1 –c4 . The problem definition does not specify the constant 𝜆. Rather, your goal is to
find the constants 𝜆 for which solutions exist. These 𝜆-values are called the “eigenvalues” of the
differential equation. Each such 𝜆 is associated with a particular solution, its “eigenfunction” y𝜆 (x).
The functions p, p ′ , q, and w should all be real and continuous, and both p and w should be
positive on the interval (a, b).
Equation 12.8.2 may look unfamiliar, but it is just an unusual way of expressing any second-
order linear homogeneous ODE (as you will show in Problem 12.125). Equations 12.8.3 may
also look unfamiliar, but again, they are a general way of writing the sorts of boundary condi-
tions we are used to. Our purpose in this section is not to introduce a new type of equation,
but to discuss a general class that includes many of the differential equations we have seen.
But unlike many differential equations, Equation 12.8.2 has an unspecified constant. Your
job is not only to solve the differential equation, but to find the 𝜆-values for which solutions
exist. This type of problem is very common: when you separate variables in a partial dif-
ferential equation, you get ordinary differential equations with unspecified constants. So
Sturm-Liouville theory is most useful for describing the normal modes of a PDE.
Even in that case, Sturm-Liouville theory doesn’t always apply. Both Equation 12.8.2 and
Equations 12.8.3 are necessarily homogeneous, so they do not apply to many important
ODEs and conditions. In addition, Equations 12.8.3 represent one condition on each end
of the boundary. This is the sort of conditions we usually get for spatial variables (as in the
example below, where the boundary conditions are y(0) = y(L) = 0). With time variables we
often get separate values for y(0) and y′ (0), which cannot be expressed in this form.
But despite these limitations, Sturm-Liouville theory applies to a broad class of important
problems. In the example below we solve an old favorite example in order to acquaint you
with this new formalism.
d 2y
Solve + 𝜆y(x) = 0 on 0 ≤ x ≤ L with y(0) = y(L) = 0 (12.8.4)
dx 2
You know how to do this, but let’s walk quickly through the steps with Sturm-Liouville
terminology.
1. Find the solution or solutions to the differential equation. The real-
valued solution in this case breaks into three categories based on 𝜆.
√ √
⎧ Ae −𝜆 x + Be − −𝜆 x ∶ 𝜆<0
⎪ (√ ) (√ )
y = ⎨ A cos 𝜆 x + B sin 𝜆x ∶ 𝜆>0
⎪
⎩ Ax + B ∶ 𝜆=0
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 700
2. Apply the boundary conditions. This will in general restrict the possible values of
𝜆, as well as eliminating some solutions. In this case the 𝜆 < 0 solutions cannot
possibly meet the boundary conditions (Problem 12.128), and the 𝜆 = 0 solu-
tion can only meet them with the trivial solution y(x) = 0. So we focus on positive
𝜆-values and their sine-and-cosine solutions.
√ The condition y(0) = 0 means that
A = 0. The condition y(L) = 0 means that 𝜆 L = n𝜋 for any positive integer n.
What have we found?
∙ If we choose n = 1 then 𝜆 = 𝜋 2 ∕L 2 . This is the( first eigenvalue
) of the prob-
lem. It gives us the differential equation y′′ + 𝜋 2 ∕L 2 y = 0 whose solution,
the eigenfunction associated with that eigenvalue, is y𝜋 2 ∕L2 = B sin (𝜋x∕L).
∙ If we choose n = 2 then 𝜆 = 4𝜋 2 ∕L 2 . This is the
( second) eigenvalue of the prob-
lem. It gives us the differential equation y′′ + 4𝜋 2 ∕L 2 y = 0 whose solution,
the eigenfunction associated with that eigenvalue, is y4𝜋 2 ∕L2 = B sin (2𝜋x∕L).
∙ More generally, every eigenvalue of this problem is of the form 𝜆 = n 2 𝜋 2 ∕L 2
for a positive integer n. Each such value is associated with an eigenfunction
y𝜆 = B sin (n𝜋x∕L).
Usually when you solve a second-order ODE with two boundary conditions you expect to
get a unique answer, but with a Sturm-Liouville problem you never will. That’s because the
ODE and the boundary conditions are linear and homogeneous, so you can stick an arbitrary
constant in front of any solution. Between the two boundary conditions you can solve for one
arbitrary constant (setting A = 0 in the example above) and restrict the possible values of 𝜆.
For 𝜆 = 9𝜋 2 ∕L 2 the problem above has infinitely many solutions, and for 𝜆 = 3𝜋 2 ∕L 2 it has
none.
If Equation 12.8.4 had come from separating variables in a PDE, every eigenfunction
yn = B sin (n𝜋x∕L) would represent one normal mode of the system. (The system in this case
might be a vibrating string tacked down at the ends.) We would then need to build the initial
position of the string as a series of those functions.
You don’t need Sturm-Liouville theory to tell you we can sum these eigenfunctions to
match almost any given initial condition; you just need to have studied Fourier series. You can
build any function on a finite domain from sines (by building an odd extension), subject to
the not-very-restrictive Dirichlet conditions. All you need is the formula for the coefficients,
and you can derive that formula based on the orthogonality of the trig functions. (If you
don’t know all that, you may want to review Chapter 9.)
But Sturm-Liouville theory promises us similar results over a very wide class of solutions
to Equation 12.8.2. It tells us that …
∙ A list of the eigenvalues from lowest to highest will start somewhere (the lowest eigen-
value), but it will generally go on infinitely up from there (no highest eigenvalue). In
our example above we saw that the eigenvalues were 𝜋 2 ∕L 2 , 4𝜋 2 ∕L 2 , 9𝜋 2 ∕L 2 and so on.
∙ The nth eigenfunction will have n − 1 zeros on the interval (a, b). In our example above
the nth eigenfunction B sin(n𝜋x∕L) crosses the x-axis exactly n − 1 times between 0 and
L (not including the endpoints).
∙ If p(a) = 0 then “y must remain finite at x = a” is a sufficient condition at that boundary.
(This also means you cannot specify a particular numerical value at that boundary; you
solve the problem and find out what value y must have at the boundary.) If p(a) ≠ 0
then a specific numerical boundary condition must be specified at that boundary, and
the constant(s) can be chosen to meet that condition.
∙ And of course the same is true at the other boundary x = b.
But the most important result of Sturm-Liouville theory is worth putting in its own box.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 701
The property of “completeness” promises us that we can build any initial condition from our
normal modes, and “orthogonality” tells us how to do it. At this point we should define those
terms a bit more carefully.
A set of functions yn (x) is “orthogonal” on an interval a ≤ x ≤ b with respect to the “weight function”
w(x) if
b
w(x)ym∗ (x)yn (x)dx = 0 for all m ≠ n
∫a
The sentence “the functions sin(n𝜋x∕L) are orthogonal on the interval from 0 to L” is just a
way of asserting that that integral comes out zero when m ≠ n. But orthogonality isn’t quite
enough to find the coefficients of a Fourier series. We also need to know the “norm” of these
sine functions, meaning the value of the integral when m does equal n. In this case it equals
L∕2. We can express both the m ≠ n and m = n results in a piecewise function, but it is more
common to use the “Kronecker delta.”
Using this symbol we can express the integral of the sine products concisely.
L ( ) ( )
m𝜋 n𝜋 L
sin x sin x dx = 𝛿mn (12.8.5)
∫0 L L 2
Now suppose we want to represent an arbitrary function on the interval [0, L] as a series of
sine functions.
∑∞ ( )
n𝜋
y0 (x) = bn sin x (12.8.6)
n=1
L
We can find the coefficients using Fourier’s trick, which relies only on the orthogonality
relationship (12.8.5). Begin by multiplying both sides of Equation 12.8.6 by sin(m𝜋x∕L) and
integrating.
L ( ) ∑∞ L ( ) ( )
m𝜋 m𝜋 n𝜋
y0 (x) sin x dx = bn sin x sin x dx
∫0 L n=1
∫0 L L
Because of orthogonality, all the terms in the sum vanish except for the term n = m, which
L
gives (L∕2)bn . Multiplying both sides by (2∕L), bn = (2∕L) ∫0 y0 (x) sin(n𝜋x∕L)dx.
Our point here is not to re-teach you Fourier series. Our point is that we were able to find
the coefficients of a Fourier series because we knew that the sine and cosine functions are
orthogonal. Sturm-Liouville theory guarantees us that a much broader class of functions is
orthogonal, and in doing so it gives us the way to derive their series expansion formulas.
Problem:
The “Chebyshev polynomials” of the first kind, Tn (x), are complete on the interval
[−1, 1] with orthogonality relation
⎧ 0 n≠m
1
1 ⎪
√ Tn (x)Tm (x)dx = ⎨ 𝜋 n=m=0
∫−1 1 − x2 ⎪ 𝜋∕2 n = m ≠ 0
⎩
Find the fourth partial sum of the series expansion of e x in Chebyshev polynomials.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 703
Solution:
We begin by writing the expansion.
∑
∞
ex = cn Tn (x)
n=0
From this equation we can get the numerical values of the coefficients on a computer.
1 1 1
c0 = 𝜋
∫−1 √ T0 (x)e x dx ≈ 1.266
1−x 2
2 1 1
c1 = 𝜋
∫−1 √ T1 (x)e x dx ≈ 1.130
1−x 2
2 1 1
c2 = 𝜋
∫−1 √ T2 (x)e x dx ≈ 0.271
1−x 2
2 1 1
c3 = 𝜋
∫−1 √ T2 (x)e x dx ≈ 0.044
1−x 2
It’s clear from these numbers that the coefficients are rapidly shrinking. This is
further confirmed by plotting each of the first four partial sums. Each plot below
shows e x (solid) and a partial sum of its Chebyshev series (dashed). By the fourth
partial sum the plots are indistinguishable within the interval [−1, 1].
c0T0 c0T0 + c1T1 c0T0 + c1T1 + c2T2 c0T0 + c1T1 + c2T2 + c3T3
3.5 7 7
3.0 6 6 6
2.5 5 5
2.0 4 4 4
1.5 3 3
1.0 2 2 2
0.5 1 1
x x x x
–2 –1 1 2 –2 –1 1 2 –2 –1 1 2 –2 –1 1 2
The most important lesson from the example above is that you didn’t need to have heard of
a Chebyshev polynomial. You needed to know a few key properties that you could easily look
up, and ask a computer to calculate some numerical integrals.
The properties you need to know about any given set of functions, if you want to build a
series from it, are its orthogonality relationship (which Sturm-Liouville theory often guaran-
tees must exist) and its norm. Finding the norm is done on a case by case basis. It’s pretty easy
for sines and cosines. In Section 12.5 Problem 12.87 you derived the norm for the Legendre
polynomials.
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 704
Assuming the drum starts in some shape f (𝜌) and is released from rest, its motion is
∑
described by the equation z(𝜌, t) = ∞ n=1 Bn J0 (𝛼0,n 𝜌∕a) cos(v𝛼0,n t∕a). The derivation of
those cosine functions is not part of this chapter (see Chapter 11 for that part of the
story). But once you have the solution in this form you can write the initial condition as
∑
f (𝜌) = ∞n=1 An J0 (𝛼0,n 𝜌∕a), at which point Sturm-Liouville theory comes into play again.
( )
2
a 𝛼0,n
6. Write the coefficients Bn = f (𝜌)J0 𝜌 𝜌d𝜌.
a J1 (𝛼0,n ) ∫0
2 2 a
Explanation: Sturm-Liouville theory says that the eigenfunctions J0 (𝛼0,n 𝜌∕a) must form
a complete set, so any initial condition f (𝜌) can be written in this form. The orthog-
onality of the eigenfunctions, also promised by Sturm-Liouville theory, enables us to
calculate the formula for the coefficients in Appendix J. You’ll do that calculation in
Problem 12.117 (taking a = 1 for simplicity).
7in x 10in Felder c12.tex V3 - February 27, 2015 3:48 P.M. Page 705
(a) Begin by solving for Φ(𝜙). The condition sufficient boundary condition at 𝜌 = 0,
of periodicity will restrict p to a discrete- while you need a specific value at 𝜌 = a?
but-still-infinite set of allowed values. (d) Solve for T (t).
(b) Use a variable substitution to put the (e) Multiply your three separate func-
R(𝜌) equation into a familiar form tions to create one solution z(𝜌, 𝜙, t).
and then solve it. The solution will be You will at this point have five arbi-
more general than the one we found trary constants, but you can absorb
with azimuthal symmetry because it one into the other four.
will have a p in it. Make sure to apply (f) Write the general solution as a series
its two boundary conditions and note over the indices n and p.
the resulting restriction on k. (g) Show that if you apply the assump-
(c) How could you have known from look- tions of azimuthal symmetry and
ing at the ODE (not at the solution or ̇ 𝜙, 0) = 0 to your final solution, it
z(𝜌,
the physical setup) that “R is finite” is a becomes the solution we found.
C H A P T E R 13
This chapter is not an introduction to complex numbers. Chapter 3 presented all of the skills above
(except the last one). It is particularly important that you are comfortable visualizing numbers on
the complex plane and working with oscillatory functions in complex form. If those skills feel rusty,
please review them before diving into this chapter.
After you read this chapter, you should be able to …
∙ graph a complex function in several different ways, including a “mapping” from one region
of the complex plane to another.
∙ identify branch cuts and singularities (especially poles) of a complex function.
∙ find the derivative of a complex function, or prove that none exists.
∙ determine if a given function is analytic.
∙ use analytic functions to solve Laplace’s equation with boundary conditions, both with and
without conformal mapping.
∙ evaluate contour integrals, especially around closed loops.
∙ use contour integrals to evaluate real integrals, and to find inverse Laplace transforms.
∙ find the Taylor series or Laurent series for a complex function around a given point, and
find the radius of convergence of such a series.
∙ find the residue of a complex function at a given pole, with or without a Laurent series.
In Chapter 3 we worked with many complex-valued functions, especially those of the form
𝐗e i𝜔t . But although the results of those functions were complex, the arguments (t in this
example) were generally real. This chapter will focus on complex functions of complex vari-
ables, the study of which is called “complex analysis.” This will allow us to move from complex
algebra to complex calculus, extending the ideas of differentiation and integration to com-
plex functions. As we found with complex algebra, doing calculus on complex functions is
often the easiest way to answer difficult questions about real functions.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 709
𝜕 2 T 1 𝜕T 1 𝜕2T
+ + =0 T=1
𝜕𝜌2 𝜌 𝜕𝜌 𝜌2 𝜕𝜙2
2. Once again, from symmetry, we can conjecture that T only
depends on one of the two polar coordinates. Which one, and x
T=0
why?
3. Using that fact, rewrite Laplace’s equation as an ODE and FIGURE 13.2
solve it with the correct boundary conditions.
See Check Yourself #91 in Appendix L
Those problems may have required dusting off a few skills with partial
derivatives and ODEs, but hopefully nothing about them seemed new y
or unsolvable. Now try this one on for size. Consider the same semi-
infinite slab we started with, but this time assume the bottom and right
edges are at T = 0 and the left edge is at T = 1. (Figure 13.3.) You’re T = 1
T=0
welcome to try to solve this, but unless you’ve studied complex analysis
you’re going to find it a lot harder than the others.
4. Even though you can’t solve Laplace’s equation with these bound- T=0 π x
2
ary conditions, you can still predict a fair amount about the behav-
ior from what you know. What is the temperature range inside the FIGURE 13.3
slab? Where is it hottest, and where coldest? What other qualita-
tive predictions can you make?
[ ]
We’ll go ahead and solve the equation for you: T (x, y) = (2∕𝜋) tan−1 cot(x)(e y −e −y )∕(e y + e −y ) .
Sorry to give that away when you were just on the verge of guessing it! While this problem
looks superficially similar to the slab problem we started with, we can use complex analysis
to show that it is actually mathematically equivalent to the problem in Figure 13.2. Once you
have the solution to the first quadrant problem, 5–10 minutes of easy algebra can convert
it to this hideous-looking solution to the second slab problem. You won’t get to that point
until Section 13.9, though. First, we have to start with the idea of a complex function…
1 For the problems in this exercise to have unique solutions we have to assume the implicit boundary condition that
T remains finite as y → ∞.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 710
2 2
2
1 1
f Re( f ) 0 Im( f ) 0
1
–1 –1
0 –2 –2
–π –π –π
Re(z) π Re(z) π Re(z) π
FIGURE 13.5 Plots of the modulus, real part, and imaginary part of f (z) = sin z. The function oscillates in the real
direction and grows exponentially in both the positive and negative imaginary directions.
Over the years you have studied elements of real graphs such as holes, asymptotes, and
jump discontinuities. These are not just visual effects; they are visual ways of describing and
understanding the behavior of a function. Complex graphs have their own peculiar phenom-
ena, and as you learn about them you will learn more about complex functions. Below we
introduce two of the most important such properties, “branch cuts” and “poles.”
When you multiply two complex numbers, their moduli multiply and their phases
add. So when you cube a complex number, its modulus cubes and its phase triples.
That tells us how to find a cube root. The number −8 has a modulus of 8 and a phase
of 𝜋, so its cube root must √have a modulus of 2 and a phase of 𝜋∕3. That gives us the
number 2e (𝜋∕3)i , aka 1 + 3i.
But that’s only one answer; we’re supposed to get three! Well, we said above that
the number −8 has a phase of 𝜋, but remember that the phase itself is multivalued.
We can also describe −8 as having a modulus of 8 and a phase of 3𝜋. √ Now its cube
root has a modulus of 2 and a phase of 𝜋, giving us a new answer, −8 = −2. (That
3
answer
√3
at least comes as no √ surprise.) We can also describe −8 as having a phase of 5𝜋,
so −8 = 2e (5𝜋∕3)i = 1 − 3i. There are our three answers. (We’ll leave it to you to
figure out what happens when you consider the phases 7𝜋, 9𝜋, and so on.)
√
Question: If we create a branch so that 3 z is a single valued function, where is the
branch cut? (Assume here that the phase function is defined as going from −𝜋 to 𝜋.)
Answer:
If you start with a number z whose phase is slightly less than 𝜋, its cube root will have a
phase slightly less than 𝜋∕3. Now move z down just a bit in the complex plane
and—because of the way we have defined phase—suddenly its phase undergoes a
discontinuous jump to a number slightly above −𝜋, so its cube root becomes
something completely different.
Im(z)
Re(z)
√
The wavy line shows the branch cut for 3 z, which is the same as the branch cut for the phase. The arrows
show that when you take the cube root of a number just above the branch cut its phase goes from 𝜋 to 𝜋∕3,
while the cube root of a number just below the branch cut takes its phase from −𝜋 to −𝜋∕3.
The moral of the story is that functions that depend on phase inherit the branch
cut from the phase function. For fractional powers such as x 1∕3 , and (as we’ll see
below) for the logarithm, the principal value has a branch cut on the negative real
axis.
At x = −4 and x = 7 this function is undefined, but around those values the behavior is
very different. x = 7 is a “removable discontinuity” or “hole” meaning that the undefined
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 714
value has no effect on the nearby behavior of the function. x = −4 is a “vertical asymptote”
meaning that the function blows up around this value, or lim ||f (x)|| = ∞. (The absolute
x→−4
value is required because the function might approach ∞, −∞, or both.)
Now consider the following complex-valued function of a complex variable.
(z − 4)(z − i)
f (z) =
(z − 4)(z − 7 + 2i)
This function is undefined at z = 4 and z = 7 − 2i, but once again we see very different behav-
ior around these points. The function is completely unaffected by the z − 4 terms in the
numerator and denominator except at z = 4 itself. But around z = 7 − 2i the function blows
up: lim ||f (z)|| = ∞ where the modulus of this function serves roughly the same purpose
z→7−2i
that the absolute value served for the previous one.
Both z = 4 and z = 7 − 2i are called “singularities” of f (z). The former is a “removable
singularity” or “hole.” The latter is a “pole.” Later in the chapter we’ll define the term “pole”
more precisely and introduce a third type, an “essential” singularity, but the most important
thing to know about poles is that the modulus of the function approaches infinity as you
approach them.
the left by 3. Similarly, f (z) + 7 − 2i adds 7 − 2i to every value of f , while f (z + 7 − 2i) shifts
the graph by −7 in the real direction and +2 in the imaginary.
Finally, a note about notation: many texts use “log z” for the natural log of a complex
number. In some cases (lowercase) “log z” means the original multivalued function, and
(uppercase) “Log z” the principal branch. All this can be useful, but we don’t need it for
our purposes, so we will stick to “ln z” and if we want the principal branch we’ll say so.
three facts that you can see about the func- 13.11 (a) If the complex number z has mod-
tion cos z by looking at your graph. ulus |z| and phase 𝜙z , what are the
13.7 Consider Equation 13.2.1 for the modulus and phase of z 1∕2 ?
special case y = 0. (b) What are the two values of (4i)1∕2 ?
(a) In this case, what are the real and (c) We can define the principal value of z 1∕2 by
imaginary parts of cos z? using the principal value of 𝜙 in your for-
(b) If you were to follow along the com- mula from Part (a). Where is the branch
plex plane from x = 0 to x → ∞, always cut for the principal value of z 1∕2 ?
holding y at zero, what behavior would 13.12 Which of the following functions
you see in the cosine function? have branch cuts?
(c) How do your answers change if y is (a) z 3
held at a non-zero constant y0 ? (b) z 1∕5
13.8 Consider Equation 13.2.1 for the (c) z 2∕5
special case x = 0. (d) e z
(a) In this case, what are the real and (e) ln |z|
imaginary parts of cos z?
13.13 Identify all the zeros (places where
(b) If you were to follow along the com- f (z) = 0) and poles of the function
plex plane from y = 0 to y → ∞, always
(z + 1)(z − 2i)
holding x at zero, what behavior would f (z) = .
(z − 3)3 (z + 5 − 2i)
you see in the cosine function?
1
(c) How do your answers change if x is 13.14 The real function 2 has no vertical
x +9
held at a non-zero constant x0 ? 1
asymptotes, but the complex function 2
13.9 For complex numbers, just as for real num- z +9
has two poles. Where are they?
bers, sin z = cos (z − 𝜋∕2). Once you know z+1+i
the behavior of the cosine function on the 13.15 The function f (z) = has
(z + 4i)(z + 1 + i)
complex plane, how does that behavior a removable singularity and a pole.
translate to the behavior of the sine func-
(a) Identify the removable singularity.
tion? Hint: remember that 𝜋∕2 is a real
How does f (z) behave for z-values
number!
close to that number?
(b) Identify the pole. How does f (z) behave
for z-values close to that number?
13.10 For each of the functions below, make
3D plots of |f |, Re(f ), and Im(f ). In each 13.16 The Explanation (Section √ 13.2.2) shows that
case the horizontal plane will represent z the principal branch of 3 z has a branch
and the vertical axis will be the real quan- cut on the negative real axis. Using this fact
tity you are plotting. Choose your domains you should be able to answer the following
so that you can clearly see how the function questions without too much work.
behaves. Describe some property of each (a) Where is √the branch cut for the
function that is most easily seen in one of the function 3 z − 2 − 3i?
plots, and what about that plot reveals that (b) Where is √the branch cut for the
property. function 3 z − 2?
(a) f (z) = z 3 (c) Where is √the branch cut for the
(b) f (z) = z ∗ 3
function z − 3i?
(c) f (z) = tan z (d) Where is √the branch cut for the
(d) f (z) = ln(cos z) function 3 z − 2 − 3i?
So df ∕dx does not just ask how much f changed; it divides that by how much x changed,
which measures how far you traveled along the x-axis.
The derivative of a complex function can be written the same way.
( )
df Δf
= lim
dz Δz→0 Δz
So df ∕dz divides the change in f by how much z changed. But Δz is not a measure of how
far you traveled; it is a complex number that measures how much z changed as you traveled
in whatever distance you did your traveling. To consider what all that means, we present you
with the following fact.
This tells us that if we start at z = 5 and move off by a reasonably small Δz the function should
change by roughly 10 Δz no matter what direction we move in. Let’s see how that pans out.
1. What is f (5)?
2. Suppose you move away from z = 5 by one unit in the positive real direction. What
is the new z, and what is Δz? What is the new f , and what is Δf ? Is their ratio
approximately 10?
3. Suppose you move away from z = 5 by one unit in the positive imaginary direction.
What is the new z, and what is Δz? What is the new f , and what is Δf ? Is their ratio
approximately 10?
See Check Yourself #93 in Appendix L
4. Suppose you move away from z = 5 by one unit in the negative real direction.
What is the new z, and what is Δz? What is the new f , and what is Δf ? Is their ratio
approximately 10?
5. Suppose Δz = 1 + i. What is the new z? What is the new f , and what is Δf ? Is it still
approximately 10Δz?
6. In every case above you should have found that Δf was approximately 10 times Δz,
but never exactly. If f ′ (5) = 10 exactly (which it is), why didn’t your answers come out
exactly that way?
df f (z + Δz) − f (z)
≡ lim Definition of the derivative
dz Δz→0 Δz
For a real function f (x) we have two choices for how to take the limit Δx → 0: from the
right or from the left. For example, applying the definition of the derivative to the function
f (x) = |x| at x = 0 gives f ′ (x) = 1 if we take the limit from the right and f ′ (x) = −1 if we take
it from the left. We therefore say the derivative of |x| is undefined at x = 0.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 718
Im(z)
Δz = iΔy
Δz = Δx
Re(z)
Real direction: From the definition of the complex conjugate, f (i + Δx) = −i + Δx, so
the definition of the derivative gives
We can make sense of these numbers by remembering that the complex conjugate
always mirrors over the real axis. If we start at z = i and move to the right, z ∗ moves to
the right with us; that’s why Δf ∕Δz = 1 in the real direction. If we start at z = i and
move up, z ∗ moves down; that’s why Δf ∕Δz = −1 in the imaginary direction.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 719
The example above shows that z ∗ has no derivative at z = i because the derivatives in the
real and complex directions are not the same. If those two derivatives had come out the same,
that still would not prove that the derivative was defined—it might come out different along
a diagonal line, or along a parabola, or along any other path that leads to z = i. A function
is only differentiable if all possible paths give the same answer. This is, in many cases, easier
to prove than it sounds.
The above bit of algebra shows that the derivative of z 2 is 2z, but it also proves that the
complex function z 2 is differentiable at all.
The algebra in this case looks identical, step by step, to the same problem with real
numbers. We did not take the limit until the last step, by which point it is clear that as Δz
approaches zero—no matter how it approaches it!—the derivative is 2z. This enabled us to
confidently claim in the Discovery Exercise that f ′ (5) = 10. In Problem 13.18 you’ll try to use
the same formula on z ∗ and discover that it doesn’t work.
Analytic Functions
The field of complex analysis is mostly the study of “analytic functions,” which is roughly
synonymous with “differentiable functions.” More precisely, we say a function is “analytic”
at z = z0 if there is some disk centered on z0 in which the function is differentiable. The
mathematical term for such a disk is a “neighborhood” of z0 .
Some sources define “analytic” to mean that a function has a Taylor series that converges
to the value of the function, and use “holomorphic” for a function that is differentiable. For-
tunately there is a theorem proving that the two are equivalent, so this definition of “analytic”
will serve our purposes.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 720
We would love to summarize this even more concisely by saying “practically all functions
are analytic outside of obvious discontinuities,” but that isn’t true. For example the complex
conjugate, a simple and very important function, is continuous everywhere and differen-
tiable nowhere. Fortunately there is a rigorous and easy test for determining if any function
is analytic.
A complex function starts with a complex number z = x + iy and produces a new complex
number f (z) = u + iv. So every complex function is defined by two real functions u(x, y) and
v(x, y). For instance, z 2 = (x + iy)2 = x 2 + 2ixy − y2 , so we can describe it as u(x, y) = x 2 − y2
and v(x, y) = 2xy. The “Cauchy-Riemann equations” determine if f (z) is analytic based on
the properties of its two constituent functions.
This rule is reversible: if u(x, y) and v(x, y) have continuous partial derivatives and satisfy the Cauchy-
Riemann equations in some region, then f (z) is analytic in that region.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 721
You may want to confirm for yourself, as an example, that f (z) = z 2 satisfies the Cauchy-
Riemann equations.
These equations show that being analytic is actually a very restrictive condition. If you
choose a given function u(x, y), Equations 13.3.1 tell you what 𝜕v∕𝜕x and 𝜕v∕𝜕y must be, and
therefore determine v(x, y) to within an additive constant. If your v function does not happen
to be a perfect match for your u function, the resulting f (z) will not be analytic.
Below we will see another property of analytic functions, easily derivable from the
Cauchy-Riemann conditions: they give us solutions to an important differential equation
called Laplace’s equation.
The above problem is an example of a “Dirichlet problem.” You are given the value of a
function on the boundary of a region and ask to find a solution to a PDE for that function in
the interior of that region, matching the given boundary conditions. The Dirichlet problem
for Laplace’s equation arises, not only in thermodynamics, but in electrostatics (because
electric potential V obeys the same mathematical law as temperature T ) and other fields. In
these cases you are looking for a function that has two properties.
∙ The function T (x, y) obeys Laplace’s equation inside a given region. In two dimensional
Cartesian coordinates Laplace’s equation looks like this.
𝜕2T 𝜕2 T
+ 2 = 0 Laplace’s equation in two dimensions (13.3.2)
𝜕x 2 𝜕y
Any function that satisfies Laplace’s equation is called “harmonic.”
∙ The function meets certain specified values along the boundary of that region. In our
example above we need T (x, y) = T0 on the inner circle and T (x, y) = 2T0 on the outer.
Such a problem has nothing to do with complex numbers. But one of the most powerful
approaches to such problems begins with a remarkable property of analytic functions.
It’s a relatively short hop from the Cauchy-Riemann conditions to that fact, as you will
demonstrate in Problem 13.31. But here let’s focus on what that fact tells us. We said in
Section 13.3.2 that the function z 2 is analytic. We also said that its real part is u(x, y) = x 2 − y2
and its imaginary part is v(x, y) = 2xy. We therefore have a guarantee that the function x 2 − y2
satisfies Laplace’s equation (13.3.2). We also have a guarantee that the function 2xy satisfies
Laplace’s equation. We can use those facts if we happen to be looking for harmonic functions
in the right regions.
Problem:
Find a function V (x, y) that satisfies Laplace’s equation with boundary conditions
V (x, y) = 1 along the curve x 2 − y2 = 1 and V (x, y) = 4 along the curve x 2 − y2 = 4.
y
3
1 V=1
V=4
x
1 2 3
–1
–2
–3
Solution:
The function V (x, y) = x 2 − y2 works perfectly. It clearly meets those two boundary
conditions, and we know it is harmonic because it is the real part of the analytic
function z 2 .
Most problems don’t work out quite that easily, of course. But because any composition
of analytic functions is itself an analytic function, you can often stretch your solution to meet
less contrived boundary conditions. This brings us to the problem we started with, the tem-
perature between two concentric circles.
Problem:
Find the steady-state temperature in the region between the circles x 2 + y2 = R 2 and
x 2 + y2 = 4R 2 if the temperature on the inner circle is held constant at T = T0 and
the temperature on the outer circle is held constant at T = 2T0 . (Assume the region
is flat and insulated on top and bottom so we can ignore the third dimension.)
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 723
y
T = 2T0
T = T0
x
R 2R
Solution:
The temperature distribution must obey two rules: it must satisfy Laplace’s equation
inside our region, and it must meet the given conditions on the boundary of our
region.
We said above that ln z is an analytic function (except along its branch
(√cut) and) we
said on Page 714 that ln z = ln |z| + i𝜙. That means its real part u = ln 2
x +y 2
and its imaginary part v = tan−1 (y∕x) must both be harmonic functions. The first of
these only depends on distance from the origin, so it is constant along the inner
circle and along the outer circle.
That’s a good start, but to match the boundary conditions we need some arbitrary
constants. Recall that the composition of analytic functions is still analytic, so if we
multiply by a constant, then take the log, and then multiply by another constant, it’s
still analytic: f (z)(= A ln(Bz). )
If we take A and B to be real then the real part of f is
√
2 2
A ln(B|z|) = A ln B x + y . Now we have a harmonic function with two arbitrary
constants that is constant along circles centered on the origin, and we can use the
boundary conditions to find those constants: A ln(BR) = T0 , A ln(2BR) = 2T0 . A little
algebra gives A = T0 ∕ ln 2, B = 2∕R, so
( √ )
T0 2
T (x, y) = ln x 2 + y2
ln 2 R
(The argument of the log is unitless and the coefficient in front has units of
temperature, so the units work.)
You may well have two objections to this process. First: how do we know that the function
we found is the right one? The answer lies in a uniqueness theorem. Given a closed bounded
region with a function V specified everywhere on the boundary, there is only one function V
that satisfies Laplace’s equation inside the region and meets the boundary condition around
the region. (You will prove this uniqueness theorem in Section 13.11 Problem 13.158, see
felderbooks.com.) So once we find a working solution by any means we know it is the right
solution.
Second: how did we happen to think of using a log, and how are you supposed to think
of the right functions? You have to be familiar with your basic analytic functions (there
aren’t many of them) and their properties. There are three shapes in particular to keep your
eye on.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 724
y
V = V2
V = V1 A “washer” centered on the origin with specified values on the
two bounding circles. As we saw in the example above, the solu-
x tion begins with ln |z| (the real part of the analytic function
ln z).
y
A rectangle with specified values on the two vertical edges,
Insulated
boundary
with the horizontal edges insulated (see below). The solution
is a linear function of x. (We know that x is a harmonic
V = V1 V = V2 function since it is the real part of the analytic function
Insulated z.) The opposite case, horizontal edges specified and verti-
boundary cal edges insulated, works equally well with y (the imaginary
x part of z).
A common boundary condition is that one or more edges are “insulated.” The solution
function must have a derivative of zero perpendicular to such a boundary. In a wedge for
instance your solution will be a function of 𝜙, and any function of 𝜙 has a zero derivative
perpendicular to a circle around the origin.
The above list is not exhaustive, of course. In our first example above we needed to match
a boundary curve shaped like a hyperbola, which lines up with the real or imaginary parts
of z 2 . But the three shapes in the above list are the most useful. In Section 13.9 we’ll show
you a technique for matching a somewhat broader set of boundary conditions than you can
find by just guessing at analytic functions. But that technique is based on mapping more
complicated regions to familiar ones, so it is still useful to have a bank of solved regions to
draw on.
Stepping Back
Analytic functions have a number of not-obviously-related properties. If a function f (z) is
analytic…
∙ The derivative f ′ (z) is defined, which is to say, it is the same in all directions.
∙ The Taylor series for f (z) converges to f (z).
∙ The real part u(x, y) and the imaginary part v(x, y) obey the Cauchy-Riemann
conditions.
∙ The real part u(x, y) and the imaginary part v(x, y) are both, separately, solutions to
Laplace’s equation.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 725
If any of the first three criteria are met then f is analytic and you’re guaranteed that all
four are true.
The idea of an analytic function is useful in many contexts, but in this section we have seen
one of the most important: you can easily determine that many basic functions are analytic,
and based on that find solutions to Laplace’s equation.
(g) The problem as stated takes place in an (a) This problem is easiest in polar coor-
infinitely long region. Now suppose the dinates. Based on symmetry, what
region is terminated by the curve 𝜌 = R, should be true about the derivatives
and that this curve is insulated—meaning of T with respect to 𝜙?
that 𝜕V ∕𝜕𝜌 = 0 along this border. Does (b) In polar coordinates ∇2 T = 𝜕 2 T ∕𝜕𝜌2 +
the same solution still work? If it does (1∕𝜌)𝜕T ∕𝜕𝜌 + (1∕𝜌2 )𝜕 2 T ∕𝜕𝜙2 . Using
not, adjust it properly for this changed your answer to Part (a), write Laplace’s
scenario. If it does, explain why. equation in polar coordinates as
an ODE for T (𝜌).
In Problems 13.27–13.29 find the solution V (x, y) to (c) Find the general solution to that ODE.
Laplace’s equation in the given region subject to the (d) Plug in the boundary conditions to
given boundary conditions. find the specific solution that we
found in the explanation.
13.27 V (x, y) = 0 on the circle x 2 + y2 = R 2 and
13.33 The function f (z) = az + b with a and b real
V (x, y) = V0 on the circle x 2 + y2 = 9R 2 .
is analytic. (It would still be analytic if a and
Solve for V in between them.
b weren’t real, but choosing them to be real
13.28 V = VU along the y-axis for y > 0 and will be more useful for our purposes.)
V = VL along the y-axis for y < 0. Solve (a) Since f (z) is analytic, u(x, y) = Re(f ) is har-
for V in the region x ≥ 0. monic. On what curves is u constant?
13.29 A rectangle. The bottom at y = 3 is at V1 (b) If we used u as the solution to an electric
and the top at y = 5 is at V2 . The left and potential problem, what would the bound-
right sides are insulated, so 𝜕V ∕𝜕x = 0 on ary conditions for that problem be?
the edges (and in fact everywhere). You can
(c) Specify some particular boundary con-
answer this even though we didn’t tell you
ditions for the electric potential of the
where the right and left edges are.
type you described in Part (b). Define
curves and give the value of the poten-
13.30 The circle x 2 + y2 = R 2 is held at constant tial on those curves. You may use num-
potential V (x, y) = V0 . The potential in bers or letters. Then use u(x, y) to solve
the interior of the circle obeys Laplace’s Laplace’s equation with the boundary
equation. conditions you specified.
(a) We have used logs to solve similar 13.34 On Page 722 we used the real part of the ana-
problems. Explain lytic function f (z) = ln z to solve Laplace’s
√ why an answer
that involves ln x 2 + y2 cannot pos- equation in a region between two concen-
sibly work for this problem. tric circles. Using the same technique, write
(b) Just by examining the problem find a problem about steady-state temperature
another, simpler solution that will work. that can be solved using the imaginary part
of z 2 . Solve the problem you wrote.
13.31 Show that if a function f (z) is analytic, where
z = x + iy and f = u + iv, then u and v must 13.35 If the number z has a modulus of √ |z| and
obey Laplace’s equation (Equation 13.3.2). a phase of 𝜙√then the function z has a
Hint: any analytic function must obey the modulus of |z| and a phase of 𝜙∕2.
√
Cauchy-Riemann equations. (a) Find the real and imaginary parts of z.
13.32 In the Explanation (Section 13.3.3) we solved (b) What shape is described by the level
∇2 T = 0 in between the circles x 2 + y2 = R 2 curves of the real part of this func-
with T = T0 and x 2 + y2 = 4R 2 with T = 2T0 . tion? (This tells us what kind of region
It’s also possible to solve that problem might lead us to use this function in a
directly without complex functions. Laplace’s equation problem.)
Our organization of this topic is one that we generally avoid, but it seemed cleanest in this
case: we present the pure math here, and a few practical applications in Section 13.5.
The integral of a complex function is in some respects like a line integral. As with a line
integral, you begin with a function to integrate and a path to integrate along. As with a line
integral, the path matters: as you’ll show in Problem 13.50, the integral of f (z) = Im(z) from
z = 0 to z = 1 + i can take different values depending on the path you take between them. A
specific path through the complex plane is called a “contour,”3 and the integral along such
a path is a “contour integral.”
But there are important differences too. When you evaluate the line integral of a scalar
function you multiply the value of the function at each point by ds, the arclength. When
you evaluate the line integral of a vector function you dot the function with d s⃗, or multiply
the along-the-path component of the function by ds. In either case the end result is a scalar
value. By contrast, a contour integral multiplies the value of the function by dz, just as we saw
with derivatives, and the final result is a complex number.
To take a concrete example, the line integral of f (x, y) = 1 along any curve adds up all
the differential arclengths, so it gives you the total arclength of the curve. The integral of
f (z) = 1 along any contour adds up all the differential changes in z, so it gives you zend − zstart .
The integral of f (z) can be defined by a Riemann sum. We break the contour into N
∑N −1
segments at points z0 , z1 , …, zN , calculate k=0 f (zk )(zk+1 − zk ), and then take the limit as the
points get arbitrarily close together.
Now, suppose f (z) is entire. This implies that its antiderivative F (z) is also entire.4 When we
evaluate ∫ f (z)dz, each step along the contour multiplies f (z) = dF ∕dz times(dz, giving) (us the)
small change dF . Adding up those changes along the full contour gives us F zend − F zstart .
So the fundamental theorem applies to analytic complex functions much as it does to real
functions.
You might be surprised to learn that we’re not going to use that result very much. “Sup-
pose f (z) is entire”—no branch cuts, no poles, everywhere differentiable—turns out to be a
relatively uninteresting special case. There is a more general formula for integrating a com-
plex function along a parameterized contour, but we’re not going to talk about that either.
Instead, we’re going to focus on two special cases.
1. Integrals along a closed contour. A contour is “closed” if it ends where it began. Most of
our applications will apply to integrals around closed contours, because—as we will see
below—they are especially easy to evaluate. We define the positive direction around
a closed curve to be the one where the interior of the curve is always on your left as
you move around. (For simple closed curves this is equivalent to saying you go around
counterclockwise.)
2. Integrals along the real axis. When you evaluate a contour integral of a function f (z)
along the real axis from z = a to z = b and f (z) happens to be real along that part
of the axis, the contour integral reduces to the real integral we are used to taking.
In the next section we’ll use that fact to solve real integrals by evaluating complex
integrals.
3 More precisely, a contour is a directed curve that can be defined by a parametrization z(t) that is piecewise contin-
uous and differentiable and that is one-to-one with the possible exception of ending at the same point it started.
For most purposes you can just think of it as a curve with a specified direction that you move along.
4 That leap isn’t obvious; it’s another wonderful property of analytic functions.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 728
g (z)
f (z) dz = dz = 2𝜋i g (z0 ) (13.4.1)
∮C ∮C z − z0
Be careful when applying Equation 13.4.1: the bottom must be z − z0 for some constant z0 . If the
bottom were 2z − 7 + i, for instance, you would have to divide the top and bottom by 2 before
applying this formula.
It’s easy to see that the integral of 1∕z around any loop that encloses the origin is a special
case of Cauchy’s integral formula, where g (z) = 1 and z0 = 0. What isn’t quite as obvious is
that any such integral can be viewed as a variation of that case, so when you prove the integral
for 1∕z (Problem 13.53) you have done most of the work to prove Cauchy’s formula.
Slightly more complicated formulas apply when there are higher powers in the denomi-
nator. Given the same conditions stated above,
and so on. These are examples of the “generalized Cauchy’s integral formula”:
( )
g (z) 2𝜋i d n−1 g
dz = (z )
∮C (z − z0 )n (n − 1)! dz n−1 0
Read the last part of this formula as “the (n − 1)th derivative of g (z) evaluated at z = z0 .”
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 729
Before showing you examples of how to use these formulas, we should note that they are
often presented as formulas for the value of some analytic function g at an arbitrary point
inside the contour. For instance, Equation 13.4.1 is often written g (z0 ) = 1∕(2𝜋i) ∮ g (z)∕(z −
z0 ) dz. While this is clearly the same formula, it is written to say that the values of g (z) inside
a region can be determined from its values on the boundary. That’s a fascinating fact about
analytic functions, but we will be using Cauchy’s formulas to calculate contour integrals, not
to find values of g (z) at interior points.
i A B
Re(z)
–1 1 2 3 4
–i
2
Question: Evaluate ∮ e z dz around each of the two contours shown above.
Answer:
2
The function e z is analytic everywhere (can you see how we know that?) so the
integral around either contour is zero.
Question: Evaluate ∮ (cos z)∕(z − 𝜋)dz around each of the two contours shown above.
Answer:
The function cos z is analytic everywhere, so we can use Cauchy’s integral formula.
Since contour A doesn’t enclose the point z = 𝜋 the integral around it is zero. Since
contour B does enclose z = 𝜋, the integral around it is 2𝜋i cos 𝜋 = −2𝜋i.
Question: Evaluate ∮ 1∕[(2z − 1)(z − 2)]dz around each of the two contours shown
above.
Answer:
The integrand has two poles, at z = 1∕2 and z = 2. The integral around contour B
encloses the second pole. For that contour g (z) = 1∕(2z − 1), which is analytic
everywhere on and inside B. Since g (2) = 1∕3, the integral around B is (2∕3)𝜋i. The
integral around contour A encloses the pole at z = 1∕2, but we can’t use Cauchy’s
integral formula until we write the integrand in the form g (z)∕(z − z0 ), which means
we have to factor a 2 out of 2z − 1:
1 1 1
f (z) = → g (z) = → g (1∕2) = −1∕3
2(z − 2) z − 1∕2 2(z − 2)
surround the point z = 3 + i𝜋∕4. Looking at the plot, the point 3 + i𝜋∕4 ≈ 3 + 0.8i is
inside contour B. To find the integral around B we take the second derivative of e 2z ,
which is 4e 2z , and evaluate it at z = 3 + i𝜋∕4 to get 4e 6 e i𝜋∕2 = 4ie 6 . Plugging this into
the generalized Cauchy’s integral formula we get (2𝜋i)∕2!(4ie 6 ) = −4𝜋e 6 .
With that terminology we can state Cauchy’s integral formula more concisely than we did
above. The formulation below is also more general, because it can handle a contour that
encloses multiple poles.
See if you can convince yourself by means of a few drawings that the one-pole result implies
the many-pole result.
We’ve defined the terms “pole” and “residue” here using functions of the form g (z)∕
(z − z0 )n where g is analytic and non-zero. When we discuss Laurent series in Section 13.7
we’ll give more general definitions and techniques for finding poles and residues for func-
tions that aren’t of this form, but these definitions are useful in most practical cases.
–R R Re(z) p1 (z) iz
lim e dz = 0 (13.4.3)
R→∞ ∫C p2 (z)
Let C be the open semicircular contour shown
above.
Fact 1: If p1 (z) and p2 (z) are polynomials and p2 is at This result only holds for the semicircle in the
least two orders higher than p1 , then: upper half-plane (or any portion of it) because e iz
decays in the positive imaginary direction, whereas
p1 (z)
lim dz = 0 (13.4.2) it oscillates in the real direction and grows in the
R→∞ ∫C p2 (z)
negative imaginary direction. More generally, e iz ,
This result also holds if C is any arc or a complete e −iz , e z , or e −z each decays in one direction. For
circle about the origin (still with R → ∞), but the example, if we use e z the semicircle has to go in the
semicircular case is the one we’ll make use of. left half-plane.
We can justify Equation 13.4.2 in a rough way by noting that the modulus of the integrand
falls like 1∕R 2 (or faster) and the length of the contour is 𝜋R, so as R → ∞ the integral
must approach zero. In Problem 13.55 you’ll fill in some of the steps in that argument. In
Problem 13.56 you’ll extend the argument to Equation 13.4.3.5
5 For a more complete derivation of Equations 13.4.2–13.4.3 see any standard book on complex analysis, e.g. “Visual
(c) Which poles are enclosed by function around any closed loop must be zero.
the contour C? (Don’t say “because it’s analytic”—yes, we told
(d) Find the residue for each pole you this is true of any analytic function, but
enclosed by C. don’t assume that going in. Instead, base your
(e) What is ∮C f (z) dz? answer on the fact that the contour integral
is defined using Δz, not arclength.)
(f) Curve C1 is the circle with the same center
as C, but with radius 1. Evaluate ∮C f (z) dz. 13.52 In this problem you’re going to evalu-
1
ate ∮ tan z dz around the circle of radius
(g) Curve C3 is the circle with the
𝜋∕2 centered on z = 𝜋∕2. The function
same center as C, but with radius 3.
has a pole at z = 𝜋∕2, but is it a simple
Evaluate ∮C f (z) dz.
3 pole and what is its residue? Try answering
those questions yourself and then look
For Problems 13.38–13.49 evaluate ∮C f (z) dz.
at the trick we describe below.
13.38 f (z) = 1∕(z − i − 3), C is the circle of radius (a) One trick for finding simple poles and
10 centered on the origin. their residues is to evaluate lim(z − z0 )f (z).
z→z0
13.39 f (z) = 1∕(2z − 3 − i), C is the square with If f (z0 ) is undefined, but this limit is finite
corners at z = −1 − i and z = 1 + i. and non-zero, then f has a simple pole
13.40 f (z) = 1∕[z(2z − i)(z + 2)], C is the square with at z = z0 and the limit equals its residue.
corners at z = −1 − i and z = 1 + i. Use the trick introduced in this problem
to show that f (z) = tan z has a simple
13.41 f (z) = (z + 1)∕(z 2 − 1), C is the circle of
pole at z = 𝜋∕2 and find its residue.
radius 3 centered on the origin.
(b) Evaluate ∮ tan z dz around the circle
13.42 f (z) = (cos z)∕z 2 , C is the square with cor-
described above.
ners at z = −1 − i and z = 1 + i.
3 13.53 In this problem you will prove Cauchy’s
13.43 f (z) = e z , C is the ellipse x 2 + 2y2 = 1
integral formula for the simple case of a
where x = Re(z), y = Im(z).
contour around the unit circle and the
1∑
3
function f (z) = 1∕z. (It’s not hard to gen-
13.44 f (z) = e z , C is the circle of radius
z − in eralize this proof to circles of other radii
n=−3
1 centered on the point z = (3∕2)i. or centers other than the origin.)
13.45 f (z) = (ln z)∕(z 2 − 3zi), C is the rectangle with (a) The unit circle can be defined as all the
corners at z = −1 + i and z = 1 + 10i. points z = e i𝜙 for 0 ≤ 𝜙 < 2𝜋. Write a
Riemann sum for ∮ f (z) dz using steps
13.46 f (z) = (cos z)∕(2z 2 − z), C is the circle of
of size Δ𝜙. Your answer should be a
radius 10 centered on the origin.
sum involving nothing but numbers
13.47 f (z) = 1∕(2z 2 + z(1 − 2i) − i), C is the circle and the variables 𝜙 and Δ𝜙. Simplify
of radius 10 centered on the origin. your answer as much as possible.
13.48 f (z) is a branch of ln z defined with the branch (b) You should have found that the “sum-
cut on the positive real axis. C is the circle of mand” (the quantity you are summing)
radius 1 centered on the point z = −3. didn’t depend on 𝜙. Given that, you can
13.49 f (z) = (sin z)∕(z 2 + 2iz − 1)3 , C is the circle evaluate the sum by simply multiplying
of radius 2 centered on z = −i. the summand by the number of steps,
2𝜋∕Δ𝜙. Evaluate ∮ f (z) dz as the limit as
13.50 In this problem you will show that ∫ Im(z) dz Δ𝜙 → 0 of the resulting quantity.
from z = 0 to z = 1 + i is path-dependent. 13.54 In this problem you will evaluate
(a) First, evaluate the integral along the ∫ (1∕z)dz along a semicircle of radius
path 0 → 1 → 1 + i. Draw this path R around the origin.
in the complex plane, integrate Im(z) (a) Argue from symmetry that ∫ (1∕z)dz
along each of the two line segments that around a semicircle in the upper half-
make it up, and add the results. plane equals ∫ (1∕z)dz around a semicircle
(b) Similarly, draw the path 0 → i → 1 + i and in the lower half-plane. Hint: consider
evaluate ∫ Im(z) dz along it. You should get the modulus and phase of z and dz at
a different answer than you got in Part (a). different points around the circle.
13.51 Argue, using the definition of a contour inte- (b) What is ∮ (1∕z)dz around the entire
gral, that the contour integral of a constant circle of radius R around the origin?
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 733
(c) Using your answers to Parts (a)–(b), (b) Using Euler’s formula to rewrite e i𝜙 in
what is ∫ (1∕z)dz around a semicircle terms of sines and cosines, simplify your
of radius R centered on the origin answer to Part (a) as much as possible.
in the upper half-plane? (c) To find an upper bound on |I |, take
13.55 In this problem you will derive Equation 13.4.2 the modulus of the integrand. We’ll
for the special case where p1 (z) = z m and call the resulting integral J (just
p2 (z) = z n . Recall that C is a semicircle because it comes after I ).
of radius R around the origin. You’ve now identified a real integral J whose
(a) Along the contour C, what is |p1 ∕p2 |? value is an upper bound on |I |, so the goal
Your answer should be in terms of R. from here is to show that lim J = 0. Unfor-
R→∞
(b) What is L, the length of the contour? tunately, J still isn’t an integral you can eval-
(c) For any function f (z), if |f (z)| ≤ M every- uate, so now you’re going to find an upper
where along contour C then | ∫C f (z) dz| ≤ bound for J , and then prove that that integral
ML, where L is the length of C. (You approaches zero. To make life easier you’ll
should be able to convince yourself rewrite J as 2 times the integral from 0 to 𝜋∕2.
of this by thinking of the integral as (You should be able to convince yourself that
a Riemann sum.) Use this result and this is valid if you have the correct form for J .)
what you’ve derived in this problem to (d) Your formula for J should involve sin 𝜙.
prove that lim ∫C (p1 ∕p2 )dz = 0. Draw a plot of y = sin 𝜙 from 𝜙 = 0 to
R→∞
𝜙 = 𝜋∕2 and on the same plot draw a line
13.56 Exploration: Deriving Equation 13.4.3.
y(𝜙) from (0, 0) to (𝜋∕2, 1). You can see
Despite the name of this problem, you will
that the line is always below the sine curve.
only derive Equation 13.4.3 for the special
Find the equation of that line. Argue that
case p1 ∕p2 = 1∕z, so you’ll evaluate I =
replacing sin 𝜙 with that linear function
lim ∫ (e iz ∕z)dz. It’s not hard to generalize this
R→∞ of 𝜙 in your integral increases the value of
argument to any rational expression p1 ∕p2 the integral. Let’s call this new integral K .
where the order of the denominator is at least
(e) Evaluate the integral K .
one higher than the order of the numerator.
(f) Show that lim K = 0. Since K > J > |I |,
(a) For points on the contour C, z = Re i𝜙 R→∞
where 𝜙, the phase of z, goes from 0 to this proves that lim I = 0. (If the modulus
R→∞
𝜋. Rewrite I as an integral over 𝜙. (This of a complex number approaches 0 then
will include finding dz in terms of d𝜙.) the number itself must approach 0.)
As we stressed in Chapter 3, complex numbers are a shortcut for getting from a real question
to a real answer. In this section we will see two such uses for contour integrals. First we will see
some integrals of real functions, on the real number line, that can most easily be evaluated
on the complex plane. Then we will look at the Laplace transform—a real transformation
that turns one real function into another real function, but that can only be inverted by a
contour integral.
is said to “converge” (the limit exists) or “diverge” (the limit does not exist). The following
examples both converge.
( )
1 ||a
∞ a
1 1 1
dx = lim dx = lim − = lim − + 1 =1
∫1 x 2 a→∞ ∫1 x 2 a→∞ x ||1 a→∞ a
1 1 |1
ln x dx = lim+ ln x dx = lim+ x ln x − x || = lim+ (−1 − a ln a + a) = −1
∫0 a→0 ∫a a→0 |a a→0
That last step requires a bit of algebra followed by l’Hôpital’s rule, but the important part
here is the first step in which the improper integral is rewritten as a limit.
The integral in Figure 13.8 is less clear. You have to break up the limit at x = 0 (the vertical
asymptote) and evaluate the two sides separately, each with a limit. But do you evaluate the
limit before you add the two pieces, or after?
1 a 1
1 1 1 the standard definition
dx = lim dx + lim dx (13.5.1)
∫−1 x 3 a→0 ∫−1 x 3
− b→0+ ∫b x 3 (limits before addition)
1 ( −c 1 )
1 1 1 the Cauchy principal value
dx = lim dx + dx (13.5.2)
∫−1 x 3 c→0 + ∫−1 x 3 ∫c x 3 (addition before limits)
In Problem 13.57 you will show that this particular integral diverges
under the first definition but converges under the second. In effect 1
the Cauchy principal value allows the infinite negative value on ∫–1 dxx3
the left to cancel the infinite positive value on the right. In Prob-
∞
lem 13.58 you will show that ∫−∞ x dx leads to the same two answers. x
–1 1
We don’t want to overstate this case. Sometimes both the stan-
dard and Cauchy definitions lead to convergence, and sometimes
both lead to divergence. But we are going to demonstrate tech-
niques for using contour integrals to evaluate real integrals, and
what we’re going to find—especially once we start integrating FIGURE 13.8 The integral
1
around poles in Section 13.6 (see felderbooks.com)—will be the ∫−1 dx∕x 3 is divergent by
Cauchy principal value. If you have learned the standard definition, the standard definition, but
its Cauchy principal value
or if you ask your favorite software for these integrals, you may find
is not.
that our answers don’t match. Now you know why.
1
dz where C is the real axis traversed from −∞ to ∞
∫C z 2 + 1
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 735
As we have seen, the easiest contour integrals to evaluate are around Im(z)
closed loops. So we draw a closed loop, a semicircular contour of
radius R centered on the origin (Figure 13.9). Now, that loop is the C
i
sum of two parts: a curve (that we don’t care about) and a line seg-
ment (that approaches our line C as R → ∞). If we can determine –R R Re(z)
the integral around the entire closed loop and the integral around –i
the curved part, we can subtract them to get the integral we’re look-
ing for.
FIGURE 13.9 A semi-
The integral around the curved part sounds like the hardest step, circular contour. The gray
but remember that R is approaching ∞. In that limit the curved part circles at ±i are poles of
of the integral is 0 by Equation 13.4.2. f (z).
So now let’s tackle the closed loop. The function f is analytic
everywhere except at the two poles z = i (enclosed) and z = −i (not
enclosed), so ∮ f (z)dz is 2𝜋i times the residue at z = i. Rewriting f as 1∕[(z − i)(z + i)] we see
that the residue at z = i is 1∕(2i). So the integral around the closed loop, and thus the inte-
gral along the real axis, and thus the real integral we started with, is 2𝜋i∕(2i) = 𝜋, just as we
expected it to be.
∞
Question: Evaluate ∫−∞ (x 2 + 1)∕(x 4 + 4) dx.
Answer:
As we did above, we consider this real integral to be one leg of a semicircular contour
integral ∮ f (z) dz, where f (z) = (z 2 + 1)∕(z 4 + 4). Once again, in the limit where the
radius of the semicircle approaches ∞, the modulus of the integrand along the curve
part decreases like 1∕R 2 and the length grows like R, so the curved part of the
contour integral vanishes by Equation 13.4.2 and the real integral equals the contour
integral.
Im(z)
C
i
–R R Re(z)
–i
z2 + 1
f (z) =
(z − (1 + i))(z − (−1 + i))(z − (−1 − i))(z − (1 − i))
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 736
(1 + i)2 + 1
((1 + i) − (−1 + i))((1 + i) − (−1 − i))((1 + i) − (1 − i))
After a bit of algebra this can be simplified to 1∕16 − (3∕16)i. The residue at −1 + i
can similarly be calculated to be −1∕16 − (3∕16)i, so the sum of the residues is
−(3∕8)i. The final answer is 2𝜋i times the sum of the residues, or (3∕4)𝜋.
This technique of using an infinitely large semicircular contour to evaluate a real integral
works when the integral goes from −∞ to ∞ and the function f (z) falls off quickly enough
at large z that the curved part of the integral goes to zero.
∞
As a slightly more complicated example, consider the integral ∫−∞ (cos x)∕(x 2 + 1) dx. We
could try the same thing we did above, extending this into an infinitely large semicircular
contour in the complex plane. The problem is that cos z blows up as z moves infinitely up
the imaginary axis. To see why this is true, and how to get around the problem, we can
use Euler’s formula to rewrite cos z as (1∕2)(e iz + e −iz ). (If that formula isn’t familiar, take a
minute to expand those exponentials using Euler’s formula and prove that this equals cos z.)
As z → i∞ the first exponential decays but the second one approaches ∞. That fact shows
why we have a problem, but it also suggests a solution. We can rewrite our original integral
as a sum of two integrals.
∞ [ ∞ ∞ ]
cos x 1 e ix e −ix
dx = dx + dx
∫−∞ x 2 + 1 2 ∫−∞ x 2 + 1 ∫−∞ x 2 + 1
We can evaluate the first integral by closing the contour in the upper
half of the complex plane, and evaluate the second one by closing
Im(z)
it off in the lower half. Each of these contour integrals goes to zero
along the curved part by Equation 13.4.3,6 so the integral along the
i real axis equals the entire contour integral. If we rewrite the first
–R R
Re(z) integrand as e iz ∕[(z − i)(z + i)] we immediately see that its residue at
–i z = i is e −1 ∕(2i), so the contour integral equals 𝜋∕e.
The second integral proceeds similarly, but there’s a subtlety
FIGURE 13.10 One
we have to be careful about. Rewriting the integrand as e −iz ∕
contour is in the upper [(z − i)(z + i)] we see that the residue at z = −i is −e −1 ∕(2i). However,
half-plane and the other from Figure 13.10 we can see that this contour goes around clockwise.
in the lower half-plane. That means the contour integral equals −2𝜋i times the residue, so
The gray circles show the once again it gives us 𝜋∕e. Putting these together the final answer is
poles at ±i. (1∕2)(𝜋∕e + 𝜋∕e) = 𝜋∕e.
a+i∞
1
f (t) = F (s)e st ds “the Bromwich integral” (13.5.3)
2𝜋i ∫a−i∞
The integral is evaluated along a vertical straight line in the complex plane. You can choose any
value for the real number a so long as it pushes this vertical line to the right of all the poles of F (s).
Problems involving Laplace transforms typically take t = 0 as the initial time, so only pos-
itive values of t matter. In Problem 13.82 you’ll show that this formula always gives f (t) = 0
for t < 0.
Below we give an example of how to use this formula. In Section 13.5.2 we will show where
the formula comes from.
Question: Find the function f (t) whose Laplace transform is F (s) = 1∕(s − 2)
Answer:
Clearly F (s) has a single pole at s = 2, so for our contour we can choose any vertical
line to the right of that. We’ll see shortly that it doesn’t matter which one. We need to
evaluate
1 e st
ds (13.5.4)
2𝜋i ∫ s − 2
To do so we’ll use the same method we used for the real integrals above: closing the
contour with an infinitely large semicircle. As always, the trick is making sure that the
curved part of the integral equals zero. We are only interested in positive values of t,
so e st approaches 0 as we move to the left, and we therefore close the contour on that
side and Equation 13.4.3 guarantees that the curved integral is 0.
Im(z)
3i
2i
–2 –1 1 2 3 Re(z)
–i
–2i
–3i
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 738
This closed contour encloses the pole, whose residue is e 2t , so the integral equals
2𝜋ie 2t . Plugging this into Equation 13.5.4 we get f (t) = e 2t . You can confirm this easily
∞
enough by taking its Laplace transform: F (s) = ∫0 e 2t e −st dt. Did you get what you
expected?
Stepping Back
Mathematics is an infinitely creative field, with new techniques always waiting to be discov-
ered. That’s an exciting thought for a mathematician, but it can be intimidating when you
have a test next week. “I followed your examples, but I never would have thought of those
steps on my own. How am I going to do the next problem?”
It’s not as scary as it looks. You might feel that our first example above would be a lot
harder if the limits of integration were 0 to 10 instead of −∞ to ∞, and you’re right. In fact
the whole trick wouldn’t work in that case, so we aren’t going to ask that question.
On the topic of “using contour integrals to evaluate real integrals,” all the problems you’re
likely to encounter fall into three categories.
∞
∙ For ∫−∞ f (z) dz where |f (z)| drops sufficiently quickly as iz → ±∞ you draw a
semicircle—upward if |f (z)| drops in that direction, downward if it drops there. Our
first example above fell into this category.
∙ Some functions can be broken into two functions, one that drops as you move up
and one that drops as you move down. You break such a function up and draw two
semicircles. Our cosine example above fell into this category.
∙ The third category, which we have not demonstrated, is rational functions of sin x or
cos x integrated over a full period. You begin by rewriting your sines and cosines as
complex exponentials, and the substitution u = e ix maps a segment on the real number
line to a closed loop on the complex plane. We will step you through such a situation
in Problem 13.71. It doesn’t look like the examples we’ve worked here, but it is always
the same u, always the same du…once you know how to do that problem, you have the
final piece of the puzzle.
Inverse Laplace transforms fall into only one category. Equation 13.5.3 sets up an integral
along a vertical line, with a chosen to place this line to the right of all poles of F (s). You then
close the contour to the left and get f (t) from the residues of e st F (s) for all positive values
of t.
and Laplace transformed. Then we’ll show how to modify the derivation when f (t) can’t be
Fourier transformed.
∞ ∞
1
f̂(𝜔) = f (t)e −i𝜔t dt, F (s) = f (t)e −st dt
2𝜋 ∫0 ∫0
∞ i∞ i∞
F st ds 1
f (t) = f̂(𝜔)e i𝜔t d𝜔 = e = F (s)e st ds
∫−∞ ∫−i∞ 2𝜋 i 2𝜋i ∫−i∞
Our original integral took place along the entire real axis. The substitution s = i𝜔 rotates it,
so our new integral is a contour integral straight up the imaginary axis.
i∞
1
g (t) = G(s)e st ds
2𝜋i ∫−i∞
We need one more fact about Laplace transforms to complete the derivation. For any func-
tion f (t) with Laplace transform F (s), the Laplace transform of e −at f (t) is F (s + a). In other
words, multiplying f (t) by an exponential simply shifts F (s) to the left. (You’ll prove this in
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 740
Problem 13.83.) In our case this means G(s) = F (s + a). Plugging in that relationship and the
definition g (t) = e −at f (t), we get:
i∞
1
f (t) = F (s + a)e (s+a)t ds
2𝜋i ∫−i∞
(We were able to move e at inside the integral since it doesn’t depend on s.) The last step is
the substitution u = s + a. The s integral was along the imaginary axis. Since u = s + a where
a is a real number, the integral over u goes along a vertical axis shifted to the right by an
amount a. As a shorthand, we write that as
a+i∞
1
f (t) = F (u)e ut du
2𝜋i ∫a−i∞
Since u is just a variable of integration we are free to rename it s, which gets us back to
Equation 13.5.3.
(c) Explain how we know, for this particular haven’t done so already, factor it. (You
function f (z), that in the limit R → ∞ the may be able to do so by inspection, but
contour integral around the curved part if you’re having trouble you can use the
of your semicircle approaches 0. (That in quadratic formula to find the roots.)
turn implies that the closed contour inte- (d) Ok, now worry about the limits of inte-
gral equals the contour integral along the gration. Since u is a complex number,
real axis, which is what we wanted to find.) the integral now traces out a contour in
(d) Find the residues of all of the poles of the complex plane. Draw the contour
f enclosed by your contour, and use followed by u as x goes from 0 to 2𝜋.
those residues to calculate the closed (e) Evaluate this contour integral to get a
contour integral. This gives you your 2𝜋
final answer for ∫0 dx∕(5 + 4 sin x).
answer for the real integral as well.
For problems 13.72–13.75 use the method we
For Problems 13.63–13.69 use a closed contour 2𝜋
outlined in Problem 13.71 to calculate ∫0 f (x) dx by
integral to find the value of the real integral. Some
turning it into a contour integral. (You can check
problems may require using two contours, as
your answers by evaluating the real integrals on a
illustrated in the Explanation (Section 13.5.1).
computer, or in some cases by hand, but you should
∞ do each of them using a contour integral.)
13.63 ∫−∞ 1∕(x 2 + 9) dx
∞
13.64 ∫−∞ 1∕(x 2 + 1)2 dx 13.72 f (x) = (sin x)∕(5 + 3 cos x)
∞
13.65 ∫−∞ (2x − 5)∕[(x 2 + 1)(x 2 + 4)] dx 13.73 f (x) = 2∕(25 + 7 sin x)
∞
13.66 ∫−∞ 1∕(x 2 − 2x + 2) dx 13.74 f (x) = 3 sin x∕(17 + 8 sin x)
∞
13.67 ∫−∞ cos(5x)∕(x 2 + 1) dx 13.75 f (x) = 1∕(2 + sin x)
∞
13.68 ∫−∞ (sin2 x)∕(x 4 + 4) dx
∞ 13.76 Walk-Through: A Bromwich Integral. In
13.69 ∫0 cos(2x)∕(x 2 + 4) dx Hint: You’ll
this problem you will calculate the inverse
need to start by relating this to an
Laplace transform of F (s) = 1∕(s − 1)2 .
integral from −∞ to ∞.
(a) Use Equation 13.5.3 to write an integral
13.70 In the Explanation (Section 13.5.1) we showed for the inverse Laplace transform f (t).
∞ Leave the number a unspecified.
how to evaluate integrals like ∫−∞ (x 2 + 1)∕
4 (b) Mark on the complex plane all the poles
(x + 4) dx using semicircular contour
integrals. of the integrand, and draw a vertical con-
(a) Explain why that technique would not tour anywhere to the right of those poles.
∞
work for ∫−∞ (x 3 + 1)∕(x 4 + 4) dx. (c) Add to your drawing a semicircular con-
(b) Explain why that technique would not tour that encloses the vertical one you
10 drew. Explain how you can know that the
work for ∫0 (x 2 + 1)∕(x 4 + 4) dx.
integral around the curved part of this
(c) Explain how you could easily adapt the contour approaches 0 as the radius of
∞
technique to find ∫0 (x 2 + 1)∕(x 4 + 4) dx. the semicircle approaches ∞, provided
13.71 Walk-Through: Turning a Real Integral of we assume t > 0. (Getting this right will
Trig Functions into a Complex Contour depend on how you closed the contour.)
Integral. In this problem you will evaluate (d) Evaluate the closed contour integral
2𝜋
∫0 dx∕(5 + 4 sin x) by using a complex u- to find f (t) for t > 0.
substitution to turn it into a contour integral.
(a) Use the identity sin x = (e ix − e −ix )∕(2i) For Problems 13.77–13.81 use a Bromwich integral to
to rewrite this integral in terms of calculate the inverse Laplace transform of F (s). You
complex exponentials, simplifying only need to calculate f (t) for t > 0. (You may want to
as much as possible. first work through Problem 13.76 as a model.)
(b) Do a u-substitution: u = e ix . As always, 13.77 F (s) = 1∕(s − 1)
remember to substitute for dx as well as 13.78 F (s) = 1∕[(s + 1)(s − 2)]
x. Once again, simplify the integral as
much as possible, but don’t worry yet 13.79 F (s) = 1∕s
about the limits of integration. 13.80 F (s) = 1∕(s + 5)2
(c) Your integrand should have a quadratic 13.81 F (s) = (s + 1)(s + 2)∕(s + 3)3
function of u in the denominator. If you
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 742
7 We said you have to find the distance to a “non-removable” singularity to cover a small exception. For instance,
the Maclaurin series for (z − 3i)∕(z − 3i) converges everywhere, not just until you get to 3i. Of course you could also
violate this rule by defining a function that is z 2 near the origin and Re(z) far away, and then your series would not
converge to the function you started with. Don’t do that.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 743
Question: Where does the Maclaurin series for the real function f (x) = 1∕(x 4 + 2)
converge?
Answer:
The real function f (x) is the complex function f (z) = 1∕(z 4 + 2) evaluated on the real
line. This f (z) is analytic everywhere except the poles at z = (−2)1∕4 . The fourth roots
of −2 have phases 𝜋∕4, 3𝜋∕4, 5𝜋∕4 and 7𝜋∕4, but more importantly they all have
magnitude 21∕4 , so that is the distance from the origin to the nearest singularity. We
therefore draw a disk around the origin with radius 21∕4 . The series converges
everywhere inside this disk and diverges everywhere outside it.
–1 1
–i
We now return to our original real-valued function. It doesn’t matter that none of the
poles lies on the real axis. The complex function converges everywhere in the disk
drawn above, so the real function √ converges
√ everywhere on the real axis within that
4 4
disk: in other words, between − 2 and 2.
For both the real and complex series we say that the “radius of convergence” is
21∕4 . On the complex plane this is the radius of a disk; on the real number line it is
the distance from 0 to either edge of the interval of convergence.
We get no guarantee of what happens to our complex series on the border of its
disk of convergence, or to our real series at the two endpoints of its interval of
convergence. If we wanted to know we would need to find the Maclaurin series for
f (x) and use tests of series convergence on it.
Laurent Series
We said above that all analytic functions have well-behaved Taylor series, but as we’ve seen
in this chapter the most interesting behavior of a function often occurs around the points
where it is not analytic. It is possible to do a series expansion in powers of z about such a
point, but the series must include negative powers of z.8 Such a series is called a “Laurent
Series.”
8 Technicallya “power series” only involves positive integer powers, so this section is not accurately named. But who
wants to read a section called “Power Series and Other Series that are a Lot Like Them but Don’t Meet the Technical
Definition”?
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 744
that converges to f (z) at every point z in that ring-shaped region. That series is a Laurent series for
f (z).
If f is analytic everywhere inside the outer radius R2 then all of the cn for negative n
will be zero and the Laurent series will simply be the Taylor series for f . Things get more
interesting, though, when z = z0 is an isolated singularity of f (z).
In this case we can define R2 to be the distance to the nearest
other singularity (∞ if there are no other singularities) and let
R1 = 0, so the “ring” is a “punctured disk” that includes every-
thing inside the radius R2 except the point z = z0 , as shown in
Figure 13.12.
You might reasonably expect that we are now going to give
you the formula for cn in terms of f (z), just as we can do with Tay-
lor series. You will work out that formula in Problem 13.104, but
it is not typically useful for calculating Laurent series. Instead,
our examples and problems will focus on calculating Laurent
FIGURE 13.12 The Laurent
series from known Taylor series. That is easy in many cases, and
series for a function f (z) around it gives you a lot of information. In fact Laurent series give us
an isolated singularity converges a way to generalize some of the definitions in Section 13.4, as
to f everywhere inside a disk well as a surprisingly quick way to compute residues. (For some
around z = z0 except for the types of functions partial fractions are useful for finding Lau-
point z = z0 itself. rent series. See Problems 13.98–13.101.)
∙ If cn ≠ 0 for some negative value of n but it does equal zero for all values of n below that,
then the function f (z) has a “pole” of order |n| at the point z = z0 .
∙ If cn ≠ 0 for an infinite number of negative values of n, then the function f (z) has an “essen-
tial singularity” at the point z = z0 .
∙ If either of the above is true then the “residue” of f (z) at z = z0 is the coefficient c−1 .
You can relate this definition of “pole” to our earlier definition by considering a function
like this.
1 7 4
+ + + 12 + 6(z − 10) + 2(z − 10)2
(z − 10)3 (z − 10)2 z − 10
Our new definition tells us that this function has a third-order pole at z = 10. We could com-
bine all these terms into one fraction with a polynomial (and therefore analytic) numerator
and a denominator of (z − 10)3 and then our previous definition would tell us the same thing.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 745
Our new definition also introduces a new term, “essential singularity,” for a point where
the negative powers just keep on going. You can think of the order of a pole as telling you
how quickly the function blows up as you approach the singularity. An nth-order pole acts like
(z − z0 )−n near the singularity because the term with that power of z grows infinitely larger
than all the other terms. An essential singularity blows up faster than any inverse power of z.
Perhaps most surprising is our new definition of “residue,” which is just the coefficient
of the −1-order term for a pole or an essential singularity. In fact this is the general defini-
tion, and the derivative-and-factorial formulas in Section 13.4 were a special case. In Prob-
lem 13.102 you’ll try both definitions on a specific function, and in Problem 13.103 you will
show more generally that they are equivalent.
2
Question: Find the Laurent series for f (z) = (1 − e z )∕z 5 about the origin and use it to
characterize the singularity at the origin and find its residue.
Answer:
2
We can find the series by expanding e z in a Maclaurin series. Since
2
e x = 1 + x + x 2 ∕(2!) + …, we know e z = 1 + z 2 + z 4 ∕(2!) + ….
( )
1 − 1 + z 2 + z 4 ∕(2!) + z 6 ∕(3!) + … 1 1
f (z) = = −z −3 − z −1 − z − …
z5 2 6
From this we can immediately see that the origin is a third-order pole with residue
−(1∕2). It follows that a contour integral of f (z) around the origin would give −𝜋i.
In Section 13.2 we listed three types of singularity. If a function f (z) has any kind of singu-
larity at z = z0 then you can safely assert that f (z0 ) is undefined, but the three types behave
differently near z0 . As you approach a removable singularity the function approaches some
finite limit. As you approach a pole the function blows up, which is to say, lim |f (z)| = ∞.
z→z0
For an essential singularity lim |f (z)| is undefined, and here’s why: in any neighborhood
z→z0
around z0 , no matter how small, f (z) must take on every complex value, with at most one
exception. See Problem 13.105. This is honestly not the most useful fact you will ever
encounter, but it’s so remarkable that we didn’t want you to go to your grave never having
heard it.
13.95 Use a complex extension of each real Laurent series, and evaluate
function to find the radius of conver- the integral of f (z) around that
gence of its (obviously real) Maclaurin contour.
series. 13.99 Use partial fractions to find the first three
(a) f (x) = 1∕(x 2 + 1) non-zero terms of the Laurent series for
(b) f (x) = e x ∕[(x − 1)2 + 1] f (z) = 1∕[z(z 2 − 1)] around z = 0. What is
(c) f (x) = 1∕(e x + 1) the radius of convergence of that series?
If you’re stuck you may find it help-
13.96 ful to look at Problem 13.98.
(a) Find the radius of convergence of the 13.100 Use partial fractions to find the first three
Maclaurin series for f (z) = 1∕(e z + 1). non-zero terms of the Laurent series for
(If you did Problem 13.95 you have f (z) = e z ∕[(z − 1)(z − 3)] around z = 1.
already solved this part.) What is the radius of convergence of that
(b) Now consider the real function f (x), series? If you’re stuck you may find it help-
whose Maclaurin series should have ful to look at Problem 13.98.
the same radius of convergence as that 13.101 A function with a singularity at z = z0 has a
of f (z). Plot f (x) and the 100th partial Laurent series about that point valid out to
sum on the same plot. Include vertical a radius equal to the distance to the near-
lines at the boundaries of the interval of est other singularity. It also has a Laurent
convergence you found. How does the series about z = z0 that is valid beyond that
partial sum behave inside this interval? other singularity (out to the distance to the
Outside it? next one). In this problem you’ll derive both
13.97 The function (sin z)∕(z − 𝜋∕4) has Laurent series for f (z) = 1∕[z(z − 1)] valid
a pole at z = 𝜋∕4. about z = 0.
(a) Find the first three terms of the (a) Use partial fractions to rewrite f in
Laurent series for (sin z)∕(z − 𝜋∕4) the form a∕z + b∕(z − 1). (You need
about the point z = 𝜋∕4. to find the constants a and b.)
(b) Find the residue of (sin z)∕(z − 𝜋∕4) (b) Use the Maclaurin series for 1∕(z − 1)
at z = 𝜋∕4. Hint: You should be to write the first four non-zero terms
able to just write it down. of the Laurent series for f valid
(c) Find the contour integral of in the region |z| < 1.
(sin z)∕(z − 𝜋∕4) around the unit circle. (c) Explain how you can know that this series
13.98 In this problem you will find the doesn’t converge to f when |z| > 1.
Laurent series for f (z) = 1∕[z(z − 2)] (d) Rewrite 1∕(z − 1) as 1∕[z(1 − 1∕z)]
about the point z = 0. and expand 1∕(1 − 1∕z) in powers
(a) You may remember the method of “par- of 1∕z to find a Laurent series for f .
tial fractions” for rewriting fractions This one is valid in the region |z| > 1.
of this type. Show that you can write Write your final answer in summation
f in the form a∕z + b∕(z − 2) by find- notation.
ing the correct constants a and b. (You (e) Explain how you can know that
should be able to easily check that your this series does converge for |z| > 1
answer is correct by adding the fractions and doesn’t for |z| < 1.
you get.) 13.102 Section 13.4 and this section gave
(b) Use this expansion to write the first four two very different ways of calculating
non-zero terms of the Laurent series residues. In this problem you will try
in the region around z = 0. Hint: use them both on f (z) = 1∕[z(z − 1)].
the Maclaurin series for 1∕(z − 2). (a) First, without calculating the Laurent
(c) What is the radius of convergence series, find the residue of f (z) at z = 0.
of this Laurent series? (b) Write 1∕(z − 1) as a Maclaurin series.
(d) Use the Laurent series you found to You can calculate this directly or use
find the residue of f (z) at z = 0. the binomial expansion.
(e) Specify a closed contour that encloses (c) Multiply that Maclaurin series by 1∕z to
the pole at z = 0 and lies entirely in find the Laurent series for f (z) about
the region of convergence of your z = 0. Confirm that the series gives
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 747
you the same value for the residue contain a different coefficient c this time.
that you found in Part (a). Once again, solve your equation for that
13.103 Suppose f (z) = g (z)∕(z − z0 )n , where g (z) is coefficient.
analytic and non-zero at z = z0 . Find the (c) Evaluate ∮ f (z)(z − z0 )m and solve the
Laurent series for f by expanding g in a resulting equation for the c coeffi-
Taylor series about z = z0 , and use your cient that appears in it.
result to find the residue of f at z = z0 . (d) Use your result from Part (c) to write
13.104 In this problem you will derive the for- a general formula for cn .
mula for the coefficients cn of a Laurent
series, Equation 13.7.1. For simplicity we 13.105 “Picard’s theorem” says that in any neigh-
will consider a Laurent series in a punc- borhood of an essential singularity a function
tured disk about the point z = z0 , and we f (z) must take on every possible complex
will assume that the contours we use in value, with at most one exception.
2
the problem enclose z = z0 (which may or (a) Find the Laurent series for f (z) = e 1∕z
may not be singular) but do not touch or about the origin and use it to prove that
enclose any other singularities. Recall that f has an essential singularity at the origin.
the residue of f (z) at z = z0 is c−1 . (b) Plot f (𝜌e i𝜙 ) for fixed 𝜌 and 0 < 𝜙 < 2𝜋.
(a) What is ∮ f (z) dz? Your answer should Do this for several decreasing values
contain c−1 . Rearrange it to get a formula of 𝜌 so you can see how f behaves
for c−1 in terms of this contour integral. around the origin as you get closer
(b) What is ∮ f (z)(z − z0 ) dz? (Hint: to it. Describe your results.
Think about what the Laurent series (c) Find the one value f never takes as
for f (z)(z − z0 ) is.) Your answer will you get close to the origin.
A complex function can be described as a “mapping” from the complex plane to itself. That
means that for every complex number that we put into the function we get a complex number
out. To see what this means, consider the function f (z) = z 2 .
1. If z = 1 + i what is f (z)?
See Check Yourself #94 in Appendix L
2. Draw a complex plane representing z. Mark on that plane the grid of points 0, 1, 2,
i, 1 + i, 2 + i, 2i, 1 + 2i, and 2 + 2i. Label the points “A,” “B,” and so on. Then draw
another complex plane and mark the points f (z) for each of the nine z-values above,
using the same labels.
As you can see, this technique represents one complex function with two different complex
planes: one for the input z, and one for the output f (z). Now draw two more complex planes,
once again representing the input and output of f (z) = z 2 . You will use these two for Parts 3–6.
3. On the z graph mark the line segment going from z = 0 to z = 2 (on the positive real
axis). On the f (z) graph mark the set of points f corresponding to all the z-values on
that line segment.
See Check Yourself #95 in Appendix L
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 748
4. Add a line segment from z = 0 to z = 2i to the z graph and the corresponding set of
points f on the f graph.
5. Add the curve |z| = 2 in the first quadrant to the z graph and the corresponding curve
to the f graph.
6. The three curves you just drew on your z graph define a quarter of a disk. What shape
does f (z) = z 2 map this quarter-disk to?
We noted in Section 13.2 that with a two dimensional input and a two dimensional output,
a complete graph of a complex function would require four dimensions. We also discussed
various workarounds for this problem. One graphing technique we did not discuss is “map-
ping,” which means you draw the input z on one plane and the output f (z) on a different
one.
In the Discovery Exercise (Section 13.8.1) you created a mapping to show how one region
of the complex plane is mapped by the function z 2 to another region. Below we show a few
examples of mappings of the function e z . No single mapping can convey the entire function,
but a few well-chosen mappings can lend significant visual intuition to your understanding
of a function. As we will see in Section 13.9, mapping also provides a powerful technique for
solving certain problems involving Laplace’s equation, such as the problem in the Motivating
Exercise (Section 13.1).
EXAMPLE Mapping
Problem:
Show how the function e z maps points in the first quadrant.
Solution:
We’ve selected a grid of nine points, which will be enough to show the basic behavior
of the function. We can calculate e z for each of these points either by hand or on a
computer, and plot the results.
Im(z)
Im(f)
8i
G H I
2i I
F
6i
D E F 4i
i
H E
2i
G D
A B C A B C Re(f)
Re(z)
1 2 –8 –6 –4 –2 2 4 6 8
Of course, more important than drawing the picture is interpreting what it tells us.
As z moves to the right (A → B → C, D → E → F , or G → H → I ), e z moves directly
outward from the origin. In other words, increasing the real part of z increases the
modulus of e z . Similarly, increasing the imaginary part of z (moving up through the z
points) increases the phase of e z without affecting the modulus.
If we write z = x + iy we get e z = e x e iy , so e z has modulus e x and phase y, confirming
algebraically what we just concluded graphically.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 749
2i
Re(z)
Re(z) –2 2 4 6
–1 1 i –2 i
–i Re(z) –4 i
2 4
the new curves will bound more than one region, in which case we can choose any point in
the original region and see which new region it maps to, as illustrated in the second example
below.
Question: The region R consists of the infinite vertical band between x = 1 and x = 2.
What region does the function f (z) = z 2 map R to?
Answer:
First we find the formulas for the two lines that bound this region. Each line has a
constant real part and an imaginary part that can be any real number, so they are
z = 1 + iy and z = 2 + iy, where −∞ < y < ∞. To map these curves we plug their
formulas into f . For the first curve f (z) = (1 + iy)2 = (1 − y2 ) + 2iy. If we write
f = u + iv, then u = 1 − y2 and v = 2y. We can eliminate y from these equations to get
a formula for the curve in the uv-plane: u = 1 − (v∕2)2 . This describes a left-facing
parabola. A similar calculation gives u = 4 − (y∕4)2 for the second curve.
Im(z) Im( f )
2i 2i
f(z) = z2
Re(z) Re( f )
1 2 3 2 4
–2 i –2 i
Question: The region R consists of the disk enclosed by the circle |z| = 2. What
region does the function f (z) = 1∕z map R to?
Answer:
The circle |z| = 2 can be described by the formula z = 2e i𝜙 with −𝜋 < 𝜙 ≤ 𝜋, which
maps to z = (1∕2)e −i𝜙 . This new curve is a circle of radius 1∕2. We’re not quite done,
though. In the example above there was only one region bounded by the two
parabolas, so that had to be the mapped region. In this case the new circle is the
boundary of its interior but also of its exterior. To see which is correct pick a point z
inside region R. For example, if z = 1 then f (z) = 1, which is outside the new circle, so
the mapped region is the exterior of |z| = 1∕2.
Im(z) Im(z)
3i 3i
2i 2i
i f(z) = 1/z i
Re(z) Re(z)
–3 –2 –1 1 2 3 –3 –2 –1 1 2 3
–i –i
–2i –2i
–3i –3i
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 751
Möbius Transformations
We conclude this section with one particularly important class of transformations. Transfor-
mations of the form
a + bz
f (z) = (13.8.1)
c + dz
are known as Möbius transformations.9 Consider, for example, the Möbius transformation
f (z) = 1∕z. You’ll show in Problem 13.119 that this transformation reverses the phase of its
argument (𝜙 → −𝜙) and inverts the modulus (|z| → 1∕|z|). If you use this to map the unit
circle you just get back the unit circle. That doesn’t mean each point on the unit circle is
mapped to itself; points on the unit circle are mapped to their complex conjugates. For
points not on the unit circle z is not mapped to z ∗ , but points in the upper half-plane do
move to the lower half-plane and vice versa.
To check your understanding of complex curves and mapping, we encourage you to try
and answer each of the following questions about the transformation 1∕z before reading our
answers.
∙ What does the circle of radius R centered on the origin map to? Answer : It maps to the
circle of radius 1∕R centered on the origin.
∙ What happens to a circle of radius 1∕4 that is not centered on the origin? Answer :
It’s a bit of a trick question because the answer depends on where the circle is. If the
circle extends from |z| = 1∕4 to |z| = 3∕4 then the mapped circle will go from |z| =
4∕3 to |z| = 4, so its radius and distance from the origin will increase. If the circle
goes from |z| = 100 to |z| = 100.5 then it will map to a tiny circle very near the origin.
More generally, regions close to the origin get moved far away and become large, while
regions far from the origin get moved closer and shrink.
∙ What happens in the limit where one edge of the circle approaches the origin? Answer :
The points infinitesimally close to the origin get mapped infinitely far away. If the circle
is very close to touching the origin it will get mapped to an enormous circle. If one
edge of the original circle touches the origin, the circle will get mapped to a line. See
Figure 13.13.
3i 3i 3i 3i
2i 2i 2i 2i
i i i i
0 3 4
0 3 4
0 3 4
0 3 4
4 3 4 3 4 3 4 3
–i –i –i –i
FIGURE 13.13 The function 1∕z maps the black circle to the blue curve. The closer one edge of the original circle
is to the pole at z = 0 the larger the mapped circle is. In the limit where the original circle touches the pole, it
maps to a line.
9 Yes, it’s the same guy who thought of the Möbius strip.
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 752
We’ve discussed all this for the simple example f (z) = 1∕z, but the same principles apply
to any transformation described by Equation 13.8.1.
Question: What does the function f (z) = (−6 − z)∕(4 + 3z) map the unit disk about
the origin to?
Answer:
The unit circle can be described as z = e i𝜙 for 0 ≤ 𝜙 ≤ 2𝜋. Plugging this into the
function, the mapped curve is the set of points f = (−6 − e i𝜙 )∕(4 + 3e i𝜙 ).
Instead of doing a lot of algebra to get this into a recognizable form, we can take a
shortcut. We know this is a Möbius transformation, and the unit circle does not touch
the pole at z = −4∕3, so it must map to a circle. Moreover, since all the coefficients
are real we know it will map real numbers to real numbers, so the points z = 1 and
z = −1 will remain on the real axis. So we calculate f (1) = −1 and f (−1) = −5. The
resulting circle has radius 2 and is centered on f = −3.
As a test we can plug in z = i, which after a bit of calculation gives
f = −(27∕25) + (14∕25)i. You can easily verify that the distance from f (i) to f = −3 is
2, so this point is also on the circle of radius 2 centered on −3. In Problem 13.120
you’ll prove more generally that all the points on the original circle map to the circle
we’ve just described.
Finally, by noting that z = 0 maps to z = −3∕2, we see that the region interior to the
original circle is mapped to the region interior to the new one.
Im(z)
Im(f)
2i
2i
i i
Re(z) Re(f)
–2 –1 1 2 –5 –4 –3 –2 –1
–i
–i –i
–2i
–2i
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 753
13.115 f (z) = 1∕z. The region is in the first quad- went on to find that the resulting circle must
rant between the circles of radius 1 and have radius 2 and center −3. This is generally
2 centered on the origin. the easiest approach for Möbius transforma-
13.116 f (z) = e z . The curve is the line segment tions, but in this problem you will prove that
from z = −1 − i to z = 1 + i. the result in this case actually is a circle.
13.117 f (z) = (1 + 3z)∕(5 + z). The curve is a unit (a) If our final conclusion was correct, then
circle centered on the point z = 2 + 2i. the transformation g (z) = (f (z) + 3)∕2
maps the unit circle to itself. (This
13.118 f (z) = e z . The curve is a quarter- doesn’t mean it’s the identity transfor-
circle of radius 2 in the first quadrant, mation; it could map each point on the
centered on the origin. unit circle to a different point on the unit
circle.) Explain how we know that.
13.119 The Explanation (Section 13.8.2) made (b) You have just shown that the state-
the claim that the transformation f (z) = ment “f (z) maps the unit circle to a
1∕z reverses the phase and inverts the circle” is equivalent to the statement
modulus of its argument. Prove it. (This “g (z) maps the unit circle to itself.” To
requires only a small bit of algebra.) prove the former you will prove the lat-
13.120 In the example on Page 752 we applied the ter, by showing that for any z = e i𝜙 the
resulting g (z) has modulus |g (z)| = 1.
mapping f (z) = (−6 − z)∕(4 + 3z) to the unit
(We recommend you start by simpli-
circle. We never proved that it would map
fying g (z) as much as possible.)
to a circle: we just accepted that fact, and
y v
2i 4i
3i
i 2i
x
π u
2 1 2 3 4
FIGURE 13.14 A rectangle mapped by the function sin z demonstrates how angles are preserved in a conformal
mapping.
To illustrate what this definition means, Figure 13.14 shows how f (z) = sin z maps a rect-
angular region. Two of the straight edges remain straight, one becomes an arc, and one
becomes a more complicated looking curve, but the intersections of these boundaries are
all still at right angles.
We introduced mapping as a technique for visualizing the behavior of a complex func-
tion. Here we will be using it for quite a different purpose. Remember that the technique we
introduced earlier for solving Laplace’s equation required that you guess an analytic func-
tion whose real or imaginary part would work for the region of interest. With conformal
mapping, if the region doesn’t suggest an obvious solution, you can find an analytic function
that maps it to a more convenient shape and solve the problem there. We demonstrate this
process below.
The Problem
The two dimensional region shown in Figure 13.15 has no heat y
sources or sinks. The boundary on the x-axis is held at tempera-
4
ture T = 0 and the boundary on the y-axis at T = 100. The two
curved boundaries are insulated, which imposes the boundary T = 100
condition that the derivative of T normal to the boundary is zero
on those curves. Find the steady-state temperature in this region. 2
With a bit of cleverness you might be able to solve this prob-
lem right now, but it will demonstrate the mapping technique
very clearly. Then we will work an example where a conformal x
2 T=0 4
mapping is more necessary.
FIGURE 13.15 Find the
The Mapping temperature in this region.
The technique begins by selecting an analytic function that will
map Figure 13.15 to a simpler region. We have seen before that the log function works well
with circles, so let’s see what it does to our region.
Recall that ln z = ln |z| + i𝜙. In other words the real part of f comes from the modulus of
z and the imaginary part is the phase of z. Using that fact we will go through each curve on
the boundary of the original figure. We will end up defining a new shape, and we will carry
our boundary conditions with us from the old one.
∙ The line segment from z = 2 to z = 4 consists of positive real numbers, so it maps to a
line segment on the real axis going from f = ln 2 to f = ln 4. The boundary condition
for this segment is T = 0.
∙ The inner curve consists of numbers with |z| = 2 and 𝜙 going from 0 to 𝜋∕2, so it maps
to a set of numbers f whose real part is ln 2 and whose imaginary part goes from 0 to
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 756
𝜋∕2. In other words it’s a vertical line segment at u = ln 2. The boundary condition
here is that the derivative of T perpendicular to the curve is zero, so on our new shape
𝜕T ∕𝜕u must be zero on this edge.
∙ Similarly, the outer curve maps to a vertical line segment at u = ln 4, where once again
𝜕T ∕𝜕u = 0.
∙ Finally the line segment from z = 2i to z = 4i maps to a set of points f that all have
imaginary part 𝜋∕2, and whose real parts go from ln 2 to ln 4. So this vertical line in the
original shape becomes a horizontal line in the new shape, which inherits the boundary
condition T = 100.
Put all of those together and you define the rectangle shown in
v Figure 13.16. In this case we figured out the shape of the mapped
region by thinking about the log function, but if we hadn’t been
T = 100 able to figure it out we could have written each curve as a function
π/2
z(a) and plugged that into ln z to find the curve f (a).
Before we proceed, make sure you understand what we just
determined. Figure 13.15 shows the region we are interested in:
it is a collection of points (x, y), each one at a particular tempera-
ture T . If we think of those points as x + iy on the complex plane,
T=0 then the function ln z maps each of those points to a different
u
ln(2) ln(4) point u + iv in the region shown in Figure 13.16. We define the
temperature at each point in the new region to be the same as the
FIGURE 13.16 The mapped
temperature at the corresponding point in the old region. (That’s
region. why we inherit the same boundary conditions.)
Finding the temperature distribution in a rectangle is easier
than the problem we were originally presented with. So we are going to solve that problem,
and then, because of the mapping, we can use that solution to solve our original problem.
Mapping Back
We now have a function T (u, v) = (200∕𝜋)v that works inside the rectangle. Remember what
it stands for: the temperature at each point in this region is the same as the temperature
T (x, y) in the corresponding point in the original washer-shaped region.
But we also have the function that maps from our original√(x, y) points to the new (u, v)
points. This function is f (z) = ln z, or in other words u = ln( x 2 + y2 ) and v = tan−1 (y∕x).
Putting all that together, we can just write down the function we were looking for.
(y)
200
T (x, y) = T (u(x, y), v(x, y)) = tan−1
𝜋 x
That last step is almost too easy. We were looking for a function that was harmonic (satisfied
Laplace’s equation) inside the original region, and that satisfied all the conditions on the
original boundaries. Are we guaranteed that we have found one?
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 757
∙ Does the new function meet the right boundary conditions? At every point on the top
of our rectangle, T (u, v) = 100. And we know that ln z maps those points to the y-axis
boundary of our original shape. So T (u(x, y), v(x, y)) along that boundary must also
give 100. In other words, every point in Figure 13.15 has the same temperature as the
corresponding point in Figure 13.16. So if we had all the right boundary temperatures
in the rectangle, we still have them in the washer.
∙ Is the new function harmonic? Inside the rectangle we found an analytic function. The
mapping from the first region to the second was based on an analytic function, i.e. it
was a conformal mapping. And the composition of two analytic functions is another
analytic function. So our new function is still analytic, which means it must be made of
harmonic functions.
∙ Finally, and most subtly, what about the insulated boundaries? The derivative of T per-
pendicular to the vertical edges of the rectangle was zero; does that mean the same
is true for the curved edges of the washer? It does, because conformal mappings pre-
serve all angles. If ∇T was perpendicular to the boundaries of the rectangle, it is also
perpendicular to the corresponding boundaries of the washer.
Of course you can (and should) verify that our solution satisfies Laplace’s equation and meets
the boundary conditions of this particular problem. But the three arguments above make
the more general case that whenever you use a conformal mapping, the solution in the simple
shape maps to a valid solution on the original shape. And remember that all of this rests on
a uniqueness theorem; if you have found a harmonic function that satisfies the boundary
conditions, you have found the one and only solution.
You may have noticed that the solution to this example, T (x, y) ∝ tan−1 (y∕x), is simply
proportional to the phase of z. We could have solved the problem simply by recognizing that
the phase of z is the imaginary part of ln z and thus harmonic, and that with appropriate
constants it can satisfy all our boundary conditions. Even though this particular problem
could have been solved without using a conformal mapping, we chose to start with it to
clearly show how the method works. As promised, however, we can now give you an example
of a problem that you would not want to solve without a conformal mapping.
Problem:
The unit circle has electric potential V = 1 along the top and V = −1 along the
bottom. It encloses no charges, so the potential inside obeys Laplace’s equation. Find
the potential function inside the circle using the mapping f (z) = (1 − z)∕(1 + z).
(After all we’ve said about logs and circles, you might be surprised that we’re not
using the log here. We invite you to try that; it doesn’t make the problem any easier.)
y
V=1
1
x
–1 1
–1
V = –1
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 758
Solution:
We begin by defining our boundary mathematically. The unit circle on the complex
plane consists of all the points z = e i𝜙 . On the bottom half of the circle −𝜋 < 𝜙 < 0,
and on the top half 0 < 𝜙 < 𝜋 ; we have to treat these two curves separately because
they have different boundary conditions.
Now we apply our mapping function to our boundary curves to find the new
region. Plugging z = e i𝜙 into our mapping function gives f (z) = (1 − e i𝜙 )∕(1 + e i𝜙 ).
After a bit of algebra (multiplying the top and bottom by 1 + e −i𝜙 is one possible
approach) and a few uses of Euler’s formula we end up here.
− sin 𝜙
f (z) = i
1 + cos 𝜙
We notice immediately that there is no real part, so this lies entirely along the
imaginary axis. As 𝜙 goes from 0 to 𝜋, f goes from 0 to −∞, so the top half of the
circle (where V = 1) maps to the negative imaginary axis. Similarly, the bottom half of
the circle at −𝜋 < 𝜙 < 0 maps to the positive imaginary axis, which thus inherits the
boundary condition V = −1.
That circle was the boundary of our original region. The imaginary axis is the
boundary of our new region, so our finite disk has been mapped to an infinite
half-plane. But which half? One way to figure that out is to pick a sample point in the
disk and try it. The easiest point is z = 0 which leads to f (z) = 1, so our points fall on
the right side of the plane. (If that seems shady, plug in z = 𝜌e i𝜙 . You will find that the
real part of f (z) is positive for 𝜌 < 1 and negative for 𝜌 > 1, showing again that our
disk maps to the right side of the plane.)
V = –1
V=1
Now we have to solve Laplace’s equation in that region. The key insight you need
here is that this region is a big wedge, of the sort that we discussed in Section 13.3. We
know that the phase 𝜙 is the imaginary part of ln z and therefore harmonic. This
region can be described fully by −𝜋∕2 ≤ 𝜙 ≤ 𝜋∕2, with the boundary conditions
𝜙(−𝜋∕2) = 1 and 𝜙(𝜋∕2) = −1. So the function V = −(2∕𝜋)𝜙 is harmonic and satisfies
the boundary conditions, giving us the solution in the mapped region.
( )
2 v
V (u, v) = − tan−1
𝜋 u
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 759
To map this back to the original region we use our f (z) function, but we have to break
it down into a real part u(x, y) and an imaginary part v(x, y). So we write
f (z) = (1 − x − iy)∕(1 + x + iy), multiply the top and bottom by 1 + x − iy and go
through a bit more algebra, and end up here.
1 − x 2 − y2 −2y
u(x, y) = and v(x, y) =
(1 + x)2 + y2 (1 + x)2 + y2
We use these mapping functions to turn our solution on the half-plane into the
solution we wanted for the disk.
( )
2 −1 2y
V (x, y) = − tan
𝜋 x 2 + y2 − 1
As always, it’s worth checking that this solution does satisfy Laplace’s equation and
the boundary conditions. (If you plug in x 2 + y2 = 1 this will blow up, but if you take a
limit as it approaches 1 you should get V = 1 on the upper half-circle and V = −1 on
the lower one. More simply, you can just sample a bunch of points very close to the
edge of the disk on the top and bottom.)
find a Möbius transformation that would turn that circle into a line, and a line with two dif-
ferent boundaries is a problem we can solve (because it’s a special case of a wedge). You’ll
see other examples in the problems where a Möbius transformation will make the prob-
lem solvable, and you just have to do some algebra to find the right Möbius transformation
to use.
Having come this far, you are now in a position to solve the problem we posed in the
Motivating Exercise (Section 13.1). See Problem 13.131.
Stepping Back
Assuming you understand every step of the process we’ve outlined in this function, it’s still
easy to lose track of the details. It may be helpful to keep Figure 13.17 around for quick
reference. What you want to know is V (x, y): the solution to Laplace’s equation in some region
in the xy-plane. You find a function f (z) that maps that region to a simpler region where you
are able to solve Laplace’s equation with the mapped boundary conditions. That gives you
V (u, v), and you can then plug in the functions u(x, y) and v(x, y) to get the solution V (x, y)
that you wanted.
x u
For finding the regions, finding the new boundaries, and translating the solution back in
the end, it’s the forward mapping f (z) that you need. Generally the only time you need to
backwards mapping z(f ) is when you are doing an inverse problem: start with a known simple
solution and apply a mapping z(f ) to find a complicated problem that you can solve with it.
(d) Solve Laplace’s equation with the and the other of radius 1. If the poten-
correct boundary conditions to tial on the inner circle is −1 and the
find T (u, v) in region B. potential on the outer circle is 0, find
(e) Write f (z) = u + iv as two real func- the potential in the region between
tions u(x, y) and v(x, y). the two circles (assuming this region
(f) Using the function u(x, y) and v(x, y) is free of charge). Call this plane the f
and the solution T (u, v), find the plane where f = u + iv, so the axes are
solution T (x, y) for region A. u and v and your answer should be a
function V (u, v).
(g) Verify that your solution satisfies the
boundary conditions of the original The next step is a mapping that turns this
problem. (Your solution to the last part easy problem into a hard one.√ For this
should be long and complicated, and problem you’ll use z(f ) = f + 1. This
this part should nonetheless be very turns both circles into more complicated
easy to do without a computer.) closed loop shapes in the z-plane.
(h) The point (4∕5, 0) is part of the (b) Find the equations of these compli-
original region A. cated loop shapes. We encourage you
i. What temperature is it? Answer to plot them, but it’s not easy with-
based on the temperature func- out a computer so we are not requir-
tion you found for region A. ing it until Problem 13.124.
ii. What point does it correspond to (c) Invert the mapping function we
in the mapped region B? gave you to find the function that
iii. What temperature is that point in? maps the complicated shapes back
Answer based on the temperature to the concentric circles.
function you found for region B. (d) Write the function f (z) in the
iv. If the two temperatures did not form u(x, y) + iv(x, y).
come out the same, worry! (e) Now you are ready to make up a prob-
lem. Your problem will look something
13.122 [This problem depends on Problem 13.121.] like this: “Find the potential function
(a) Plot the region described in Prob- in the complicated region described
lem 13.121. On the same plot show by with boundary
contour lines of your solution T (x, y) conditions by using
and verify that the boundaries of the mapping .” If you
the region lie on contours. make up exactly the right problem,
(b) Verify that your solution to Prob- then you have already done almost
lem 13.121 satisfies Laplace’s all the work to solve it.
equation. (This would be tedious (f) Solve the problem you just wrote.
without a computer.)
13.123 Walk-through: Inverse Problems. This is an 13.124 [This problem depends on Problem 13.123.]
inverse conformal mapping problem of the In Problem 13.123 you solved Laplace’s
sort that we discussed in the Explanation equation in the region in between
(Section 13.9.1). You will start on the f -plane the two curves generated by mapping
where the real axis is u and the imaginary concentric
√ circles with the function
axis is v, and solve a straightforward Laplace z(f ) = f + 1. In this problem you’ll see
problem there. Then we will give you the what that region and your solution
function z(f ) to map this simple region to a look like.
complicated region on the z-plane where the (a) Plot the region on which you solved
real axis is x and the imaginary axis is y. You the problem. (This is not the region
will end up with a difficult problem on the z- between the two concentric circles
plane that you can solve by mapping it to the we gave you; it is a more complicated
f -plane. region that you solved by mapping it
Your starting point will be a problem to those concentric circles.)
you already know how to solve: the poten-
(b) Using the solution V (x, y) you found,
tial between two concentric circles.
make a contour plot showing the
(a) Consider two concentric circles cen- equipotential lines (lines of constant
tered on the origin, one of radius 1∕2 V ) in the region between these two
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 762
curves. Each of the boundary curves transformation z(f ) = f 2 . You do not need
should be an equipotential. to have done that problem to do this one.
(Warning: The answer will be messy.)
Problems 13.125–13.127 are inverse problems of the
kind that we discussed in the Explanation 13.128 A circle of radius R1 is inside a larger circle
(Section 13.9.1). You might want to work through of radius R2 , and the two circles are tangent
Problem 13.123 as a model before tackling these. In to each other (meaning they touch at one
each problem you will be given a simple region R in point). The inner circle is at temperature
the f = u + iv plane, a set of boundary conditions for T1 and the outer one at temperature T2 .
a function T (u, v), and a transformation that maps R Find the steady-state temperature in the
to a more complicated region in the z = x + iy plane. region between them. Hint: Remember
For each problem… that a Möbius transformation whose pole
(a) Solve Laplace’s equation in the region R to find touches a circle turns it into a line, so you
T (u, v) with the given boundary conditions. can place your uv-axes in such a way that you
(b) Use z(f ) to find the region in the z-plane that cor- can do a simple Möbius transformation to
responds to region R in the f -plane. (In some turn both of these circles into lines.
cases you can easily sketch it; in other cases you 13.129 The Möbius transformation f (z) = 1∕(z − 2)
will only have a formula that is not easy to plot maps the circle |z| = 1 to a circle that
without a computer.) we will call C and the circle |z| = 2 to
(c) Invert z(f ) to find the function f (z). Use that a line that we will call L.
function and your solution from Part (a) to find (a) How do we know that |z| = 1 maps to a
the solution T (x, y) to Laplace’s equation in the circle, and |z| = 2 to a line? Hint: It isn’t
region you found in Part (b). obvious looking at these curves; it is a
property of Möbius transformations.
(d) Plot the boundaries of the xy region and plot
(b) Plugging z = e i𝜙 into the transformation
the isotherms of T (x, y) in that region.
will give you C but it’s easier to start
13.125 R is the ring between a circle of radius 1 and knowing you’re going to get a circle.
a circle of radius 2, both centered on the ori- Find the two points at the edges of the
gin. T = 1 on the inner circle and T = 2 on diameter of |z| = 1 along the real axis,
the outer one. z(f ) = 1∕(3 + f ). Hint: Because see where they map to, and that will give
this is a Möbius transformation the circles you two points that define a diameter
will get mapped to circles. The new circles of C. Based on this, draw circle C.
both have centers on the real line, so you can (c) It’s a bit easier to find line L; just find
easily figure out what circles they are. two points! Based on this, draw L on
13.126 R is an upper-right quarter-disk of radius the same graph where you drew C.
3 centered on f = 1. T = 0 on the lower (d) If C is held at potential VC and L is
edge, T = 1 on the left edge, and the curved held at potential VL find the electric
edge is insulating. (The derivative normal potential in the region exterior to
√ to
the curved boundary equals 0.) z(f ) = f . C and to the left of L.
Hint: To find T (u, v) in region R first find
13.130 [This problem depends on Problem 13.129.]
it in a quarter-circle centered on the ori-
Use your solution to Problem 13.129 to
gin and then shift your answer by 1 in the
make a plot showing the curves of constant
u-direction.
potential (“equipotentials”) in the region
Im(f) bounded by the circle and the line.
13.131 The region R is the semi infinite slab 0 ≤
3i x ≤ 𝜋∕2, 0 ≤ y < ∞. The temperature is
T=1 held at 0 on the bottom and right edges,
and at 1 on the left edge. Use the trans-
Re(f) formation f (z) = sin z to map R to a differ-
1 T=0 4
ent region, solve Laplace’s equation with
appropriate boundary conditions on the
13.127 Use the same region and boundary con- new shape, and transform your solution
ditions as Problem 13.126, but with the back to find T (x, y) in R. (The hardest
7in x 10in Felder c13.tex V3 - February 27, 2015 3:51 P.M. Page 763
APPENDIX A
d 3y d 2y dy
a3 (x) 3
+ a2 (x) 2
+ a1 (x) + a0 (x)y = f (x)
dx dx dx
d 3y dy
3
+ ex − 3y = 4x 5
dx dx
is a third-order differential equation, because it has a third derivative, and it has no fourth
derivatives, fifth derivatives, etc.
d 2y
x + (cos x)y = f (x)
dx 2
is homogeneous if f (x) = 0, but inhomogeneous for any other value of f (x) (including a
constant).
ax 2 + by2 + cx + dy + e = 0
describes a parabola if a or b is zero, an ellipse if a and b have the same sign, and a hyperbola if
a and b have opposite signs. (If a and b are both zero then it isn’t a second-order equation
at all: it’s a line.)
By analogy, the second-order, linear partial differential equation:
𝜕2 z 𝜕2z 𝜕z 𝜕z
A +B 2 +C +D +E =0
𝜕x 2 𝜕y 𝜕x 𝜕y
is described as parabolic if A or B is zero, elliptic if they have the same sign, and hyperbolic if
they have different signs. (If A and B are both zero then it isn’t a second-order differential
equation at all.)
Why do you care?
∙ An elliptic equation typically involves only spatial derivatives, and its boundary
conditions are usually specified on all borders of the domain of interest. A parabolic
or hyperbolic equation typically involves both time and space. The spatial boundary
conditions are usually given on all spatial borders, but the initial conditions are
typically given at only one value of time. These are not guaranteed mathematical
truths, but they are good rules of thumb.
∙ A parabolic equation is second order in one variable and first order in the other. When
separated, it is likely to show exponential growth or decay in the first-order variable, but
something more complicated (trig functions, Bessel functions, etc) in the second-order
variable.
Note: If a second-order algebra equation also has an xy term, you can do a linear transfor-
mation to a new set of coordinates where that term vanishes and then use the rule above to
𝜕2z
determine which shape it describes. Likewise, if a second-order PDE has you can do a
𝜕x 𝜕y
linear transformation to a new set of independent variables for which that term will vanish.
We do not discuss this technique in this book: see a textbook on PDEs for more details.
7in x 10in Felder appB.tex V3 - February 27, 2015 3:54 P.M. Page 768
APPENDIX B
Taylor Series
∑
∞
f (n) (0) f ′′ (0) 2 f ′′′ (0) 3
f (x) = x n = f (0) + f ′ (0)x + x + x +…
n=0
n! 2! 3!
∑
∞
f (n) (a) f ′′ (a) f ′′′ (a)
f (x) = (x − a)n = f (a) + f ′ (a)(x − a) + (x − a)2 + (x − a)3 + …
n=0
n! 2! 3!
Note that your ability to use this rule depends, not only on the Taylor series you are
using, but on the value you are plugging in. For instance the series 1 + x + x 2 + x 3 + …
alternates for x < 0 but not for x > 0.
2. A more general formula—in that it works whether the series alternates or not—is the
“Lagrange remainder.”
(x − a)n+1 (n+1)
Rn = f (z) The Lagrange remainder
(n + 1)!
Here f (n+1) is the (n + 1)st derivative of f and z is a number between x and a. Note that
if z = x then the Lagrange remainder is the next term in the Taylor series—the same
upper bound we saw for alternating series above.
In practice you can’t generally use this to find the error because you don’t know
z, but you can use it to place an upper bound on the error by finding the value of z in
that domain that maximizes this formula.
3. If a Maclaurin series converges on the interval |x| < 1 and has monotonically decreas-
ing absolute value then Rn ≤ ||cn+1 x n+1 || ∕(1 − |x|).
7in x 10in Felder appC.tex V3 - February 27, 2015 4:44 P.M. Page 770
APPENDIX C
∑
n
Sn = ai = a1 + a2 + a3 … an
i=1
The sum of an infinite series is defined as the limit of the partial sums.
∑
∞
an = lim Sn
n→∞
n=1
In some rare cases such as “telescoping series” you can find a closed-form expression for
the nth partial sum and then use this definition directly to determine the convergence or
divergence of a series. For other cases we have the tests.
nth term test ∙ If lim an ≠ 0 then the series 1. Because it is quick and easy, this is
n→∞
diverges. generally the first test to try for a
∙ If lim an = 0 this test is strictly numerical series.
n→∞ 2. This test can only prove divergence,
inconclusive.
never convergence!
a
Geometric Series ∙ If |r | < 1 it converges to 1
1−r
an = a1 r n−1 ∙ If |r | ≥ 1 it diverges
∙ If p > 1 the series converges Don’t get these confused with geometric
p-series series! 1∕2n is geometric; 1∕n 2 is a p-series.
∙ If p ≤ 1 the series diverges
1
an = p
n
7in x 10in Felder appC.tex V3 - February 27, 2015 4:44 P.M. Page 771
If the alternating series Note that the bn terms here are not the
Alternating ∑∞ n−1 b = b − b + b − b + …
n=1 (−1) n 1 2 3 4 individual terms of the original series; they
Series Test (bn ≥ 0) satisfies… are the terms with the alternator removed.
1. bn+1 ≤ bn for all n, and
2. lim bn = 0
n→∞
The discussion above applies to real series. Section 13.7 discusses convergence of complex
power series. It includes a method for using complex series to find the interval of conver-
gence of real power series, as an alternative to using the ratio test.
7in x 10in Felder appD.tex V3 - February 27, 2015 4:46 P.M. Page 772
APPENDIX D
Curvilinear Coordinates
Cartesian 𝜌2 = x 2 + y 2 r 2 = x 2 + y2 + z 2
tan 𝜙 = y∕x cos 𝜃 = z∕r
z=z tan 𝜙 = y∕x
Cylindrical x = 𝜌 cos 𝜙 r 2 = 𝜌2 + z 2
y = 𝜌 sin 𝜙 cos 𝜃 = z∕r
z=z 𝜙=𝜙
Caution: the equation 𝜙 = tan−1 (y∕x) will always give you an answer in the range −𝜋∕2 ≤ 𝜙 ≤ 𝜋∕2.
For a point in the second or third quadrant, add 𝜋 to get the correct 𝜙.
d s⃗Cartesian = dx î + dy ĵ + dz k̂
d s⃗cylindrical = d𝜌 𝜌̂ + 𝜌 d𝜙 𝜙̂ + dz k̂
d s⃗spherical = dr r̂ + r d𝜃 𝜃̂ + r sin 𝜃 𝜙̂
7in x 10in Felder appD.tex V3 - February 27, 2015 4:46 P.M. Page 773
From these differential lengths you can immediately get differential volumes.
dVCartesian = dx dy dz
dVcylindrical = 𝜌 d𝜌 d𝜙 dz
dVspherical = r 2 sin 𝜃 dr d𝜃 d𝜙
The Jacobians for these coordinate systems are just these formulas without the differentials,
e.g. r 2 sin 𝜃 for spherical coordinates.
In polar coordinates there’s no dz so dApolar = 𝜌 d𝜌 d𝜙.
Unit Vectors
The symbols i,̂ j,̂ and k̂ represent unit vectors (vectors of magnitude 1) in the x-, y-, and z-
directions, respectively. The Cartesian unit vectors are invariant: that is, î is the same vector
no matter where you are. By contrast, curvilinear unit vectors are position dependent. For
instance, r̂ means a unit vector in the r -direction, which means it points directly away from
the origin—so r̂ at one point is not the same vector as r̂ at another point.
The following table gives formulas for the curvilinear unit vectors as a function of position,
expressed in Cartesian and the native coordinate systems. Once again polar coordinate are
simply cylindrical coordinates without the k. ̂
Cylindrical Spherical
x î + yĵ x î + yĵ + z k̂
𝜌̂ = √ = cos 𝜙 î + sin 𝜙 ĵ r̂ = √ = sin 𝜃 cos 𝜙 î +
x 2 + y2 x 2 + y2 + z 2
sin 𝜃 sin 𝜙 ĵ + cos 𝜃 k̂
−yî + x ĵ −yî + x ĵ
𝜙̂ = √ = − sin 𝜙 î + cos 𝜙 ĵ 𝜙̂ = √ = − sin 𝜙 î + cos 𝜙 ĵ
x 2 + y2 x 2 + y2
( )
xz î + yz ĵ − x 2 + y2 k̂
k̂ = k̂ 𝜃̂ = √ √ = cos 𝜃 cos 𝜙 î +
x 2 + y2 x 2 + y2 + z 2
cos 𝜃 sin 𝜙 ĵ − sin 𝜃 k̂
7in x 10in Felder appE.tex V3 - February 27, 2015 4:49 P.M. Page 774
APPENDIX E
Matrices
For a clockwise rotation you replace Δ𝜙 with −Δ𝜙, which just switches the signs on the
sin(Δ𝜙) and − sin(Δ𝜙) terms. Remember that rotating the axes clockwise (a passive trans-
formation) is accomplished with the same matrix as rotating the vectors counterclockwise
(an active transformation).
The dimensions of a matrix are the number of rows, and the number of columns, in that
order. (𝐑 has dimensions 2 × 3.) Each element is referred to by its row followed by its column,
written as a subscript. (R23 = 8.)
Column A column is a vertical list of numbers. Matrix 𝐑 has three columns, 𝐒 two. Section 6.2.
Column Matrix A column matrix is a matrix with just one column. Section 6.2.
Rank The rank of a matrix is the number of independent rows. It can be shown that this is
equal to the number of independent columns. Section 7.4 (see felderbooks.com).
Row A row is a horizontal list of numbers. Matrices 𝐑 and 𝐒 have two rows each. Section 6.2.
Row Matrix A row matrix is a matrix with just one row. Section 6.2.
7in x 10in Felder appE.tex V3 - February 27, 2015 4:49 P.M. Page 775
Transpose The “transpose” 𝐌T of a matrix 𝐌 is obtained by turning all its rows into columns.
(𝐌Tij = 𝐌ji .) Section 7.1.
⎛ 3 1 ⎞ ( )
6 3
𝐑 =⎜ 2
T
5 ⎟ 𝐒 =
T
⎜ ⎟ 2 5
⎝ 4 8 ⎠
⎛ 2 3 0 0 0 ⎞
⎜ 1 4 0 0 0 ⎟
⎜ 0 0 1 3 1 ⎟ A block diagonal matrix
⎜ ⎟
⎜ 0 0 2 4 4 ⎟
⎝ 0 0 3 1 5 ⎠
Determinant The determinant is the product of the “eigenvalues” of a matrix. The determi-
nant is zero if the rows of the matrix are linearly dependent. (If the rows are then the
columns will be also and vice-versa.) Section 6.7.
Diagonal Matrix In a diagonal matrix every element is zero except the ones on the “main
diagonal.” Section 6.2.
Identity Matrix The identity matrix 𝐈 has 1s along the “main diagonal” and 0s everywhere
else. For any matrix 𝐌 with the same dimensions as a given identity matrix, 𝐌𝐈 =
𝐈𝐌 = 𝐌. Section 6.6.
Inverse Matrix The inverse matrix 𝐌−1 of a square matrix 𝐌 is defined by the property
𝐌𝐌−1 = 𝐌−1 𝐌 = 𝐈 (where 𝐈 is the “identity matrix”). Section 6.6.
( ) ( )
a c 1 d −c
The inverse of is .
b d ad − bc −b a
Main Diagonal (also principal diagonal, primary diagonal, or leading diagonal) The main diago-
nal is the list of items going from the upper left to the lower right (Mii ). Section 6.2.
Square Matrix A square matrix has the same number of rows as columns. Matrix 𝐒 is a square
matrix, 𝐑 is not. Section 6.2.
𝐌 = 𝐌 . For a real matrix unitary and “orthogonal” mean the same thing. Section 7.3.
† −1
7in x 10in Felder appF.tex V3 - February 27, 2015 4:51 P.M. Page 777
APPENDIX F
Vector Calculus
Operator Notation What kind of field goes in? What kind of field comes out?
gradient ⃗
∇f scalar vector
divergence ∇⃗ ⋅ f⃗ vector scalar
curl ⃗ × f⃗
∇ vector vector
2
Laplacian ⃗ f
∇ scalar scalar
Cartesian Coordinates
Differential Length: d s⃗ = dx î + dy ĵ + dz k̂
𝜕f 𝜕f 𝜕f
Gradient of f : î + ĵ + k̂
𝜕x 𝜕y 𝜕z
𝜕fx 𝜕fy 𝜕fz
Divergence of f⃗: + +
𝜕x 𝜕y 𝜕z
( ) ( ) ( 𝜕f )
𝜕fz 𝜕fy 𝜕fx 𝜕fz y 𝜕fx
Curl of f⃗: − î + − ĵ + − k̂
𝜕y 𝜕z 𝜕z 𝜕x 𝜕x 𝜕y
𝜕2f 𝜕2f 𝜕2f
Laplacian of f : + +
𝜕x 2 𝜕y2 𝜕z 2
Differential Length: d s⃗ = d𝜌 𝜌̂ + 𝜌 d𝜙 𝜙̂ + dz k̂
𝜕f 1 𝜕f ̂ 𝜕f ̂
Gradient of f : 𝜌̂ + 𝜙+ k
𝜕𝜌 𝜌 𝜕𝜙 𝜕z
1 𝜕 ( ) 1 𝜕f𝜙 𝜕fz
Divergence of f⃗: 𝜌f𝜌 + +
𝜌 𝜕𝜌 𝜌 𝜕𝜙 𝜕z
( ) ( ) [ ]
1 z𝜕f 𝜕f 𝜙 𝜕f𝜌 𝜕fz 1 𝜕 ( ) 𝜕f𝜌 ̂
Curl of f⃗: − 𝜌̂ + − 𝜙̂ + 𝜌f𝜙 − k
𝜌 𝜕𝜙 𝜕z 𝜕z 𝜕𝜌 𝜌 𝜕𝜌 𝜕𝜙
𝜕2 f 1 𝜕f 1 𝜕 f 𝜕2f
2
Laplacian of f : + + +
𝜕𝜌2 𝜌 𝜕𝜌 𝜌2 𝜕𝜙2 𝜕z 2
7in x 10in Felder appF.tex V3 - February 27, 2015 4:51 P.M. Page 778
Spherical Coordinates
Fundamental Theorems
APPENDIX G
x 1
x sin(ax)dx = − cos(ax) + 2 sin(ax) + C (a ≠ 0)
∫ a a
x 1
x cos(ax)dx = sin(ax) + 2 cos(ax) + C (a ≠ 0)
∫
(a )a
ix 1
xe iax dx = − + 2 e iax + C (a ≠ 0)
∫ a a
e kx
e kx sin(bx)dx = [k sin(bx) − b cos(bx)] + C (k 2 + b 2 ≠ 0)
∫ k2 + b2
e kx
e kx cos(bx)dx = [k cos(bx) + b sin(bx)] + C (k 2 + b 2 ≠ 0)
∫ k2 + b2
After you integrate remember that n (the summation index) is always an integer, and for any
integer n:
sin(𝜋n) = 0, cos(𝜋n) = (−1)n , e i𝜋n = (−1)n
∑
∞ ( ) ∑ ∞ ( )
n𝜋 n𝜋
f (x) = a0 + an cos x + bn sin x where
n=1
L n=1
L
L L ( ) L ( )
1 1 n𝜋 1 n𝜋
a0 = f (x) dx an = f (x) cos x dx bn = f (x) sin x dx
2L ∫−L L ∫−L L L ∫−L L
The complex exponential Fourier series for a function f (x) with period 2L is:
( ) ( )
∑
∞
i n𝜋 x 1
L
−i n𝜋
x
f (x) = cn e L where cn = f (x)e L dx
n=−∞
2L ∫−L
( )
L 2L
You can evaluate any of these integrals over any full period. ∫−L is the same as ∫0 .
7in x 10in Felder appG.tex V3 - February 27, 2015 4:52 P.M. Page 780
∑
∞ ( ) L ( )
n𝜋 2 n𝜋
f (x) = bn sin x where bn = f (x) sin x dx
n=1
L L ∫0 L
∑
∞ ( )
n𝜋
f (x) = a0 + an cos x
n=1
L
L L ( )
1 2 n𝜋
where a0 = f (x) dx and an = f (x) cos x dx
L ∫0 L ∫0 L
Either one of these series will in general converge to f (x) at all points on the interval
0 < x < L where f is continuous. On the interval −L < x < 0 the sine series will create an
“odd extension” of the function and the cosine series an “even extension.”
∑ ∑
∞ ∞ ( ) ( ) ( ) ( )
m𝜋 n𝜋 m𝜋 n𝜋
Amn cos x cos y + Bmn cos x sin y +
L W L W
m=0 n=0
( ) ( ) ( ) ( )
m𝜋 n𝜋 m𝜋 n𝜋
Cmn sin x cos y + Dmn sin x sin y
L W L W
where
W L ( ) ( )
1 m𝜋 n𝜋
Amn = f (x, y) cos x cos y dx dy,
QLW ∫−W ∫−L L W
W L ( ) ( )
1 m𝜋 n𝜋
Bmn = f (x, y) cos x sin y dx dy
QLW ∫−W ∫−L L W
and similarly for C and D. The Q in the denominator is 4 if m and n are both zero, 2 if
one of them is zero, and 1 if neither one is zero. (This sounds strange but it’s a consistent
extrapolation of a0 in a single-variable series.)
If the function is even in x then all terms with sin x will go to zero, if a function is odd in
x then all terms with cos x will go to zero, and the same is true of course for y.
The complex exponential form is:
( ) ( )
∑
∞
∑
∞
i m𝜋 x i n𝜋 y
f (x, y) = cmn e L e W
m=−∞ n=−∞
W L ( ) ( )
m𝜋
1 −i x −i n𝜋 y
where cmn = f (x, y)e L e W dx dy
4LW ∫−W ∫−L
7in x 10in Felder appG.tex V3 - February 27, 2015 4:52 P.M. Page 781
Fourier Transforms
Warning: Different sources use slightly different equations for Fourier transforms. Your formulas
will work as long as you stick consistently to the conventions we present here, or to any other self-
consistent set. But if you Fourier transform with our equation, and then inverse Fourier transform
with someone else’s, you may get an incorrect answer.
The function f̂(p) is the “Fourier transform” of f (x), and can be calculated from:
∞
1
f̂(p) = f (x)e −ipx dx
2𝜋 ∫−∞
For a function defined on the half-infinite interval [0, ∞) you can define a Fourier sine or
cosine integral by making an odd or even extension to negative values.
∞ ∞
2
f (x) = f̂s (p) sin(px) dp where f̂s (p) = f (x) sin(px) dx
∫0 𝜋 ∫0
or
∞ ∞
2
f (x) = f̂c (p) cos(px) dp where f̂c (p) = f (x) cos(px) dx
∫0 𝜋 ∫0
1
→ f̂(p) = √ e −p ∕4
2 2
Simplest case: f (x) = e −x
2 𝜋
Aw
→ f̂(p) = √ e ipx0 e −w p ∕4
2 2 2
General case: f (x) = Ae −[(x−x0 )∕𝜔]
2 𝜋
7in x 10in Felder appH.tex V3 - February 27, 2015 4:53 P.M. Page 782
APPENDIX H
Laplace Transforms
[ ] ∞
f (t) = F (s) = f (t)e −st dt
∫0
Chapter 10 introduces Laplace transforms and uses them to solve ordinary differential
equations and coupled differential equations. Chapter 11 uses Laplace transforms to solve
partial differential equations. To find an inverse Laplace transform by hand requires a
contour integral on the complex plane, so this is discussed in Chapter 13.
The Laplace transform is a linear operator.
[ ]
af (t) + bg (t) = aF (s) + bG(s)
The Laplace transform turns derivatives and integrals into multiplication and division.
[ ]
𝜕 (n) f 𝜕f 𝜕2f 𝜕 (n−1) f
= s n F (s) − s n−1 f (0) − s n−2 (0) − s n−3 2 (0) − … − (n−1) (0)
𝜕t (n) 𝜕t 𝜕t 𝜕t
[ t ]
F (s)
f (𝜏) d𝜏 =
∫0 s
7in x 10in Felder appH.tex V3 - February 27, 2015 4:53 P.M. Page 783
You can find much more extensive tables online, or ask your favorite mathematical
software package for Laplace and inverse Laplace transforms. When using a table, don’t
forget the property of linearity! For instance, you can tell immediately from the table
above that: [ ]
3 sin(5t) + 7e 11t = 15∕(s 2 + 25) + 7∕(s − 11)
[ ]
Other cases may take more work. For instance, suppose you want −1 1∕(2s + 5)7 . Your first
step might be to pull out the 2 on the bottom.
[ ]
−1 1 1
27 (s + 5∕2)7
That looks like the formula for t n e −at with n = 6 and a = 5∕2. It’s missing the 6! in the numer-
ator, but we can put that in too.
[ ] [ ]
1 1 6! t 6 e −5t∕2
−1
= −1
=
(2s + 5) 7 7
(2 )(6!) (s + 5∕2) 7 (27 )(6!)
Try a few! It’s easy to check your answers (to all the problems in this appendix) if you have a
computer that can take Laplace transforms.
1. [5t 2 + 6t + 7]
2. [4H (t − 1) − 4H (t − 2)] ( )
3. [sinh t] Hint: sinh t = e t − e −t ∕2.
4. −1 [1∕s 3 ]
5. −1 [1∕(3s + 4)2 ]
6. −1 [1∕(cs + d)5 ]
7. −1 [1∕(2s 2 + 18)]
7in x 10in Felder appH.tex V3 - February 27, 2015 4:53 P.M. Page 784
Partial Fractions
You may have learned the technique of partial fractions as a trick for finding integrals, but
it is also very useful for looking up inverse Laplace transforms.
Problem: [ ]
Use partial fractions and the table above to find −1 11∕(4s 2 + 5s − 6) .
Solution:
Noting that the bottom factors into (4s − 3)(s + 2), we can rewrite the fraction. (This
is not a property of Laplace transforms, it’s just algebra.)
11 A B
= +
(4s − 3)(s + 2) 4s − 3 s + 2
We want this to be the same as the fraction we started with. The denominators are the
same so the numerators must match term by term.
A + 4B = 0 and 2A − 3B = 11
You can solve these with substitution, elimination or matrices to find A = 4 and
B = −1. So we can rewrite our original problem in a different form. Then we can use
linearity to look up each of these in the table individually and combine the answers.
[ ] [ ] [ ]
11 4 1 1 1
−1
=−1
− = −1
− = e 3t∕4 − e −2t
4s 2 + 5s − 6 4s − 3 s + 2 s − 3∕4 s + 2
Try a few!
[ ]
8. −1 (s + 22)∕(s 2 + 9s + 14)
[ ]
9. −1 (9s + 6)∕(s 2 − 2s − 24)
[ ]
10. −1 13∕(2s 2 + 11s − 6)
[ 2 ]
3s − 13s + 34
11. −1 Hint: You need to break this up into three fractions: one with
(s + 1)(s − 4)2
denominator s + 1, one with s − 4, and one with (s − 4)2 .
7in x 10in Felder appH.tex V3 - February 27, 2015 4:53 P.M. Page 785
Convolution
The property of linearity applies to the sum of two functions. “Convolution” gives you a way
of finding the inverse Laplace transform of a product of two functions.
Convolution
Given two functions f (t) and g (t), the “convolution” of these two functions is defined as:
t
(f ∗ g )(t) = f (𝜏)g (t − 𝜏)d𝜏
∫0
The Laplace transform of a convolution is the product of the Laplace transforms of the individual
functions.
[ ]
(f ∗ g )(t) = [f (t)][g (t)]
Problem: [ ]
Use convolution and the table above to find −1 11∕(4s 2 + 5s − 6) .
Solution:
Once again we begin by noting that we can factor the denominator. We can also
factor out the 11 (linearity again) and rewrite the problem like this.
[( )( )]
1 1
11 −1
4s − 3 s+2
We can find the Laplace transforms of 1∕(4s − 3) and 1∕(s + 2) using the table above.
[ ] [ ] [ ]
1 1 −1 1 1 1
−1
= = e 3t∕4 and −1 = e −2t
4s − 3 4 s − 3∕4 4 s+2
The inverse Laplace transform of the product of these two functions is the
convolution of those two functions.
t
1 3𝜏∕4 −2(t−𝜏) 11 −2t
t ( )
11 e e d𝜏 = e e 11𝜏∕4 d𝜏 = e −2t e 11t∕4 − 1 = e 3t∕4 − e −2t
∫0 4 4 ∫0
Try a few!
[ ]
12. −1 (s + 22)∕(s 2 + 9s + 14) (Yes, this is Problem 8 again. If you did it before with
partial fractions, make sure you get the same answer with convolution.)
[ ]
1
13. −1 2
[ s(s + 9) ]
1
14. −1 2 2
s (s + 9)
[( )( )]
a s
15. −1
Hint: You will need a few trig identities.
s2 + a2 s2 + a2
[ ] t
16. Use convolution to show that if [f (t)] = F (s) then −1 F (s)(1∕s) = ∫0 f (𝜏)d𝜏.
(This is the property that makes Laplace transforms useful for integro-differential
equations.)
17. The table above gives the Laplace
[ transforms
] of a and 𝛿(t − t0 ). Use those facts and
convolution to show that −1 e −t0 s ∕s = H (t − t0 ).
18. Prove that convolution is commutative. Hint: substitute u = t − 𝜏.
7in x 10in Felder appi.tex V3 - February 27, 2015 4:54 P.M. Page 787
APPENDIX I
Solving differential equations is in some ways like integration. You learn a battery of tech-
niques, each with a specific set of steps and conditions. Faced with a new problem, you have
to start by figuring out which technique to use. There are no surefire rules for this task—in fact,
for some problems, several different techniques may apply. But below we have collected the
major guidelines for the techniques presented in Chapter 11.
If you go through this process and decide to use separation of variables, you may find it
helpful to review the summary of that process on Page 611.
solution that solves the PDE and the now-modified initial conditions with homo-
geneous boundary conditions (Section 11.8). If the boundary conditions are dif-
ficult to match (e.g. time-dependent boundary conditions) you can use an extra
trick that we illustrate in Section 11.12 Problem 11.260 (see felderbooks.com).
(d) Does your problem meet all the conditions from Part (a) except that you have
too many inhomogeneous boundary conditions? You may be able to create “sub-
problems” with fewer inhomogeneous conditions, and then sum the solutions to
solve the problem you were given (Section 11.8).
(e) Is your differential equation inhomogeneous? The easiest approach to an inhomo-
geneous equation is to find one particular solution to the equation, often by making
a simplifying assumption. (“Hey, a line will work” or “What if I look for a steady-
state solution by setting 𝜕f ∕𝜕t equal to zero?”) Such a solution has no arbitrary
constants and therefore cannot match your initial conditions, but you can then
add it to a solution to the “complementary” homogeneous equation, which you
may be able to find with separation of variables (Section 11.8). If that approach
does not work you discard separation of variables altogether and move on to the
techniques discussed below.
As powerful as separation of variables is, it is not universal. For example, it cannot be used on
any equation for which you cannot separate the variables (which includes, but is not limited
to, inhomogeneous equations). When separation of variables is unusable for any reason, the
following techniques may serve.
These techniques apply to individual variables, and can be done in combination. For
example, if you have a problem where 0 < x < L, −∞ < y < ∞, and 0 < t < ∞, you might
be able to simplify the problem by applying an eigenfunction expansion to x, a Fourier
transform to y, or a Laplace transform to t. It may even prove useful to do all three!
3. If you have a spatial variable with homogeneous boundary conditions and a finite
domain, consider eigenfunction expansion (Section 11.9).
To apply this method, you have to be able to find the eigenfunctions of your dif-
ferential operator. The most common example is the operator 𝜕 2 ∕𝜕x 2 for which the
eigenfunctions are sines and cosines, so you expand into a Fourier series.
4. Consider using a Fourier transform (Section 11.10) if you have a spatial variable
with an infinite or semi-infinite domain that appears only in a second derivative and
(optionally) in an inhomogeneous term. In other words, consider this method if you
can rewrite your PDE as
𝜕2f
+ (term that may include x but doesn’t depend on f )
𝜕x 2
+ (other terms that don’t involve x) = 0
We specified that this was a spatial variable because time usually doesn’t have boundary
conditions at infinity, which we need as described below.
(a) Does your variable have an infinite domain with lim f (x, …) = lim f (x, …) = 0?
x→∞ x→−∞
Use a regular Fourier transform with complex exponentials.
(b) Does your variable have a semi-infinite domain (e.g. 0 < x < ∞), with
lim f (x, …) = 0, and an explicit boundary condition that specifies f at x = 0?
x→∞
Use a Fourier sine transform.
(c) Does your variable have a semi-infinite domain (e.g. 0 < x < ∞), with
lim f (x, …) = 0, and an explicit boundary condition that specifies 𝜕f ∕𝜕x at
x→∞
x = 0?
Use a Fourier cosine transform.
7in x 10in Felder appi.tex V3 - February 27, 2015 4:54 P.M. Page 789
5. If time appears in your equation only in the derivatives, not in the coefficients, con-
sider a Laplace transform (Section 11.11).
This method only works if t is defined on a semi-infinite domain 0 < t < ∞ and the
conditions on this variable are all defined at t = 0. Those criteria rarely apply to spatial
variables.
6. If you have a variable with an infinite or semi-infinite domain that does not meet the
criteria for Parts 4 or 5, consider a different transform.
The transform you choose is based on your differential operator and boundary con-
ditions. The transform should be an expansion of your solution in eigenfunctions of
the differential operator in your equation. Other transforms are not covered in this
book, but by stressing the commonalities of all transform methods, we hope we have
prepared you to select and use different transforms for different problems.
This summary does not include every known analytical technique for solving partial differ-
ential equations, but it does cover the techniques that are most commonly used in simple
situations. It is not uncommon to encounter PDEs that cannot be solved by any of the meth-
ods described above, or PDEs that cannot be analytically solved at all! This brings us to the
last step of our summary:
7. If you need to solve a partial differential equation that cannot be solved by any of
these techniques, consider solving it numerically.
7in x 10in Felder appJ.tex V3 - February 27, 2015 4:55 P.M. Page 790
APPENDIX J
The table below lists solutions to some common differential equations. The sections below
the table give the minimum amount of information you need to use these functions to
solve ordinary differential equations, or to find the normal modes for partial differential
equations. More information about many of these functions can be found in Chapter 12.
Some special functions that are not primarily defined as solutions to particular ODEs can be
found in Appendix K.
You may need to do a bit of algebra to make your problem match an equation in this table.
Watch in particular for variable substitutions (Section 10.7).
Example: (k 2 − x 2 )y′′ − 2xy + l(l + 1)y = 0
Substitute: u = x∕k
After some algebra the equation becomes: (1 − u 2 )y′′ − 2uy′ + l(l + 1)y = 0
Look that up below to get the solution: y = APn (x∕k) + BQn (x∕k)
Trigonometric Functions
We assume you know the basic properties of trig functions, but we include them to emphasize
the parallels with many less familiar functions. To solve a PDE whose normal modes are sines
and cosines, you generally only need the facts in this appendix—and the same is true for the
other functions listed here.
∙ sin 0 = 0; cos 0 = 1.
∙ sin x has an infinite number of zeros. They occur at n𝜋 where n is any integer. cos x has
an infinite number of zeros that occur at 𝜋∕2 + n𝜋.
∙ Both sin x and cos x are periodic, with period 2𝜋.
∙ sin x has odd symmetry, cos x even.
d d
∙ dx sin x = cos x; dx cos x = − sin x. (Like many of the other facts listed here, this only
holds if x is measured in radians.)
∙ If a function f (x) on the domain [−L, L] is expanded into a “Fourier series” of
the form: ∞ [
∑ ( ) ( )]
n𝜋 n𝜋
f (x) = a0 + an cos x + bn sin x
n=1
L L
Exponential Functions
∙ The function e x has no zeros, and is not symmetric.
∙ lim e x = ∞ and lim e x = 0. The reverse is of course true for e −x .
x→∞ x→−∞
∙ The function A sinh(kx) + B cosh(kx) (discussed below) is fully equivalent to Ae kx +
Be −kx , but with different values of the arbitrary constants. The hyperbolic trig form
is easier for matching given initial conditions y(0) and y′ (0).
and cosine. And it is most important to note how these properties fit into solving differential
equations.
∙ sinh 0 = 0; cosh 0 = 1.
∙ The origin is the only zero of the hyperbolic sine. The hyperbolic cosine has no zeros.
∙ lim cosh x = lim cosh x = lim sinh x = ∞; lim sinh x = −∞.
x→∞ x→−∞ x→∞ x→−∞
∙ sinh x has odd symmetry, cosh x even.
d d
∙ dx sinh x = cosh x; dx cosh x = sinh x. (No negative sign!)
The hyperbolic trig functions are discussed in more detail in Section 12.3.
Bessel Functions
The Bessel functions are two different families of functions. Jp (x) stands for the “Bessel
functions of the first kind of order p” and Yp (x) are “Bessel functions of the second kind of
order p.”
∙ J0 (0) = 1. For p > 0, Jp (0) = 0.
∙ All the Jp (x) functions have an infinite number of zeros. These are difficult to calculate,
but they are listed in many mathematical tables and computer systems. By convention
𝛼p,n represents the nth zero of the Bessel function Jp ; for instance, 𝛼0,4 is the fourth
positive zero of the function J0 .1
∙ For p ≥ 0, lim Yp (x) diverges. For many problems, this makes the Jp (x) functions the
x→0
only solutions.
∙ All the Yp (x) functions have an infinite number of zeros.
∙ If a function f (x) in the domain [0, a] is expanded into a “Fourier-Bessel series
expansion” of the form:
(𝛼 )
∑∞
p,n
f (x) = An Jp x
n=1
a
Note that this involves a choice of p. For instance, you may take a given function and
expand it into third-order Bessel functions J3 (𝛼3,n x∕a), or ninth-order Bessel functions
J9 (𝛼9,n x∕a), or any other order. All these will produce different series with different
coefficients.
∙ The Bessel functions of a given order are orthogonal to each other with weight x. For
integer m and n:
a (𝛼 ) (𝛼 )
p,m p,n a2 2
xJp x Jp x dx = Jp+1 (𝛼p,m )𝛿mn
∫0 a a 2
𝛼p,n
1 When a Bessel function is being examined on a finite interval [0, a], some textbooks use 𝜆p,n to mean a
.
7in x 10in Felder appJ.tex V3 - February 27, 2015 4:55 P.M. Page 794
∙ The solutions to Bessel’s equation can also be written in terms of “Hankel functions,”
also known as “Bessel functions of the Third Kind.”
Hp(1) (x) = Jp (x) + iYp (x), Hp(2) (x) = Jp (x) − iYp (x)
Bessel functions are introduced and used to solve PDEs in Section 11.6, and derived in
Section 12.7.
As with regular Bessel functions, you can choose to expand a function f (x) in whatever
order (value of p) spherical Bessel function you want.
∙ The spherical Bessel functions of a given order are orthogonal to each other with
weight x 2 . For integer m and n:
a (𝛼 ) (𝛼 )
2 p+1∕2,m p+1∕2,n a3 2
x jp x jp x dx = jp+1 (𝛼p+1∕2,m )𝛿mn
∫0 a a 2
Airy functions
∙ The values at x = 0 can be expressed in terms of gamma functions.
1 1 1 31∕6
Ai(0) = 2∕3
, Bi(0) = 1∕6
, Ai ′ (0) = − 1∕3
, Bi ′ (0) =
3 Γ(2∕3) 3 Γ(2∕3) 3 Γ(1∕3) Γ(1∕3)
7in x 10in Felder appJ.tex V3 - February 27, 2015 4:55 P.M. Page 795
(The gamma function Γ(x) function is described in Section 12.3, and in Appendix K .)
∙ lim Ai(x) = lim Ai(x) = lim Bi(x) = 0. lim Bi(x) diverges.
x→∞ x→−∞ x→−∞ x→∞
∙ Neither Airy function has any zeroes for positive x, and both have infinitely many zeroes
for negative x.
∙ For positive x the Airy functions can be written in terms of modified Bessel functions.
√ ( ) √ [ ( ) ( )]
1 x 2 3∕2 x 2 3∕2 2 3∕2
Ai(x) = K1∕3 x , Bi(x) = I1∕3 x + I−1∕3 x
𝜋 3 3 3 3 3
∙ For negative x the Airy functions can be written in terms of regular Bessel functions.
√ [ ( ) ( )]
x 2 3∕2 2 3∕2
Ai(−x) = J1∕3 x + J−1∕3 x ,
9 3 3
√ [ ( ) ( )]
x 2 3∕2 2 3∕2
Bi(−x) = J1∕3 x − J−1∕3 x
9 3 3
You can therefore find the zeroes of the Airy functions in terms of the zeroes of
J1∕3 (x).
1 x d n n −x ∑n
n!
Ln (x) = e (x e ) = (−1)m xm
n! dx n m=0
(n − m)!(m!) 2
With this definition Ln (0) = 1 for all n. Sometimes the 1∕n! is omitted, in which case
Ln (0) = n!.
Warning: The formulas given below apply to the definition for the Laguerre polynomials given
above. Some sources omit the 1∕n! coefficient; the resulting functions solve the same differential
equation, but the formulas must be modified accordingly.
∙ The associated Laguerre polynomials for any given 𝛼 are orthogonal with weight x 𝛼 e −x .
∞
(n + 𝛼)!
x 𝛼 e −x Lm𝛼 (x)Ln𝛼 (x)dx = 𝛿mn
∫0 n!
Hermite Polynomials
Warning: The differential equation and properties given here for Hermite polynomials are standard
in physics. In probability theory different conventions are typically used.
∙ The boundary conditions that typically go with the Hermite equations are a bit com-
plicated, but in practice they generally restrict n to be an integer and they eliminate
the second solution to the equations.
∙ The Hermite polynomials can be defined by the following formula.
( 2)
2 dn
Hn (x) = (−1)n e x e −x
dx n
∙ Hn (x) is an nth-order polynomial. The first few are:
Spherical Harmonics
Spherical harmonics are unique within this appendix. All the other solutions are functions
of one variable which we have chosen here to call x. Spherical harmonics are a multivariate
function of the spherical coordinates 𝜃 and 𝜙. (See Appendix D.) As such they provide a
general representation for any function that depends on direction but not on distance.
7in x 10in Felder appJ.tex V3 - February 27, 2015 4:55 P.M. Page 797
Warning: Different textbooks use different conventions for the constant in front of Plm (cos 𝜃)e im𝜙 .
The formula given above is common in physics, but be sure to look at the convention in any other
source you are using. The formula below for the coefficients of an expansion in spherical harmonics
is only valid for this convention.
∑
∞
∑
l
f (𝜃, 𝜙) = Alm Ylm (𝜃, 𝜙)
l=0 m=−l
with coefficients
2𝜋 𝜋 [ ]∗
Alm = f (𝜃, 𝜙) Ylm (𝜃, 𝜙) sin 𝜃 d𝜃 d𝜙
∫0 ∫0
[ ]∗
where Ylm is the complex conjugate of Ylm .
Spherical harmonics are discussed and used to solve partial differential equations in
Section 11.7.2.
7in x 10in Felder appk.tex V3 - February 27, 2015 4:56 P.M. Page 798
APPENDIX K
Special Functions
The functions in this appendix are listed in alphabetic order. Functions that are primarily
defined as solutions to particular ODEs (including Bessel functions, Legendre polynomials,
hyperbolic trig functions, and others) are listed in Appendix J.
Beta Function
1
B(x, y) = t x−1 (1 − t)y−1 dt x > 0, y > 0
∫0
Γ(x)Γ(y)
B(x, y) =
Γ(x + y)
The beta function is “symmetric” meaning that B(x, y) = B(y, x). It is used in probability and
statistics. It is not discussed in this book.
More properly, we define the Dirac delta function as the limit of a sequence of functions.
⎧ 0 x<0
⎪
𝛿(x) = lim D(x) where D(x) = ⎨ 1∕k 0≤x≤k
k→0
⎪ 0 x>k
⎩
∙ Strictly speaking 𝛿(x) is not a function at all, but it is often useful to treat it as one.
Mathematicians call it a “generalized function.”
∙ For any function f (x) that is continuous around 0, f (x)𝛿(x) is the same thing as f (0)𝛿(x).
For instance, (x 2 + 2 cos x)𝛿(x) = 2𝛿(x).
∞
∙ ∫−∞ f (x)𝛿(x − a) dx = f (a).
7in x 10in Felder appk.tex V3 - February 27, 2015 4:56 P.M. Page 799
That last bullet point follows logically from the one before it, but it is the most important; that
integral is the key to using the Dirac delta in both Laplace transforms and Green’s functions.
The Dirac delta function is discussed primarily in Chapter 10. Section 10.10 defines
and introduces this function; Section 10.11 uses 𝛿(x) in differential equations to model
fast changes, and then solves those equations with Laplace transforms; Section 10.12 uses
“Greeen’s functions” to introduce 𝛿(x) into other differential equations as a mechanism for
solving them.
Error Function
x
2 2
erf x = √ e −t dt
𝜋 ∫0
erf x
1
x
–2 –1 1 2
–1
∙ lim erf x = 1.
x→∞
∙ erf x is an odd function.
∙ The “complementary area function” erfc x = 1−erf x.
The error function and its use in statistics are discussed in Section 12.3.
Factorial
∏
n
n! = i =1×2×3×…×n
i=1
∙ The factorial function is only defined for non-negative integer n-values. A generaliza-
tion of the factorial function onto the real domain is the “gamma function” Γ(x).
∙ Factorials obey the recurrence relation n! = (n − 1)! × n.
∙ 0! is defined as 1, based on the recurrence relation above.
√ x the factorial function can be approximated by “Stirling’s formula”
∙ For very large
x! ≈ x x e −x 2𝜋x.
Factorials are used extensively in Taylor series (Chapter 2). They are discussed as part of the
build-up to the gamma function in Section 12.3.
Fresnel Integrals
x ( ) x ( )
𝜋 2 𝜋 2
S(x) = sin t dt C(x) = cos t dt
∫0 2 ∫0 2
S(x) C(x)
0.5 0.5
x x
–5 –3 –1 1 3 5 –5 –3 –1 1 3 5
–0.5 –0.5
Gamma Function
∞
Γ(x) = t x−1 e −t dt x>0
∫0
Γ(x)
x
–2 –1 1 2
∙ The gamma function generalizes the factorial function. The two are related by
Γ(x) = (x − 1)! for positive integer x, but the gamma function is also defined for
negative and non-integer x-values.
∙ The gamma function obeys the recurrence relation Γ(x) = (x − 1) × Γ(x − 1).
∙ The integral above is divergent for x < 0, but Γ(x) is defined for negative values by its
recurrence relation. It is undefined for all integers x ≤ 0.
∙ For very large x√the gamma function can be approximated by “Stirling’s formula”
Γ(x + 1) ≈√x x e −x 2𝜋x.
∙ Γ(1∕2) = 𝜋.
The gamma function is discussed in Section 12.3.
Heaviside Function
{
0 x<0
H (x) =
1 x≥0
H(x)
2
1
x
–2 –1 1 2
The Heaviside function really is as simple as it looks, but it gives us a notation that applies to a
wide variety of discontinuous functions. Section 10.10 defines and introduces this function;
Section 10.11 uses H (x) in differential equations to model discontinuous changes, and then
solves those equations with Laplace transforms.
7in x 10in Felder appL.tex V3 - February 27, 2015 4:57 P.M. Page 801
APPENDIX L
1. The force is towards the equilibrium point: left if the object is on the right side and
right if it’s on the left. The resulting motion will be an oscillation.
2. d 2 x∕dt 2 = −(k∕m)x. In words, we might read this equation as follows: “The position
function x(t) has the property that when you take its second derivative, you get the
same function you started with, multiplied by −k∕m.” As a quick check, the left side
of this equation has units m∕s 2 and the right side has N ∕m∕kg m, or N ∕kg . Since a
Newton is a kg m/s2 , this becomes m/s2 , so the units match on the two sides.
3. One possible answer is y(x) = 3x 2 + 2.
4. dy∕dx = y
3
5. y = ±e x ∕3+C .
2 2
6. u1 (x) = e −x , u2 (x) = e 2x
7. dR∕dt = 5R − 10F
8. f (x0 + Δx) ≈ f (x0 ) + f ′ (x0 )Δx. This formula should not come as any great surprise.
Since f ′ (x0 ) represents the slope (or “rise over run”), we multiply it by Δx (the “run”)
to estimate how much the function has gone up from its original value of f (x0 ).
9. 605.
10. If you plug a1 = 5, r = 3, and n = 5 into your formula, you reproduce our original
series. Your formula should therefore yield the answer 605. If it doesn’t, figure out
what went wrong!
11. 1∕2
12. x = Ae −10t
(
−4t
√ Be )
+ ( √ )
−2+ −36 t −2− −36 t
13. x = Ae + Be .
14. 34
15. B = i
1
16. e (1+2i)x + C
1 + 2i
17. y = 1∕2
18. A line has zero concavity. Because 𝜕 2 y∕𝜕x 2 = 0 the equation predicts 𝜕 2 y∕𝜕t 2 = 0: no
acceleration. Because it started at rest and is not accelerating, the string will never
move.
19. positive
20. y(50) = 130
21. −∞
22. down, 2
( ) ( )
𝜕f 𝜕f 1 c 1 + 2c
23. u + u =1 √ +2 √ =√
𝜕x x 𝜕y y 1+c 2 1+c 2 1 + c2
7in x 10in Felder appL.tex V3 - February 27, 2015 4:57 P.M. Page 802
𝜕f ( )
24. a = x , y (We’ll leave b and c for you to determine.)
𝜕x 0√0
2 + 2 GM
25. V = − . (Even if your answer is correct it may take some simplification
4 d
to make it look like ours. You can easily check your answer by punching it into a
calculator. Ours comes out to −0.85GM ∕d.)
26. 56 miles
W
27. M = ∫0 (kx)(Hdx) = (1∕2)kW 2 H . Since kx has units of mass per distance squared,
k must have units of mass per distance cubed, so this answer does have units of mass.
28. M = (1∕4)qW 2 H 2
29. m = 2kR 5 ∕15. Since 𝜎 is mass over distance squared k must have units of mass over
distance to the fifth, so the units are correct.
30. 𝜌 = 5, 𝜙 = tan−1 (4∕3), z = 10
√ √
) (𝜃 = cos )(10∕ 125), 𝜙 = tan (4∕3)
31. r( = 125, −1 −1
56. −1∕2
57. 2𝜋∕19 Earth years
58. By eye, it’s roughly 2 cos(3x) + 0.3 cos(20x)
59. 4𝜋∕5
60. 2i∕(3𝜋)
61. 𝜋∕500, 2𝜋∕500, 3𝜋∕500 …
62. d 2 x∕dt 2 + 6(dx∕dt) + 9x = 0
63. A(x)
64. A sin(kx) + B cos(kx), A sin(kx + 𝜙), Ae ikx + B sin(kx), Ae ikx + Be −ikx
65. du∕dt = dx∕dt + 1
66. du∕dt = u 2
67. xu ′′ (x) − u ′ (x) = 0
x
68. –2 –1 1 2 3 4 5
69. z = e x sin y.
70. 𝜋∕15.
71. One of your equations should be rewriteable as X ′′ (x) = PX (x).
72. Your solutions in Parts 5, 6, and 7 should have been exponential, linear, and sinu-
soidal, respectively. Since an exponential or linear function cannot reach two zeros,
you should have concluded that P must be negative. We will now replace P with −k 2 ,
which allows for all possible negative values.
73. X (x) = A sin (n𝜋x∕w)
[ ( √ ) ( √ )]
∑ ∑
∞ ∞
( )
74. z(x, y, t) = 2 2
sin (kx) sin py Cnm sin v p + k t + Dnm cos v p + k t 2 2
n=1 m=1
where k = n𝜋∕w and p = m𝜋∕h.
R ′′ (𝜌) 1 R ′ (𝜌) −Z ′′ (z)
75. + = . Both sides of your equation must now equal a constant.
R(𝜌) 𝜌 R(𝜌) Z (z)
Because the boundary conditions on R(𝜌) are homogeneous, we consider the ODE
for R(𝜌) first to determine the sign of the separation constant.
∑ ( ) ( (𝛼 ∕a)z )
76. V (𝜌, z) = ∞ n=1 An J0 𝛼0,n 𝜌∕a e 0,n − e −(𝛼0,n ∕a)z
∑∞ ( ) ∑ ∞
𝛼b (t)n 2 𝜋 2 ( )
n𝜋 n𝜋
77. bn′ (t) sin x = − n 2 sin x
n=1
L n=1
L L
78. (𝜕 [u]∕𝜕t) = −𝛼p 2 [u]
79. R(𝜌) = CJ0 (k𝜌) + DY0 (k𝜌)
80. k = 𝛼0,n ∕a, where 𝛼0,n is the nth zero of J0 .
( )[ ( ) ( )]
𝛼0,n 𝛼0,n 𝛼0,n
81. z = J0 𝜌 A sin vt + B cos vt
a a a
∑ ( 𝛼0,n
)
82. f (𝜌) = ∞ n=1 Bn J0 a
𝜌
83. dy∕dx = c1 + 2c2 x + 3c3 x 2 + 4c4 x 3 + 5c5 x 4 + …
84. 2c2 = c1 + 1
n2 + n − 𝜇
85. cn+2 = c
(n + 1)(n + 2) n
86. r = −3, r =(−4√ ) (√ )
2mE 2mE
87. R(r ) = Aj0 r + By0 r
ℏ ℏ
7in x 10in Felder appL.tex V3 - February 27, 2015 4:57 P.M. Page 804
INDEX
Regular numbers refer to page numbers in this printed book. Numbers with the symbol § and the word
“Online” refer to online sections located at www.felderbooks.com.
806 Index
Index 807
808 Index
Index 809
Milkmaid Problem, 185 Operator, 411, 486, Polar Coordinates, 221, 248,
Minimum Ratio Rule, §12.10(online) 4, 6 589, 772
§7.5(online) 14 Linear, §12.10(online) 6 Pole, 713, 730, 744
Mites, 285 Optimization Population Growth, 14, 22, 35,
Mixed Partials, 140 Constraint, 174, 182 37, §1.7(online) 1, 8,
Mixing Problem, 22, Gradient, 172 §1.9(online) 19, 20
§1.7(online) 6 Lagrange Multipliers, 181 Positive-Definiteness, 371, 374
Modified Bessel Functions, 696 Simplex Method, 377, Potential, 387, 402, 406, 438, 778
Modulus, 110, 115 §7.5(online) 8 Electric, 383, 394,
Molecules, 295, 306 Ordinary Differential Equation, 406, 425, 442
Moment Generating Function, see Differential Equation Gravitational, 388, 397
92 Orientation of a Shape, in Multiple Dimensions, 396
Moment of Inertia, 180, 192, 200 357 in One Dimension, 387
Momentum Density, 256, 408 Orthogonal Basis, 371 Magnetic, 425
Monge, Gaspard, §7.5(online) Orthogonal Functions, Potential Energy, 439, 442
24 457, 470, 701, Power Factor, §3.6(online) 6
Moore’s Law, 90 §12.9(online) 1 Power Series, see Taylor Series
Mordor, 401 Orthogonal Matrix, see Matrix Predator-Prey System,
Morse Potential, 75 Orthogonal Trajectories, §1.7(online) 1, 8
Multiple-Valued Function, §10.13(online) 43 Pressure, 401
710, 711 Orthonormal Basis, 371 Pressure Gradient, 170,
Multivariate Function, 546 §4.11(online) 20
P Price Elasticity of Demand, 157
p-Series, see Series Principal Axes,
N PageRank, §6.10(online) 4 §7.6(online) 29
nth Term Test, see Series Parabolic Approximation, 59 Principal Value, 710, 711
Nabla (∇), 411 Parametric Surfaces, 249 Product Notation (Π), 655
Natural Basis, 330 Parseval’s Theorem, 470 Projectile Motion, 186
Neighborhood, 719 Partial Derivative, 136, Pseudovector, 361, 366
Newton’s Law of Heating and §4.0(online) 1
Cooling, 34, 38, 542, Definition, 138
§1.7(online) 7, Partial Differential Equation, Q
§1.9(online) 22 see Differential Equation Quantum Harmonic Oscillator,
Newton’s Law of Universal Partial Fractions, 518, 784 §12.10(online) 4,
Gravitation, 75, 188, 238, Partial Sum, 84 §12.11(online) 13
267, §5.11(online) 1 Passive Transformation, 354 Quantum Mechanics, 375, 472,
Nezzer Chocolate Factory, Path-Independent, 438 579, 615,
§4.11(online) 20 PDE, see Differential Equation §1.8(online) 17,
Nonbasic Variable, Pendulums, Coupled, §6.10(online) 3
§7.5(online) 13 §4.7(online) 7
Norm, 371, 681, 702 Period, 20, 24, 450
Normal Form, §7.5(online) 12 Petri Dish, 402 R
Normal Mode, 268, 320, 331, Phase, 20, 24, 115 Radioactive Decay,
555, 559 Phase Portrait, 494, §1.9(online) 22
Normalize, 371 §10.3(online) 1 Radius of Convergence, 743
Nuclear Facility, 378 Phase Space, Raising Operator,
Nutrition, §7.6(online) 29 §10.3(online) 7 §12.10(online) 9
Nyquist Frequency, Phasor, §3.6(online) 4 Rank of a Matrix, see Matrix
§9.7(online) 6 Picard’s Theorem, 747 Ratio Test, 770
Pitch, Dependence on Re, 106
Temperature, Recurrence Relation, 655, 668
O §2.9(online) 9 Recursive Definition of a
Objective Function, 174, 182 Planck’s Law of Blackbody Series, see Series
Oceania, §1.7(online) 5 Radiation, 143 Recursion, see Recursion
Odd Extension, 462, 780 Point Matrix, 298 Reduction of Order,
Odd Function, 452 Poisson’s Equation, 550, §10.9(online) 33
ODE, see Differential Equation 635 Reindexing a Series, 655
7in x 10in Felder bindex.tex V1 - February 27, 2015 9:05 P.M. Page 810
810 Index
Index 811