You are on page 1of 776

CALCULUS

SECOND EDITION

EDWIN E. MOISE

Harvard University

ADDISON-WESLEY PUBLISHING COMPANY


Reading, Massachusetts

Menlo Park, California

London

Don Mills, Ontario

This book is in the


ADDISON-WESLEY SERIES IN MATHEMATICS

Consulting Editor:

LYNN H. LOOMIS

Copyright 197.2 by Addison-Wesley Publishing Company, Inc. Philippines copyright 1972


by Addison-Wesley Publishing Company, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the publisher. Printed in the
United States of America. Published simultaneously in Canada. Library of Congress Catalog
Card No. 76-150576.

Author's Note on the Second Edition

The preface to the first edition was an explanation of the author's intentions, and
since these intentions have not changed, the original preface is reprinted after this one.
But the present edition is a thorough revision of the first, with many major changes
and even more minor ones. Some of these are as follows.
1. The use of language has been simplified throughout. Excessive colloquialisms

have been eliminated.


2. Many problems have been added, most of them being easy.

In cases where

several problems form a sequence, they have been combined into a single problem,
with parts (a), (b), (c), .... Thus it is now safe to assign all odd-numbered problems,
without checking to make sure that problem-sequences are not being broken up.
3. Various long sections have been divided into two parts.
4. The classical definition of a limit has been restored.

Exploratory problems,

dealing with limits and continuity in terms of "boxes," have been eliminated, and
this material has been inserted in later portions of the text.

5. Section 5.8, on the derivative of one function with respect to another, has been

completely recast, following suggestions of Professor Hugh Thurston, of the University


of British Columbia.

The new version is mathematically straightforward, and it

bridges the gap between the modern concepts of function and derivative and the
"fractional" notation

du/dv

commonly used in physics.

6. Chapter 8, on the conic sections, has been shortened and simplified, by

omitting various topics not ordinarily covered in a first course in calculus.

In

particular, the section on the geometry of the ellipse has been omitted. In a way it is
a pity to leave this out, because it is very good mathematics, but in a first course in
calculus we barely have time for essentials.
7. In the chapters on vector spaces, the standard use of the terms "vector space"

and "inner product space" has been restored.


8. The old Chapter 10, on number theory and partial fractions, has been omitted.

The above remarks about the geometry of the ellipse also apply here.
9. The chapter on infinite series has been completely recast.

In the first edition,

the idea of uniform convergence was built into the presentation, almost from the
outset;

it was used, in various special cases, to justify term-wise integration, long

before the general definition of uniform convergenc was stated. This treatment had
advantages, for some students, but it had a serious tlisadvantage: it meant that the
hardest part of the study of infinite series could not be skipped, or even postponed.
The chapter has now been arranged in such a way that the hardest parts of it come
iii

iv

Author's Note on the Second Edition

last. Term-wise integration and differentiation of power series are introduced early,
and play a central part throughout the chapter; but the justification of these processes
, is saved for the end.
The construction of the complex numbers (using congruence classes of real
polynomials modulo

10.

1 + x2)

has been moved to an appendix.

The chapter on linear transformations, matrices, and determinants has been

recast and simplified in various ways.

For example, the idea of isometries between

subspaces has been omitted (from the text and therefore from the problems.)

11.

In the chapter on functions of several variables, the Leibniz notation for

partial derivatives has been introduced, in parallel with the subscript notation

f.,,fv, . .

..

The former notation is of course universal in physics, and it cannot be

denied that it makes the chain rule easier to remember.


These examples should make it plain that this is not a perfunctory revision.
The intent of the revision is to make the book more teachable and more flexible,

without weakening its mathematical content.

Some sections (as indicated above)

have been omitted outright. Some chapters have been recast in such a way that more
topics can be omitted at the teacher's discretion. But the main substance of the book,
and the conception of calculus that it attempts to teach, have not been changed. All
the hard problems in the first edition have been retained (except for one very embar
rassing case, in which I asked the student to prove a false theorem).
Most of the calculus books now in print are of one of the following three types:

1)

Some are written on a high plateau of austerity and rigor, and the Devil take

the hindmost.

2)

Some are "quick calculus" books. A typical device, in this sort of book, is to

use the Fundamental Theorem of Integral Calculus as a


integral.

definition

of the definite

This enables the student to imitate the behavior of mathematicians, in

calculating definite integrals, without sharing the mathematicians' conception of


what the problem meant in the first place.
3) Some are a combination of types

and 2, with exact theoretical material

included in the text but not in the problems. A course based on a book like this is
evidently intended to play a double game, in which some of the students learn the
text, but most of them study, in effect, a "quick calculus" course based on the same
sort of problems used in books of type

2.

The present book is of none of these three types. It is addressed to all students
who ordinarily take calculus courses and pass them, and it is designed to teach the
ideas of calculus, at least in some form, to these students. In a sense we are playing
a double game, because the teacher has a great deal of choice.

If the entire text is

taught, then the course is logically complete, and nothing that it includes needs to be
taught again later.

But many of the theoretical sections, especially at the ends of

chapters, giving proofs of "foundational" theorems, can easily be omitted.

For

example, if we omit Sections 5.6, 5.7, and 5.8, then something is subtracted from the
course, but nothing is disrupted or confused. And in Chapter

10 we can stop at almost

any point. But even if such omissions are made on a large scale, the remaining hard
core of the book conveys the ideas of calculus in conceptual forms.

Author's Note on the Second Edition

The rock-bottom minimum, in any good mathematics course, is for the student
to attach conceptual meanings to the problems that he solves and to the "answers"
that he computes. If we settle for less than this, we are making a bad bargain.
hard fact is that "practical" calculus courses are not practical.

The

In real life it seldom,

if ever, happens that a mathematical problem takes the form of a homework exercise
which can be solved by copying the pattern of the "solved problems" that immediately
precede it. In physics it is the conceptual definite integral that is crucial, and numerical
valuations are often done by computers. Thus the art of setting up integrals is often
more useful than the art of calculating them by elementary methods.
principle applies very widely.

The same

When people put their mthematical training to

practical use, they seldom need the logical refinements that appear in a thorough
treatise, but they nearly always use their conceptual grasp-at some level-of mathe
matical ideas.

Obviously, many techniques are needed, and in this book we have

worked hard to teach them.

But as the same time we have tried to produce the sort

of conceptual grasp of mathematics that can be put to work in real life.


New York City, N. Y.
October 1971

E.E.M.

Preface

The mathematical content of the first ten chapters of this book is familiar and easy
to describe. These chapters present, more thoroughly than is customary, the material
normally covered in one-year introductions to college calculus, and end with a chapter
on infinite series.
the title

(This portion of the book is being published separately, under

Elements of Calculus.)

In the last four chapters of the complete edition, the

choice of material is nowhere nearly so traditional.

In particular, we have laid heavy

stress on the methods of linear algebra.


In the latter portion of this preface, we explain the considerations on which the
selection of topics in the last few chapters is based.

Elements

Most of the novelties in the

are in the style of treatment; and the ideas underlying them may best be

explained by means of numerous examples.

1.

THE SPIRAL PROCESS

The central concepts of the calculus are deep.

It is not to be expected that they can

be learned ali at once, in the forms in which a modern mathematician thinks of them.
Therefore, in this book, the more difficult ideas are presented in a series of different
forms, in ascending order of difficulty, generality, and exactitude.

Thus the idea of

the definite integral makes its first and simplest appearance in Section 2.10; it is
generalized in Section 3.7; and it is not presented in final form (using Riemann sums)
until Section 7.1, where Riemann sums are needed, in the calculation of arc length.
Similarly, the chain rule for derivatives appears first in Section 3.6, for powers
and square roots of functions;

it is proposed, in more general forms, in Problem

Sets 3.6, 3.8, 4.3, and 4.5; and it appears in final form only in Section 4.6.
The mean-value theorem is first stated, in geometric terms, in Section 3.2, before
any formal definition of the derivative. It is used freely thereafter. Finally, in Section
5.7, it is proved, after the ideas needed in the proof have been used and motivated

in other ways.
The idea of the limit of a function appears first in Section 2. 7. The formal defini
tion is in Section 3.3.

Earlier sections include a lengthy preparation for the

formal definition, designed to eliminate in advance as many of its difficulties as


possible. This purpose is served by the text of Sections 1.4 and 2.5. Thus the style of
treatment is such that an inspection of isolated sections of the book is likely to lead
to an overestimate of the difficulty of the course. The point is that the sections are
not isolated: difficult discussions have been provided with elaborate foundations, in
the text and especially in the problems.
vi

Preface

vii

The spiral treatment, in which concepts appear in various forms as the theory
develops, is intended to make the concepts easier to learn.
purpose.

But this is not its only

The processes by which special ideas are generalized, and heuristic ideas

are made concrete and exact, are part of the substance of what we ought to be
teaching.

Thus the heuristic treatment of exponentials and logarithms, in Section

4.9, is not given merely in order to make the student's life easier. The transition from

(in which the theory is based

Section 4.9 to Sections 4.10 and 4.11

In

x =

n (dt/t)) is

on the definition

valuable in itself, as an illustration of a recasting process which

is essential both in the growth of mathematics and in the growth of the people who
use it.

2.

MOTIVATION

The desire to solve interesting puzzles is very strong; there. is no maturity level at
which it disappears;

and we should appeal to it continually.

Most of the time,

however, when new ideas are introduced, they ought to be motivated by a sense of
power, and by the light that they throw on ideas already regarded as significant.
For example, if we present Riemann sums, in full generality, long before we deal
with problems in which they are needed, it is not reasonable to expect the student to
master their complications.

Similarly, the completeness of the real number system,

in the sense of Dedekind, is not needed at all in the theory of pointwise limits: this
theory takes exactly the same form in the rational domain as in the real domain.
If we postpone the idea of completeness until the point where it is needed, in the
study of functions continuous on an interval, it is more likely to be understood,
partly because it is more likely to get the student's attention.
The problem of motivating the idea of the limit of a function involves a peculiar
difficulty. The only cases in which limx-+a f(x) is easy to calculate are those in which
f is a continuous function, described by a simple formula. In these cases, the formula
works just as well for

x =

as for other values of x; in practice, it turns out that

the limit isf(a); and the student is likely to get the idea that the expression limx-+a f(x)
is merely a devious and pretentious description of j(a).

If we avoid this trouble by

starting with significant cases, such as


sin x
1m--,

1.

X-+0

then the technical difficulties are formidable, and workable problem material is hard
to come by.

If we choose, instead, to discuss limits of sequences, then we have

evaded the issue by changing the subject:

in the differential calculus, limits of

functions are what we need.


But there is a fourth alternative: we can introduce the idea of a limit not as a
subject in its own right, but as a device for solving a problem.
mention limits for the first time, in finding the limit of a

linear

In Section 2.7 we

function.

to the limit, in this case, we merely plug the hole in a punctured line.
has no intrinsic significance.

To pass

This process

But in the context of Section 2.7, it has an extrinsic

viii

Preface

significance, because it is used to solve a nontrivial problem, namely, the problem


of finding the slope of the tangent to a parabola.

Similarly, in Section

2.10, we use

the idea of the limit of a sequence, in a technically simple case, in order to find the
area of a parabolic segment.

(A formal definition of the limit of a sequence finally

appears in Section I 0.1 ). There are many other points at which ideas are introduced,
in simple forms, in connection with a discussion of something else.

3.

BLACK BOXES

It is generally agreed that in a physics laboratory the student should build as much as
possible of his own equipment.

Nobody learns very much by watching the per

formance of the proverbial "black box."

In mathematics the situation is similar:

we do not learn mathematical principles by hearing them mentioned once, no matter


how elegantly; we need to live with them and use them.

Therefore, in this book

certain extremely powerful theorems have been proved long before being stated.
That is, the proof has been presented, in the form of a method of solving a certain
class of problems; and after the student has learned the idea by using it on many
problems, we have summed up the situation by stating the general theorem that the
proof proves. This scheme costs very little time, even in the short run; and in the
long run it is likely to save a great deal of time. The point is that if we allow recipes
to take the place of ideas, in a first course, then the ideas need to be taught all over
again later;

and the second attempt may be harder, because the problem-solving

motivation for these particular ideas has already been used up.
There are good reasons for not giving examples of this technique.
It should be understood that the avoidance of black boxes has no particular
connection with the pursuit of logical rigor. Indeed, if we have to choose, it is better
to master an idea in an heuristic form, by using it repeatedly, than to listen once to a
rigorous exposition, and then forget it.

4.

PROBLEMS

In a quick examination of a textbook, it is not a good idea to read the text and skip
the problems; it is better to read the problems and skip the text.

The problems

represent the life that the student leads when he studies the course; and any ideas
that do not appear in them are unlikely to be learned, no matter how much preachment
may be devoted to them.
In this book, a variety of problems are used for a variety of purposes. There are:

I) Technical problems, as, for example, in the chapter on the technique of integration.
These are carefully graded, and often they form sequences, in which the answer
to one problem can be used in the solution of the next.

2) Theoretical problems, some easy, some hard.

Vigorous attempts have been

made to find easy ones, so as to avoid a dichotomy between techniques (which the
student really uses) and "theory" (of which he is intermittently a spectator).

Preface

3)
4)
5)

ix

Puzzle problems.
Sketching exercises, in which the student is asked to translate back and forth
between analytic ideas and visual images.
Discovery problems, which anticipate, in special cases, ideas which will later
be explained in the text.
There is wide general agreement on the content of the first year course in college

calculus;

and in writing the

Elements, the author was in the happy position of

working on the basis of a consensus with which he was fully in sympathy. But there is
no such general agreement on the content of a course in intermediate calculus.

In

the past decade, calculus courses have tended to grow, by including various topics
from advanced calculus and linear algebra. But it is not easy to decide which of these
topics should be included, and what relative stress should be placed on them; and in
fact there is no reason to suppose that such questions have unique answers.
On the other hand, every book and every course must make

some choices, and

then stick to them long enough to permit a valid learning process.

If the pursuit of

flexibility turns an intermediate calculus book into an anthology, then its little pieces
are unlikely to have any lasting effect. For example, if the treatment of infinite series
is sketchy, then its residuum in the mind of the student may include hardly more than
the ratio test.

And the dangers presented by brief treatments of linear algebra are

worse.
Modern algebra is modern because its motivations and its applications came late.
Today, there are very good reasons for studying groups, rings, fields, vector spaces,
normed vector spaces, inner product spaces, linear transformations, matrices, and so
on.

But the logical simplicity of the rudiments of these theories is misleading.

For

example, the manipulative process of multiplying matrices can be taught to almost


anybody, at almost any level;
another matter entirely.

but the significant applications of this process are

In a short treatment of axiomatic and linear algebra, at the

freshman or sophomore level, we cannot presuppose knowledge of the significant


applications, and we have no time in which to present them. Thus we may fall into a
peculiar form of use-mention confusion: the reader hopes that the ideas of modern
algebra are going to be used, but in the end he sees that they have merely been
mentioned.
For these reasons we have tried, throughout, never to state an algebraic definition
until the reader already knows at least one important instance of the idea that the
definition describes; and once an algebraic idea has been introduced, we have tried,
throughout, to put it to work for the purposes that it is good for. Thus, for example,
matrices are introduced as a shorthand for handling linear transformations;

and

thereafter the treatment of the two is closely tied together. The Schwarz inequality is
first introduced (on page

521) as a theorem in Cartesian three-space, and for this


f) 1 for every fJ. Later, on

case it is proved by the trivial observation that cos2


page

536, it is proved in the general case, and thereafter it is used in a great variety

of ways, to trivialize problems which would not otherwise be trivial.

It appears in

disguised forms in many problems (which should not be listed here). These examples
are typical of the style of Chapters

11

through

13.

It appears to the author that the

Preface

nature, the purposes, and the power of algebraic methods are hot likely to be under
stood unless they are conveyed to the student by some such extended experience.
The most impressive, but also the most difficult, of these applications occurs in
Chapter 12, on Fourier series. This topic is not ordinarily included in intermediate
courses; and if something must be omitted, in teaching a course from this book,
Chapter 12 is an excellent candidate for omission. (None of the material in it is used
later.)
Chapters 1 through 13 amount to more than 600 pages; something had to be
shortened; and so the treatment of functions of many variables is shorter than might
have been expected, and there is no separate chapter on differential equations.

It

should be noted, however, that there is a substantial treatment of linear differential


equations at the end of Chapter 13, and that the viewpoint of differential equations
has been stressed throughout. (Recall, for example, the treatment of the fundamental
theorem of integral calculus, and of the elementary functions, in the Elements.). In
Chapter IO, the standard method of showing that a given series converges to a given
function is first to show that the series and the function satisfy the same differential
equation, and then to show that the differential equation (with initial condition) has
only one solution. Usually, the series is derived from the differential equation, and so
the student is not likely to be surprised when the same process is applied later to
equations whose solutions were not previously known.

For this sort of reason, the

book conveys much more of the spirit and methodology of differential equations
than the table of contents would suggest.
Moreover, it appeared to the author that the natural sequels of the material in
Chapter 14 would grow exponentially more difficult, and that they rightly belong in
an advanced calculus course. The hard fact is that multivariate calculus, once we get
past its beginnings, is not an elementary subject; and if we try to make it seem elemen
tary, we are likely to give up both intuition and logic in favor of a bewildering
formalism.

Thus it appeared, at the end of Chapter 14, that we should say either

much more, or no more at all; and since every book-even a calculus book-has got
to end somewhere, the choice was clear.
The above discussion is an attempt to indicate some of the author's objectives,
and some of the methods used in pursuing them. Obviously no such discussion can
prove anything about the extent of the contribution that the text makes to the
achievement of these objectives.

A great deal has happened, in the teaching of cal

culus, in the past decade, and it remains to be seen how much more can be accom
plished, and how.
New York City, N. Y.
October 1971

E. E. M.

Contents

Chapter 1

Inequalities

1.1

Introduction

1.2

Products which are equal to zero

2
3

1.3

Order

1.4

Absolute values. Intervals on the number line

Chapter 2

Analytic Geometry

16

2.1

Introduction

2.2

Coordinate systems. The distance formula

2.3

The graph of a condition. Equations for circles .

21

2.4

Equations of lins. Slopes, parallelism, and perpendicularity

26

2.5

Graphs of inequalities. And, or, and if ... then

33

2.6

Parabolas .

38

2.7

Tangents

43

2.8

A shorthand for sums

2.9

The induction principle and the well-ordering principle .

49

2.10 Solution of the area problem for parabolas

Chapter 3

16

51
57

Functions, Derivatives, and Integrals

3.1

The idea of a function

63

3.2

The derivative of a function , intuitively considered

69

3.3

Continuity and limits

75

3.4

Theorems on limits

82

3.5

The process of differentiation

89

3.6

The process of differentiation: roots and powers of functions .

97

3.7

The integral of a nonnegative function

3.8

The derivative of the integral

109

3.9

Uniformly accelerated motion .

119

*3.10 Proof of the formula for the derivative of the integral

Chapter 4

102

124

Trigonometric and Exponential Functions

4.1

Directed angles. Trigonometric functions of angles and numbers .

128

4.2

The law of cosines and the addition formulas

135

4.3

The derivatives of the trigonometric functions; the differences tlx and


fl/; the squeeze principle

139
xi

xii

Contents

4.4

The approximation of differences by differentials

148

4.5

Composition of functions

154
159

4.6

The chain rule

4. 7

Invertible functions. The inverse trigonometric functions

165

4.8

Simpson's rule. The computation of 1T

176

4.9

Exponentials and logarithms

185

4.10 The functions In and exp

191

4.11 Exponentials and logarithms. The existence of


Chapter 5

197

The Variation of Continuous Functions

5.1

Intervals on which a function increases, or decreases

5.2

Local maxima and minima, direction of concavity, inflection points

211

5.3

The behavior of functions at infinity

216

5.4

206

The introduction of functions into geometric problems;

the use of

existence theorems as shortcuts

223

5.5

The use of functional equations as shortcuts .

232

5.6

The completeness of R and the existence of maxima

238

5.7

The mean-value theorem and the no-jump theorem .

246

5.8

The derivative of one function with respect to another

250

Chapter 6

The Technique of Integration

6.1

Introduction

6.2

Independent variables and indefinite integrals

6.3

Integrals leading to the logarithm and the inverse secant.


devices

265

6.4

Integration by parts .

273

6.5

Integration of powers of trigonometric functions

278

6.6

Integration by substitution .

284

6.7

Algebraic substitutions

291

6.8

Algebraic devices: completing the square and partial fractions

297

Chapter 7

254
255
Algebraic

The Definite Integral

7.1

The problem of arc length

7.2

The definite integral, defined as a limit of sample sums .

308

7.3

The calculation of volumes, by the method of disks .

315

7.4

The general method of cross sections, and the method of shells

321

7.5

The area of a surface of revolution

327

7.6

Moments and centroids. The theorems of Pappus

335

7.7

Improper integrals

344

The integrability of continuous functions .

350

*7.8

Chapter 8

303

The Conic Sections

8.1

Translation of axes

356

8.2

The ellipse

360

8.3

The hyperbola

366

8.4

The general equation of the second degree. Rotation of axes

372

Chapter 9

Paths and Vectors in a Plane

9.1

Motion of a particle in a plane

9.2

The parametric mean-value theorem; l'Hopital's rule

385

9.3

Other forms of I'Hopital's rule .

393

9.4

Polar coordinates

397

9.5

Areas in polar coordinates

402

381

9.6

The length of a path .

405

9.7

Vectors in a plane

409

9.8

Free vectors

9.9

Velocity, acceleration, and curvature

415
422

9.10 Concluding remarks on vector spaces and inner product spaces

430

Chapter 10 Infinite Series


10.l

Limits of sequences

10.2

Infinite series. Convergence. Comparison tests

437

10.3

Absolute convergence. Alternating series

445

10.4

Estimates of remainders.

448

10.5

Termwise integration of series. Power series for Tan-1 and In

453

10.6

The ratio test for absolute convergence.

457

10. 7

Power series for exp, sin, and cos

10.8

The binomial series

10.9

Taylor series

431

Applications to power series

463
468

473

10.10 Taylor's theorem: Estimates of remainders

477

10.11 The complex number system

479

10.12 Sequences and series of complex numbers.

The complex exponential

function

484

10.13 De Moivre's theorem


*10.14 The radius of convergence.

489
Differentiation of complex power series

*10.15 Integration and differentiation of real power series

493
499

Chapter 11 Vector Spaces and Inner Products


11.1

Cartesian coordinate systems in three-dimensional space

508

11.2

Direction cosines. The directed normal form

512

11.3

Three-dimensional space, regarded as an inner-product space

518

11.4

The dimension of a vector space. Various ways to form a basis

526

11.5

Orthonormal bases

530

11.6

The Schwarz inequality. More general concepts of norm and distance

533

Chapter 12 Fourier Series


12.1

Projections into a subspace, trigonometric polynomials and Fourier

12.2

Uniform approximations by trigonometric polynomials

12.3

Integration of Fourier series. The uniform convergence theorem

series

541
549
.

556

xiv

Contents

Chapter 13 Linear Transformations, Matrices, and Determinants

13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8

Linear transformations .
Composition of linear transformations and multiplication of matrices
Formal properties of the algebra of matrices. Groups and rings
The determinant function
Expansions by minors. Cramer's rule and inversion of matrices
Row and column operations. Linear independence of sets of functions
Linear differential equations
The dimension theorem for the space of solutions.

563
570
577
582
590
596
601

The nonhomo

607

geneous case .

Chapter 14 Functions of Several Variables

14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8

Surfaces and solids in R3


The quadric surfaces
Functions of two variables. Slice functions and partial derivatives
Directional derivatives and differentiable functions

Differentiable functions of many variables. The chain rule.


Directional derivatives and gradients
Interior local maxima and minima, for functions of two variables

Line integrals

651
660
666
674
680

The Shorthand of Logic and Set Theory .


.
Algebraic Operations with Limits of Functions
Algebraic Operations with Limits of Sequences
The Error in the Approximation 11/ ""' df .
.
The Continuity of Composite Functions
The Error in Simpson's Rule .
. .
The Idea of a Measurable Set .
Proof of the Northeast Theorem
.
Proof of the Formula for Path Length
.
. . .
A Method for Constructing the Complex Numbers
Iterated Limits. Mixed Partial Derivatives .
.
Possible Peculiarities of Functions of Two Variables
Maxima and Minima for Functions of Two Variables
An Exact Definition of the Idea of a Function
Selected Answers .
Index .

687
690
695
697
700
702
705
707
711
713
717
721
725
727
733
759

Level curves

14.9
14.10
14.11
14.12
Appendix A
Appendix
Appendix
Appendix
Appendix

B
C
D

Appendix F
Appendix G
Appendix H
Appendix I
Appendix J
Appendix K
Appendix L
Appendix M
Appendix N

The chain rule for paths

614
620
626
634
641
644
648

Double integrals, intuitively considered


Cylindrical coordinates in space. The definition of the integral
Moments and centroids of nonhomogeneous bodies

1.1

Inequalities

INTRODUCTION

In this book it is assumed that you know elementary geometry and the algebra of the
real number system. Theorems of plane geometry will be used only occasionally, and
there is no need to reexamine the subject as a whole.
Inequalities, however, are another matter.

We shall be using them constantly,

and they are tricky. We shall therefore handle them with care. To derive the laws that
govern them we first need to recall the elementary laws of the number system. These
are as follows.
We have given the set R of real numbers, with the operations of addition and
multiplication. Thus the number system is a triplet
[R,+, ].

Addition and multiplication are subject to the following laws:


Closure.

For every

Associativity.

and

For every

in R,

and

a+b

and

ab

are in R.

b,

a+(b+c)= (a+b)+c,
and

a(bc)= (ab)c.
Commutativity.

For every

and

b,
and

a+b=b+a
Distributive Law.

For every

a , b,

and

ab=ba.

c,

a(b+c)=ab+ac.
Existence of 0 and 1.

There are two different numbers 0 and 1 such that

a+O=a
for every

and

a 1 =a

a.

Existence of Negatives.
Existence of Reciprocals.

For every

there is a number

-a such that a+(-a)= 0.

For every a : 0 there is a number I /a such that


1

a I /a=1.

1.2

Inequalities

These laws are called the field postulates; and any number system which satisfies
them is called

afield.

There are many such number systems: the real numbers form a

field, and so do the complex numbers. For a long time to come, however, we shall be
working only with the real numbers. Therefore, when we speak of numbers, we mean

real numbers, unless the

contrary is stated.

We shall assume not only the field postulates but also the familiar laws based on
them.

1.2

(a - b)(a + b)

For example, we know that

for every

a2

- b2, and that

a 0

a.

PRODUCTS WHICH ARE EQUAL TO ZERO

.When we perform calculations, we shall not stop to justify them on the basis of the
field postulates.

But the following principle is worth special mention, because it is

used in reasoning processes which don't involve calculations:


Theorem 1.

If ab

0, then e ith er

0 or b

0.

Proof

1) If a
0,
2) If a 0,
=

there is nothing to prove.


then

has a reciprocal. Therefore

-1 (ab)
a

1
=

1. b

0'

0,

and
b
Thus either

0 or b

0.

0.

Obviously it is possible that

a and bare both


0. In Theorem 1 (and everywhere
either ... or . . , we allow the possibility of both.
=

else in mathematics) when we say

PROBLEM SET 1.2

1. Show that if x2 =0, then x=0.


2. a) Obviously the numbers 1 and -1 are roots of the equation

(x - l)(x + 1) =0.
How do you know that no other number is a root of the equation?
b) Show that 2 and 3 are the only roots of the equation

x2 - 5x + 6 =0.
3. If 0 had a reciprocal, then its reciprocal would be a root of the equation
Ox=l.
Show that this equation has no root.
4. a) If

b) If

ab =ac,
ab =ac,

does it follow that


and

c?

Why or why not?

;t 0, does it follow that

c?

Why or why not?

1.3

Order

5. a) Show that if

ab c

0, then

0 or b

0 or

0.

b) Show that 1, 2, and 3 are the only roots of the equation


x3 - 6x2 + llx - 6
6. a) If a2
b) If a2

b2, does it follow that a

0.

b? Why or why not?

b2, what can you conclude about the relation between a and b? Why?

7. Under what conditions (if any) is it true that


1
1
- +-

1
=--

x +a

8. a) Under what conditions (if any) is it true that

(a + b)2

a2 + b2?

b) Under what conditions (if any) is it true that

(a + b)3

a3 + b3?

*9. Consider the "number system" which has only two elements 0 and 1, with addition and

multiplication defined by the following tables:


+

----1

--0

Which of the field postulates hold true, in this system?

Which, if any, fail to hold?

(The answer to this question suggests that the field postulates are not, in themselves,
a very adequate description of the real number system.)
*10. Consider the number system in which the "numbers" are 0, 1, 2, and 3, with addition

and multiplication defined by the following tables:


+

0
0

3
2

Exactly one of the field postulates fails to hold in this number system. Find out which
one.

[Hint: Don't bother to test the Associative and Distributive Laws; in fact, they

hold true in this system, although the verifications are extremely tedious.]
Does Theorem 1 hold true in this system? Why or why not?

1.3

ORDER

We think of the real numbers as being arranged on a line, like this:

-V3

11'

-1

Inequalities

1.3

When we write a<b, this means (roughly speaking) that a lies to the left of b on the
number line. Thus what we have in mind is a system

[R,

+,

<],

where<is a relation having the following properties:

0.1. (Trichotomy) For every a and b in R, one and only one of the following
conditions holds:
or

a<b,

or

a= b,

b<a.

0.2. (Transitil'ity) If a<b and b< c, then a<c.


A relation satisfying 0.1 and 0.2 is called an order relation, and an expression of
the form a<b is called an inequality. We write b > a to mean a<b; a b means
that either a<b or a
b; and a b means that either a > b or a= b. A number
=

a is positive if a > O; a is negative if a<0.

Zero is neither positive nor negative.

But 0.1 and 0.2 do not, by themselves, enable us to handle inequalities.


need to know how<is related to + and

We

The laws are the following:

MO. If a > 0 and b > 0, then ab > 0.


AO.

If a<b, then a +c <b+c for every

c.

These four laws, in combination, tell the whole story: all of the elementary laws
of inequalities can be derived from them.
following problem set.
Theorem 1.
Theorem

2.

Theorem 3.

lf a

You will carry out this process, in the

Meanwhile we state the theorems without proof.

> 0, then -a<0.

If a<0, then -a

> 0.

If a<b, and c<d, then


a+c<b+d.

Theorem 4.

An inequality is preserved if both sides are multiplied by the same positive

number.
That is, if a<band

> 0, then ac<be.

Similarly,
Theorem 5.

An inequality is preserced if both sides are divided by the same positive

number.
That is, if ac<be and c > 0, then a< b.
Theorem

6. An inequality is raersed if both sides are multiplied by the same negative

number.
That is, if a<b and c<0, then ac > be.

1.3

Order

Theorem 7.

An inequality is rerersed if both sides are divided by the same negative

number.
That is, if be<ac, and c<0, then b

>

a.

Consider now an inequality involving an unknown number x, for example,

3x+4<5x + 7.
An expression like this, involving a variable, is called an open sentence; in an open
sentence, x marks the spot where numbers are to be inserted. Some numbers, when
substituted for x, may give true statements, and other numbers may give false state
ments. For example,

32+4<52+7
is true, because 10<17; but

3(-5) + 4<5(-5) + 7
is false, because -11

> -18.

In simple cases like this, it is easy to find out what numbers satisfy the inequality.
If

3x + 4<5x + 7,

(1)

4<2x + 7,

(2)

then

by AO. (We have added -3x to each side of the inequality.) Therefore

-3<2x,

(3)

x > -t,

(4)

by AO; and so

by Theorem 4. (We have multiplied, on each side, by t, and then written the inequality
backwards, to put x on the left.)
Thus every number which satisfies (1) also satisfies (4). And all of our steps can
be reversed. If

x > -t,

(4)

-3<2x,

(3)

4 <2x + 7,

(2)

3x + 4<5x + 7,

(1)

then
by Theorem 4; therefore
by AO; and so

by AO. Therefore every number which satisfies (4) also satisfies (1). We can sum all
this up briefly by writing

3x + 4<5x+7

<=>-

x > -t.

Here the symbol <=>- is pronounced "is equivalent to."

When we write<=:>- between

two inequalities (or any two open sentences of any kind) we mean that whenever one
of them is satisfied, so is the other.

Inequalities

1.3

We use a single-headed arrow to indicate that one condition implies another.


For example,
x

> 0

x2 > 0.

=>

This is true. (Why?) But


(?)
is false, because
a

b => a2

> 0

<=>

> 0 (?)

-1 satisfies the second inequality but not the first.

Similarly,

b2 is true, but

( ?)
is false, because if a ":/= 0 and b

<=>

a2

b2 (?)

-a, then the second inequality holds, but the first

does not.
The shorthand symbols<=> and => are worth learning and using.

The reason is

that when we write down strings of formulas, in solving a problem, we ought to


indicate what the connection between them is supposed to be. We are more likely to
do this if we have a way of doing it briefly.
Using the symbols=> and<=>, we can restate some of the theorems of this section
in a more efficient way. For example, AO says that
a< b

=>

(5)

a + e< b + e.

And given a + e< b + e, we can add -e to both sides, preserving the inequality.
Therefore
a

=>

+ e< b + e

a< b.

(6)

These fit together to give:


The Addition Law of Order. a< b

<=>

a + e< b + e.

We shall refer to this, for short, as ALO. Similarly, Theorem 4 says that
fore > 0,

a< b

fore > 0,

ae< be

ae< be.

=>

Theorem 5 says that


a< b.

=>

These fit together to give:


The Multiplication Law of Order.

Fore> 0, a<b

<=>

ae<be.

This will be referred to as MLO. Theorems 6 and 7 say that


ae > be,

(7)

=>

b > a.

(8)

=>

a< b.

(8')

We sum all this up in the short form on the next page.

The meanings of the

fore< 0,

a< b

fore< 0,

be<ae

ae > be

=>

and
And (8) can be rewritten in the form
fore< 0,

Thus Theorems 6 and 7 fit together to give:


Reversal of Order.

For c<0, a < b

abbreviations should be plain.

<=>

ae > be.

Order

Trich.

For every a and b in R, one and only one of the following


conditions holds:

a<b,

or

b,

or

a<band b < e

a< e.

MO.

a>

ab>

AO.

a< b

Trans.

0 and

b>

1.

a>

-a< 0.

[Theorem

2.

a<O

-a>

[Theorem

3.

a<band e< d
a<b

0.

a+ e <b + e.

Theorem

ALO.

b <a.

<=>

0.

a+ e <b

d.

a+ e<b+ e.

MLO.

Fore> 0,

a<b

<=>

ae<be.

RO.

Fore< 0,

a<b

<=>

ae

be

The last three of these are convenient in solving inequalities; they enable u
e <=> at each stage, instead of working first forward and then backward.
nple, the solution of the illustrative problem above can now be written like t

3x+ 4< 5x+ 7


4<2x+7

<=>

by ALO

<=>

-3 <2x

by ALO

<=>

-t < x

byMLO

x > -

<=>

by definition of >.

A linear inequality is said to be solved when we find an equivalent inequalit


form x< a or x >

a.

)BLEM SET 1.3


Solve the following inequalities, by writing a chain of equivalent inequalities, and gi
he right the reason for each step, as in the text.

5 - 3x

>

17+x

5x+ 3

>

17x+ 1

-3x - 7 <x +5
6x - 10

>

5x+ 3

2. 5x - 3<17x+ 1
4. 5+ 3x<17+x
6.

-4x -

8.

3 - 2x<4 - 3x

8 <2x+6

2x - 6 <2 - 2x

10. 6x - 2<3+x

2x+6<3+x

12. 6(x - 2)

>

x -3

In the following problems, we d evelop the theory in which all of the results of
ion are derived from Trich., Trans., MO, and AO.

Therefore, at the start, thest

Inequalities

1.3

the only statements that can be given as reasons in proofs.

In each problem, however,

you may assume that the results given in the preceding problems are known and you may
cite them as reasons.

13. Following are the steps in the proof of Theorem 1.

Complete the proof by giving a

reason for each step.


a) a> 0
c)

=>

0 <a

-a+ 0 < -a+ a

-a< 0

=>

b) 0 <a

=>

d) a> 0

=>

- a + 0 < -a+ a
-a<0

14. Following is an outline of the proof of Theorem 2. Complete the proof by giving a
reason for each

=>.

a< 0

-a + a< -a + 0

=>

0 < -a

=>

-a> 0.

=>

15. a) Give a reason for the statement


a< b

=>

a+ c <b + c.

c <d

=>

b + c < b + d.

b) Similarly, for
c) Prove Theorem 3.

16. a) Show that


a<b

<=>

b - a> 0.

(More than one step is needed here.)


b) Give a reason for the statement
and

c> 0

b - a> 0

=>

(b - a)c > 0.

c) Prove Theorem 4.

17. Show that


x

f;; 0,

for every

x.

[Hint: By Trich., there are three cases to be considered: x > 0, or


2
2
x > 0 or x
O.]

Show that in each of these cases we have either

18. Show that

y2 - 2y + 1 f;; 0,

19. a) Everybody knows that 1 > 0.


developed so far.

0, or

< 0.

for every y.

Prove it, on the basis of the theory that we have

(You may assume, of course, that 1 ,,e 0.)

b) Show that

a> 0

=>

1
-

> 0.

[Hint: By Trich., it
0 and 1/a< 0 are impossible.

That is, the reciprocal of every positive number is positive.


will be sufficient to show that the conditions 1/a
Remember that a

1/a

1.]

20. Show that


c> 0

CThis is Theorem 5.)

and

ac < be

=>

a<b.

1.4

21.

Absolute Values.

Give the reason for each step in the following proof of Theorem 6.

=>
=>
=>
=>
=>

22.

Intervals on the Number Line

and
a <b
b - a >0
and
b - a >0
and
(b - a)(-e) >0
ae - be> 0
ae> be.

<

e <0
-e> 0

Give the reason for each step in the following proof of Theorem 7.

=>

be
be

<

ae
ae

and

=>

be

<

ae

and

-> 0
-e

=>

ae - be >0

and

- >0
-e

=>

=>

-(ae - be) >0


-e
b-a>O

=>

b >a.

<

e <0
-e >0

and

23.

Is there a positive number which is smaller than all other positive numbers? Why or

24.

Is there a negative number which is larger than all other negative numbers?

why not?
Why

or why not?
*25.

ls it possible to define, for the complex numbers, a relation < which obeys the laws
0.1 and 0.2?

(That is, can an order relation be defined for the complex numbers?)

Why or why not?


*26.

ls it possible to define, for the complex numbers, a relation < which satisfies not only
0.1 and 0.2 but also MO and AO?

[Hint: Since i

;f:.

0,

we must have i

>0 or

-i

>0.)

The language in which these problems are stated ought to suggest what the answers are.
The answer to Problem 26 indicates why it is that arranging the complex numbers in an
order is not a useful proceeding.

In the complex number system, no theory of inequalities

can be made to work.

1.4

ABSOLUTE VALUES. INTERVALS ON THE NUMBER LINE

The absolute value lxl of a number xis defined by the following two conditions:

1)

If x 0, then lxl = x.

2)

I f x < 0, then lxl

-x.

Thus under Condition

(1)

we have

12 1=2,
and under Condition

(2)

we have

1-21 = -(-2) = 2,

Inequalities

10

1.4

Thus the operation I I leaves positive numbers unchanged, and replaces each negative
number by the corresponding positive number. On this basis it is easy to see that the
following theorem holds.
Theorem 1. For every x,

lxl 0.

Proof There are two cases to consider.


Case

1.

x 0. Here lxl

x, by definition of lxJ. Therefore Ix\ 0 in Case

I.

Case 2. x < 0. Here lxl


-x, by definition of lxl; and -x > 0, by Theorem 2
of the preceding section. Therefore lxl > 0 in Case 2.
=

Thus in each case we have lxl 0.


Theorem 2. For every x,

\x\2

x2.

This is true because \xi is either x or -x, and ( -x)2

x2

A number x is a square root of a number a if x2


a. For each a > 0, -J is the
positive square root of a. Thus, for example, 9 has two square roots, 3 and -3;
=

and /9 is 3, which is the positive square root. We define Jo


0. Here and hereafter,
we are assuming that positive numbers have roots of all orders-square roots, cube
roots, and so on.
=

Theorem 3. For every x,

\xi=

Proof By Theorem 2, lxl2


lxl 0. Therefore lxl
Theorem

4.

x2, and so \xi is a square root of x2 By Theorem

-J x2,

by definition of -J .

For every x,
1-x\

This is true because 1-xj


Theorem

5.

-J.

-J (-x)2

I x!.

-J x2

Ix!.

For every x and y,


\xy\

This is true because lxyl

-J (xy)2

jx\ \yj.

-J x2y2

-J x2 -J y2

Jxl IYI.

1,

1.4

Absolute Values.

Theorem 6.

For every

Intervals on the Number Line

11

x,
xIx!.

Here, as in the proof of Theorem 1, we need to consider two cases.


Case 1.

0.

Here

xjxj, because !xi = x.

Case 2. x < 0. Here !xi = -x, and -x >


jxj, and so x < jxj.
Theorem 7.

0.

(Why?)

Thus

x <

0 <

-x =

(The triangular inequality) For every x and y,


Ix+ YI lxl+ jyj.

The trouble with this theorem, if we try to prove it by brute force, is that there

x and y may or may not be


x and y have different signs, x + y may or may not be negative. It

are too many cases to consider: each of the numbers


negative; and if

turns out, however, that we can get a proof by examining only two cases:
Case 1.

x+ y

0.

In this case

Ix+ YI= x + y.
Since

x lxl ,

we have

yjy l,

x+ Ylxl+ IYI,

and so

Case 2.

and

Ix+ YIlxl + jyj.


Suppose that

x+ y <

0. Then

-x - y >

0. Therefore

Ix+ YI= 1-x - YI= 1(-x)+ (-y)II-xi+ 1-yj,


by the result of Case 1. Since

I-xi = !xi and 1-yl = Jyl, we have


Jx+ YI Jxl+ JyJ,

which was to be proved.


Theorem 8.

Given d > 0. Then

Jxl < d

<::::>-

-d <

x < d.

lxl <d

This is geometrically obvious:

lxl is "the distance between 0 and x, on the number

line"; and the points that lie within a distance d of the origin are the numbers between
-d and d.
origin.

We get a more general result by using any given point

a instead of the

12

1.4

Inequalities

Theorem 9.

Given d > 0, and any number a. Then

Ix - al <d

<=>

a- d< x<a+ d.

lx-aj<d

Proof In Theorem

8, substitute

Ix

x- a for x. This gives

al <d

-d<x- a<d.

<=>

And

-d<x-a<d

<=>

a- d<x<a+

(Reason?)
If a<b, then the set of all numbers between a and b is called an open interval,
and is denoted by (a, b).

(a, b)
0

There is a shorthand for this sort of statement:

(a, b)

{x I a< x<b}.

The expression on the right denotes the set of all objects that satisfy the condition
following the vertical bar. This is called the solution set of the open sentence a<x<

b. Similarly, the set of all positive numbers is the solution set of the open sentence
x > O; this is denoted by {x I x > O}. Thus two open sentences are equivalent if
they have the same solution set.
Sometimes it turns out than an open sentence never gives a true statement, no
matter what we substitute for x. In such cases, the solution set is empty. The empty
set is denoted by { }. For example,

{x

I .Jx2

x- 1}

{ }.

The notation { } is designed to suggest its meaning: we describe sets in the brace
notation; and when there is nothing written between the braces, this means that the
set has nothing in it.
If we add to the open interval (a, b) the endpoints a and b, we get a closed interval,
denoted by [a, b].

[a, b]
0

Thus

[a, b]

{x I a x b}.

Absolute Values.

1.4

Intervals on the Number Line

We shall also be dealing with "infinite intervals."

13

In the first figure below,

the "infinite interval" is

(a, oo)

{x I a

<

x}.

Similarly,

( - oo, a)

{x I x

<

a},

as shown in the second figure below.

(a, oo)
a

(- oo, a)
0
This notation, in which
but it is convenient.
"numbers"

- oo

and

"oo"

is used as if it denoted a number, is not very logical,

To keep track of the notation, you should think of fictitious


oo

as the "ends" of the number line, as shown below.

We also use "half-open" intervals:

[a, b)

{x I a

<

b}

( a, b]

and

{x I a

<

{x I x

a}.

b},

[a, b)
0

a
(a, b]

and "closed infinite" intervals:

[a, oo)

{x I x a},

and

( - oo, a]
[a, oo)

a
(, oo, a]
a

14

1.4

Inequalities

Finally, we may refer to the whole real number system Ras the interval ( - oo, oo).
Thus we have a total of nine kinds of interval:
(a, b),

(a, b],

[a, b),

[a, b],

(a, oo),

( - oo, a),

[a, oo),
(-oo, oo).

(- oo, a],

In some of the problems below, you may find it convenient to use the following:
Theorem 10.

If lxl

lyl, then x

y or x

-y.

Proof
Jxl

IYI

=>

=>
=>
=>

=>

IYl 2
Jxl2
x2 = y2
x2 - y2 =
=

(x - y)(x + y)
or
x y
x
=

0
=

-y.

(The converse is obvious.)


PROBLEM SET 1.4
Describe each of the following sets in the interval notation. Your answers should be in

a form

like the following:

{xi 3x+4

<

2. {x I 3 - x > x - 3}
4. {x I Ix - 31 2}
6. {x I 2x+3 6x - 4}
8. {x I 12+xi < O}

{x J 3x+4 > 4x+5}


{x J lxl < 1}
{x I Ix - 51 < 5}
{x I 11 - xi 2}

9. a) Is it true that

Yx2

oo .

1.3 of the text.)

(This is the example discussed in Section

1.
3.
5.
7.

5x+7} = (-f,

x for every x?
.

Why or why not? Describe the set

{x I Yx2 = x},

in the interval notation.


b) Describe the set

{xi Y(x+1)2=x+1},
in the interval notation.
Find out for what numbers

(if any) each of the following conditions holds. In each

case in which the solution set is an interval, the answer should be given in the interval
notation.

10.
12.
14.
16.

Yx2- 2x+1 = x - 1
lx2 - 5x+61=x2 - 5x+6
Ix+11 =11 - xi
Yx2-l=x

11.
13.
15 .
17.

lx2- 5x+61 = Ix -31 Ix - 21


Ix - 51 = l2x - 31
vx2+1 = x
l2x -lj +Ix+31 l3x+21

Absolute Values.

1.4

19. ,12x - x2 = 1
21. Ix + ll +l2x +31

18. 1 7x +31 +1 3 - xi 6 Ix + ll
20. l2x - x21
22. Ix

Intervals on the Number Line

x +2x2

>

15

ll lx2 +xi = lx3 - xi

Indicate graphically, on a number scale, the places where the following conditions hold;
describe the graphs in the interval notation if possible.

23. lxl

25. l2x - 31 ;;;; i


27. 13 - 2xl ;;;; i
29. 1 1 - xi ;;;; 2
3 1. Ix 11 < 2

33.

24. Ix - 21 <1
26. Ix - 11 < -
28. Ix 21 < :l
30. l2x - 41 < 1
32. 1 2x - 1 1 1

<2

lxl -

and

a) Show that if

0, then

lI 11
=

lbl is the number y


lbl 1 1/bl
l.)

b) Show that if

34.

(There is a short proof.)


b) Show that for every

l,bl
a

and

and

b,
b,

x 2

Ix - 21 ;;;; 1

l2x - 11 ;;;; 1

and

such that

lbl

y = I. Therefore

0, then

a) Show that for every

and

(By definition, the reciprocal of


it is sufficient to show that

and (also)

lal

fbl.

la - bl lal - lbl.
la +bl lal - lbl.

(The proof is short.)

35.

For what numbers a is the fraction

a/lal

defined?

What is this fraction equal to, for

various values of a?

36.

Sketch

{x I Ix - 21 + 1 7 - xi

5}

on the number line, and describe this set in the interval notation.

Analytic Geometry

2.1

INTRODUCTION

This chapter includes various topics which serve as a preparation for calculus. Some
of these topics are familiar to you, at least in some form. In such cases you should
still read the text carefully, in order to learn the terminology that will be used hereafter.
2.2

COORDINATE SYSTEMS. THE DISTANCE FORMULA

We shall now apply algebra to the study of geometry. We start with a plane, in the
usual sense of Euclidean geometry; and we suppose that a unit of distance has been
chosen, once for all, so that the distance between two points Pand Q is a well-defined
nonnegative number.

The distance between the points P and Q is denoted by PQ.

(We say merely that PQ is nonnegative, rather than PQ > 0, because we are allowing
the case P

Q, and in this case PQ

0.)

To set up a coordinate system in a plane, we first need to assign number-labels to

the points of a line. We choose a point 0 as the origin; it is given the label 0.

Each pointP1 to the right of 0 is labeled with the distance x1

OP1, which is positive.

And each point P2 to the left of 0 is labeled with the number x2


negative.

-OP2, which is

Thus we have a matching scheme, under which each point of the line is

matched with exactly one real number.


p

-2

RS

I I I
1 V2 2

-1

For the points marked in the figure, the matching pairs are
P-2,
S

../2,

Q -1,
T 2,

Rl,
U

71'.

Here the double arrow is pronounced "is matched with." Every such pair has the
form P x, where Pis a point and xis a number. A one-to-one matching scheme,
16

Coordinate Systems.

2.2

The Distance Formula

17

between the elements of one set and the elements of another, is called a one-to-one

correspondence between the two sets.


If the correspondence is set up in the way that we have just described, then we
can compute the distance between any two points by means of the formula

Here P1 +--* x1 and P2 +--* x2 This distance formula holds no matter how the points
P1 and P2 are situated on the line:
0

and so on; in every case, P1P2

lx2 - x11.

Thus we have a one-to-one corre

spondence P x, between the points of the line and the real numbers, such that the
distance formula holds for every pair of points. Such a correspondence is called a

coordinate system for the line. If P

+--* x, then x is called the

coordinate of P.

These ideas are summed up in the following postulate.


The Ruler Postulate.

Every line has a coordinate system. And given any two points

0 and

P of the line, there is a coordinate system in which the coordinate of 0 is


and the coordinate of P is positive.
0

x>O

On the basis of the ruler postulate, it is easy to set up a coordinate system in the
plane. We take two perpendicular lines Xand Y, intersecting in a point 0. On each
of the two lines we set up a coordinate system, in such a way that 0 +--* 0; that is, the
coordinate of 0 is zero on each of the lines Xand Y.

Xis called the x-axis, Y is

called the y- axis, and the point 0 is called the origin.


Given any point P of the plane, we drop a perpendicular from P to the x-axis,
ending at a point M.

The point M has a coordinate x, on the line X.

then x is called the x-coordinate of P.

y
y N--------- -,p

Ml
I
L--------- y
p
N

I
I
M x
x

If M +--* x,

Analytic Geometry

18

2.2

Similarly, we drop a perpendicular from P to the y-axis, ending at a point N.


If N

+--+

theny is called they- coordinate of P. Thus we have a matching scheme

y,

p +--+

(x, y)

between the points P of the plane and the ordered pairs

(x, y)

of real numbers. The

order in which we write the numbers makes a difference. In the left-hand figure below,
and

Q +--+ (2, 1).

We may speak of "the point (1, 2)" or "the point

(x, y),"

meaning "the point

which is matched with (1, 2)" or "the point which is matched with
may write P

(x, y),

meaning P+--+

(x, y)."

Thus we

(x, y).

--

p
-,
I

N
y

I
I

Q
------,
I
I
I
I
I
I

p'

-----

tt

Obviously

and

and

I
I
I
I
I

are determined when Pis known. And Pis determined when

are known, because the vertical line through M and the horizontal line

through N intersect in exactly one point. Thus we have a one-to-one correspondence


p

+--+

(x,y)

between the points of the plane and the ordered pairs of real numbers. Such a corre
spondence is called a

coordinate system for the plane.

We need to see how the algebra

in this situation is related to the geometry.


y

Consider first the question of distance.

(x2, h)

If we know the coordinates

(x1, y1)

and

of two points P and Q, then the points are determined, and so the distance

between them is determined. The following theorem gives a formula for the distance.

Coordinate Systems.

2.2

Theorem

The Distance Formula

19

1. If
and

then

PQ=
Proof

.J(x2 - X1)2

Draw the vertical line through

at the point

R.

Let

and

T be

+ (Y2 - Y1)2.

and the horizontal line through

P,

the feet of the perpendiculars to X, from

meeting

and

respectively. Then

PR= ST,
because opposite sides of a rectangle have the same length. And

ST= lx2 - Xii,


by definition of a coordinate system on a line. Therefore

PR = lx2

Xii

For the same sort of reason,

RQ= UV= IY2 - Y1I


But

t:,.PQR is a right triangle, with its right angle at R.

theorem,

Therefore

Therefore, by the Pythagorean

PQ2 = PR2 + RQ2


= /x2 - X1l2 + IY2 - Y1l2.
PQ= .J1x2 - X1l2

2
IY2 - Yil -

This is not quite the formula given in Theorem I, because it uses absolute-value
signs instead of parentheses. But this makes no difference, because

/X2 - X112= (X2 - X1)2,


and

2
2
IY2 - Y1l = (Y2 - Y1)

(Why? We need a theorem from Section 1.4.)


In the previous figures, we have shown the x-axis going positively from left to
right, and the y-axis going positively from bottom to top.

Logically speaking, we

could equally well have put the axes in any of a number of other positions:

But fhe axes are usually drawn as shown on the right above.

This figure shows the

minimum that must be indicated when graph paper is used for drawing pictures of

20

Analytic Geometry

2.2

coordinate systems. That is, the axes must be labeled, and the number scale must be
shown on each axis, by indicating the coordinate of at least one point.
The two axes separate the plane into four parts, called quadrants. The quadrants
are numbered I, II, III, IV. That is, the first quadrant is the set of all points (x, y)
of the plane for which x > 0 and y > O; the second quadrant is the set of all points

(x, y) for which x < 0 and y > O; and so on.


y

y
1

II

III

IV

--+---X
-'--

We have used the letters X and Yin order to have convenient names for the x
and y-axes. The axes are more commonly labeled as on the right above.
PROBLEM SET 2.2
Calculate the distances between the following pairs of points. Then plot the points and
check the plausibility of your answers.

1. a) (1, 2)

and

c) (7, -5)
2.

a) (3, 7)

(3, 4)

and

3. Obviously, PQ

b) ( -2, -4)

and

(4, 2)

(5, -7)

d) (1, 0)

and

(0, 1 )

(-3, -7)

b ) ( 1, 3 )

and

(--2, 7)

and

QP for every pair of points P, Q; the distance between two points

does not depend on the order in which the points are named.

Therefore any correct

distance formula has the property that when we interchange the two points, the formula
gives the same answer. Check algebraically that our distance formula has this property.
4.

Find out whether or not the points ( - 1 0 , 10), (14, 3), and (38, -4) are vertices of an
isosceles triangle. Is the triangle equilateral?

5. Find all points (x, y) such that (0, 0), (2, 2), and (x, y) are the vertices of an equilateral
triangle.
6.

Find out whether the points ( -2, 3), (0, 1), and (3, 4) are the vertices of a right triangle.
Then plot the points and check for plausibility.

(This problem can and should be

worked by the use of distances alone. The use of slopes is not necessary.)

7. Find the coordinates of the point which is equidistant from (0, 0), (1, 2), and (3, -1).
Find the radius of the circle which passes through the three given points.

8. What point on the y-axis, if any, is equidistant from ( -1, -2) and (2, 3)?
9. a) Give a formula for the perpendicular distance between (x, y) and the x-axis.
b) Give a formula for the perpendicular distance between (x, y) and the y-axis.
10.

Find out whether the points ( -1, -1), (0, 1), and (2, 5) are collinear. Then plot and
check for plausibility.

11.

(The remarks following Problem 6 also apply here.)

Find a point on the x-axis which is collinear with the points (1, 2) and (0, 3).
remarks following Problem 6 also apply here.)

(The

The Graph of a Condition.

2.3

Equations for Circles

21

The following problems are a review of the main theorems of elementary geometry that
we have been using so far.
12.

Show that an exterior angle of a triangle is greater than either of its remote interior
angles.
A

.,

l><i_.

That is, show that in the left-hand figure we have LACD > LA.
based on the figure on the right.

The proof is

[Query: If you know that LACD > LA, how do

you infer that LACD > LB?]


13.

Show that there is only one perpendicular to a given line, from a given external point.
That is, show that the left-hand figure below is impossible for A B. (We needed this
in order to explain what was meant by the x-coordinate of a point; A must be determined
when P is known.)

..

14.

Write the proof of the Pythagorean theorem suggested by the figu,re on the right above.

15. The proof of Theorem 1 of this section was incomplete: it discussed only the most

significant case and neglected to mention two other cases. The point is that if P and Q
lie on the same horizontal line, or the same vertical line, then there is no such thing as

t-,PQR, and so the Pythagorean theorem cannot be used.


Show that the distance formula holds in the case x1
Yi
2.3

x2, and also in the case

Y2

THE GRAPH OF A CONDITION. EQUATIONS FOR CIRCLES

Given a point Pand a positive number r, the circle with center Pand radius r is the
set of all points of the plane whose distance from Pis equal to r. That is, a point Q
is on the circle if PQ

r.

This is the first and simplest example of the idea of the graph of a condition.

If

we state a condition which every point of the plane either satisfies or doesn't satisfy,
then the graph of the condition is the set of all points of the plane that satisfy it. (Thus
the graph is simply the solution set of an open sentence; we use the word graph

22

Analytic Geometry

2.3

when the solution set is a set of points.) In this language, we say that the graph of the
condition OQ

r is

the circle with center at the origin and radius

r.

y
r

-r

The

interior of the circle with center P and radius r (r > 0) is the set of all points
< r. Thus the interior is the graph of the inequality PQ < r. We

Q such that PQ

indicate such graphs in figures by means of shading or cross-hatching.

Sometimes the condition takes the form of an algebraic equation. For example,
if Q +--+

(x, y),

then the distance formula tells us that

OQ

Jx2

+ y2.

Therefore the condition

OQ

(1)

can be written in the equivalent form

Jx2
or

y2

x2 + y2

The point

r,

(2)

r2.

(3)

(x, y) is on the circle if and only if x and y satisfy (2). And


Jx2 + y2 r <=> x2 + y2 r2
(r > 0).
=

Thus the circle with center -at the origin and radius 2 is the graph of

J x2

y2

<=>

x2

y2

4;

and the interior of this circle is the graph of

J x2

y2 <

<=>

x2

y2 <

4.

Similarly, the first quadrant is the graph of the condition

x > 0

and

y > 0.

2.3

The Graph of a Condition.

Equations for Circles

23

y
x>O
y>O

x>O
y<
O

IV

The fourth quadrant is the graph of the condition

> 0 and y < 0.

We found that the circle with center at the origin and radius
equation
2
x

y2

is the graph of the

2
r .

Consider, more generally, the circle with center at

(a, b)

and radius

r.

By

definition, the circle is

{PI QP

r .

.,.,....-

r,,....
,,

P(x, y)

Q(a, b)

QP algebraically, we get

Using the distance formula to express

QP

)(

. <=>

<=>

(x

- a)2

+ (y

- b)2

+ (y

- b)2 = r2

a)2

Thus:
Theorem 1.

The circle with center at

(x - a)2

(a, b)
+

and radius

(y - b)2

is the graph of the equation

r2

An equation written in the above form is easy to interpret.

(x

2)2

(y - 5)2

we see by Theorem 1 what the graph is.

4,

On the other hand, if such an equation

is "simplified" algebraically, it may look like this:


2
x

For example, given

+ y2 + 4x - lOy +

25

0.

24

Analytic Geometry

2.3

5
4
3
2

To find out what the graph is, we first "unsimplify" by completing the square:

x2+ 4x+ y2 - IOy = -25


<:::>

x2+ 4x+ 4+ y2 - lOy+ 25= -25+ 4+ 25


2
<=> (x+ 2)2+ (y - 5) = 4.

In the general case, for equations of the form

x2+ y2+ Dx+ Ey+ F =


there are three possibilities for the graph.

0,

In some cases, the graph is a circle.

x2+ y2

But

is also an equation of this form, and its graph is not a circle, but a single point, namely
the origin. And the equation

x2+ y2+ 1
is never satisfied, for any

x and y.

Its graph is therefore the empty set

{ }.

By completing the square, starting with the general form, we shall show that these
three possibilities-a circle, a point, and the empty set-are in fact the only ones:

x2+ y2 + Dx+ Ey+ F=

2
(D)2
(E)2
(D)2
x + Dx+ 2 + y2+ Ey+ z = - F+ 2
<:::>

(E)2

+ 2

D)2
E)2 D2 + E2 - 4F
(
x+-+ y+-=
.
2
2
4

If the fraction on the right, in the last equation, is positive, then it is=
positive number

radius

r.

r,

and so the graph is the circle with center at

If the fraction on the right is

r2 for some
(-D/2, - E/2) and

0, then the equation takes the form

The Graph of a Condition.

2.3

and the graph contains only the point

D/2

E/2)

Equations for Circles

25

Finally, if the fraction on the

right is negative, then the equation is never satisfied, for any x and y, and so the graph
is the empty set { }.
To sum up:

Theorem 2. The graph of an equation of the form


x2 + y2 + Dx + Ey + F

is a circle, a point, or the empty set.


PROBLEM SET 2.3
Problems 1 through

6.

In the illustration below six figures are drawn. For each of these figures, state a condition
which has the given figure as its graph. In the figure, the arrowheads merely indicate that the
Thus (1) and (2) are entire
(3) is a ray, going infinitely far on the right, but stopping at the point (0, 4) on the left;
and (6) is a segment, with endpoints (I,'-3) and (4, -3).

line is supposed to go infinitely far in the indicated direction.


lines;

(1)

(4)

Problems 7 through 10.


Follow the same directions as in the previous problems for the illustration below.

26

Analytic Geometry

2.4

Sketch the graphs of the following conditions, using cross-hatching to indicate regions.
11. x2

y2 = 1

14. x = 2
18. x2

and

12. x2
0 y 2

y2

<

15. x = - 3

y2 1

13. x2

1
16. y

19. x y 0

y2

>

17. y = x/lxl, x
20. x

>

0,

>=

y = 3

21. a) Sketch the graph of the condition "(x, y) is equidistant from the points (0, 1) and
(1, 0)."
b) Write this condition in the simplest possible algebraic form.
22. Write the simplest equation that you can get, for the set of all points that are equidistant
from (1, 2) and (0, 3). What sort of a figure is this graph? How is it related to the
segment from (1, 2) to (0, 3)?
23. Same problem, for the set of all points that are equidistant from (1, 2) and (2, 2).
24. Same problem, for the set of all points that are equidistant from P i = (xi, Yi) and
P2 = (x2, J2).
*25. Describe and sketch the graph of the equation
v (x - 1)2

(y - 2)2

v (x - 4)2

(y - 7)2 = v34.

[Hint: If you do a lot of algebra, you will probably get the wrong answer; the graph is
an ellipse.]

not

*26. Describe the graph of the equation


v x2

(y - 1)2

v (x - 2)2

y2 = 1.

[The same hint as for Problem 25 applies here.]


27. Draw the graph of the equation
x3y

y3x - xy = 0.

xy2 - xy = 0.

28. Draw the graph of the equation


x2y

29. Consider the set of all points that are twice as far from the origin as from the point
(3, 0). Find an equation for this graph, and sketch.
2.4 EQUATIONS OF LINES. SLOPES,
PARALLELISM. AND PERPENDICULARITY

Every line is the graph of an equation of the form

Ax+ By+ C
where A and B are not both

0,

0. The proof is as follows.

Every line is the perpendicular bisector of some segment. If Lis the perpendicular
bisector of the segment from

Q +----'> (ai, bi) to R +----'> (a2 , b ),


2

{P j PQ

PR}

then

Equations of Lines.

2.4

Slopes, Parallelism, and Perpendicularity

27

(Remember your geometry.) Therefore Lis the graph of the equation

<::?-

2
2
2
2
.J(x - a 1) + (y - b1)
.J(x - a2) + (y - b2)
2
x2 - 2a1x + ai + y2 - 2b1y + b
x - 2a2x + a: + y2 - 2b2y + b

<::?-

2(a2 - a 1)x + 2(b2 - b1)y + a i + bi - a - b

0.

This has the desired form

Ax + By + C

0,

with
and

B cannot both be 0, because a2 - a1 and b2 - b1 cannot both


0; the number pairs (au b1) and (a2 , b2) are the coordinates of Q and R, and
Q =;tf R, because Q and R are the endpoints of a segment.

The numbers A and


be

An equation of this type is called a


Theorem 1.

linear equation in x and y.

Every line is the graph of a linear equation in


y

Thus we have

and y.

If the line is not vertical, we can say more. In this case, the perpendicular segment
from Q to R is not horizontal, and this means that

b2 - b1 =;tf 0.

Therefore

B =;tf 0,

Analytic Geometry

28

2.4

and we can divide by B and solve for y. This gives

which has the form


y

+ k,

mx

where
A

m =

In the figure, the label

b2

bi

k on the y-axis is correct, because k is they-coordinate of


k). The number k is called the
(m 0 + k

the point where L crosses the y-axis

y-intercept

of the line.

The number m also has a geometric meaning, as we shall soon see.


(xi, Yi) and P2 (x2, y2) are any two points of a nonvertical line, then the

If Pi

slope

of

the segment from P1 to P2 is defined to be the fraction

The denominator x2

is marked 6.x in the figures below; it is pronounced

Xi

"deltax," and stands for the


stands for the

difference

in y.

difference

inx. Similarly, Yz - Yi is marked 6.y, which

Here 6.y and 6.x are not necessarily distances in the

sense of elementary geometry, because they may be negative.


y

y
P1

P1

P,

t:.x

" <

y
" <

O
t:.x>O

P,

<
Slope

Y2-Y1
x2-x1

t:.y
t:.x

We shall show that all segments of the same line have the same slope, and that
this slope is the number

which appears in the equationy

mx + b.

Given two points Pi (x1,y1) and P2 > (x2, Yz), on the line
y
then
Yz

x2

+ k

+ k,

mx

and

Therefore
and

Y2

Y1

X2 - X1

m.

2.4

Equations of Lines.

Slopes, Parallelism, and Perpendicularity

29

(In this calculation we do not care whether y2 - Yi and x2 - x1 are positive or


negative. The algebra takes care of all cases at once.)
The number m is called the slope of the line. And we have proved the following
theorem:
Theorem 2.

The graph of the equation


y=mx+k

is the nonvertical line with slope m and y-intercept k. All segments of this line have
slope=m.
The equation given in this theorem is called the slope-intercept form of the equa
tion of the line.

y=x

A line can be described by many different equations. For example, the bisector
of the first and third quadrants above is the graph of each of the following equations:
y=x
x-y=O

<=>
<=>
<=>
<=>

and so on.

3x - 3y=0
2
- y) =0

(x

(x - y)177=0,

But there is only one equation, in the slope-intercept form, for every

nonvertical line, because when the line is named, its slope and its y-intercept are
determined.
Often a line will be described by its slope m and the coordinates x1, Yi. of one of
its points. We can then find an equation for it in the following way. If (x, y) is any
other point of the line, then

because all segments of the line have the same slope


Y - Y1=m(x -

m.

Therefore

X1 .

The graph of this equation contains (x1, y1), because 0=m


line with slope=m, because the equation has the form
y=mx + (y1 -mx1)=mx + k.

0. And the graph is a

Analytic Geometry

30

2.4

Thus:
The graph of the equation y

Theorem 3.

slope

and contains the point

For example, the graph ofthe equationy


slope

and passes through the point

intercept form y

2x

y1

m(x - x1)

is the line which has

(x1, Yi).
3
-2(x + l)is the line which has
( -1, 3). Solving for y, we get the slope
-

+ 1.

Two nonvertical lines are parallel if and only if they have the same slope.

Theorem 4.

Given:

we need to prove two things: (1)If the slopes are the same, and the lines are different,
then the lines are parallel.
1)

If m1

m2,

then

k1

(2) Ifthe slopes are different, then the lines are not parallel.

--

k2,

because the lines are different. Therefore the lines are

parallel, because the two equations are inconsistent: they take the form

Since

2)

k1

--

k2,

If m1 --

these equations have no common solution.

m2,

the lines cannot be parallel, because the equations always have a

common solution. By subtraction we get

and we now find they-coordinate of the point of intersection by substituting in either


of the original equations.
Theorem 5.

If two nonvertical lines are perpendicular, then their slopes are negative

reciprocals of each other.


y

Equations of Lines.

2.4

Slopes, Parallelism, and Perpendicularity

31

Proof Given Li with slope mi and L2 with slope m2, intersecting at right angles at T.
Let (ai, bi) and (a2, b2) be points of L2 which are equidistant from T. Then Li is the
perpendicular bisector of the segment between these points. As we found earlier, the
slope of Li is

But we can calculate the slope


and

(a2, b2).

Obviously

m2

m2 of L2 by the slope formula, using the points (ai, bi)

This gives

-1/m1.

This also works the other way around:

Theorem 6. Given two lines L1, L2, with slopes m1, m2. If

then

Li and L2

are perpendicular.
y

\
\
\
\
\
\
\
\
\

\
\
\

Lz

L?.

Proof
(Why?)

First we observe that the lines cannot be parallel, because


Let T be the point where they intersect.

perpendicular to

L1.

Then

has slope

m2

there is only one line with a given slope.


perpendicular to

Let

l/m1.

(Why?)

m2 cannot be

m1.

be the line through T,

But through a given point

Therefore L is

L2,

and

L2

is

L1.

Probably you have seen these theorems proved before, in different ways.

The

treatment given above is intended to avoid repetitions and also to furnish some
practice in drawing geometric conclusions by algebraic methods.

Analytic Geometry

32

2.4

PROBLEM SET 2.4

Find point-slope equations, and slope-intercept equations, for the Jines containing
the following pairs of points.

1. (-3, 2), (2, 1)

2. (3, -4), (1, 2)

4.

3. (I, 0), (3, 3)

(-1, I), (2, -2)

5. Find an equation for the tangent to the graph of


x2 + y2
at the point
6.

25,

(3, 4).

Given thatP1

(x1, y1) lies on the circle


x2 + y2

with

a2,

LetP2 be the point where the tangent atP1 crosses the x-axis.

[Warning: Geometric distances are never negative.]

Find the distanceP1P2

-a

7.

Find the points P on the circle x2 + y2


passes through the point

2 so that the tangent line to the circle atP


(2, 0). (You may use the fact that, at any pointPon a circle,
=

the tangent and the radius are perpendicular.)


8.

Sketch the graph of the equation


x2 + y2 + I +

9.

2x

2y

0.

Sketch the graph of the equation

x2
10.

2xy

+ 4y2 + I

4xy +

2x

4y

0.

Sketch the graphs of the following equations.

a) y

lxl

b) y

-l2xl

c) y

- Ix - II

For this problem we offer a hint which applies equally well to a very large number of other
problems.

If you didn't know the meaning of the symbol

lxl, you would have no hope of


lxl, and use it.

sketching the graph. This suggests that you should recall the definition of

Graphs of Inequalities. And, Or, and If ... Then

2.5
11. Sketch the graph of

33

!xi + lyl = 1.

[Hint: As a first step, sketch the portion of the graph that lies in the first quadrant.]
12. Sketch the graph of the equation

y = x +!xi + 1.
13. Sketch the graph of the equation
!xi - lyl = 1.
14. Sketch the graph of the equation
v(x -1)2 + (y - 3)2 + v(x - 4)2 + (y - 2)2

v10.

15. Let C be the set of all points P such that the segment from ( - 1, 0) to Pis perpendicular
to the segment from P to (2, 1). What sort of figure is C? Sketch. (In answering this
one you should bear in mind that the endpoints of a segment are always different. That
is, there is no such thing as the segment from P to P.)
*16. Let A = ( -2, 0), let B = (2, 0), and let G be the set of all points P such that LAPE

is an angle of 60. What sort of figure is G? Sketch. (You will have to remember and
use some plane geometry, to do this one. If you have suitable drawing instruments, you
ought to be able to do a good sketch.)

2.5

GRAPHS OF INEQUALITIES. AND, OR, AND IF ... THEN

We have found that the graph of the equation

(x is the circle with center at A

1)2 +

(y -

1)2

(1, 1) and radius 1.

1
The interior of the circle is the

graph of the condition AP < 1. This is the region marked R1 in the figure. It is the
graph of the inequality

(x -

1)2 +

(y -

1)2 < I.

Similarly, the exterior R is the graph of the condition AP > 1, so that


2
R2

{(x,y) I (x -

1)2 +

(y -

1)2 > l }.

34

Analytic Geometry

2.5
y

The graph of the equation

is a line L. The points lying above L form a set H1, called a

halfplane.

Evidently H1

is the graph of the inequality


y

>

x.

The points lying below L form a half-plane H2; and H2 is the graph of the inequality

y< 1

x.

Consider now the double inequality

t< x < %.
The graph is an infinite vertical strip R1, lying between the lines
X

.l..

and

Ji

Similarly, the graph of

t<y< 1
is an infinite horizontal strip, as shown on the left below.
y
I
.I
I
I

y
1

___

--

__

----

--

Consider next the condition


or

I
I
I
I
I

t<y<l.

--- -

35

Graphs of Inequalities. And, Or, and If ... Then

2.5

The graph of the condition using or is an infinite cross-shaped region. This region R'
is the union of an infinite vertical strip R1 and an infinite horizontal strip R2; it contains
all points of the plane that belong to R1 or to R2.
(In mathematics, when we say that one condition holds or another condition
holds, we allow the possibility that both conditions hold. If we mean " ... but not
both," we have to say so.)
Similarly, the graphs of the conditions

y>x,

y > -x

are two half-planes H1 and H2


and to the right of the line y

They are respectively to the left of the line y


x
-x, as shown in the figure on the left below. The graph
=

of the condition

y> x

and

y > -x

is the intersection of these two half-planes. This is the interior R1 of LAOB.


y
y= - x

y=x

The graph of the condition

y >x

or

y > -x

is the union of the two half-planes.


y

y
S1
R

I-

R
I
I

'

I
I

l
2

S2

'

2
R

36

2.5

Analytic Geometry

Let us now see what sort of graph we get when we combine two inequalities by
"if ... then." Consider the condition

i<x<i
This says that

t<x< },

if

i<y<l.

=>

then

i<y<l.

Let R be the graph. We assert that R looks like the drawing on the right above. That
is, R contains all points that do

not

lie in either of the two vertical strips marked

S1 and S2. The reason is as follows:

1)

If

(x, y)

is a point of R, and

t < x< !,

fore the part of R that lies between the lines

then we must have


=

t and x

t <y< I.

There

i must be the interior of a

rectangle, as indicated by the dashed lines in the figure.


2)

On the other hand, if xis

not

between

t and i,

then the condition for the graph

imposes no restriction on y at all. Therefore R contains all points to the left of the
line

t and

all points to the right of the line x

R also contains these two

vertical lines, for the same reason.


The reasoning in (2) may seem a little tricky, but may be clarified by an analogy
from everyday life.
defective vision,

The law in most places requires that if a person has seriously

then

he must wear corrective glasses when driving a car.

A person

with normal vision automatically obeys this law; its restrictive clause does not apply
to him. In the same way, the "law"

i<x<imposes a restriction only on points

=>

(x, y)

t<y<l
for which

l <x< -;

all other points

automatically obey the "law," because its restrictive clause does not apply to them.
Thus the "law" holds under each of the following three conditions:
1)
2)

t <x< !
x t,

and

t<y<

1,

3) x !.
The graph of (1) is the rectangular region in the middle of the figure; the graph of (2)
is the infinite region to the left of the line x
region to the right of the line x

t;

and the graph of

i.
y

Yes

x<O
y>O

x>O
y>O

Yes

Yes

x<O
y<O

x>O
y<O

No

(3)

is the infinite

2.5

Graphs of Inequalities. And, Or, and If ... Then

37

Similarly, the graph of the condition

x>O

yO

=>-

contains all of the plane except for the fourth quadrant.

x> 0

quadrant that

y < 0 s

holds and

y 0

It is only in the fourth

x> 0,
y 0.

does not hold; and the possibility

the only possibility that is ruled out by the condition

x> 0

=>-

In each of the following cases, the shaded region is the graph of the condition
appearing below it.
y

x;;;o
y

-_ +
1 -

r-

-t-.,.x

-1

There is no need to use graph paper in the following problem set.

Reasonably

neat freehand sketches, with cross-hatching used to indicate regions, are sufficient.

PROBLEM SET 2.5

Sketch the graphs of the following conditions:

1. !

< x <

3. } g
5.

<

y < ii

Ix - 11 <

7. !y - 2i <
9.

2. Ix - 1 1 < l

4.

t and ly - 21

/0

=>-

fyl ;2 ix!

11. x2 + y2 ;2 1

< /0

fx - 11 <

ly - 21 <

6. lx - lj <

8.

'

110

( x - 1)2 + y2 ;2 1

ly - 21 < /0

;;;: lxl

10. x2 + y2 ;2 1
or

=>-

and

(x - 1)2 + y2 ;2 1

38

2.6

Analytic Geometry

12. x2+ y2 1

(x - 1)2+ y2 1

=>

x2+ y2 1

13. (x - 1)2+ y2 1

=>

14. x2+ y2 4

and

(x+ 1)2+ y2 1

15. x2+ y2 4

or

(x+ 1)2+ y2 1

16. x2+ y2 4

=>

x2+ y2 1

18. x2 + y2 1

and

y ;;;; x

=>

x2+ y2 4

and

x lyl

-x - y 1

25. !xi - ly l 1
21. Ix - yl 1

26. Ix+ yl 1
=

19. x2+ y2 1
23.

-x+ y 1

24. Ix!+ lyl 1


28. x

=>

21. x - y 1

20. x+ y 1
22.

17. x2+ y2 1

30. Ix - 31 < t

=>

29. Ix - 21 <Yo

=>

ly - 11 < t

l y - 21 < t

31. Supposeyou know that (a)P =>

Qand (b)Pis false. What, if anything, can you infer

about Q?

32. Supposeyou know that (a)P =>

Qand (b) Qis true. What, if anything, can you infer

aboutP?

33. Supposeyou know that (a)P =>

Qand (b) Qis false. What, if anything, can you infer

aboutP?

34. Supposeyou know thatP

=>

Q. Which of thefollowing arepossible?

a) P is trueand Qis true.

b) Pis trueand Q is false.

c) Pis false and Qis true.

d) Pis falseand Qis false.

35. Suppose you know that P <=>


in Problem 34 arepossible?
2.6

PARABOLAS

The

distance

Q. Which of the combinations (a), (b), (c), and (d)

from a point to a line is the length of the perpendicular from the point

to the line . Given a point F and a line Dnot containing F,

and directrix

the parabola with focus

Dis the set of all points of the plane that are equidistant from Fand D.

The parabola is the graph of the condition


FP =MP,
where Mis the foot of the perpendicular from P to D. The perpendicular line to D

Parabolas

2.6

39

through F is called the


parabola is called the

axis of the parabola. The point where the axis crosses the
vertex. (There is only one such point, because any such point

is midway between the focus and the directrix.)


The first step in the study of parabolas is to get equations for them.
y

::::-+-:+l-. x
D..-----------

_______

n_

____

Mly=-'E.
2

In setting up our axes, we take the vertex as the origin, and the x-axis parallel to

the directrix, in such a way that Dis below the x-axis and the focus is above it.

The

number

be a

is the distance from the focus to the directrix.

point of the parabola. Then

Now let P

= Jx2 + (y r

FP

and

Therefore
FP

MP

Jx2 + (y - r = J (y + r

x2 + (y - )2 = (y + r
x2 + Y2 py + 2 y2 + py + 2
x2 = 2py
- 21p ..
_

This has the form


where

a = l/2p,

and

p__
4

.,

p 2a1

=-.

p__
4

(x, y)

40

Analytic Geometry

2.6

Thus we have proved the following theorem:


Theorem 1.

The graph of the equation

y= ax2
is a parabola, with focus at

(0, 1/4a)

and directrix

y=-1/4a.
y

_ti_
M

______

-4li
FP=MP

<=>

____

y=ax2.

If a parabola is situated like this, relative to the axes, then the parabola is said
to be in standard position.

The use of standard position simplifies the equation

considerably. For example, if Fis the point (2,

parabola is the graph of the equations

-1) and

Dis the line y= 3, then the

FP=MP

<=>

.Jex - 2)2 + (y + 1)2 = .J(y - 3)2


x2 - 4x + 4 + y2 + 2y + 1 = y2 - 6y +

<=>

x2 - 4x - 4 =-8y

<=>

y = -tx2 + tx + t.

<=>

It is not hard to check, in general, that if the directrix is horizontal, then the equation
always takes the form

y = Ax2 + Bx + C,

:;i: 0.

,t: 0.

And if the directrix is vertical, we get

x= Ay2 + By + C,

If the directrix is neither horizontal nor vertical, then the equation involves, in

general, terms in x2,

y2, and xy, as well as linear terms and a constant. In this case it

is hard to derive the equation when the focus and directrix are given; and it is even

harder, when the equation is given, to see that the graph is a parabola. This case will
be discussed in Chapter

8.

For a long time to come, however, we shall deal only with the simplest case, in

which the directrix is horizontal.

Parabolas arise in a variety of contexts which appear at first to be unrelated.

Following are a few.

Parabolas

2.6

1)

41

If a right circular cone is cut by a plane parallel to an element of the cone, the

resulting curve is a parabola. This was the viewpoint from which the Greeks studied
parabolas; and it is for this reason that a parabola is one of the conic sections. There
are other kinds of conic sections, obtained by slicing cones by planes in various
positions.
y

---------------

2)

If a theoretical projectile is fired from the surface of the earth, in any direction

other than straight upward, the path that it moves along is a portion of a parabola.
In the figure on the right above, the x-axis lies along the surface of the earth, the
y-axis is vertical, L Cl.. represents the angle at which the gun is aimed, and Tis the
point where the projectile hits the ground. We say, "a theoretical projectile," because
to get this result you must assume both that the weight of the projectile is independent
of its altitude and that the air makes no resistance. These assumptions are false, but
they are good approximations to the truth, if the projectile is not going very fast or
very high.

For high-speed, long-range projectiles, both assumptions are quite

unrealistic, and the situation is more complicated.


3)

If you rotate a parabola around its axis, you get a surface which is called a parab

oloid

of recolution. The mirror in a reflecting telescope is a paraboloid of revolution,

as is the reflector in an automobile headlight.

The reason is that if a ray of light

travels along a line parallel to the axis, and is reflected in the usual way, it always hits
the focus. And conversely, if a ray of light starts at the focus, hits the surface and is
reflected, it always continues along a line parallel to the axis. The first of these prin
ciples is used in telescopes, and the second in headlights.

i
I
I

i
I
I

l"1F
4)

Suppose that you fire a "theoretical projectile'" vertically upward. It moves up a

vertical line, for a certain distance lz, and then comes down again along the same line.
Thus the path of motion is simply a segment.

Suppose now that we label our

42

Analytic Geometry

2.6

horizontal axis as the t-axis; we measure time starting at the moment of firing; and
we plot, for each time t, the height of the projectile at time t.
y

-----

-:..:-...--

The resulting graph is a portion of a parabola. In the figure above,

is the

time at which the projectile hits the ground. Note that the graph that we have been
discussing is not all of the parabola: one minute before firing, the projectile was in
the gun; it was not underground. And at t
the motion stops.

= a

the projectile hits the ground, and

Therefore, in the figure there is a solid arc, which indicates the

portion of the parabola that is related to the physical problem; the irrelevant part
of the curve is indicated by dashed arcs to the left and right.
This example indicates that geometric ideas come up in physics in unexpected
ways; the uses of geometry are not limited to the study of figures in space.

PROBLEM SET 2.6


1. Take a full-size sheet of graph paper; draw the y-axis in the center; and draw the x-axis
near the bottom of the paper.

Then choose the largest uniform scule that you can,

on the axes, in such a way that x ranges from - 2 to 2 and y ranges from -! to 4.
Now sketch the graph of y =x2 First plot the points corresponding to the following
values of x:
x=O,

x=0.1,

x= 1.2,

x=0.2,

x = 1.4,

... ,

... '

x=0.9,

x = 1.8,

x=l,

x= 2.

Then draw the curve, freehand, as smoothly as you can. If this is done carefully, it will
really look as if FP = MP at every point of the curve.
One of the reasons for doing this is that it will give you an accurate idea of what a
parabola really looks like.
2.

Show that
0 < x1 < x2

xi < x .

=>

y
y=ax2 , a>O

Xz
Y1 <yz

Tangents

2.7

43

This means that the right-hand half of a parabola in standard position rises as we go
from left to right along the curve.

3. Show that
What does this tell us about parabolas in standard position?
4. Find the focus and the directrix of the graph of the equation y = x2
5. Same problem, for the equation y = 3x2

6. Same problem, for the equation y = tx2

7. Show that the graph of the equation y = x2 + 1 is a parabola. To show this, you must
find its focus F and directrix D.

You can then check by deriving the equation of the

parabola with focus F and directrix D.

8. Show that the graph of the equation y = (x - 2)2 is a parabola.


9. Same problem, for the equation y = (x - 2)2 + 1.

10. Show that the graph of y = x2 - 2x is a parabola.

11. Show that the graph of the equation y = (x + 1)2 is a parabola.


and directrix D.)

(Find the focus F

12. Same problem, for y = (x + 1)2 - I.


13. Same problem, for y = (2x + 1)2
2.7

TANGENTS

In geometry, tangent lines to circles are defined as follows.


Definition. A tangent to a circle is a line (in the same plane) which intersects the
circle in one and only one point. This point is called the point of contact.
It is then shown that a line is tangent to the circle if and only if the line is per
pendicular to the radius drawn to the point of contact.

(In fact, the latter condition

is probably the one that you used to find the slopes of tangent lines to circles, in
Problem Set 2.4.)

z2

y2

+b2

1.

Tangency can be defined in the same way for an ellipse.

Ellipses will be studied

in Chapter 8. Meanwhile we observe that an ellipse is an oval curve, of the sort shown
in the right-hand figure above, and the tangents to it are the lines that intersect
it in one and only one point.

44

Analytic Geometry

2.7

But for some curves, tangents cannot be described by the definition that we use
for circles.

Consider, for example, a parabola, as shown in the figure below.

tangent to the parabola, at the point (x1,


the vertical line through

(x1, y1)

The

y1), intersects the curve only at (x1, Ji). But

has the same property; and the vertical line is not

a tangent.
y

We may try to get around this trouble by providing that the tangent line must
only touch the curve, without crossing it.
either. The graph of

y =

x3

But for many curves, this won't work

is shown below. The tangent to this curve at the origin

turns out to be the x-axis; and the x-axis crosses the curve, at the point of tangency.
Jn other cases a tangent line may cross a curve in many points.
y
y

y =x

The geometric idea of tangency is obvious in all these cases.

But the above

examples indicate that the mathematical definition that works for circles does not
work in general.

To find the tangents to other curves, we need a better definition.

Consider first the graph of y

x2,

and the fixed point

find the slope of the tangent. For every other point

(x,

x2

(I, 1) at which we want to


of the curve, let Lx be the

Tangents

2.7

secant line through

45

(1, 1) and (x, x2). Then the slope of L., is


m,,,

x2

---

(x -:F- 1).

Here the restriction x -:F- 1 reflects the geometric fact that it takes two different
It also refers to the algebraic fact that fractions with

points to determine a line.

denominator 0 have no meaning.


y
y

!/'m.

Lx

I
I
I
I
I
I
I

We shall now draw the graph of y


y

mx

mx

x+

(x -:F-

1). We have

(x -:F-

1).

The graph is a line from which one point has been deleted.
such thing as "the secant line through
such thing as the "fraction"
to see that

mx

m1

For x

(1, 1) and (1, l2"


) ; and for x

0/0.

1, there is no
1, there is no

But this causes no trouble, because it is easy

is very close to 2 when xis very close to 1. We express this by writing


lim

x->1

mx

This is read: "The limit of mx, as xapproaches


general definition of the idea of a limit.

2.

1, is equal to 2." Later we shall give a

But in the present case, the meaning of the

limit is clear, and so we use it in the definition of the tangent to the parabola.
Definition.

The tangent to the graph of


y

ax2 + bx +

c,

at a point (x0, y0) of the graph, is the line through (x0, y0) with slope
Sx0

where

mx

lim

xx0

111.,,

is the slope of the secant line passing through the points (x0, y0) and

(x, ax2 + bx +

c)

(x -:F- x0).

Analytic Geometry

46

2.7

Even in the general case, the slope is easy to calculate on the basis of this defini
tion. We have

ax2 + bx +

x0)

(x

x0)

= a(x + x0) + b

(x

x0)

= ax + (ax0 + b).

(x

x0)

X - Xo

a(x2 - x) + b(x - x0)

The graph of y =

- ax - bx0 -

(x

m.,,=

X - Xo

m.,, is

a line with one point missing. The line from which the point

y = m.,, is

is missing is shown on the left below. The graph of


y

on the right.

y
I
I
I

-----

I
I

I
I
I

-4---+--- x
1ro

y=ax+(ax0+b)
Here again, the limit of

y=m.;

m.,, is simply they-coordinate of the point that is missing from

the graph. Thus we have:


Theorem 1.

Let

(x, y)

be a point of the graph of

y = ax2 +bx +
Then the slope of the tangent to the graph, at
S.,,

(0, 0) .

For each

(x, y),

is

2ax +b.

For some curves, there is no tangent.


y = lxl, at the point

c.

Consider, for example, the graph of

0,

lxl - IOI
lxl
=m =
"'
x
0
x
-

Thus:

m.,, =

for

(Remember the definition of

>

Ix!.)

0,

mx =

for

<

0.

Therefore the graph of y = m,, looks like the

2.7

Tangents

47

y
y

k?

y=l,x>O

k?

y = -1,x<O

k?

y=fxf
drawing on the right above. For this graph there is no
whenever

x is

one

number that y is close to,

close to 0. Therefore, there is no such thing as


Jim

(?)

x->O

mx

(?);

for every number k, the statement

(?)

Jim

mx

(?)

x->O

is false.

Geometrically it is obvious that the origin is the only point of the curve at which
things go wrong; at every other point
tangent is I for

> 0 and -1 for

(x, lxl), the curve has a tangent; the slope of the


< 0.

PROBLEM SET 2.7


1.

You already have a carefully drawn graph of the equation J

= x2 At each point (x, J)

of the graph, the slope of the tangent ought to be 2x. Check this graphically by drawing
lines of the proper slope at the points where

2. Given J = x2

x = 0.2, 0.4, 0.6, 0.8, and 1.

4x + 4. Find the slopes of the tangents at the p oints where x =

2,

x = 0, and x = 2 and sketch, showing all three of these tangents.


,

= x2 + x + I, using the points where x

3.

Same problem, for J

4.

By completing the square, show that

0, x = L and x = 1.

J = ax2 +bx + c
can be expressed in the form

y = a(x - A)2 + B.
For

a > 0, this means that the point where x = A is the lowest point on the curve.

Find the slope of the tangent at this point.

5.

a) Given the graph of

y = ax2
and a point

(x0, Jo) of the curve. Show that the tangent at (x0, Jo) is the only non
(x0, Jo) and has no other point in common with the

vertical line which passes through

48

Analytic Geometry

2.7

parabola. That is, show t hat, if the graph of

(y - Yo)
intersects the parabola only at

(x - x0)

(x0, y0), then


m

2ax0

b) Prove the corresponding theorem for the graph of

ax2 + bx +

c.

6. a) Get a plausible answer for the slope of the tangent to the graph of
point

(I, 1). Sketch the graph of y

mx,

x3, at the

explain what sort of graph it is, and explain

as well as you can why your value for the slope is plausible.

7.

a) Show that, if

< 0, then the line through the origin with slope

meets the graph

> 0, then the line through the origin with slope

meets the graph

of y

of y

x3 at precisely one point.

b) Show that, if

8.

x3, at an arbitrary point (x0, xg).

b) Do the same for

x3 at precisely three points.

Sketch the graph of

x lxl,

and describe this curve in terms of types of curve that we already know about.
which points does this graph have a tangent?

What is S2?

possible, a general formula for Sx. Is there such a thing as S0?

9.

Consider the graph of

What is S_2?

At

Give, if

y = x3 - 4x.

Where does this cross the x-axis? At which points is the tangent horizontal? What is

For what values of x is y > 0? For what values


x is y < 0? Use this information to draw a reasonable sketch of the graph, plotting

the slope of the tangent at (0, 0)?


of

onlyfive points.
10.

Carry out the steps of Problem 9 for the equation

y
11.

2x3 - 6x.

Show that every parabola has the reflecting property. In the figure, Tis the tangent at P,
and you need to show that
is a rhombus.

Cl

/3. The key to the proof is that the quadrilateral FPRQ

(That is, all four sides have the same length.)

A Shorthand for Sums

2.8
2.8

49

A SHORTHAND FOR SUMS

An arithmetic series is a sum of the form


Sn =a

+ (a +

d) +

(a

2d) + + (a + [n

l]d).

A geometric series is a sum of the form

Tn =

+ ar + ar2 + ar3 + + ar"-1

There is a shorthand for sums, which makes them easier to handle.

we write

n
Sn=

I a;.

i=l

(This is pronounced: "The summation from 1 to

Ii and follow it with an expression involving i,


all (integral) values of i, from 1 to
1)

Given a sum

n,

of a;.") That is, when we write

this means that we are to substitute

and add the results.

The geometric series


Sn = a

+ ar +

ar

+ + arn-1

can be written as

The shorthand can be checked by substituting the values of i from 1 to


i

2 1
ar -

a 1
ar -

4 l
ar -

ar

ar

ar

n.

When we add these, we get the geometric series.

2)

Consider the sum


Rn = l2 + 22 + 32 + ... +

In the short form,

2
n .

.o

Rn = '
L., z-.
i=l

3)

An arithmetic series can be written in the form


n

Sn =

L [a + (i

i=l

l)d].

This can be checked by means of a table of the sort that we gave above for the case
of the geometric series.
In each case, the formula after
i=

gives the ith term; i

gives the second term, and so on.

1 gives the first term,

This will always be true so long as we are

2.8

Analytic Geometry

50

taking the sum from i

1 to

However, we also write such sums as

n.

.::.., i 3

i=2
Here we take all values of i from i
fore

I i3

i=2

to i

23 + 33 + 43 + 53

In general, for m

n,

n
I a;

i=m

5 inclusive and add the results.

+ 21 + 64 + 125

am + a.,,+1 +

There

224.

+ a n.

Thus
4

I (af + 1)

i=2

(a+ 1) + (a+ 1) + (a+ 1)

a+ a+ a+ 3

Note that

I a; + 3.

1:=2

applies only to the expression immediately after it; in the last line, we

are told to add the numbers a (from i


parentheses in the formula
sum.

4)

and add 3 to the result. The

indicate that

is part of every term of the

!t=2 (a + 1)

to i

PROBLEM SET 2.8


Find each of the following sums numerically:
3

1. >2
i=l

Each of the sums below is of the form

am + Gm+l +

3. I u2 - 1)
i=2

2. I u-1)2
i=l

+ an.

Lf=m a;.

s.

i=3

4. "2i2
i=l

s.

I u3

i=2

Write each of them in the long form

3
I (3b + d)

i=2

Convert each of the indicated sums to the short form:

10. 12 + 22 + 32 +
12. k2 + (k

n2

1)2 + (k + 2)2 +

13. 21 + .13 + ... +


14.

k-1

--

Is is true that

(n + 1)2

11. 32 + 42 + ... + k2

+ (n - 1)2

1
k

+-

n
I (a;

i=l
Why or why not?

1)

b;) = I a; + I b;?
i=l
i=l

9.

!i7
i=m

The Induction Principle and the Well-Ordering Principle

2.9
15.

51

Is it true that
n

L kai

i=l

k 2, ai?
i=l

Why or why not?


16.

Is it true that
n

L..
i=1

( -) hi 2
n

h3
-a
n

L..

iz?

i=l

Why or why not?


17.

()is the number of subsets with exactly k elements, in a given set with
<mis the number of possible 13-card bridge hands; (552) is
the number of possible 5-card draw poker hands. Show that ()
<D
Show that
GD G).
For 0

k b,

elements.

For example,

*18.

*19. Show that

2.9

THE INDUCTION PRINCIPLE AND THE WELL-ORDERING PRINCIPLE

Consider the following game.

We have three spindles, of the sort used as targets in

quoits. On the first spindle is a stack of wooden disks, diminishing in size from bottom
to top. (See the figure.) The disks are numbered 1,
in the figure,

2,

3, .

.. , n, from top to bottom;

5.

A legal move consists in taking the topmost disk from one spindle and placing it
on one of the other spindles, providing that we must not, at any stage, place a disk
above a smaller disk.
At the start, all the disks are on spindle A. The object of the game is to get all
the disks onto spindle B, by a series of legal moves.
For example, we might begin by taking disk 1 off spindle A and putting it on
spindle B. There would then be three possibilities for the second move:
back on spindle A,

(2)

put disk 1 on spindle C, and

(3)

put disk

( 1) Put disk 1

on spindle C.

It

would not be legal to put disk 2 on spindle B, because disk 2 would then be above
disk

1,

which is smaller.

We shall see that the game can always be completed, no matter how large the
positive integer

may be. For each positive integer

the game can be completed, starting with


of the propositions P n are true.

disks.

n,

Let Pn be the proposition that

What we need to show is that all

52

2.9

Analytic Geometry

Lemma 1. P1 is true.
(A lemma is a sort of subtheorem, used as a step in the proof of a harder theorem.)

Proof of Lemma

I.

Move the one and only disk from spindle A to spindle B. Then

the game is over.

Lemma 2. P 2 is true.

Proof of Lemma 2. (I) Move disk

1 to spindle

C.

(2) Move disk 2 to spindle

B.

(3) Move disk I to spindle B. Then the game is over.

Lemma 3. P3 is true.

Proof of Lemma 3. By Lemma 2, disks 1 and 2 can be moved to spindle C. (Lemma


2 really means that any two disks at the top of a stack can be moved to any other
spindle.) Do this. Then move disk 3 to spindle B. By Lemma 2, disks 1 and 2 can
then be moved to spindle B, whereupon the game is over.
A pattern is now appearing, suggesting the following lemma. This lemma states
that if the game with n disks can be completed, then the game with

+ I disks can

also be completed.

Lemma 4. For each n, Pn

=>

P n+1

Proof of Lemma 4. We are given n + I disks on spindle A, and we are given by


hypothesis that P n is true. Therefore the stack consisting of disks 1, 2, . . . , n can be
moved to spindle C by legal moves. Do this. (Disk n + 1 causes no trouble; it can
be regarded as the base of the spindle on which it lies, because it is larger than any
of the disks being moved.)

Then move disk n + 1 from spindle A to spindle B.

By Pn> we know that disks 1, 2, ... , n can be moved from spindle C to spindle B.
Then the game is over.
Lemma 4 gives us an infinite chain of implications:

And Lemma 1 tells us that the first statement in the chain is true. Therefore all of
the statements P1, P2,

are true. This idea is conveyed mathematically as follows:

The Induction Principle.

Let Pl> P2,

be a sequence of propositions (one

for every positive integer). If

a)

P1 is true, and

b)

Pn

=>

Pn+i for every

n,

then all of the propositions P1, P2,

are true.

The problem of the disks is probably the clearest illustration of what the induction
principle means.

The principle is used continually, in all branches of mathematics.

In this section, we shall use it to get short formulas for certain sums.

The Induction Principle and the Well-Ordering Principle

2.9
Theorem 1.

For every n,

53

n
> = -(n+ 1).

i=l
Proof

For each n, let Pn be the proposition that


n

n
Ii= -(n+ 1).

i=l
P1 is true, because

a)

Ii = 1 = tCl + 1).

i=l

b)

Pn

=?-

n, because

Pn+I for every


n

n
Ii= - en+ 1)

i=l
n

n
L i+ (n+ 1) = -(n+ 1)+ (n+ 1)

=?-

i=l

: i= G+ 1) en+ 1)
n+ 1
I i = -- (n+ 2).

=?-

n +l

=?-

i=l

In this chain of implications, the first equation is Pn and the last is Pn+i Therefore

Pn

=?-

Pn+i By the induction principle, Pn is true for every

n, which was to be proved.

In fact, there is a simpler way of getting this result. If


Sn

= 1 + 2+ 3+

+ (n - I)+ n,

then
Sn

= n+ (n - I)+ (n - 2)+

+ 2+ I;

and adding terms in pairs, we get


2Sn =

(I+ n)+ (I+ n)+

+ (I+ n)+ (I+ n),

to n terms. Therefore
2Sn

= n(n+ 1)

and

Sn

n
= -(n+ 1) '
2

as before. This device is neat but very special. Consider now the problem of calculat
ing

Sn

= L i2 = 12+ 22+ 32+ ...+ n2.


i=l

We have just found that the sum of the first n positive integers is a polynomial in n,
of degree 2. This suggests that S n is a polynomial of degree 3. That is, we conjecture
that
Sn

= An3+ Bn2+ Cn+

D,

54

2.9

Analytic Geometry

for some numbers A, B, C, D. The problem is to find A, B, C, and D, and prove by


induction that they work. Let P n be the proposition that

n
Pn: L i2=Ana+ Bn2+ Cn

+ D.

i=l

Then Pn+I asserts that

n
Pn+I: L i2+(n+ 1)2=A(n+ l)a+ B(n+ 1)2+ C(n+ 1)+ D.
i=l

We want P n P n+l to make the induction proof work. This means that

Ana + Bn2+Cn+ D+ (n+ 1)2=A(n+l)a + B(n+ 1)2+ C(n+ 1)+ D.


If this equation holds, then P n

P n+i (Check the algebra.) Collecting coefficients,

we get the equivalent equation

Ana + (B+ l)n2+ (C+ 2)n+ D+ 1


=An3+ (3A+ B)n2+ (3A+ 2B+ C)n+A+ B+ C+ D,
or

(1 - 3A)n2+ (2-3A - 2B)n+ 1-A-B-C= 0.


This holds if

A= t,
This gives

B = t(2 - 3A)= t,

C=l-A-B=i.

n
Pn: L i2=t na+ tn2+ i n+ D.
i=l

Thus, for any D, Pn P,.+i For D= 0, P1 is true. We take D= O; and we know


by the induction principle that

n
L i2=tna+ tn2+ tn,
i=l

for every n. Taking a common denominator on the dght and factoring, we get:
Theorem 2.

For every n,

n
n
L i2=- (n + 1)(2n+ 1).
i=l

For some purposes, the following idea is easier to use thanthe Induction Principle.

The Well-Ordering Principle.

Every nonempty set of positive integers has a

least element.

(See, for example, Problems 10 and 12 below.) The Well-Ordering Principle and the
Induction Principle are equivalent.

(See Problems 14 and 15 below.)

2.9

The Induction Principle and the Well-Ordering Principle

55

PROBLEM SET 2.9

1. Prove by any method that for every n, the sum of the first n odd numbers is n2 That is,
n

c2i - 1)

i=l

n2

This can be shown by induction, but there are at least two other ways.
2. Prove by induction that

1 + r + r2 +

rn+l - 1

+ rn

(r 1).

3. Prove by induction that

4.

Find by any method a formula for


n

C3i - 1).

I
i=l

5. Find by any method a formula for


n

I C4i
6. Find a formula for

Iu
7. Find a formula for

- 2) .

i=l

+ i + 1).

i=l

i (i2

- i).

i=l

8. Assume that if A1, A2, A3 are points, then

AiA2 + A2A3 A1A3.


Prove that for every n 3 we have

A1A2 + AzA3 +

+ An_1An A1A,,.

This is known as the polygonal inequality.


9. a) Let Pn be the number of moves required to complete the game with n disks. Show
that for every 11,

Pn+l

2pn + 1.

b) Let Pn be as in (a). Show that for each

p,,
(Since 210
moves.

2"

11,

1.

1024, this means that the game with 20 disks requires over a million

Thus, if you want to verify that P20 is true, the easiest way to do it is to

show by induction that Pn is true for every

11,

and then set

11

20.)

*10. Throughout this problem, the numbers under discussion are positive integers. If
a =be for some c, then bis called a factor of a (or a divisor of a). If p > 1, and the
only positive factors of p are p and 1, then p is a prime. Obviously every prime has a

56

2.9

Analytic Geometry

prime factor, namely, itself. Prove that every number greater than 1 has a prime factor.
[Beginning of the proof: "Let K be the set of all numbers Which are greater than 1 and
have no prime factors. We need to show that K is empty. If K is not empty, then . .."]

* 11. Following is the beginning of Euclid's proof that there are infinitely many primes.
Suppose that there are only a finite number of primes, say
Consider the number
N

P1P2Pa

'Pn +

1.

Complete Euclid's proof, by showing that this situation i s impossible.

*12. Show that every rational number can be expressed as a fraction in lowest terms. [Hint:
Try the Well-Ordering Principle.]
13.

In the song "The Twelve Days of Christmas," gifts are sent on successive days according
to the following scheme:
First day: a partridge in a pear tree.
Second day: another partridge, and two turtledoves.
For each i, let G; be the number of gifts sent on the ith day.Then
G;

G;_1 + i.

(Which we have just observed for i


2.)
Let Tn be the total number of gifts sent on the first
formula for Tn, in the form
?(? + ?)(? + ?)
=

days of Christmas. Get a

?
As a check, the final value is T12
Thomas F. Banchoff.)

364.

(I am indebted, for this problem, to Professor

* 14. Show that, if the Well-Ordering Principle is taken as a postulate, then the Induction
Principle can be proved as a theorem. [Start of the proof: Suppose that not all of the
propositions Pn are true, and let
K

{n I Pn is false.}.

Then K ""- { }. Therefore ...]


* 15.

Show conversely that, if the Induction Principle is taken as a postulate, then the Well
Ordering Principle can be proved as a theorem. [Start of the proof: For each n, let Pn
be the proposition that none of the integers 1, 2, . .. , n belongs to K .... ]
The diagram below is related to one of the problems in this section.

2.10
2.10

Solution of the Area Problem for Parabolas

57

SOLUTION OF THE AREA PROBLEM FOR PARABOLAS

If a line intersects a parabola in two points, then it cuts off a region called a parabolic
sector. In the left-hand figure below, the sector is the region lying above the parabola
and below the line. In the third century B.C., Archimedes discovered a method for
finding the area of a parabolic sector. In this section we shall give an easier solution
of the problem.

The problem will be solved if we can find the area of a "curvilinear triangle" of
the type shown on the right above. If we can do this, then we can find the area of the
trapezoid in the other figure, and subtract the areas of the two curvilinear triangles.
The result will be the area of the sector.

We shall attack the area problem, for the graph of y

region with rectangles, like this:

x2, by approximating the

We cut the closed interval [O, h] into n little intervals of equal length, using the
di.vision points

O, ,
11

2h , ... , (i l)h , ih , ... , (n


-

11

11

11

l)h , h.

This gives a sequence of closed intervals

l)h ' l ' [(11 l)h ' hl


'

[
[(i
'
l
1
2:

[
0,

58

Analytic Geometry

2.10

With each of these intervals as base, we construct a rectangle, using as altitude the
height of the parabola at the right-hand endpoint.
ith interval is

ih/n.

The right-hand endpoint of the

Therefore the altitude of the ith rectangle is

(ih/n)2

the area of the ith rectangular region is

Therefore

Let Rn be the union of all these rectangular regions. Then the area of Rn is

n
n h3i2
ai= .
An = .
i=l
i=l n3

h3
i2
.
n3 i=l

We want to find out what limit An approaches as n becomes very large. If we find this
limit, then our problem is solved, because the limit is the area of the region R that
we started with.
We found, in Theorem

2 of Section 2.9,

that

n
n
I i2 = - (n + 1)(2n + 1).
6

i=l

Therefore

h3 n
An =- - (n + 1)(2n + 1)
n3 6

As

/1

h3

(1 + 1-) ( 1 + -1 )
/1

2n

becomes large without limit, it is easy to see that


1

--

so that

o,

and

1
1 + - -1,
n

1
-o'
2n

)(

h3
1
1
An =- 1 + - 1 + 3
n
2n
Therefore the area under the parabola, from 0 to

1
1 + --1,
2n

and

h,

h3
-.
3

is

h3
A=-.
3

It would have been equally natural to approximate the area from the inside. We
shall see that this procedure leads to the same answer as before. Here we have cut
up the interval

[O, h]

into the same little intervals as before; but on each little in

terval we have set up a rectangle whose altitude is the height of the parabola at the

Solution of the Area Problem for Parabolas

2.10

59

ah

--...,
I

I
I
I
I
I
I
I
I
I
I
I
I
I

.. (i-l)h fl: ...<n-l)h nh_


n
n n
n
n
.

left-hand endpoint. Therefore, on [O,

h/n]

our "rectangle" is merely the base inter

val, with area 0; and thereafter the area of the ith rectangle is

a =
Let

[(i - l)h] 2
n

h3
n3

R be the union of these rectangular regions.

(i - 1)2.

Then the area of

R is

To see why the last equation holds, observe that each of the indicated sums is the
sum of the squares of the integers from 1 to

n-

h3 n
(n + 1)(2n + 1)
A = 3
n 6
-

As

n increases, An---+ h3/3 and h3/n---+ 0.

n2

Therefore

limit as before. To sum up:


Theorem 1.

1. Therefore
=

An -

A---+ h3/3,

Let

R
Then the area of

{(x,

y) I 0

R is h3/3.
y

x h

and

h3 .
n
-

and we get the same

x2}.

2.10

Analytic Geometry

60

It is easy to extend this result to the case in which the parabola is the graph of

kx2, k > 0.

y =

y=kx2

.h

i
n

. - (ih)2h

Ui

When we multiply y by
by

k.

Thus, if A11

k,

this multiplies the area of each approximating rectangle

,2;'.,.1 ai,

as before, and
n

Bn

.2 a;,

-i=l

we have

Bn

,2 ka;

h3/3,

i=l

i=l

Since An--+

k ,2 G;

kAn.

we have

Bn

kh3
3

-+-

Therefore we have the following theorem:


Theorem

2. Let
R =

with

k > 0.

{(x, y) I 0 x h

Then the area of R is

In general, for

<

0 y kx2},

and

kh3/3.

b let

be the area of the region under the graph of y

kx2,

from

to

b. Then

we have the following:


Theorem

3.

Akx2

(b3
3

(Proof? There are three cases to consider:

a3).
<

b 0, a

<

0 b, 0 a

<

b.)

61

Solution of the Area Problem for Parabolas

2.10
PROBLEM SET 2.10
Find the area under the graph of y

5x2, between the following limits.

1.

From 0 to 4

2.

From 0 to 2

4.

From 2 to 4

5.

From -2 to 2

3.

From -2 to 0

Find the area under the graph of y = 2x2 + 1, between the following limits.

6.

From 0 to 4

7.

From -1 to 0

8.

From -1 to 3

9. Find the area of the parabolic sector between the graphs of y. = 2x2 and y
10.

Same problem, for y

11.

Find the area of the sector between the graphs of y = x2 - 1 and y

12. Same problem, for y


13.

x2 and y

x2 and y

x + 1.

x.
=

-x2 + 1.

2x2 - 1.

Solve, for the general case, the problem of Archimedes, stated at the beginning of
this section.

14. a) For each n, let


An = 1 +

1
----=

v'n
Obviously An > 1 for every n.

Under what condition for n can you be sure that

b) Under what condition for n can you be sure that

1
An - l <
?
10 '000 '000
c) Let

be any positive number.

An - 1 < E?

15.

a)

For each n, let

Under what condition for n can you be sure that

2n - 2
En= --.
3
n -1

Under what condition for n can you be sure that En < lo?

b) Under what condition for n can you be sure that

c) Given any positive number

E,,. < E?

16.

E,

under what condition for n can you be sure that

For each n, let

Obviously C,,. > 4 for every n.


whenever n is sufficiently large.

Given a positive number

E,

show that Cn

4 <

62

Analytic Geometry

17. a) For each n, let Dn

that Dn <

2.10

n2

3n

1
+ 2 .

Under what condition for n can you be sure

102 ?

b) Given any positive number E, under what condition for n can you be sure that

Dn < E?

18.

Given an ellipse, find its area.


y

-b
-

x2
y2
-+- = 1
a2
b2

This can be done by a method somewhat similar to one used in the preceding section of
the text.

[Hint: In the figures, what is the relation between y and k?]

19. In the discussion preceding Theorem 1, we fo und that A

A n - h3/n.

Verify this

statement geometrically, without using a formula for either An or A.

Hint: Draw a figure showing both the inner and outer rectangles, and explain why
An - A
*20. a) Find a formula for

h
=

. 1z2.

""'

.., l 3

i=l

b) Find the area of the region under the graph of y = x3, from 0 to 1.
*21. a) Let

as in the text, and let

Thus En is the error in the approximation An

""'

'13/3, and En > 0 for each n.

Calculate En and show that En < h3/n for each n.


b) Show that for every E > 0, En < E when n is sufficiently large. That is, find a number
N such that En < E whenever n > N.

Functions,
Derivatives, and Integrals

3.1

THE IDEA OF A FUNCTION

Roughly speaking, a function is a law of correspondence under which to each element


of one set there corresponds one and only one element of another set.

Consider

some examples.
I)

Suppose that we have set up a coordinate system in a plane .

point P of

Then to each

E there corresponds a number x which is the x-coordinate of

P.

Thus

we have a function

E--+R
which matches points

2)

P of

E with elements x of R.

Similarly, every point P has a unique y-coordinate y.

Thus we have

another

function
E--+ R.

To distinguish these two functions, we give them different names, say, X and Y.
Thus
X:

E--+ R,

: PHX
is the "x-coordinate function," and

Y: E--+R,

PHy
is the "y-coordinate function."

When we write

PH x (with the vertical bar on the

left-hand end of the arrow), this means that each point P is matched with its x
coordinate x.

3)

Thus we write

---+

between

sets and

between

elements of the sets.

If the real number x is known, then x2 is determined. Thus we have a function

f: R--+R,
2
X H X .

4)

Every nonnegative real number has one and only one nonnegative square root.

Thus we have a function

g: R+--+R,
: x H

1-Y

X,

where R+ denotes, as usual, the set of all nonnegative real numbers.


63

64

3.1

Functions, Derivatives, and Integrals

5) If x 2, then x - 2 is nonnegative, and so has one and only one nonnegative


square root. Thus we have a function
[2, oo)-+R,

h:

:XH

6) The absolute value /xi of xis defined by the conditions


x
/ /

x
/ /

and

x for x 0

-x for x < 0.

In either case, if xis known, then /x/ is determined. Thus we have a function
i:

R-+R,
XH /x/.

In each of these six cases we have a function


f: A-+B,
where A and B are sets of some kind. The elements of A are the objects to which
things are going to correspond. The set A is called the domain of the function f
In each case, B is a set which contains all of the objects which correspond to elements
of A. The set B is called the range of the function/ Finally, to have a function/, we
must have a rule under which to each element of A there corresponds a unique element
of B. Under these conditions, we have a/unction of A into B.
We can sum up the preceding examples in the following table.
Example

Function

--

Domain

Range

R+

g
h

[2, co)

Rule

PH x
PHy
XH x2
XH V
xH Vx
XH/x/

It is not required that all the elements of the range actually get used. Thus, in
Example 3, x2 0 for every x, and so we could equally well write
/: R-+R+,
XHX2,
using R+ as the range instead of R.
Often functions are defined by algebraic formulas, but some of the most important
functions are defined in other ways. Consider the following example.
7) Given the parabola, shown below, which is the graph of the equation y
x2
For each point P of the parabola, the arc of the curve from the origin 0 to P has a
certain length. If to each xwe let correspond the length of the arc from 0
(0 , 0)
to P
(x, x)2 , then we have a function
=

j: R-+ R+.

The Idea of a Function

3.1

65

(Here we are talking about simple geometric length, independent of direction, and
so the length of the arc is never negative.) Later we shall find that this function can
be described by a formula. But we don't need to know this, let alone find the formula,
to know that we are dealing with a function.
y

8)

Given the same parabola. To each number k

0 there corresponds a number A

which measures the area of the shaded region in the right-hand figure. To be exact,
the region is

R2

{(x, y) I 0 x k, 0 y x2}.

Thus we have a function


/2: R+---+ R+,

kHA.
In Chapter 2 we got a formula for this function:

kH ik3
for every

9)

k 0.

Given the graph of y =

x", for x 0.
y

(The rest of the graph goes upward when

To each k
region

is even and downward when

is odd.)

0 there corresponds a number A which measures the area of the shaded

66

Functions, Derivatives, and Integrals

Thus for each

3.1

n we have a function

: kl---* A .

Only for the cases n = 1 and n = 2 do we know how to calculate the values of A.
But for n = 3, we nevertheless have a well-defined function/3. Later in this chapter,
you will see how this function can be calculated.
Given a function/: A -+ B, for each a in A we denote by f(a) the element of B
which corresponds to a. For example, if f is the function which squares things
(x x2), then
f(l)

1,

/(2)

4,

/(3)

9, f(J2)

2;

and
/()-;;)

for every

=x

0.

In Example 9 above,f3(1) is the area under the graph of y = x3, from 0 to I ; and
so on.
If the domain A and the range B are sets of real numbers, then we can draw
pictures of the function. The graph of a function f: A _,. B is the set of all points
of the coordinate plane that have the form (x,f(x)). In other words, to draw the
graph of the function, we plot the point (x,f(x)) for each x in A.
y
y

r-Y!
I
I
I
I

I
I

In the case shown in the left-hand figure above, the domain is a closed interval
Consider, next, the function g in Example 4, which extracts nonnegative
square roots:

[a, b].

g:
:

R+-+ R,
x 1--4

)"'-;,

The graph of g (the right-hand figure above) is the graph of the equation
To see that this graph is approximately right, observe that

y = )
We get x

<=>

0,

0,

x=

y = )'-;.

y2.

= y2 by interchanging x and yin the equation y = x2.

Therefore the graph

The Idea of a Function

3.1

of x
y

y2 is a parabola with directrix

and focus

(t, 0).

67

And the graph of

..j--; is the upper half of this graph.

A curve which is the graph of a function is called a function-graph. It is easy to

see what sort of curve is a function-graph: A set of points in a coordinate plane is a

function-graph if it intersects every vertical line in at most one point.


y

y
I
I
I

y
I

No

Yes

Ordinarily, we make no distinction between a function-graph in a coordinate plane


and the corresponding function.
y
4

For example, in the figure above,jis a set of points, and is a function-graph. We use
the same symbol /for the corresponding function. Thus we say that the domain off
is the closed interval

[ -1, 7],

and the range off is R.

(Obviously some smaller set

could be used as the range, but it is not obvious from the figure just what the smallest

possible range is.)

2,

1, 2

We write f(O)

2,

f(l)

1,

J(2)

3, and so on, because

3, under the action of the function/

Given a function

f: A-+B.
If bis

f(a) for some a in A, we say that bis a value of the function. For example,

4 rs a value of the function

is called the image.


that in

2
x H x ,

but -1 is not. The set of all values of a function

If you reexamine Examples 1 through 6 above, you will find

and 2 the image is all of R, and in the remaining cases the image is R+.

(You should check these cases.)


Similarly, for

f: [O, 1]-+ R,

X H

../I

x2.

68

Functions, Derivatives, and Integrals

3.1

Here the graph is a quadrant of a circle, as shown on the left below, and the image
is the closed interval [O, 1].
y

A word of caution: it often happens that a figure looks like a function-graph if


you look at it sidewise. In the right-hand figure above, C is not a function-graph;
but sidewise it looks like one. More precisely, if you reflect C across the line y
x
you get a curve which really is a function-graph. We often use this device, to study
various curves C for which the reflection C' is a function-graph. But this does not
mean that C was a function-graph in the first place. Therefore, in the following
problem set, when you are asked whether certain curves are function-graphs, you
must look at the curves right side up. For the curve C shown in the above figure, the
answer is "No," even though for C' the answer is "Yes."
In some of the problems below, you are asked to find the image. In some cases,
the image is not an interval; and you may find it convenient to use the notation
=

{a, b,
for the set whose elements are

a,

b,
N

c,

c,

. ..}

.... Thus
{ l,

2, 3,

.}

is the set of all positive integers;


z

{ ...

'

2 , -1, 0,

1,

2,

...}

is the set of all integers; and

{O, l}
is the set whose only elements are

and 1.

PROBLEM SET 3.1


1.

Givenf(x)

2. Givenf(x)
3.

x2 + x + 1, for every x. Findf(O),f(l), and [(2).


2x2 - x + 3. Findf(-1),f(O), andf(2).

Givenfas in Problem 2. Get a general formula forf(2 + h).

4. For what positive integers

n (if any) is the graph of y


xn a function-graph?
each such case (if any), what are the domain and the range?

5.

Same question, for the graphs of the equations x

yn.

For

3.2

The Derivative of a Function, Intuitively Considered

69

6.

For what positive integers n (if any) is the graph of y = lxln a function-graph? For
each such case (if any), what are the domain and range?

7.

Same problem, for the graphs of the equations lyln = x.

*8.

Same question as

6, for y3 +

9. Is the graph of x =
Sketch.

ny = x.

Vy a function-graph?

If so, what are the domain and the ima ge?

10. Same question, for y = lxl/x.


11. Same question, for Jyl = x.
12. Same question, for y = lxl + x.
13. Same question, for y = x2 + x + 1. (Here, of course, the only trouble is in finding
the image. The image is an interval, and should therefore be described in the interval
notation.)

14. The postage rate for airmail letters within the United States is now (1971) ten cents
per ounce or fraction thereof Thus we have a function
amp:

R+

___,.

R+,

where amp xis the airmail postage (in cents) for a letter of weight x (in ounces.) Thus
amp t = 10, amp 1 = 10, amp 1T = 40, amp 0 = 0, and so on. Sketch the graph of
this function. What is the image?
15. The roundoff function r: R ---+ R assigns to each number the nearest integer (with a
half-integer assigned to the next highest integer). Thus r(2) = 2, r(2!) = 2, r(2t) = 3,
r (2! ) = 3. Sketch the graph of this function from 0 to 3. What is its image?
16. Under what conditions is a semicircle a function-graph?
17. Under what conditions is a parabola a function-graph? (To solve this one, you will
need a theorem from a problem in Chapter 2.)
3.2

THE DERIVATIVE OF A FUNCTION, INTUITIVELY CONSIDERED

In Section 2.7 we solved the tangent problem for parab olas. Given the graph of
y

we found that for each


was

x0,

ax2 + bx +

c,

the slope of the tangent at the point

(x0, y0)

of the graph

70

3.2

Functions, Derivatives, and Integrals

Obviously a parabola with its axis vertical is a function-graph; its equation expresses
in terms of x. Thus we have a function

f:

R-R

ax2

bx

c.

Now at each point of the graph off there is one and only one tangent; and this
tangent has a certain slope. Thus we have another function
f':

R-R

: X

S"'

2ax

+ b.

For each x,f'(x) is the slope of the tangent to the graph ofjat the point (x,f(x)).
To see how this works, consider the simplest example, in which

j(x)

x2

Here the parabola is the graph of the function


j: X

X2,

2X.

and the line is the graph of the function


f': X

For each x, the value off' is the slope of the tangent to the graph off For example,
I the slope of the tangent tofis 2; and f'(l)
2 I 2.
at the point where x
At x
%, we getf'(i)
t; and tis the slope of the tangent to the parabola. Where
x - 1, the slope of the tangent to the parabola is -2; andf'(-1) 2(-1)
-2.
=

The Derivative of a Function, Intuitively Considered

3.2

71

In general, suppose that we have a function


f:

R ___.... R.

If the graph ofjhas a nonvertical tangent at each point (x,f(x)), we letf'(x) be the
slope of this tangent. This gives a new function
f':

R ___.... R.

The new function f' is called the derivative off Consider another example.
y

A careful inspection of the figure above indicates that f' is (at least approximately)
the derivative of/ Thus, at x = 0, the tangent tofis horizontal; andf'(O) = 0, as it
should be. At x = 1, the tangent to f seems to have slope = -1; and f'(1) = -1.
At x = -1, the tangent to f has slope = I; and /'(-1) = I. For x > 0. the
tangent to f has negative slope; and/' (x) < 0 for x > 0. For x < 0, the tangent
to f has positive slope; and/'(x) > 0 for x < 0.
It may be that at some points f has no tangent. At such points,/' is not defined.
Thus, in some cases, the domain off' is a smaller set than the domain of/ Consider,
for example, the function/: x H Ix/.
y

f'

-1

f'

-1

For every x > 0, the slope of the tangent is 1; and for every x < 0, the slope of the
tangent is -1. Therefore the graph off' looks like the figure on the right above.

3.2

Functions, Derivatives, and Integrals

72

Drawing both/ and/' on the same set of axes, we get the left-hand figure be low.
You should carefully inspect the figure on the right below, to conv ince yourself that
f' is the derivative off, at least approximately.

Heref has a t angent at

the tangent is vertical, and therefore there is no such thing as f'(O).


and

small.
point

is small, thenf'

(x)

It looks as iff'(2)

( 2,/(2)).

is large, becausef is rising steeply. When


=

0, but

When

>

is

x >
1, f' (x)

O; and the g raph off has a horizontal tangent at the

y
y

f
f'

f'

-1

A function which has a derivative at every point of its domain is called

entiable.

differ

The following theorem describes a fundamental property of differentiable

functions:
The Mean-Value Theorem.

Every chord of a differentiable function is parallel to the

tangent at some intermediate point.


y

Here by a

chord we mean a segment joining two points of the graph. The theorem
[a, b], then there is some i between a and bat which

says that if/ is differentiable on

the slope of the tangent is the same as the slope of the chord. As indicated in the
right-hand figure above, there may be more than one such point.
The situation with regard to this theorem is awkward.
obvious.

Also it is important and we shall need it soon.

It is geometrically

On the other hand, the

proof of the theorem is hard, and involves ideas which belong in the later portion of a
calculus course. We shall therefore postpone the proof, but use the theorem whenever
we need it.

The Derivative of a Function, Intuitively Considered

3.2

73

The theorem can be stated in a form which looks more algebraic. If f is defined

on an interval [a, b], then the slope of the chord joining the endpoints is

f(b)-f(a)
b-a
and the slope of the tangent at x is f' (.X). Thus the theorem states that

f'(x)

f(b)-f(a)
b - a

for some x between a and b. In this style we can restate the theorem as follows:

The Mean-Value Theorem. Suppose that f is differentiable on the closed interval


[a, b ]. Then for some x between a and b we have

f'(x)

f(b)-f(a) .
b - a

Note that, if we merely required that the graph have a tangent at every point, the
theorem would become false. The graph shown below has a tangent at every point,
but one of these tangents is vertical. Therefore the function f is not differentiable on

[a, b]. And no tangent line is parallel to the chord from P to Q.


y

.... Q
,,,.,,,,, l
.,,.,,.
I
,,,,,.
I
I
I

Hereafter, the mean-value theorem will be referred to as MVT.


PROBLEM SET 3.2

In each of the figures below, a function-graph/is given. Do a tracing of each graph on a


sheet of writing paper, and then draw a plausible sketch of the graph off'. Obviously your
sketch off' cannot be exact. But f' should be
0 at points where the tangent to f is hori
zop.tal; /'(x) should be > 0 where the original graph slopes upward;j'(x) should be < 0
=

wheref slopes downward; and so on. In some cases, you may find that the values off' are

so large that there is no room for them on the paper. In such cases, draw as much of the
graph of/' as space permits.
Some but not all of the functions shown below satisfy the conditions of the mean-value
theorem. For each such function, draw the chord between the endpoints of the graph,
draw a tangent line which is parallel to this chord, and drop a dashed line from the point of
tangency to the point x on the x-axis. (See the figures in the text, illustrating

MVT.)

3.2

Functions, Derivatives, and Integrals

74

y
1

1.

y
4

2.

3.

f,

-1

-2

-4
-2
y

4.

5.

y
4
f
2

-2

6.

f
f
-1

7.

-1

8.

10.

y
11.

12.

f
x

-1

f
x

-1

-1

13.

9.

-1

-1

14.

-1

15.

y
1

-1

-x

Continuity and Limits

3.3

17.

16.

18.

75

-1

19.

-1

20.

21.

3.3

CONTINUITY AND LIMITS

Let/be a function defined on an interval, or defined for every x. Roughly speaking,


f is continuous if we can draw the graph without lifting the pencil from the paper.
For example, the function f(x)
half of the circle x2 + y2

1.

J1

x2 is continuous, because it is the upper

Most functions that arise naturally are of this type,

and in this book we will rarely deal with any other kind of function. But some very
simple functions are not continuous.

Consider, for example, the airmail postage

function defined in Problem 14 of Problem Set 3.1. The graph looks like this:
y

40
30
20
IO

76

Functions, Derivatives, and Integrals

3.3

Here y = amp x, where amp x is the airmail postage on a letter weighing x ounces.
The values of this function make sudden jumps at integral values: the graph cannot be
drawn without lifting the pencil from the paper, and so the function is discontinuous.
Functions of this kind are used in physics. For example, the so-called Heaviside
function is defined by the conditions
h(x) =

if

x < 0

if

x 0.

The graph looks like the figure below. It makes a sudden jump at x = 0.

We shall now make the idea of continuity more exact, in several stages.

Given

a point x0, in the domain off, we want to explain what it means to say that f is
continuous at x0. First we try the following:

Continuity and Limits

3.3

1)

77

Whenever xis close to x0,f(x) is close to f(x0). In symbols,


x X0

=>

f(x)

)
f(x0.

This is the idea, but it is not good enough; the question is

how

close things are

supposed to be to each other. As xgets very close to x0,f(x) is supposed to become


very close to f(x0). This suggests:

2)

We can makef(x)

as close as we please tof(x0), by taking xsufficiently close to x0

This is better, but it can be improved. We measure the closeness of two numbers

by taking the absolute value of their difference. Thus if E is a positive number, and
/f(x) - f(xo)/ < E,
then we say thatf(x) is -close tof(x0). In these terms, we can restate

(2) as

follows:

3)

For each E > O,f(x) is -close to f(x0) whenever xis sufficiently close to x0

4)

For each E > 0, there is a o > 0 such thatf(x) is -close tof(x0) whenever xis

If o > 0 and Ix - x01 < o, then we say that x is a-close to x0 The idea of
"sufficiently close" can be described by taking a positive number o. This gives:
0-close to Xo
We can now draw a picture.
y

/(xo)+E

/(xo)

- -

In the figure, the solid rectangular region is called an EO-boxfor the functionfat the

point (x0,f(x0)
) . When we call it a boxfor the function, we mean that no point of the
graph lies above the box or below it.

positive number
We now restate

E,

(4)

If the function is continuous, then for every

no matter how small, we can find a o > 0 that gives an EO-box.


as follows.

Definition. Let x0- be a point in the domain of the function


every E > 0 there is a o > 0 such that

/x - x01 < O
Thenf is

continuous at x0

=>

/f(x) - f(xo)I < E.

Suppose that for

78

Functions, Derivatives, and Integrals

3.3

This definition applies very simply to the function/(x)

Given any

E/2.

at the point (I,

2).

> 0, we can find an EO-box, as shown in the figure; we simply take o

2x,

Then, algebraically,

Ix - II<

=>
=>
=>

Ix - II < E/2
l2X - 21 < E
lf(x) - /(I)I < E;

and this is what we need, to show that the function/ is continuous at the point x

I.

Of course, the function is also continuous at all other points, and we can show in
exactly the same way that the definition of continuity applies, taking o
would we do for f (x)

3x,

at any point

E/2.

We shall now apply the definition of continuity to the function f(x)

the point (I,

(What

x0 ?)

I).
y

1-------

d'

x2,

at

79

Continuity and Limits

3.3

Given an E > 0, we want to find an ED-box for f, at

Ix

11

< D

lx2

=>

(I, I),

so that

I I < E.

To find the desired number D directly requires clumsy calculations, but there is an
easier way. Let d and d' be the numbers such that

f (d)

d2

(See the lower figure on page


d <

78.)
d'

<

f(d')

E;

d'2

1 +

E.

The graph rises from left to right. Therefore,


=>

f(x) < /(d')


x2 < d'2

f (d)

<

<=>

d2 <

:>

<=>

lx2

X2

<

II <

< I

+ E

E.

Thus the dotted rectangle in the figure boxes in the graph, in the same way that an
ED-box does.

dd'-box.

We call such a rectangle a

Obviously a dd'-box is just as

good as an ED-box. And, in fact, given a dd'-box, we can always get an ED-box that
lies in it.
d'

Let D be the smaller of the positive numbers

d and

d'

1. (Jn

fact,

is the smaller, but we don't need to use this.) Then

and so

Ix

11

< D

=>

Ix

< D

=>

lx2

<

<

d',

II <

E,

which is what we wanted. We are going to use this method again, and so we record
it as a theorem.
Theorem 1.
E

Let

x0 be a point in the domain of the function f

> 0 there are numbers d and d' such that d < x0


d <

x < d'

Then/ is continuous at

f (x0)

=>

<

f (x)

<

f(x0) +

x0.
y

f(xo)+.- - - - -.---------

f(xo) - -

- - - -

- -

- -

--

- - - -

8
d

Suppose that for every

< d' and

E.

Functions, Derivatives, and Integrals

80

Proof Let
o

d'

Ix

o be the smaller of the numbers

x0.)

Then

- x0I

< o

3.3

- d and d' -

f (x0)

=>

d <

<=>

If (x) - f(xo)I <

< d'

x0

=>

x0.

< f(x) <

(In the figure,

f (xo)

E.

We shall now reexamine the idea of a limit, which we used in defining the slope
of the tangent to the graph of a function.
y

To find the slope of the tangent at the point (x0,f(x0)), we let m(x) be the slope of the
secant line through the points (x0,f(x0)) and (x, f(x)), where x -: x0. Thus the slope
of the secant is a function, and we are now describing it in functional notation. By
definition, the slope of the tangent is

f'(x0)

Jim m(x),

x-xo

if such a limit exists. We shall now give a definition of the limit. The idea is that
limxx m(x)
L if the function m becomes continuous at x0 when we insert the
0
value L as the value m(x0). Thus we want to use L as m(x0) in the definition of
continuity. This gives the following:
=

Definition. Let m be a function defined at each point of an interval /, except at the


point x0 Suppose that for every E > 0 there is a o > 0 such that

0 <

Ix - x0I

< o

lm(x) - LI

=>

Then
Jim
For example, for f(x)

m(x)

x2, x0

x2
x

m(x)
l,

- 1

--

<

E.

L.

we have

+ 1

(x

-:

1).

When we insert the point (1, 2) on the graph of the function m, we get a continuous
function (which is equal to x + 1 for every x ). Thus lim,,,_"'0 m(x)
2, not just
intuitively but also in terms of our definition of a limit.
There is one more problem to consider. What if j(x0) is defined? In this case,
=

Continuity and Limits

3.3

81

what do we mean by lim.,.,J(x)? The answer is that we ignore the value off at x0,
and investigate how the rest of the graph behaves. To be exact:
Let/be a function defined on an interval I, except perhaps at the point x0
Suppose that for every e > 0 there is a b > 0 such that

Definition.

0 <

Ix - x0I < b

If (x) - L I <

e.

Then
lim f(x)
x-+x0

L.

Note that here we have simply copied the preceding definition, using f for m: all
x0 was ruled out by the condition 0 < Ix - x01. The left-hand
along, the value x
figure, showing the eb-box, looks the same as before, except that there is no point in
plotting /(x0) (which may not be defined, and which will not in any case be used).
=

y
I

f(Xo)
L

/
-r?:

-- - ---

-+---<:---x
Xo
I

The definition of a limit applies in peculiar ways to certain peculiar functions.


In the right-hand figure above, lim.,.,JCx) = L, but L :;i:. f(x0). The following
theorem shows that the strange situation shown in the figure cannot happen if the
function is continuous.
Theorem 2.

f is continuous at x0 if and only if


lim f(x)

f(xo).

Here the displayed formula says three things at once:

1) f(x0) is defined. That is, x0 is in the domain off


2) lim.,.,J(x) exists. That is,f approaches a limit, as x--+ x0
3) f(x0) and lim.,.,0 f(x) are the same number.
PROBLEM SET 3.3
1.

How close to 3 does x need to be, for 2x to be within 0.001 of 6? (Answer in the form

Ix - 31 <
.8-box

12x - 61 < 0.001. Sketch the graph of j(x) = 2x, and sketch your
0.001).)

2. Find numbers d and d' such that d < x < d' lx2 - 321 < 0.0001. Sketch the graph
ofj(x) = x2 and sketch your dd'-box. In your sketch, you will have to distort the scale
grossly, because of the small size of your

82

3.

Functions, Derivatives, and Integrals


Show that the function f(x)

x2 is

3.4

x0

continuous at the point

that was used in the text for the same function at the point

x0

Thus your answer will include statements in the form: "Let d


Then

d <x <

d' =>

32 -

< x2 < 32 + ."

3.

Use the method

1 and apply Theorem I.

. . . , and let d'

. ..

Sketch the function, showing your

dd'-box.
Answer as in Problem

4.

j(x)

6.
8.

f(x)
f(x)

9.

f(x)

10.

2x2,

Xo

V':;, x0
xn, where

3 for the following functions,

I.
8.

n is any positive integer

5.

j(x)

7.

f(x)

and x0

at the following points


=

V,

Xo

x3, x0

x 0

4.

2.

is any number.

v:;, x0 is any positive number.


A function f is called Lipschitzian if there is a number m >
points x, x0 we have
l/(x) - f(x0)1 m Ix - x0I.
=

0 such that for every two

Show that every Lipschitzian function is continuous.

3.4

THEOREMS ON LIMITS

In this section we shall give the elementary rules that we use in dealing with limits.
These rules are much easier to learn and to use than they are to prove, and so many of
the proofs are omitted from this section. (You will find the missing proofs in Appendix
B.) But some of the proofs are easy, and they throw some light on the idea of a limit.
Theorem 1.

If limx-xJ(x)

L, then limx-x0 [-j(x)]

-L.

Proof To get -/from/, we flip the graph of/ across the x-axis. We know that for
every e > 0, fhas an eb-box at (x0, L). If we flip the box across the x-axis, in the
same way that we flipped the graph, this gives us a box for -fat (x0, -L).
This theorem can also be proved algebraically.

The hypothesis means that:

(1) for every e > 0 there is a o > 0 such that


0 < Jx - x01 < O

=>

If (x) - LI <

e;

Theorems on Limits

3.4

the conclusion means that:

(2) for

every" > 0 there is a o > 0 such that

o <o
O<lx-xl
Since

83

1-f(x)-(-L)I<"

=>

1-f(x) -(-L)I = lf(x) - LI, it is obvious that (1)

=>

(2),

and thus the

theorem holds.

Theorem

2.

If limx-xJ(x)

L, then limx-xo

[f(x) - L]

0.

!I

J---1
I
I

51

rl

Proof

Given f, we get

f-L

f - L by moving the graph up or down a certain distance

(down or up, according as L is positive or negative). We move the box along with the
graph; and this gives a box for the functionf - L.

Theorem

3.

If Iimx-xo

[j(x) - L ]

0, then limx- x0 f(x)

L.

To prove this, merely use the previous proof in reverse; move the box along
with the graph.

Theorem 4. If limx_.,0 f(x)

L, and k is any number, then limx-xo kf(x)

kL.

That is, the limit of a constant times a function is the same constant times the
limit of the function.
y

Proof
1) Fork

0, this is easy:

kf(x)

0 for everyx, and so lim.,_x0

kf(x)

0 L.

Functions, Derivatives, and Integrals

84

2)

Suppose that

Therefore, for every

> 0.

For every

3.4

=>

0 < Ix - x01 < O


Multiplying by

3)

Proof for

1/(x) - LI < E/k.

k, in the inequality on the right, we get:


< Ix - x01 <

and so lim,H,,0

(x0, L).
(x0, L). Thus

> 0, the graph off has an EO-box at

> 0, the graph of /has an (E/k) o-box at

lkf(x) - kLI < E,

=>

kf(x) = kL, which was to be proved.

k < O?

These proofs illustrate the way we work with boxes, to prove things about limits.
The following theorems will be proved later (unless you find a way to prove them
for yourself).
Theorem 5.

If lim xx f(x) =Land 1imxx0 g (x) = L', then

Jim [ (x)
x-+x f
0

g(x)]

L + I.:.

That is, if each of the functions/and g has a limit, as

x--+ x0, then the sum also

has a limit, and the limit of the sum is the sum of the limits.
Theorem 6.

If limx-xo

f(x) = Land limx-xo g(x)


Jim

x-xo
Theorem 7.

If limxx

[f(x)g(x)]

L', then

= LI.:.

f(x) =Land limxx0 g (x) = L', and L'


Jim

x-xo

f(x)
g(x)

!=._
I.:

0, then

Caution: The preceding theorem says nothing about what happens when L'
anything can happen, even in very simple cases. If

0.

And in fact, for L' = 0

f(x) = 2x,
then

f(x)
g(x)

And any number

2x

and

g(x) = x,

f(x)
x-o g(x)
Jim

2.

k can be used in place of 2. Therefore, if f (x)

--+

0 and g (x)

--+

0,

the quotient//g can approach any number whatever as a limit. This should not sur
prise us, because every time we calculate a derivative we are finding the limit of a
quotient

f(x) - f(xo)
X - Xo

whose numerator and denominator are approaching 0.

Theorems on Limits

3.4

85

In the preceding section we showed that f is continuous at x0 if and only if


lim f(x)

f(x0).

(This was Theorem 2.) Hereafter, we shall regard the above formula as interchange
able with the definition of continuity. Thus every theorem on limits automatically
gives us a theorem on continuous functions. Some of these are as follows:
Theorem 8.

If f is continuous at x0, and k is any number, then kf is continuous at x0

Proof We are given that


lim f(x)

f(xo).

By Theorem 4,
lim kf(x)

kf(x0),

and this means that kf is continuous at x0


Theorem 9.

If f and g are continuous at x0, then so also are f + g and Jg.

Proof? (Use Theorems


Theorem 10.

and 6.)

If f and g are continuous at x0, and g(x0)

>'6 0,

then fig is continuous

at x0
(Use Theorem 7.)
Most of the time we shall apply these results not just at one point x0 but through
out the domain of the functions f and g. For these cases, we can state our theorems
more briefly as follows:
Letf and g be functions with the same domain. Iff and g are continuous,
then so also are kf, f + g, and Jg. And fig is continuous at every point x0 where
g(x0) =: 0.

Theorem 11.

Thus, for example, given that/(x)


x2 + 1 and g(x)
x4 + 4 are continuous,
on the entire real number system, we can infer immediately that kf,f+ g,fg, andfig
have the same property. Here g(x) =: 0 for every x. Given
=

h(x)

x2

1,

we can infer that/+ h andfh are continuous everywhere, and thatflh is continuous
except at 1 and -1. Of course, at x
1 and x
-1 it is not just continuity that
breaks down: the quotient function is not even defined at these points, because the
denominator ofj/h becomes 0.
Finally, a trivial observation.
=

3.4

Functions, Derivatives, and Integrals

86

Theorem 12.

Every constant function is

continuous.

-- -- ----

Xo

Proof Given/(x)
every

"

>

0 there

k, for every x in a certain domain. We need to show that for


is a o > 0 such that
=

Ix

Xol < o

=>

If (x)

kl <

E.

Obviously any positive number can be used as o.


PROBLEM SET 3.4

In solving the following problems, you need not base your work directly on the definition
of continuity or the definition of a limit; you are free to use all the theorems stated in this
section. Note that the later problems are not based on this section at all; they are extensions
of the theory.
1.

Show that if/(x)

2.

f(x)
kx2
Same, for f (x)
xn (n a positive integer).
Same, for f (x)
x3 - 3x2 + x - 4.
Apolynomial of degree n is a function of the form

3.
4.
5.

Same, for

kx, then/is continuous.

where an 0. (The function which is


Show that every polynomial is continuous.
=

0 for every x is a polynomial of degree 0.)

6. Show that if

f(x)

xn
1 + x2'

then f is continuous.
A function f is

bounded if there is a number

Msuch that - M

f (x)

Mfor every x

in the domain of M. If the above inequalities hold, then we say that Mis a

bound off.
f; and to

Obviously, to show that a function f is bounded, you have to name a bound of

show that a function is unbounded, you have to show that no number Mis a bound off.

3.4

Theorems on Limits

87

Find out which of the following functions are bounded, on the given domains, and justify
your answers:
7. f(x)

9. f(x)

11. f(x)

13. f(x)

15. f(x)

17. f(x)

19. f(x)

1
=

--

1 + x2

10. f(x) =

x3, 0 ;;:; x ;;:; 2

12. f(x)

1 + x2
x

--

1 + x3

1 + x2
1
x + 1

8. f(x)

x2, 0 ;;:; x ;;:; 1

x
=

, - oo < x < oo

--

1 + x3

x2, - oo < x < oo


x2

--2

1 + x

x
=

--

1 + x2'
x

, 1;;:; x < oo

14. f(x)

, - oo < x < oo

16. f(x)

, -1 < x < 1

18. f(x)

--

1 + x2

< x <

0 x 1
-

, - oo < x < -1

1
-- , 1 < x < oo
1 + x3
x4

- w

--

1 + x3

, 0 < x < oo

, -1 < x < 1

20. Show that if f is bounded, then so also is kf for every k.


21. Show that if f and g are bounded (on the same domain), then so also is f + g.
*22. Show that if/and g are bounded (on the same domain), then so also is/g. You may
find it convenient to write the condition for boundedness in the form If(x)I ;;:; M.
Can you infer also that fig is bounded? Why or why not?
*23. a) Show that if f is bounded and
Jim

g(x)

0,

x-xo

then
Jim [f(x)g(x)]

0.

x-xo

(First try proving this for the case M


1.)
b) Show that if
Jim f(x)
0,
=

then
Jim [f(x)sin x]

0.

x-xo

(Here it makes no difference whether degrees or radians are being used.)


c) In Problem 23(a), can we get along without the hypothesis that/is bounded? That
is, if
Jim g(x)
0,
=

x-xo

does it follow that


Jim [f(x)g(x)]

0,

no matter what kind of function f may be? Why or why not?

88

Functions, Derivatives, and Integrals

3.4

d) Show that if
Jim f(x)

0,

xo

then
Jim
xo

[t(x) ]
sin

0.

(Query: Can Theorem 6 be applied to this problem?)


e) Show that
1

Jim x2 cos -

x-o

0.

-M

*24.

5 I
-

-----

a) A function fis locally bounded at x0 if there are positive numbers M and o such that
0 <Ix - x01 < o=> lf(x)I < M.
b) If fis bounded, does it follows that f is locally bounded at each point of its domain?
c) Conversely, if f is locally bounded at each point of its domain, does it follow that f
is bounded?
d) If f is locally bounded at each point of the open interval (0, 1), does it follow that f
is bounded on (0, 1)? Why or why not?
e) Show that if
Jim f(x)

L,

x-xo

then fis locally bounded at x0. (This result does not require that x0 be in the domain
of f If you draw a picture of what you have, and a picture of what you want, and
compare the two, this proof may become obvious.)
f) Show that if f is locally bounded at x0, and
.

Jim g(x)

0,

x-xo

then
Jim f (x)g(x)
x-xo

0.

The Process of Differentiation

3.5
3.5

89

THE PROCESS OF DIFFERENTIATION

The theorems in the preceding section tell us enough about limits to give us some
information about derivatives.
To make some formulas easier to write, we introduce an alternative notation for
the derivative: we write DJ to mean the derivative off Thus
Df=f',

by definition. Similarly, if

h(x) =f(x) + g(x)

for every x in a certain domain, then

D(f + g) = h'.

Similarly, when we write

D(x2 + 2x + 5)

we mean the derivative of the function

(x2+2x+ 5).

We know already what this derivative is:


D(x2+2x+ 5) = 2x + 2.
Here we are merely rewriting the result which we got quite a while ago: for each x,
the slope of the tangent to the graph of
y=

is given by the formula

ax2+bx+c

S,, =

2ax+b.

We recall that a functionfis differentiable at a point x0 if it has a derivative at x0.


When we say thatf is differentiable, we mean that it has a derivative at every point of
its domain. For example, if f(x) = lxl, then f is differentiable at 1, but not at 0.
But ifj(x) = x2, thenf is differentiable ( without qualification).
Theorem 1. The derivative of a constant function is 0.
y

Here by a constant function we mean a functionffor whichf(x) = k for every x.


Obviously the slope of the tangent is 0 everywhere. Algebraically,

f(x) - f(x o)
rr;->x0
X - x0

!'(Xo) - 1.lm
_

- = 11m 0 = 0 .
k
= !.im --x->x0 X - X0

x->x0

90

3.5

Functions, Derivatives, and Integrals

Theorem 2.

f is

Jf

differentiable, then so also is

D(kj)

Proof

Take any

x0

kf for

every k, and

kDf

in the domain off Then

1.
Im

x-->x0

by the definition off' (x0).

f(x) - f(xo) _ '


- f (x0),
X - x0

Therefore, by Theorem 4 of Section 3.4,

kf(x) - k f(x0) _ '


- kf (Xo)
"''"'"'o
X - Xo
1.
Im

Therefore, at each point

x0 ,

the derivative of

kf is

have seen an example of this:

Dx2

2x,

D(kx2)

2kx,

k times the derivative off

We

and

as it should be.
Theorem 3.

f and g

If

are differentiable, then so also is

D( f + g)

Proof.

Given an

x0

in the domain of

f and g,

f(x) - f(xo)
"'"""'o
x - x0
1. g(x)
Im
"''"'"'o

g, and

DJ+ Dg.

l im

and

f+

- g(x0)

we have

X - Xo

f '(xo)
'
g (x0).

We want to prove that

[j(x) + g(x)] - [f(x0) + g(x0)] _ f'(


'
x0) + g (x0) .
"''"'"'o
x - Xo
1.
Im

Since we know by Theorem .5 of Section 3.4 that the limit of the sum is the sum of the

limits, the result follows immediately; the big fraction in the third formula is the sum
of the two fractions in the preceding two formulas.
We shall now show that

Dx3
For each

x,

3x2

let

f(x)

xa;

The Process of Differentiation

3.5

91

and take any x0. Then

'
f (x0)

f(x) - f(xo)
x - x0
o:->o:0
Im

.
= IIm

(x - x0)(x2

= lim

(x 2

!t::t:o

x0x

x0,

x0x

x)

o:->o:0 X

- x0

Xo

xg)

(Why?) Therefore

by Theorems 6, 4, and 5 of Section 3.4.

for every

x3 - x

x -

.,_,.,o

r
= Im

and

which is what we wanted.


To extend this result to j(x) =

x",

for every positive integer

n,

we merely need

to know the general factorization formula

This is easy to check by multiplication.


Thorem 4.

Proof.

Dx"

Let f(x)
.
IIm

nx"-1,

for every positive integer

x" for every

f(x) - f(x0)

and take any

x0

lffi

(x - x0)(x11-1

x"-2x0

X - Xo

z-+ :Z:o

= lim

Then

x" - xg

1.

1.
= Im

x,

n.

(x"-1

x-xo

11-2
x
x0


+
+

xxg-2

xg-1)
(to

Thus/' (x0)

nx-1

for every

x0,

and so
Dx" =

which was to be proved.

nx"-1,

terms)

92

Functions, Derivatives, and Integrals

3.5

The preceding theorems, in combination, enable us to differentiate any poly


nomial. For example,

D(17x29 + 7TX17 - 7x5)

Dl7x29 + D7Tx17 - D7x5,

because the derivative of the sum is the sum of the derivatives. This is
=

Theorem 5.

17 29x28 + 177Tx16 - 35x4

If f is differentiable at x0, then f is continuous at x0.


y
I

I--/'
I
I
I
I

---1I
I
I

x
---+---xc!--o---By definition, f'(x0) is the

This is easy to see, as a matter of common sense.


slope of the tangent.

Therefore f must have a nonvertical tangent at

"made a jump" at x0, there surely couldn't be a nonvertical tangent.

x0.

But if f

The secant

lines would get too steep for their slopes to approach a limit. Pictures are useful, but
it is hard to be sure that the pictures we draw allow for all possibilities. Jn any case,
the theorem is easy to prove algebraically.

Given

f(x) - f(x0)
- f'(x0),
x-xo
X - Xo
.
Itm

then

hm [f(x) - f(x0)]

x-+x0

hm

f(x) - f(xo)

x-x0

f'(x0)

- x0

(x - x0)

0.

(Theorem needed, for the second step?) By Theorem 3 of Section 3.4, lim.,_,, f(x)
0

f(x0), which was to be proved.

The differential calculus would be simpler if the derivative of the product were
equal to the product of the derivatives; but this is not so. For example, take

f( x)

xa,

g (x)

x2.

Then

f'(x) g'(x)

which is not the same as

Dx5

3x2

2x

5x4

6x3,

The Process of Differentiation

3.5

93

A correct formula can be derived as follows. Take any x0, and suppose, as usual,

thatfand g are differentiable. Then

'
f(x)-f(x0)
-f (X0)
.,_,.,o
X - Xo

(1)

g(x) - g(x0)
X - x0
.,_,.,0

(2)

1.

lffi

and

1.

tm

g '(Xo) .

To find the derivative of the product, we need to find

.f(x)g(x)-f(x0)g(x0) .
X- x0
x->x0

(3)

lim

In a similar situation, when we were finding the derivative off+ g, there was no

problem: we looked at the fraction whose limit we wanted to find, and observed
that it was the sum of the two fractions

g(x ) - g(x0)
X - Xo

f(x)-f(xo)
X - Xo

whose limits we knew. If these fractions appeared in


that their limits are given.

(3),

then we could use the fact

Since neither of them appears, we use a trick: we simply

put one of them there, fix up the rest of the fraction so that its value is unchanged,
and hope for the best:

f(x)g(x)-f(x0)g(x) + f(x0)g(x) - f(x0)g(x0)


x -

f(x)g(x) - f(x0)g(x0)
x-
=

f(x) - f(xo)
x - x0

Now we can see what happens as

1.

lill

x->x0

x ->-

g(x)

+ g(x) - g(xo)
f(xo).
X - x0

x0:

'
f(x)-f(xo)
-f (Xo,
)
X0
X
_

lim g(x)

g(x) - g(x0)
:z:->:z:o
x - Xo
1.

lffi

g(x0),
g'(Xo) .

Therefore, by our theorems on limits of sums and products, we have

f(x)g(x) - f(x0)g(x0)
-f'(x0)g(x0) + f(X0)g'(x0)
X - Xo
:z:->:z:o
!.

1m

Functions, Derivatives, and Integrals

94

3.5

In words:
Theorem 6.

The derivative of the product of two differentiable functions is the

derivative of the first, times the second, plus the first times the derivative of the
second.
More briefly:

D(fg) = f'g

Let us try this one out for

f(x)

Now

/'(x)

Therefore

f'(x)g(x)

x3,

g(x)

3x2,

g'(x)

f(x)g'(x)

g'f

x2

3x2 x2

2x.
x3 2x

as it should be.
Next we want to find the derivative of the reciprocal
As usual, we take a fixed

have/ (x0) 0, or

x0;

'

1.

1m

xx0

if such a limit exists.

x-->- x0,

l//

of a function f

not be defined.

g(x) - g(x0)
X - Xo

We must

Now

1.
I //(x)
Im

_
-

- 1 /J(x0)
,
X - Xo

xx0

Algebraically,

1/f(x) - l/f(x0)
x - X0
As

5x 4,

and we assume that f is differentiable at x0.

g(x0) would

g (x0) -

f(x0) - f(x)
x)f(x
f(
o)(x - x0)

the first fraction approaches

- 1/ [/(x0)] 2 , because/(x0) 0.

f'(x0),

-1
f(x) - f(x0)
x
x - X0
f( )f(x0)
and the second fraction approaches

Therefore

,
g (x )
o

-f'(xo)
[f(xo)J2 '

and so

at every point
Theorem 7.

where f (x) 0.

In words:

The derivative of the reciprocal of a differentiable function is equal to

minus the derivative of the function, divided by the square of the function (wherever
the function is different from zero).
Using the preceding two theorems, we get

(f);

( ;1)

D f

f'

fD

(1);

!'
=

;-

f.

-g' =
7

f'g - g'f
g2

In words:
Theorem 8.

The derivative of the quotient of two differentiable functions is equal to

the denominator times the derivative of the numerator, minus the numerator times

The Process of Differentiation

3.5

95

the derivative of the denominator, all divided by the square of the denominator
(wherever the denominator is not

More briefly, at every point

0).

where

g(x)

0,

we have

() = gf' Jg'

Let us try this one out in the case

f(x)

where we .already know the answer.

D
This is right, because

(-xx42) = x2

Dx2

g(x)

x4,

x2,

By Theorem 8 we get

2x5
==

4x3 - x4 2x
(x2)2

x4

2x.

2x.

For convenience of reference, we list our differentiation theorems as short for


mulas:
(i)
(ii)
(iii)
(iv)
(v)
(vi)

(vii)

0,
Dk
kDf,
D(kf)
DJ + Dg,
D(f + g)
nxn-i
(n a positive
Dxn
D(fg) f' g + gf,
=

integer),

n() f[_2
D (J
) = gf' - Jg'
g2

(wherever f

Theorem 5 did not involve a formula.

(wherever

0),
g

0).

It said that if a function is differentiable

at a point, then it is continuous at the same point.


Finally, some remarks on the notation used for derivatives.
named by a letter, such as/, then the notation

DJ is

When a function is

unambiguous: it means f'.

But when we describe functions by formulas, it is not always obvious what function
we mean. When we speak of "the function
x

H x2 - x

t H t2 + 2t

x2 -

+ 1,'' it is obvious that we mean


t2 + 2t
3," we mean

1; and when we speak of "the function


3.

But if we speak of the "function"

t2 - tx2 +

x,

we might have either of two things in mind:


a)

xis regarded as a constant; and our function is

j: t

H !2 - tx2

X.

Functions, Derivatives, and Integrals

96

b)

3.5

tis regarded as a constant, and our function is

g: x

f->

t2 - tx2 +

x.

In such a case, it would hardly do to indicate a derivative by writing


D(t2 - tx2 + x)

(?)

(?),

because nobody could tell whether we meantf' or g'.

To eliminate the ambiguity,

we write D1 or D,,, to indicate which letter does not represent a constant. Thus
D1(t2 - tx2 + x)

f'(t)

2t

2
- x ,

while
D,,,(t2 - tx2 + x)

g'(x)

-2tx + 1.

Similarly,
Dx(ax3z + z2)
D.(ax3z + z2)
Da(ax3z + z2)

3ax2z,
ax3 + 2z,
x3z.

PROBLEM SET 3.5

All of the following are differentiation problems. Most of them can be worked by the
standard formulas that we have just derived. But in some cases you will need to start with
the definition of[' (x0) and then use various algebraic strategems.
2.

0
1. D(7x1 - x8)
4. D -x2 + 1
n

3. D --

x + 1

x + 1

y
5. D -y3 - 3

7.

D--

()

8.

x)

b) Da(3axy + x2 + a3)

11. a) Dx(3axy + x2 + a3)

12. D[(x2 - x + l)(x2 + x + 1)]

D(?y4 _ y2 + 7T)

9. D(l + x)3

b) Dv (x3y + ay3 + xy 2)

10. a) Dx(x3y + ay3 + xy 2)

x + 1

n 2

6.

1 3. D

c) Da (x3y + ay3 + xy2)

x2 - x + 1

x2 + x + 1

15. D(x2 + x)2

14. D-x3 - x

16. If you worked Problem 9.of Section 3.3, you know that f(x)
Vx is continuous in its
entire domain R+. Assuming, in any case, that this is true, find['. [Hint: Set up the
fraction whose limit is/'(x0), rationalize the numerator and hope for the best.]
=

17. Given/(x) = Vx + 1 (x;;; -1),findf'. Here you may assume thatfis continuous.
But you should mention this fact, at the stage where you need it.
18. Given/(x)

Vx2 + I, find/'. (Assume that/ is continuous.)

19. Givenf(x) = VJ - x2

(-1 ;;:; x ;;:;: 1), find/'.

The Process of Differentiation:

3.6

20. Gi veng(x)

21. Find D(I/Vx)

x2Vl - x2, findg'.

Roots and Powers of Functions

(x > 0).

23. Find D(x/Vl - x2)

22. Find D(l/Vl - x2).

97

(-1 < x < 1).

24. Now solve Problem 19 by the methods of Chapter 2, without using limits or differentiation formulas.

Find out whether the following formulas are correct, and give your reasons.
25.

D(x2 + 1)2

27. D(x2 + 1)3


29.

2(x2 + 1) (?)

26. D(x2 + 1)2

28. D(x2 + 1)3

3(x2 + 1) (?)

2(x2 + 1)

3(x2 + 1)

2x (?)

2x (?)

500(x2 + 1)499 (?) [Hint: In fact, this formula is wrong. And it is


D(x2 + osoo
possible to prove that it is wrong, without finding out what the derivative of the given
=

function really is.]


30. Prove by induction that Dxn
or the binomial theorem.

31. Same problem, for Dx-n

32. Find DVx2 + x + 1.

nxn-1, without using either a factorization formula

-nx-n-1

3.6 THE PROCESS OF DIFFERENTIATION:


ROOTS AND POWERS OF FUNCTIONS

Some of the answers that you got in the preceding problem set deserve to be regarded
as standard differentiation formulas. For example, you found that for

we have

f(x)

f'(x)

Jx
2 vlrx .

This problem is going to come up again. We had, therefore, better add it to the list
of formulas at the end of Section 3.5:

(viii)

D.jx

2,/x .

You found also that

Dy'X+I

D J x2 + 1

DvI1

x2

2.Jx + 1
x ,
J x2 + 1
-X

--===

J1

x2

(x

> - 1),

(-1 <

< 1).

In each of these cases, we have the problem of finding the derivative of the positive
square root

Jj of the differentiable positive functionf

If we can solve this problem

98

Functions, Derivatives, and Integrals

3.6

in the general case, then we can get a formula

n.J]=

?;

and we can then apply the formula hereafter.

g(x)

==

Suppose, then, that we are given

.JJ(x),

on a domain where f (x) > 0; and suppose that f has a derivative at a certain point

x0 of the domain.

We want to find

g (Xo) =
'

By definition,

g(x) - g(x0) = 1.lm JM - Jf(XJ


x->x0
X - x0
x->x0
X - x0
1.

lm

lim

[.JI(X5 -- .Jf(XJ
.JTW
.J!!S?__
Xo
.jf(X5 Jf(xo)J
1
[f(x) - f(xo) .
JJ(x) JJ(xo)J
Xo
+

x->a:o

g'(x0).

lim
a:->a:o

= f'(x0) lim
->
x

a:o

.jf(X5 + JJ(xo)

provided that the latter limit exists. It is easy to see that this limit exists, provided that
lim

.jf(x) = JJ(x0).

(1)

Since the limit of the sum is the sum of the limits, it then f ollows that
lim

[.jf(X5 + JJ(xo)] = 2Jf(xo);

and since the limit of the quotient is the quotient of the limits, we get:
Theorem 1.

If f is positive and differentiable, then

DJ = f' .
] 2J]
Let us try this on the function x H Jx + 1.
for every x. Therefore

Here f(x)

=x

+ 1, and f' (x)

1
DJ.X+l = J.X+l '
2 x+ 1
which is the right answer.

DJx2

+ 1

For

x HJ x2

+ 1, we have

f(x) = x2 +
f'(x) = 2x,
x)
= DJJ(x) = f'( =

2JiW

1,

2Jx2

The Process of Differentiation:

3.6

which is the right answer.

which we haven't proved.

.JJ(x0),

(1)

We postpone the proof, observing meanwhile that

Since f is continuous, we have

x """' x0
Since

.J is continuous,

99

Formula (viii), of course, depends on

lim.JJ(x)

reasonable.

Roots and Powers of Functions

(1)

is

f(x) """'f (xo) .

=>

we have

f(x) """'f(xo)

.J f (x) """' .J f(xo).

=>

Fitting these two statements together, we get

x """' x0

.Jf(x)

=>

which is what we want.

"""' .J f(x0),

Consider now the following function:

g(x)

(x5o

xl7 + 1)247.

If you want to find g' (x), it is not helpful to observe that g is a polynomial, of degree

12,350,

or to recall the binomial theorem. In fact, the right approach is to solve first

a more general problem. Given a function

g(x)

g which is a power of a function f Thus

f"(x),

where by f"(x) we mean [f (x)]n. Then


g

'

(Xo)

rCx) - rCxo)
X - X0
x-+x0
1.

Jill

Iim

x-+x0

([f(x) - JCxomr-1Cx) + r-2(x)f(xo) +


X - x0

We see that this limit is

g'(x0)

+ r-1cxo)l

nj"-1(x0)f'(x0),

(2)

provided that in the brackets on the right we have


lim

xx0

for each positive integer k.

fk(x)

(3)

And this is true: given that


Jim

xx0

it follows that

f2(x)

fk(x0)

Equation (3) states that the kth power of a continuous

function is always continuous.

lim

lim

f(x)

[f(x)f(x)]

f(x0),

f(x0)f(x0)

f2 (x0),

100

Functions, Derivatives, and Integrals

3.6

because the limit of the product is the product of the limits. For the same reason,

Ink - 1 such steps, we get Eq. (3). Therefore Eq. (2) is correct; and we have:
Theorem 2.

If f is differentiable, and n is a positive integer, then


nr = njn-1!'.

Let us try this on our polynomial of degree 12 ,350:


D (xso

x17 + 1)247 =Df247 = 247f246f'


= 247(xso
x17 + 1)246(50x 49 - 17x16).
-

Note that our use of the shortcut formula Dfn =nfn-1.f' has two advantages
over the method based on the binomial expansion. First, the calculation is possible,
as a practical matter. Second, it gives the answer not merely in a correct form, but
also in a factored form, which is easier to handle than the binomial expansion of the
derivative.
Since we know how to differentiate fractions, we know how to differentiate
functions of the form
1
f(x) =---;;,
x
where k is a positive integer. We have

f'(x) =

-Dxk
-kxk-l
=
x2k
(xk)2

wherek + 1 = 2k - (k - 1). If we express 1/x k as x-k, and make the same change
in the formula for f', then we get
Dx-k = -kx-k-1
This has the same form as our previous formula Dxn =nxn-1, with n = -k . What
is needed, to take care of all such cases, is the following:
Theorem 3.

If n is a positive integer, and f is differentiable, then

Ifn is a negative integer, then the same formula holds at every pointx wheref(x) 0 .
The last condition is necessary: ifn < 0 , thenf(x) appears in the denominator of
rcx) = l/J-n(x), and r is therefore not defined at points where f (x) = 0.
Theorem 3 has already been proved for the case in which n is a positive integer.
For the case in which n is a negative integer, = -k, the proof is as follows:

kp-1f'
nr =Df-k = D _!_k =
k
f
f2
= -kf-lc-lf ' = nr-l f'.
-

The Process of Differentiation:

3.6

Roots and Powers of Functions

101

For convenience of reference, we list all the differentiation formulas that we have
so far:

(i)
(iii)
(v)

Dk = 0,
D(f

g) = DJ + Dg,

D( fg) = f' g

()

D L

(vii)

(ix)

(ii)

DJ]=

gf,

(vi)

()

J'

(x)

Each of these formulas holds for every

nr

(n :rt= 0),

- [_
!2'

D ! =
f

r=
...) DyX
(Vlll

gf' - Jg' ,
2
g

(iv)

D(kf) = kDf,
Dxn = nxn-i

2Jx

---=,

njn-1.f'

(n :rt= 0).

x for which its right-hand member is defined.

PROBLEM SET 3.6


Find, by any method:

I.

2. D

Dv'(x +l)(x +2)

8. DVx3 +2x +1

7. DVx(x - 2)
x2
1
10. D- -
x2 +1
-

11. D

DVx2 (Warning:

-J
1

Recall that for

x
(x2 +2x+1)2

x2 +1
6. D- -x2 - 1
9.

v'x - I
--
x2 + 1

12. DYv'

+x

Don't "simplify" yourself into a wrong answer.)

15. Dv(x3y2 - x2y3)3

14. D,, v' x2 +a2


17.

3.

5. D(x3 +x2 - x +7)2

4. DVx4 +5x2 +2

13.

1
(x2 +x +1)2

>

16 Dx

0,

x3y2
(x2 +y2)2

by definition. Show that

[Hint:

_
.a,

By definition, v

.{;xv

xv is

( :X)1'

(x

>

o) .

the number which, when raised to the qth power, gives

x'l>.

Therefore you need to show that the qth power of the right-hand side of the above
equation is

xv.]

This is an instance of a frequent phenomenon: often, a problem becomes easy if

18.

we rewrite it, using the

definitions

Find

the answer in a form which brings out the analogy with the

Dx312, and write


formula Dxn
nxn-1
=

of the ideas that the problem involves.

You may assume that

x312 is

continuous.

102

Functions, Derivatives, and Integrals

3.7

19. Find D.,[V x3 + x(x3 + x)].


20. Find D.,(x4 + 2)312

*21. Get a general formula for D/312, where f is any positive function (differentiable, of
course). The answer should be written in such a form as to bring out the analogy with
the formula D fn

nfn-lf'.

*22. Find a formula for D/512, where f is differentiable and positive.


23. Find D.,(x2 + 3x + 1)512

24. Find a formula for Dx-312 and write it in a form which brings out the analogy with
the formula Dxn

nxn-1

25. a) Simplify

a - b
a3 - b3.

(Obviously, the word "simplify" can mean many different things.

In this case, it

means to get rid of the numerator, so as to get a formula which will be useful to you
in solving the next part of the problem.)
b) Find DflX: assuming that

fix is continuous.

26. a) Simplify, as in Problem 25,

a - b

q is a positive integer.)
q_
q
b) Find DV x, assuming that V x is continuous.
(Here

*27. Given positive integers p, q. Find a formula for Dxp/q, valid for x > 0, and write the
answer in a form which brings out the analogy with the formula Dxn
nxn-1
=

3.7

THE INTEGRAL OF A NONNEGATIVE FUNCTION

Given the graph ofy


region

We found that

kx2. In Section 2.10, we calculated the area A" of the shaded


=

{(x,y) \ 0 x h, 0 y
Ah

kx2}.

h3.
y
2
y=kt

The Integral of a Nonnegative Function

3.7

In this situation, we regarded h as a fixed number.

103

On the other hand, it is plain

that h can be any positive number, and that when his named, Ah is determined. Thus
Ah can be regarded as a function.

To discuss Ah as a function, without confusing the notation, let us relabel our


horizontal axis as the t-axis, as shown on the right. Thus our parabola becomes the
graph of the equation y = kt2

(Here t is acting like

x.) For each x 0, let F(x)


be the area of the region under the parabola, from 0 to x. Thus F(I) is the area under
the parabola from 0 to 1 ; this is
k

F(l) =3
F( 3)

is the area under the parabola from

1 3 =- .

to

3;

this is

F(3) = 33 = 9k.
3
And so on.

Thus we have a function F: R+-+ R+.

the values of

F:

And we have a formula giving

F(x) = x3
3
Here we have replaced h by

x in

the area formula

3
A"=- h.
3

Thus, starting with the nonnegative function

we have defined a new function

3
.
-x.
F.x1--+
3
For each

x,

the value of the new function is the area under the graph of the old one.

We can generalize this scheme in the following way.

y
y=f(t)

n/l

I
I
I
I
I

I
I
I
I
I

I
I

104

Functions, Derivatives, and Integrals

3.7

Given a continuous nonnegative function/, defined on an interval, [a, b]. As before,


we label the horizontal axis as the t-axis, because we want to use the symbol x for
another purpose. For each number x on the interval [a, b], let R., be the shaded region.
Thus
R.,= {(t, y) I a t x, 0 y f(t)}.
And let F(x) be the area of the region R.,.
This is the scheme that we used for the parabola. For the parabola, we had
f(t)= kt 2, and we used a= 0. But we can go through the same proceeding starting
with any number a and any continuous function/ Let us look at some more examples.
Consider
f( t )= t + I,
t 1.
This is a function
f: [1, oo)--+ [2, oo).
y

y=t+l =f(t)

4
I
I
I
I
I
1

(x, x+I)

2-

I
I
I
I
I
I
I

/I
//
I
1 /
I
/
I
I
//
I
/
a=l

For every x on the infinite interval [ I, oo ), let F(x) be the area under the graph, from
1 to x. We now have a new function
F: [l, oo) --+ [O, oo).
In this case, it is easy to write a formula for F. For x= I, the area is 0. Therefore
F(l)= 0. For x > I, F(x) is the area of a trapezoid lying on its side, with its "bases"
vertical. The altitude is h= x - I, and the lengths of the bases are b1= 2 and
b2= x + 1. Therefore
F(x)= i(b1 + b2)h
H2 + x + l)(x - 1)
= Hx + 3)(x - I)= t(x2 + 2x - 3).
=

Let us now try the parabolay = t2, taking a= -2:


y = f(t)

t2,

t -2.

Here we have
f: [-2, oo)--+ [O,

co .

The Integral of a Nonnegative Function

3.7

105

-2

For each x -2, F(x) is the area under this graph, from -2 to x. By our old formula,
F(x)
tx3
t(-2)3 = tx3 + l
=

In all these situations, F(x) is determined by (a) the given function f, (b) the
number a, and (c) the number x. All this is conveyed by the notation

F(x)

Li(t) dt.

That is, if/ is continuous and nonnegative, and a x, then

is the area under the graph of f, from t

a to t

x.

y
fi J(t)
I

1x f(t) dt

I
I

and

Here it should be understood that f is continuous on an interval containing a


x.
The expression

is called the integral from a to x of the function f The number a is called the lower
limit of integration (or, briefly, the lower limit) and x is called the upper limit of

106

Functions, Derivatives, and Integrals

3.7

integration (or simply the upper limit). The function f is called the integrand. The
notation for the integral may look formidable at first, but it is not hard to learn and
is convenient: it shows at a glance that we are taking the integral, of a certain function,
between certain limits.
We proceed to generalize these ideas in two ways.
a)

Suppose that f is negative, for some values of

t-axis are counted

t. In this case, areas below the


negatively. For example, in the left-hand figure below, Ai and A2

are positive numbers, representing the areas of the two shaded regions.
Ai positively; it is the area of a region above the t-axis.
it is the area of a region below the !-axis.
y

Thus we have

Similarly, in the figure on the right,

b)

So far, we have required

a <

x.

If

a >

x, we first find

J:f(t) dt,
and then reverse the sign. Thus, in the figure on the left below,

under our old definition. And

under our new definition.

We count

We count A2 negatively;

The Integral of a Nonnegative Function

3.7

107

y
(x, 2-x)
y

4
y=f(t)

x?
a

Consider, for example, f(t)


figure above). Take

L:

f(t) dt

J(2 -

t) dt

t(x

t, defined for every t (in the right-hand

-1

- 1 . Then, for

a =

x ;:;;; 2
2

+ 1)(3 +

we have

- x)

Hx

+ 1)(5

x),

by the formula for the area of a trapezoid.


For

x2

we have

J"'(2

- t) dt

-1

For

x;:;;;

-1

- t(x - 2)(x

- 2)

we have

J"'c2

t) dt

-1

r-\2
-[t(-1 x)(2 - x
Hx
- x).
-

- t) dt

J.,

+ 3)]

1)(5

You should check that these are the right answers for the three cases.

In each

case, we have computed areas of triangles and trapezoids by elementary area formulas,
and then attached the correct sign to the area of each region.

PROBLEM SET 3.7


1.

a) Consider f(t)

It!. Get a formula for

f
valid when

;;;;

0. Sketch.

ltldt,

108

Functions, Derivatives, and Integrals

3.7

b) Now get a formula for

valid when x 0. Sketch.


c) Now get

one

formula for the same integral, valid for every

d) Let

F(x)
Get a formula for F'(x), valid when

f'

t
l l dt.

> 0.

e) Now get a formula for F'(x), valid when x < 0.


f) Finally, get a formula for F'(x), valid for every x.
2.

Do the same six things for

f'(t2
3.

+ 1) dt.

Consider the function defined by the graph below.


y

This function is called the signum.

Algebraically,
when

< 0,

when

sigt

when

0,

> 0.

Obviously sig is not continuous at 0. But we define

f'

sig t dt

in the same way as for continuous functions. For example,

i1

sig t dt

I;

here the integral is the area of a square of edge I. Similarly,

1-i
13

sigtdt

sigtdt

3,

!)

1,

x.

3.8

f1 sig t dt
J_l

-1 + 1

O;

and so on.
sig t that you did forf (t)
Do the same six things for f (t)
Do the same six things, for f(t)
t It!.
a) Explain why
=

4.
5.

109

The Derivative of the Integral

It/ and f (f)

t2

+ 1.

( t3 dt
J_l
1

0.

b) Explain why

(1 t273 dt
J_l

0.

c) Let f be a cubic polynomial, and suppose that


Show that

6.

I (-a)

Explain why
0

7.

Explain why
0

3.8

<

I (0)

fa f(t) cit
J3 -dt
<

I (a)
=

0.

o.

14
< 12

3 1---+t2 dt
J
1

<

.
3

5.

THE DERIVATIVE OF THE INTEGRAL

In Problems 1 and 2 above, you found that if

F(x) =
then

{xf(t) dt,

F'(x) = f(x)
for every

x.

That is, at each point

the value of the integrand function f.

the derivative of the integral function Fis simply

In fact, it is not hard to convince ourselves that if f is a continuous function, this

is what always happens.

Consider first the case in which f is positive.

y
y=f(t)

y=F(x)

Functions, Derivatives, and Integrals

110

Take a fixed

x0

By definition,

F'(x0 ) -

X -

o:->o:0

Now

F(x)

Lf(t) dt

F(x) - F(x0)
<

x,

Xo

F(x0)

and

Therefore

x0

F(x) - F(x0)

1.tm

For

3.8

f"f(t) dt.
a

"
C
f(t) dt - L f(t) dt.
.

as in the figure below,

F(x) - F(x0)

"

( f(t) dt.
J..

Since f is continuous,

F(x) - F(x0)

(x - x0)f (xo).

Here"" means "is approximately equal to." We are claiming that the area under
the curve, from
x

- x0

x0

to

x,

is closely approximated by the area of a rectangle with base

and altitude f(x0). Therefore

F(x) - F(x0)
X - Xo
and the approximation gets better as

f(Xo'
)

gets closer to

x0

y=f(t)

For the situation shown in the figure, in which the graph rises to the right of
it is easy to see why the approximation is good.

F(x) - F(x0)

Here we have

(x - x0)f (x0)

+ E,

where E is the area of the little curvilinear triangle at the upper right.
E <

e(x - x0)

and

--E

X - X0

< e,

Now

x0,

The Derivative of the Integral

3.8

where

111

is the altitude of the curvilinear triangle. Thus, when we write

F(x) - F(x0)

!"'3

(x

x0)f (xo),

the error in the approximation is E; and when we write

F(x) - F(x0)
X

,,,..,

""

Xo

f(Xo) '

the error in the approximation is Ej(x - x0) , which is less than e.


y

x0

If x < x0, the same approximation formula holds, although the reasons are
slightly different.

Here the area under the curve from x to x0 is


F(x0) - F(x);

the area of the rectangle is

( x0 -

x)f (x0);

these are approximately equal. Changing the sign of each, we get


F(x) - F(x0)

!"'3 (x -

x0)f(x0),

and

as before.

F(x)

- F(x0)

X - Xo

,,,_,
rv

f(Xo'
)

But the fraction on the left is the slope of the secant line to the graph of F. The
limit of this fraction is F' (x0); and since the fraction is close to f (x0) when xis close

to x0, we ought to have

F'(x0)

f(xo).

y
F

112

Functions, Derivatives, and Integrals

3.8

If this is true, then we have:


Theorem 1.

If f is continuous on an interval containing

D.,
at each point

f'f

(t) dt

a,

then

f(x)

x of the interval.

In fact this is true, and can be proved by a more careful use of the ideas that we
have just been describing informally.

But let us postpone the proof until the end of

this chapter, and see, in the meantime, what the theorem is good for.

Consider the

following problem:
Problem I.

Calculate the area under the graph of y =

x4, from x =

0 to

I.

To solve this problem, the first step is to realize that whoever proposed the
problem has asked the wrong question: the answer to his question is a number, and
there is nothing about this number that is easy to see.

y=f(t) =t4

y=x4

The easiest way to solve Problem 1 is to consider instead the following:


Problem 2.

Find a formula for the function

F(x) =

t4 dt.

It might seem that Problem 2 must be harder, but this is not true.
that, while information about the number
something about the function

The point is

F(l) is hard to come by, Theorem

1 tells us

F, namely,
F'(x) = x4
x4 as its derivative. We get powers
x, using the formula

We now ask ourselves what sort of function has


of

by differentiating powers of

Thus

The Derivative of the Integral

3.8

This is like F'(x), except for the factor 5.

113

But the 5 is easy to get rid of: we divide

x5 by 5, getting
D(tx5) = t . 5x4 =x4.
Thus we have found a function

G(x) =-kx5,
which resembles the given function

F(x) =
To be exact:

G'(x) =F '(x)

f'

t4 dt.
for every x,

(1)

G(O)=F(O).

(2)

To see why (2) holds, we observe that

G(O) = t05 = O
and that

F(O) =
Equations

(1)

t4 dt = 0.

and (2) ought to guarantee that


for every x;

G(x) =F(x)
that is,

G=F.
The functions F and G start with the same value, at x= 0.

And

(1)

tells us that F

and G always change at the same rate. This suggests the following:
Theorem 2.

(The uniqueness theorem.)


I, and let a

defined on the same interval

Let F and G be differentiable functions,


be a point of

I.

If

F(a) = G(a)

(3)

and

F' (x) = G' (x)

for every x in

I,

(4)

then

F(x) = G(x)

for every x in

I.

(5)

Here we call the interval I because we want to allow intervals of all kinds, including

[a, b], [a, b), [a, oo), ( -oo, a], and so on. We also allow the case I= ( - oo, oo).
This is the case for the functions

F(x) =

't4 dt,

G(x) = tx5

that we have been discussing in the last few pages.


The uniqueness theorem is a consequence of the mean-value theorem (MVT)
of Section 3.2. The proof is as follows.

114

3.8

Functions, Derivatives, and Integrals

Suppose that F(b) G(b)for some bon the interval I. For each xon [a, b], let
H(x)
Then H(a)

0, but

H(b)

F(x) - G(x).

0.
y

The slope ofthe chord joining the endpoints ofthe graph of His

H(b) - H(a)
b-a
By MVT there is an x between

O.

and bsuch that

H'(x)

H(b)-H(a)
b-a

Therefore H' (x) 0. But this is impossible: for every xwe have
H'(x)

F'(x)

G'(x)

0.

Theorems 1and 2,in combination,enable us to solve some difficult area problems.


Example 1. Find the area ofthe region above the xa
- xis and below the graph of
f(x) = 1 -x2
y

Obviouslythis area is

Let

F(x)

1:(1 - t2) dt.

The Derivative of the Integral

3.8

Then

F'(x)= I

by Theorem 1.

Now

115

x2,

It is easy to find another function with this derivative, namely,

F(-1)=0,

G(x) = x - tx3

and

G(-1) = -1 + t = -i.

(?)

But this is easy to fix: we change our minds and write

G(x) = x - tx3 + i.

Then

G(-1)=0,

as it should be; and by the uniqueness theorem it follows that

for every

x.

1:
Example
to

x = 2.

2.

G(x) = F(x)

Therefore

(1

t2) dt = F(l)

G(l) =

Find the area under the graph of

-} + i = f.

y = x2 + x + 2, from x =

-1

:
-1

1
1

Here the area is

Let

fy2 + t + 2) dt.
F(x) = Jy2 + t + 2) dt.

Then

As our first guess, let

F'(x) = x2 + x + 2.

G(x) = }x3 + }x2 + 2x,

so that

G'(x)

F'(x).

We find that

3.8

Functions, Derivatives, and Integrals

116

-t + t - 2
G(-1)
2x + ll, so that G( -1)
=

rt2 t
+

-1-l-.

+ 2) dt

Therefore we really want

G(x)

tx3 + t x2 +

This gives the answer in the form

0.

F(2)

G(2)

.
l 8 + t

+ 2 . 2 + 1-l

v-

J,_l

The same scheme can be used to calculate integrals in which the integrand is
negative and hence does not represent an area.

F(x)

G'(x)
We then arrange for

G(a)

f'f (t)
G

we first find (if we can) another function

Given

dt,

such that

F'(x)

f(x).

to be 0, by adding a suitable constant to the first


G(x)
F(x) for every x. Therefore

we tried. We then know that

!Jf(t) dt

that

F(b)

G(b),

in the same way as for positive functions, and for the same reasons.
PROBLEM SET 3.8
I. By the methods of this section, find the area under the graph of y

x - x3, from
x
0 to x
1, and sketch. (Here, and hereafter in this problem set, you should
explain what functions you are using as the functions/, F, and G.)
=

2. Find the area of the region lying below the x-axis and above the graph of y
x4
I.
Note that here the function f is negative, so that the area and the integral are different;
the area A is positive, and
=

J_ f

(t) dt

-A.

3. a) Find the area under the graph of y = x10, from 0 to b.


b) Find the area under the graph of y = x10, from a to b. (Q11e1y: Do you need to give
separate discussions for the cases 0 < a < b, a < 0 < b, and so on?)
c) Same as Problem 3(b), for the graph of y
x100.
=

d) Now find a general formula for

valid for every positive integer


4.

Find
a)

fu2

2t

5)dt

b)

f'

tndt,

11.

cx2 + 2x + 5)dx

c)

(z2

+ 2z +

5)dz

3.8

The Derivative of the Integral

117

d) A general formula for

F(x) J:(t2 + 2t + 5) dt.


=

5.

a) Find

(t3 + t

1) dt.

b) Get a general formula for

F(x) f + t - 1) dt.
(t3

6.

Get a general formula for

f'(t5 - 2t3 + 1) dt.


7. a) Find the area under the graph of

x
v' x2 + 1 '

y=

1 2.
x/ v'x2 + 1,

from

to

Unless you happen to remember a function whose derivative is


you are going to have to figure out how this function might arise as

the answer to a differentiation problem.

The radical in the denominator suggests

that somebody has been using the formula

Dv'fb) Find

f'
2v'J'
-

Jl t2t + 1 dt.
__

-1

Then sketch the graph of

v'

__

y = /(!)

v'

t2 + 1 '

as well as you can, and explain how the numerical value that you got for the integral
could have been predicted, without any calculations at all.
8. Let

F(x) f(1 +
=

Express

F'(x)

formula.

not

being asked to express

Same as Problem 8, for

F(x) r(1 + v'()500 dt.


=

10.

dt.

by an elementary formula (that is, by a formula not involving integrals

or differentiations). Note that you are


9.

v'()0

Same as Problem 8, for

F(x)

by an elementary

118

11. Find the area under the graph of y =


12. Same, for the function y =
13.

3.8

Functions, Derivatives, and Integrals

1
_

1 __

x + 1

(and above the x-axis) from 0 to 1.

.
vz -x

--

Find

[Hint: This problem is easier if you forget the binomial theorem.]


14.

Find

15.

Find

16.

For n 7" 0, we have

x2

(1

dx.
xa)2

Since n 7" 0, the function


f(x) = x-1 =

1
-

1
never appears as the derivative of a power of x. If we allowed n = 0, then xn = xo
for x > O; the derivative is O; and l/x still does not appear. Thusf(x)
l/x is not the
derivative of any integral power of x.
Question: ls there any function at all which has f(x) = 1/x as its derivative, say,
for x > 0? If so, what function?
=

* 1 7.

Consider

If you attempt to evaluate this integral by applying the methods of this section in a
mechanical sort of way, you will get an "answer." If you try to interpret your answer
geometrically, you will see that your answer cannot possibly be right. What went wrong?
(Evidently we must have been trying to apply a theorem in a case in which its hypothesis
is not satisfied. The question is what theorem and what hypothesis.)
*18.

In Theorem 1, suppose that we had omitted the hypothesis that/ is continuous. Give
an example to show that the resulting theorem would not have been true. [Hint: You
have already seen cases in which a function of the type
F(x) =

ft

(t) dt

fails to have a derivative at some point x0; and surely we cannot have

if there is no such thing as F' (x0).]

Uniformly Accelerated Motion

3.9
3.9

119

UNIFORMLY ACCELERATED MOTION

Suppose that a particle is moving, according to some given law, along a line . If we
think of the line of motion as the y-axis, then the motion can be described by a
function

f:

I->- R;

for each time ton the interval,f(t) is they-coordinate of the moving particle at time t.
Thus, for example, in the figure below, the total time interval I is the closed interval
[t1, t4]. The figure tells us that, at the start of the motion, the time is t1 and the particle
is at the pointy = 1; in the time interval [t1, t2], the particle rises from 1 to 3; in the
time interval [t2, t3], the particle falls from 3 to -1; and in the time interval [f 3, t4],
the particle rises from -1 to 4.
y
4
3
2
1

-1

The figure shows a finite time interval I= [ti. t4]. More generally, the function
f may be defined on an infinite time interval I= [t1, oo) or I= R = (-oo, oo).
But most of the time, on or near the earth, the motion begins at some time t0, and
eventually the motion stops. The velocity is the function

v=f':

I_.. R,

provided that f is differentiable. The acceleration is the function

a= v': I->- R,
provided that vis differentiable. Thus the acceleration is the derivative of the derivative
off. We call this the second derivative off, and denote it by f". Thus we can sum up:

v=/',

a= v =f"
I

(by definition).

Finally, there is a fourth function associated with the motion. This is the function
F: l->-R
which gives, for each time t, the force F(t) acting on the body at time t.

Functions, Derivatives, and Integrals

120

3.9

We shall now see what form these functions take when/ describes the motion of a
freely falling body. Before we can work mathematically on the problem, we have to
state our physical assumptions in mathematical form.

1)

Newton's second law asserts that acceleration is proportional to force divided by

mass; that is,

F(t)
a(t) = k1-

(k1

const).

2)

For a freely falling body (or a body projected vertically upward), the force is the

resultant of the

weight (which acts downward) and the air resistance (which acts up

ward when the body is falling and downward when the body is rising). If the speed is
moderate, then the air resistance can be neglected. Hereafter, we shall assume that the
weight is the only force, so that

F(t)
where

3)

W(t) < 0

for every t on

[c, d],

W(t) is the weight at time t. W(t) < 0 because weight pulls things downward.

Evidently, the weight will not change merely with the passage of time, but it will

depend on the altitude; the greater the altitude, the less the force of gravitation. But
if the altitude is not very great, then the weight will be very nearly constant. We shall
assume hereafter that the weight of a freely falling body is constant.

F(t)

W(t)

Therefore

k2 < 0. Therefore
a(t) = k1

k2

for every

< 0

t;

and

a(t)

k3

< 0.

This last equation says that

for each falling body there is a constant which is equal to


the acceleration, independently of the time.

4)

There remains, however, a question: is there one constant which works for all

falling bodies, or does the constant acceleration depend on what sort of body is
falling? Conceivably, the law governing the free fall of heavy bodies (such as cannon
balls) might be different from the law governing the fall of light bodies (such as BB
shots).

Jn fact, until the time of Galileo, everybody thought that heavy bodies fell

faster.

The story goes that Galileo proved them wrong by dropping two iron balls

of different sizes off the leaning tower of Pisa: they hit the ground at the same time.
Since

k3/m

is independent of

m, there is a constant -g

k3/m which gives the

acceleration of every freely falling body, regardless of its mass. The number

k3

g= -m

is called the

acceleration of gravity. If distance is measured in feet and time in seconds,

then numerically

32,

measured in ft/sec2

Uniformly Accelerated Motion

3.9

121

The above discussion can be summed up as follows:

If we neglect air resistance and neglect the variation of weight with


altitude, then the acceleration of a freely falling body is given by the formula

a(t)

-g,

where g is a constant and

32 ft/sec2

We now consider the problem of finding the functions that satisfy the equation

a(t)
Problem.

J"(t)

-g.

The function

f:

R---+R

has the following properties:


a) f"(t)

-g for every t,

b) f'(O) is a given number v0,


c)

f(0) is a given number Yo


What is f ?
Using the notation v for f' and a for v'

f", we write these conditions in the

form:
= -g for every t,
v(O)
v0,
c) f(O) =Yo

a)

a(t)

b)

Thus our data consist of (a) the constant acceleration -g, (b) the initial velocity

v0
a)

v(O), and (c) the initial position y0

f(O). The solution is as follows:

We know that

v'(t)

a(t)

-g

for every t. The function

u(t)

-gt (?)

has -g as its derivative; the only trouble is that u(O) is 0 instead of v0

But this is

easy to fix: we change our minds and let

u(t)

-gt + v0

Our function u then has the same derivative as v, and has the same value at t
By the uniqueness theorem, u and v are the same function, and so

v(t)
b)

-gt + v0

We know now that

f'(t)

v(t)

-gt + v0

0.

122

3.9

Functions, Derivatives, and Integrals

We want to find/ Now the function

z(t) = - t2 + v0t (?)


2

has

-gt + v0 as its derivative; the only trouble- is that z(O) is 0 instead of y0 But

this is easy to fix: we change our minds and let


g
z(t) = - - t2 + v0t + Yo
2

The function z then has the same derivative as f, and has the same value at
By the uniqueness theorem, f and
f(t)

0.

z are the same function, and so


=

t2 + v0t + Yo
2

This completes the solution. We sum up in the following theorem:


Theorem 1.

Let f be a function R

__,,.

R.

If

f"(t) = -g for every t,


f'(O) = v0, and
c) f(O) =Yo

a)

b)

then
d)

f(t) = (-g/2)t2 + v0t +Yo for every t.


Thus the mathematical problem defined by (a), (b), and (c) has only one solution.

This fact is important in applications, because, if our mathematical problem had two
solutions, we would have to find out which of the two solutions applied to the
But, if f"(t)
-g, f'(O) =
v0 = 10, andf(O) = v0 = 5, thenfmust be the functionf(t) = (-g/2)t2 + lOt + 5.

physical situation that we started out to investigate.

Theorem 1 can be stated in a more general form. If I is any time interval whatever
(finite or infinite) and

t0 is any point of I, then we can consider a function


f: /_,,.R,

such that

J"(t) = -g for every t,


f'(to) = Vo ,
c) /(to) =Yo

a)

b)

The uniqueness theorem applies to our problem in exactly the same way as before,
and the algebra is only slightly more complicated. We are given

v'(t) = -g.
We try

u(t) = -gt (?);


we observe that

u(t0) = -gt0 instead of v0; to fix this, we let


u(t) = -gt + gt0 + Vo

Now u has the same derivative as the unknown function


at t

123

Uniformly Accelerated Motion

3.9

t0 By the uniqueness theorem, u(t)

v(t)

v,

and has the same value

v(t) for every t, and so

- gt + gt0 + v0

This solves half of our problem.


We know that/'(t)

v(t). We therefore try

z(t)
we observe that

(t0)

z(t)

- !2 + gt0t + V0t (?);

(g/2)t + v 0t0

instead of y0; and we fix this by letting

- - t2 + gt0t + v0. t - - t20 - v0t0 + Yo

- (t - t0)2 + Vo(t - to) + Yo


2

Then

has the same derivative as/, and has the same value at t0. By the uniqueness

theorem it follows that z{t)

f(t)

f(t) for every t. Therefore

- (t - t0) 2 + v0(t - t0) + Yo


2

None of these formulas should be learned. What you need to learn is the process by
which they were derived; if you remember the method, you can use it. For example:
Problem.

Given f"(t)

Solution. Let v

f'.

3,/'(3)

Then v'(t)

1, and/(3)

2, what is/?

This suggests that v(t)

3.

3t.

Adding the

appropriate constant, to get v ( 3 ) = 1, we obtain

v(t)

3t - 8.

Now

f'(t)
This suggests f(t)
we have

f(t)

3t - 8.

ft2 - 8t. Adding the appropriate constant, to get f (3)


ft2 - 8t - t . 32 + 8 . 3 +

2,

it2 - 8t + .
-

This is the answer. (Two differentiations verify that it is an answer; and two applica
tions of the uniqueness theorem tell us that it is the only answer.)
PROBLEM SET 3.9
Find formulas for the unknown functions, under each of the following sets of conditions.
In all but one of these problems, the conditions are enough to determine the function.

In three cases, however, there are infinitely many possibilities; and in these cases you should
try to explain what the possible functions are.

1. /'(t)
3. f"(t)

3t + 4,/(0) 4
-1,/'(0) 2,/(0)
=

2. f'(x)
4. f"(x)

x3 - 7x + 5,/(0)
3x2,f' (1) O,f(1)

-1

124

Functions, Derivatives, and Integrals

5. J"(t)

7. g (x) =

9.

I (t) =

3.10
1
,
6. f (x) = 2 ,/(1)

3
t ,f'(O) = 1,/(1) = 0
_

x
1

,g(O)
x2

t ,/(2) =

8. g'(x) = x(x2 + 1)2, g(3) = 1

-1

10. f'(t) = t2(1 + t3)10,f(O) = 2 (By all means, do not use the binomial theorem on this
one.)

11. j'(t) = t2 + 1,/(1)


13. f'(t) =
15.

t2

(I

+ 3

t )2

,/(1)

12. f"(x) = x,/(1) = O,f'(I)

2
=

14. g"(t) =

+ l 3,g(O) = l,g(I)= 1.
(t
)

A "theoretical projectile" is fired vertically upward, from the surface of the earth, at
time 0, with initial velocity 10 ft/sec. When will it hit the ground again? For what time
interval is its motion described by the condition a(t) =

g?

(Following the advice

given at the end of this section, you should solve this problem with your book closed,
using the methods but not the results given in the text.)

I 6. A "theoretical projectile" is fired vertically upward, from the surface of the earth, and
hits the ground again ten seconds later. What was the initial velocity?

17. A "theoretical projectile" is fired vertically downward from the top of a 200-foot
building and hits the ground 2 seconds later. What was the initial velocity?

18. We state this problem in a nonmilitary form. A billiard ball is raised to a certain height
y0 and simply dropped, so that it begins its free fall at velocity v0 = 0. Five seconds
later it hits the ground. What was y0?
19. Free fall near the surface of the moon works the same way as free fall near the surface
of earth, except that the constant acceleration -gL (L for lunar) is different; the smaller
mass of the moon makes the difference. Suppose you went to the moon, dropped a
billiard ball as in Problem 18, and found that it dropped 3 feet in one second. What
could you conclude aboutgL?
*3.10
PROOF OF THE FORMULA
FOR THE DERIVATIVE OF THE INTEGRAL

We shall now prove Theorem I of Section 3.8.

We have a continuous function f;

we let

F(x)
we take a point

Let

x0;

ixf(t) dt;

and we want to show that

be any p ositi v e number.

has an eb-box at the point

Since f is continuous, we know that the graph off

(x0,f(x0)).

Proof of the Formula for the Derivative of the Integral

3.10

125

f(xo)+l-----------+-----,

x0-5

Thus

Ix - x0I < a

If(x) - f(xo)I <

=>
<=>

Xo

f(x0)

- E

< f(x) < f(x0) +

E.

We are going to use these inequalities to get information about the function

m(x)
Here

1
=

--

X - Xo

[F(x) - F(x0)].

is the slope function for the function

F, so that lim,,_.,0 m(x)

F'(x0).

Evidently

F(x) - F(x0)

i
f

"'f(t) dt -

f(t) dt

!(t) dt,

:ro

and so

m(x)
If f is positive and

_
_

x - X0

[F(x) - F(x0)]

l_

X - Xo

f(t) dt.

"'o

x0 < x, as in the figure, then F(x) - F(x0) is the area of the

shaded region.
Case 1.

Suppose that

f(xo)
Therefore

Xo < x < Xo + a, as in the figure. Then


-

< f(t) < f(xo) +

(x0 ;;;

;;; x).

126

3.10

Functions, Derivatives, and Integrals

and so
[f(x0) - E:](x - x0) <

ff(t) dt

<

'
"'

[f(x0) + E:](x - x0).

Dividing by the positive number x - x0, we get


f(x0) - E < m(x) < f(x0) + E.
Thus we have shown that
x0 < x < x0 + o

If (x0) - m(x)I

=>

(.1)

< E.

Case 2. Suppose that x0 - o < x < x0 Then

(x t x0),

f(x0) - E < f(t) < f (x0) + E


just as in Case

1.

Therefore
"'0

[f(xo) - ]

"'0
[f(x0) + ]

dt J:f(t) dt i
<

<

dt.

(We are integrating from left to right.) Therefore


[f(x0) - ](x0 - x) <

I:f(t) dt

<

[f(x0) + E:](X0 - x).

Since x0 - x > 0 in Case 2, we can divide by x0 - x, preserving these inequalities.


This gives
1

f(x0) - E <

--

J"'f(t) dt

X0 - X "'

<

f(x0) + .

When we interchange x and x0, this changes the sign of each of the factors in the
middle of this expression. Therefore
f(x0) - E <
To sum up:
X0 - o < x < X0

=>

X -1 Xo l"'f(t) dt

--

a:0

<

f(x0) + .

/(x0) - E < m(x) < f(x0) + E

=>

lm(x) - f(x0)1 <

E,

(2)
exactly as in Case
Therefore

1.

Fitting together our results in Cases

0 <

Ix - x01 < o

lim m(x)
a::-.a::o

lim
a::-t-a:o

=>

and 2, we get

lm(x) - f(x0)1 < E.

Xo

X -

[F(x) - F(x0)]

which was to be proved.


This proof is not easy, but it might have been worse. It was made simpler by the
fact that for each E > 0, the o > 0 that we get from the hypothesis lim.,_,.,0/ (x)
f(x0) is precisely the o that we need, to conclude that lim.,_,.,0m(x) f(x0).
=

3.10

Proof of the Formula for the Derivative of the Integral

127

PROBLEM SET 3.10

Find the first and second derivatives of the following functions.


1.

f(x)

3.

h(x)

5.

g(x)

7.

f(x)

9.

h(x)

11.

g(x)

13.
15.
17.

f' (t3 1) dt
{"' dt
f(x)
Jo t
h(x)
i v t2 + 1dt
f (1 + t3)100dt
8.
10. f(x) J: v2 + tdt
2
12. h( )
1 t 4dt
"'
f
-x
14. J1x t3 dt+ 1
Jx -1t2++t110dt
2
18. {"' J 1 + t dt
Ja 1 + t4

Vl

v(

4.

6.

v!

l
+

16.

1T

If you know that

D.,

for every continuous function

f,

f /(t)dt

f(x)

f(x),

this does not immediately enable you to find the

derivative of

*20.

2. g(x)

g(x)

dt
iJ,,x 4dt
2 t +1
J"' J 1 + tdt
t
-1

19.

r t2dt
J("'2,, (t4 - t)dt
r4 + t8dt
J("'4,, 1dt
{"'
dt
Jo 1 + t2
f (t2 + 1) dt

rJo2x

Vl

t8dt.

vl

t8 dt.

But find the answer f', by any method.


Findg'(x), given
g(x)

"'

(
Jo

'

Trigonometric and
Exponential Functions

4.1
DIRECTED ANGLES. TRIGONOMETRIC
FUNCTIONS OF ANGLES AND NUMBERS

In elementary geometry, when we speak of an angle we simply mean a geometric


figure, that is, a set of points:

If

--+

AB

and

-+

AC are rays

which have the same endpoint

line, then their union is the angle

LBAC.

A,

but do not lie on the same

(In the figure, the arrowheads remind us

that the sides of an angle are rays rather than segments.)

Some authors define the

word angle in such a way as to allow "zero angles" and "straight angles."

In any case, in elementary geometry the idea of an angle does not include the idea
of order; the sides of an angle are not arranged in an order, any more than the sides
of a triangle are.
Initial

Terminal

L.
0
OL.
A
LBOA A
LAOB

In trigonometry, however, the order of the sides of an angle makes a difference.


Henceforth, whenever we speak of an angle we shall mean a directed angle.
in the figures above,
initial side, and

--+

OB

LAOB

is an ordered pair of rays

is the terminal side.

Thus
128

-+

--+

(OA, OB);

LAOB is

Thus,

OA is
LBOA.

the ray

different from

-+

the

Directed Angles.

4.1

Trigonometric Functions of Angles and Numbers

129

Wf suppose that a coordinate system is given in the plane. The counterclockwise


direction is the direction from the positive x-axis to the positive y-axis, as shown
below. The counterclockwise direction in a coordinate plane is regarded as positive;
and the clockwise direction (running the other way) is regarded as the negative direc
tion.
A new coordinate system is called
wise direction; otherwise it is called

right-handed if it gives the same counterclock


left-handed. In the figure below, the right-handed

coordinate systems are marked R, and the left-handed ones are marked L.
y

Lx

"X

We can now define the trigonometric functions of an angle LAOB. The procedure
--+

is as follows. We set up a right-handed coordinate system, in which the initial side OA


--+

is the positive half of the x-axis. On the terminal side OB we choose a point P 0.
P has coordinates

(x, y), in the coordinate system that we have set up,


r.

and the distance

OP is a positive number

y
x

It is easy to show (by similar triangles) that the ratios

xfr, yfr, yfx, xfy, r/x, r/y

are

independent of the choice of P; they depend only on the angle that we started with.
Thus we can define the trigonometric functions of LAOB as follows:
sin LAOB
tan LAOB
sec LAOB

y/r,
yfx
rfx

cos LAOB
(for
(for

x
x

0),
0),

cot LAOB
csc LAOB

=
=

xfr,
x/y
r/y

(for

0),

(for y

0).

4.1

Trigonometric and Exponential Functions

130

We have defined six functions.

Note that the domains of these functions are not

sets of numbers, but sets of angles.


Consider now the unit circle C, with center at the origin, in the xy-plane.
y

P,

-1

%
P0

-1

Let P0 be the point

(1, 0),

as in the figure. To each real number fJ there corresponds a

point P0 of C, under the following rules:

1)

Given

fJ

> 0, we start at P0 and move around C in the counterclockwise direction

until we have traced out a path whose total length is

fJ.

The point where our path

ends is P0

2)

Given

fJ

< 0, we start at P0 and move around C in the clockwise direction, until

we have traced out a path whose total length is

lfJI.

The point where our path ends.

is P0
These rules define a function
w:

R-+ C

fJ f-'>P0 == w(fJ),
under which to each real number fJ there corresponds a point of C. The function w
is called the winding function.
rather than numbers.

Note that the values of the function

are points

Note also that

Po+2" ==Po,
for every e.

The reason is that when we add

27T

to

fJ,

this merely means that we

take another round trip around the circle, ending at the same point P0 where we
began.

Similarly,
and

for every integer n, positive, negative, or zero.


We shall use the winding function to define trigonometric functions of numbers,
in the following way.

For each number

fJ,

let

LO== LP00P0

Directed Angles.

4.1

The symbol
number e.

L()

Trigonometric Functions of Angles and Numbers

is pronounced "angle () ;
"

L()

131

is the angle which corresponds to the

We now define
sin()= sin

L()=

sin

LP00P0,

cos() = cos

L()=

cos

LP00P0,

and so on.
y

We have defined these functions in terms of

L()

because we want to emphasize

their geometric meaning. But for some purposes, it is simpler to forget about angles,

and merely use the coordinates of P0 If

Po= (xo, Yo),

then
sin()

COS()=

y0,
Xo,
Yo
Xo
Xo
Yo

(whenever

x0

0),

(whenever

Yo

0),

sec()= -

(whenever

x0

0),

_!_
Yo

(whenever

Yo

0).

tan() =

cot() =

Xo

csc ()

Using these definitions, we can derive the usual formulas. Since P0 is on the unit
circle C, we know that

OP0=

and we have:
Theorem 1.

1.

Therefore

y=

1,

For every (),


cos2 () + sin2 e= 1.

Trigonometric and Exponential Functions

132

If the sign of

4.1

is changed, this sends us around the circle C in the opposite

direction. Therefore the points P8 and P_8 are symmetric across the x-axis, as in the
figure.
y

Pe
/1
/ I
/ I
/ I
/

Therefore

Y-o

This gives:

Theorem 2.

For every

-Ye

6,

sin

(-6) =

-sin

6,

cos

c -6) =

cos

6.

Plotting the points P0, P1112 and P11, we get the following:
,

Theorem 3.
sin 0

Sln

7T
-

0,

cos 0

1,

cos

sin 7T = 0,

Theorem 4.

For each

'!!.
2

1,

= 0'

cos 7T = -1.

6,

sin (7T +

6) =

-sin

6,

cos (7T +

6) =

-cos

6.

Trigonometric Functions of Angles and Numbers

Directed Angles.

4.1

133

Proof For each (), the points P9 and P1T+9 are symmetric across the origin. This
holds in all quadrants. Therefore

and the theorem follows, by definition of the sine and cosine.


In the kind of trigonometry that we are dealing with now, the relation between
angles and numbers is a little tricky.

If () is known, then P9 is determined, and so

L() is determined; L() is LP00P8


y

But if the angle is known, the number() is not determined. In the figure on the right,

LP00Q is given, but for this angle we may have


()

!7T,

or

()

In fact, for every integer

n,

21T + !7T

or

\1 7T,

()

t7T - 21T

-i1T.

positive, negative, or zero, we may have


()

If an angle

!7T + 2n1T.

LAOB corresponds to a number(), under the rules that we have been


LAOB has measure (), and we shall write, for short,

giving, then we shall say that

LAOB
(We have seen that every angle

L().

LAOB has infinitely many measures e.


"the measure of an angle.")

For

this reason, it would be misleading to speak of

So far, we have used the notation L() only for angles "in standard position,"
that is, angles with the positive half of the x-axis as initial side.

But it will be con

venient to use the same shorthand for angles in general. Thus

LP00Q

and

LP 0Os

3
L 7T.
4

134

4.1

Trigonometric and Exponential Functions


y
x'

But if we set up new axes

x',

LQOS

'
y , we can also say that

L 7T

LQOT

and

LTT.

Using other axes, not shown in the figure, we see that

LSOP0

(- )

37T ,
4

and so on.
PROBLEM SET 4.1

Derive the trigonometric identities given or suggested below.

The derivations should

be based on the definitions and theorems given in this section of the text.

1.
6.

sin

2.

7.

csc z

23. sin(7T - 6)

27. sec

( 7T - 6)

cos

16. cot

19. tan(7T + 8)

sin

8.

4.

tan x
cos

sin

12. cot2 8 + 1

11. 1 + tan2 8
1 5. tan( -8)

cos

3.

(-8)

9.
13.

20. cot(7T + 6)
24. cos(7T

8)

28. CSC(7T - 6)

cotx

sec

csc

secy
csc

sec

17. sec(-6)

18. csc (-8)

21. sec(7T + 6)

22. csc(7T + 8)

2 5. tan(7T - 6)

sin

801;:;; 10 - 001.

b) Show that the sine is a continuous function.

30. a) Show that for every 8, 80, we have


jcos fJ

--

14. sec2 8

sec x

10.

cscx

29. a) Show that for every 8, 6 0, we.have


!sin

5.

- cos fJ01 ;:;; lfJ - e01.

b) Show that the cosine is a continuous function.

26. cot(7T - 8)

='

The Law of Cosines and the Addition Formulas

4.2
4.2

135

THE LAW OF COSINES AND THE ADDITION FORMULAS

Jn the figure on the left below, We have X9 =COS 6, andy9 =sin 6, by definition Of
the sine and cosine.
y

P(x, y)

More generally, we have:


Theorem 1.

Let

P be

any point of

---+

OPe,

and let

OP=a.

Then the coordinates of

are

x=a cose,
Proof

y=a sine.

By similar triangles,

lxl =lxel
'
a
1

Therefore

and

lyl =a IYol
In these equations

xe also agree in
x=axe= acose,
and

which was to be proved.


Theorem 2.

sign, and similarly for

y=aye=a sine,

From this we get immediately:

(The law of cosines). If LACE = L8, then


c2 =a2 + b2 - 2ab cos e.

(The notation is that of the following figure.)


y

y and y9

Therefore

136

Trigonometric and Exponential Functions

4.2

Proof By the preceding theorem,


B

a
( cose,a sine).

And obviously
A = (b, 0).
Therefore, by the distance formula,
c2 = (a cose - b)2 + (a sin e - 0)2
=a2 cos2e - 2ab cose

= a2( cos2e

b2 - 2ab cose

sin2e)

b2

+ a2 sin2e

= a2 + b2 - 2ab cose'
which was to be proved.
Theorem 3.

For every

and cp,

cosce
Proof Let A = P0,

cp)

cose cos <P - sine sin rp.

= P8, and C = PB+ef>


y

Then
A= 1
( ,0),
c =(cos

e
(

rp), since

rp)),

and so by the distance formula


AC2 = [cos e
(

cp)

= cos2ce

rp) - 2 cos e
(

= 2 - 2 cosce

1]2

r/J).

sin2 e
(
+

rp)

cp)

+1 +

sin2 e
(

cp)

4.2

The Law of Cosines and the Addition Formulas

137

---+

We now set up a new coordinate system, with OP6 as the positive x'-axis.
y

In the new coordinate system,


A=P_8 = (cos
C =Pq,

(-8), sin (-8)) =(cos 8, -sin 8),

(cos cf>, sin cf>).

Therefore, by the distance formula,


AC2 =(cos

cos2

- cos c/>)2 + (-sin


2 cos

8 -

= 2 - 2(cos

8 cos cf>

8 cos cf>

8 -

sin c/>)2

+ cos2 cf> + sin2

- sin

+ 2 sin

8 sin cf>

+ sin2 cf>

8 sin cf>).

But the distance AC is independent of the coordinate system. Therefore


2 - 2 cos

(8

+ cf>) = 2 - 2(cos

8 cos cf>

- sin

8 sin cf>),

and
cos

(8

+ cf>) = cos

8 cos cf>

- sin

8 sin cf>,

which was to be proved.


Once we have the addition formula for the cosine, it is easy to get similar formulas
for the other trigonometric functions.
Theorem 4.

For every

and cf>,

cos

Proof

(8

- cf>) = cos

8 cos cf>

+ sine sin cf>.

Using -cf> for cf> in the preceding theorem, we get


cos (8

- </>) =

cos e cos (-cf>) + sine sin (-cf>).

But we know that


cos (-cf>) = cos cf>,

and

Using these, we get the desired formula for cos

sin (-cf>).=-sin cf>.

(6 +

cf>).

4.2

Trigonometric and Exponential Functions

138

For every 8,

Theorem 5.

cos

Proof.

( - )
e

and

=sine,

( )

sin

-e

=cose.

By Theorem 4,
cos

( )
-e

=cos 7!. cose


2
=0. cose

+ 1

by Theorem 3 of Section 4.1. Therefore


cos
Using 7r/2

( )
-e

sin 7!. sine


2

sine,

= sine.

8 for 8, we get
cos

[ - ( - ) J
e

=sin

( )
-

e .

Therefore
sin

( )
-e

= cose.

(The name of the cosine is a reference to this theorem; the word


the Latin

complementi sinus,

meaning

For every fJ and cp, sin

Theorem 6.

(fJ +

cp) =sin fJ cos cp

cos fJ sin cp.

Proof.
sin(fJ + cp) =cos

=cos

[ - (fJ + J
cp)

( - fJ)

cos cp + sin

( )
-

fJ

sin cp

=sin e cos cp + cose sin cp.


PROBLEM SET 4.2
1. tan (A + B)

3. cot (8 + </>)
5. sin 28
7.

cos 2li

tan A +tan B
=

l - tan A tan B

cote cot</> - 1
=

cote + cot</>

2 sine cose
1 - 2 sin2 IJ

cosine

sine of the complement.)

2. tan (A - B)

4. cot (A - B)

6.

cos 28

8. cot (Ii

2 cos2 e
</>)

is from

The Derivatives of the Trigonometric Functions

4.3

31T
9. a) sin-=
2
31T
11. a) tan-=
2
12.
15.
18.

o
(}
2 sin - cos- =
2
2

+ cos(}
2

(}
tan 2

31T
10. a) cos2 =

b) sin (3 7T + o)
2

b) tan (3 7T + e
2

13.
16.

sin(}

+ cos(}

(} 1 - cos(}
19. tan - =
sin(}
2

2 cos2

0
2

=
14.

- 1

- cos 20
=
2

17.

l
J

b) cos (3 7T + o
2

139

+ cos 20 =
2

[Hint: Let <P = (}/2, so that 8


in terms of <fi. Then prove it.]

2</i, and rewrite the formula

*20. Show geometrically (without using any of the theory developed in this section) that the
formula in Problem 18 holds whenever 0 is between 0 and TT. Discuss the problem of
extending the formula from this special case to the general case.
21. Show that there is no formula which expresses sin (0/2) in terms of sin e.
show that sin ((}/2) is not determined if only sin(} is known.

That is,

2 2. Find a formula which expresses !sin ((}/2)1 in terms of cos 0.

23 . Show that there is no formula which expresses sin(}in terms of tan e. That is, show that
tan(}does not determine sin 8.

24. Show that there is no formula which expresses sin ((}/2) in terms of sin (}and cos e.

25. Show that if Pe is known, then P3e is determined. [Hint: If Pe = P4,, what is the relation
between 0 and <P? In this case, what is the relation between 3(} and 34'? Between P3e
and P3q,?]
26. It is a consequence of Problem 25 that, if sin 8 and cos(}are known, then P38 is deter
mined, and therefore sin 3(}is determined. How ? That is, find a formula which expres
ses sin 3(}in terms of sin (}and cos e.

27. Can cos 3(}be expressed in terms of cos(}? If so, derive such a formula. If not, explain
how you know that no such formula exists.

4.3 THE DERIVATIVES OF THE TRIGONOMETRIC FUNCTIONS;


THE DIFFERENCES /lx AND llf; THE SQUEEZE PRINCIPLE

If we try, in a straightforward way, to find the derivative of the sine, we get into
trouble.

By definition,

!'(
if the indicated limit exists.

x0)

For f (x)

Sln

, Xo

1.

f(x) - f(xo)
,
X - Xo
x->x0
1m

sin

x,

sin

1.

!ill

x->x0

this definition says that

- sin x0

X -

X0

4.3

Trigonometric and Exponential Functions

140

if the indicated limit exists. In fact, the limit does exist. But it is not obvious what we
ought to do to this expression
sin

x -

sin

x0

X - Xo

in order to find its limit. For functions f which were defined algebraically, we found
ways to cancel out x

- x0

in fractions of the form

f(x) - f(xo)
X - Xo
using various algebraic tricks.

Evidently some new device is needed for the sine.

It is as follows. Let

The symbol

Ax

Ax= x - x0

is all one symbol.

is supposed to remind us that

Ax

It is pronounced "delta
is the difference in

Similarly, let
Llf
Here

A/ is

x,"

and the Greek delta

Obviously, x

the difference in/, as we pass from

(x0,f(x0)).

and

A/ is

1J
I
I

I
I

x0

I
I

x=x0+t:.x

Geometrically, it is easy to see that the expressions

x->x0

lim
ax->O

f(x) - f(xo)= '


f (xo)
X

+ Llx.

indicated by a new set of

t:.f=f(x)-f(xo)

and

x0

x0 to x.

Ax

Jim

f (x) - f (xo).

Geometrically, the use of the differences


axes, with the new origin at the point

x.

xo
Llf
= lim f(
Llx ax->O

x0

Ax) - f(xo)
= f'(xo)
Ax

are merely two different ways of describing the same limit f' (x0) .

The Derivatives of the Trigonometric Functions

4.3

141

The point of this procedure, in finding the derivative of the sine, is that it enables
us to apply the addition formula for the sine.

f(x0

=sin

Ax)

(x0

For f (x) =sin x, we have

Ax)

=sin x0 cos

+ cos x0 sin Ax.

Ax

Thus we have

f'(Xo)

1.
sin
=Sill Xo = Jill
.

(x0

.
sin
=IIm
.

Ax

x0 cos Ax

Ax

.
cos Ax
Sill Xo

Ax

il.x->O

- 1]

x0

sin

+ cos x0 sin Ax

il.x->O

=I!ill

Ax)

il.x->O

1.

Jill

sin

cos Xo

x0

sin Ax .

--

Ax

il.x->O

We are going to show that


Jim

cos Ax

and

=0

Ax

il.x->O

Jim

sin Ax

Ax

il.x->O

(1)

= l.

(2)

It will then follow that


sin'

x0

=cos

x0,

and
D sin

=cos

x.

The unknown limits ( 1) and (2) have curious forms.

Since cos

= I, the first

limit has the form


.
cos
IJill

(0

il.x->O

And since sin

Ax)

- cos

Ax

=cos ,

0.

(3)

0.

(4)

= 0, the second limit is


.
sin
I1m
il.x->O

Thus we have found that if cos'

(0

Ax)

- sin 0

Ax

=0 and sin'

=sm ,

= 1, then sin'

cos x for every

To simplify the notation, in the theorems that follow, we use e in place of


and state the theorems that we need in the following way:

Theorem 1.

sine
Jim -8-

8->0

1.

x.

Ax,

142

4.3

Trigonometric and Exponential Functions

Theorem 2. IIm

cose - 1

9-o

0.

Theorem 1 is the hard part; given that Theorem 1 holds, Theorem

it. To see this, we first observe that


lim

cose - 1
e

o-+o

Jim

9-+o

cose - 1 . cose +

Simplifying on the right, we express this as

/:::i

cos2 e - 1

e(cose + 1)

cose +

1
9

-sin2 e

e(cose + l)

lim

cose - 1
e

-lim
8-+o

][

i_n_e
s_
e

][

1im sine
0
8-+

follows from

.
J
.

The last formula can be factored into three parts, giving

o-+o

1im --1-o-+o cos e + 1

Given that Theorem 1 holds, this gives


lim

cos e - 1

o-+o

(Query:

-1 . 0 . l

0.

How do we know that lim0_,0 sine

= 0, and that lim _0 cose


1 ?)
9
First we observe that only positive values of e

It remains to prove Theorem 1.

need to be considered, because when we replace e by -e, the value of the fraction
(sine)/O is unchanged. Thus if (sine)/e-+ 1 ase approaches 0 through positive values,

it follows that (sine)/e

---+

1 as ()

-+

0 through negative values.

We shall show that for 0 < e <

1Tj2

we have

sine e tane.
y

-1

4.3

The Derivatives of the Trigonometric Functions

In the figure, () is the length of the arc from Q to P.


RP= sin()

and

QS

143

Since
=

tan(),

the inequalities that we want take the form


RP() QS.
To prove this, we have to go back to the definition of arc length.
The figure below shows a broken line inscribed in the arc from Q to P, with
segments of equal length

(In the figure,

11

a1 = a2 =

= an.

Thus the length of the broken line is

3.) We extend the radii of the circle until they intersect the vertical

line through Q; and for each segment of our broken line we let b; be the length of the
corresponding segment on the vertical line through Q.
y

It is a matter of elementary geometry to check that

and that
for each i.
Therefore
RP<A11< QS,
and so
sin()<A,, <tan 8.
As

n -+ oo,

A11 -+ () .

In fact, this is the definition of the length of a circular arc.

Therefore
sin()() tan().

144

Trigonometric and Exponential Functions

4.3

(When we pass to a limit, a "weak inequality''


preserved, but a "strong inequality"

<

<

A,. or A,,

An or A11 b is always
b is not necessarily preserved.

For example,

+ 1

for every

> 1

n,

11

but we cannot conclude that

(?)

lim

/1

+ 1

>

11

n-oo

1.

(?)

In fact, the limit is 1, which is I, but not >I.


equalities that we have written above.

always hold for 0 < 8 <

7r/2,

but we are not stopping to prove it.) Therefore

1
As 8-+ 0, cos 8-+ cos 0

1.

Hence the overcautious weak in

The strong inequalities sin 8 < 8 < tan 8

::::; _e_ ::::;

1_

- sine - cose

(You proved this in Problem 30(b) of Section

4.1.)

Therefore I/cos 8-+ 1, because the limit of the reciprocal is the reciprocal of the limit.

Thus the picture must look something like the figure below.
y
y=

cos e

e
----y = sine

That is, the graph of y = 8/sin 8 is "squeezed into l ,

1.1m
9 ... 0

e
--

sin e

"

and

1.

(This is an instance of a general "squeeze principle," to be discussed further at the


end of this section.) Therefore

because the limit of the reciprocal is the reciprocal of the limit.


this means that:

As we have seen,

The Derivatives of the Trigonometric Functions

4.3

145

Theorem 3.
D sin x =cos

x.

Once we know how to find D sin x, the derivatives of the other trigonometric

functions are easy.

cos'

cos

Ll.x) - cos

(x0

X0
+
---'--"------'-----=

x0 =lim

Ll.x

Ax->O

1.
cos
=Im -

x0 cos Ll.x - sin x0 sin Ll.x - cos x0


Ll.x

Ax->O

= cos

Xo

(cos

1.
cos
Im

Ao:->0

x0)

Ll.x -

Ll.x
(sin

x0)

Sill Xo

1.
sin Ll.x
Im -.6.x

Ao:->O

1 = -sin

x0.

Thus:
Theorem 4.
D cos

x = -sin x.

By simpler methods, we get

D tan

x =sec2 x,

D cot

x = -csc2 x,

D sec x = sec x tan x,


D csc x = -csc x cot x.

You will be asked to derive these, in the problem set below.


In finding the limit of &/sine, we used the following idea:
Theorem 5.

(The squeeze principle). Letf and g be functions defined at every point

of the interval I, except perhaps at the point x0 If


Jim f(x) = L,
and for each x, g(x) is betweenf(x) and L, then
lim g(x) = L.

4.3

Trigonometric and Exponential Functions

146

y
g

(}

Lt------

(}

Two illustrations of the theorem are shown above. The theorem is geometrically
clear, and is also easy to prove.
any box for fat (x0,
at (x0,

L),

L) is

The point is that since g(x) is betweenf(x) and

automatically a box for g at (x0,

L).

L,

Since fhas an EO-box

for every E > 0, it follows that g does also. Therefore


lim g(x) = L,
xx0

by definition of a limit.
y

f
g
-+-Ll--.,--.c.+-----,,.>.'-(} -.:-.-/ :
:
I
I
j
L------l-----J
I
I

The same idea also works when two functions approach the same limit, and a
third function lies between them.

(}

The Derivatives of the Trigonometric Functions

4.3

147

If
g(x) h(x) f(x),

and

limf(x)

lim g(x)

L,

x-+xo

then it follows that


limh(x)

L.

Similarly for the following situation:


y
g

All of these ideas are very closely related, and we shall refer to all of them as the
squeeze principle.
PROBLEM SET 4.3

Derive formulas for the following:


2. D cot x

1. D tan x
5. DVl - sin2 x
6.

9.

:2

4. D csc x

[Warning: It is very easy to get a wrong answer to this one.]


[Same warning.]

DVl - cos2 x

7. D cos2 x

3. D sec x

ce>J"

D 2 sin x cos x

8. D(cos2 8 + sin2 8)
10. Dv'l + tan2 8
sin x

11. D(csc2 8 - cot2 8)

12. D

cos x
13. D l
.
+ smx

14. D(x2 sin x)

1 + cos x

Show that the following differentiation formulas are correct:


15. D sin 2x =(cos 2x)2

1 6. D cos 2x =(-sin 2x)2

17. D tan 2x =2 sec2 2x

18. D sin ( -x) = [cos ( -x)](-1)

19. D cos (-x) = [-sin ( -x)]( -1)

20. D cot 2x = -2 csc 2x cot 2x

21. D tan (-x) = [sec2 (-x)](-1)

22. D sin 3x = (cos 3x)3

23. D cos 3x = (-sin 3x)3

24. D tan 3x = 3 sec2 3x

*25. Make a plausible guess for D,, sin ax, and verify it if you can.

Trigonometric and Exponential Functions

148

v'x.

*26. Same, for D., sin


*27. If f(x)

sin x and g (x)


(a)

4.4

f'

g,

x,
(b) g'
cos

then
=

-f,

(c)

and /2,

g2

f(O)

(d)

0,

g(O)

1.

Is it possible that there is another pair of functions satisfying the same four conditions?

[Hint: Suppose that the pairs fv

function

What sort of function is F?

g1

satisfy (a) through (d).

Consider the

From what you learn about F, what can you conclude

about /1,/2, g1, and g2 ?]


The answer to this problem has a rather curious significance: it means that all
properties of the sine and cosine are contained, implicitly, in conditions (a) through (d).
That is, the sine and cosine are completely described by the conditions

Sill

4.4

I
=

cos'

COS,

sin 0

-sin,

cos 0

0,

1.

THE APPROXIMATION OF DIFFERENCES BY DIFFERENTIALS

We recall, from the preceding section, the apparatus which we set up in order to
calculate the derivative of the sine.

Given a function

/: J-..R,
where I is an interval, and a fixed point
x

- x0,

so that

x0 + .6.x.

x0 of I.

For each

point

/.,
6}L
I

We let

IJ..f

f(x) - f(x0)

Llx

f(x0 + .6.x) - f(x0).

In the old notation,

'
f ( Xo) _

by definition.

1.

lffi

x--+x0

f(x) - f(xo)
,
X - Xo

In the new notation, this takes the form

f'(
x0 )

1.
im
ax->O

f(xo + .6.x) - f(xo)


6.x

1.

1m

ax->O

.6.f
.
ilx
-

of I, we let

.6.x

The Approximation of Differences by Differentials

4.4

When Llx is small,

149

LlfLl
/ x is close to f'(x0). Thus
when Llx::::::;

0,

where ::::::; stands for the phrase "is approximately qua! to." This ought to mean that
when Llx::::::;

0.

Let us interpret this last statement geometrically.


y
f
T

Xo

In the figure, the line Tis the tangent to the graph off at the point
the slope of Tis

f' (x0). If

(x0,f(x0)).

(x, y), then

Thus

f (xo) f'( Xo'


)
_

X0

because the slope of the segment from P to S is the slope of the line T. This gives
Y
This quantity is called the
in the figure.) To repeat:

by definition. Since

x0

f(xo)

f'(x0) Llx.

differential off at x0, and is denoted by elf (See the label


df

f' ( x0) Llx,

is regarded as fixed, throughout this discussion,

dfis a function,

whose value is determined when Llx is named. The differential is often convenient for
purposes of numerical approximation. We have observed that
when Llx::::::;

0.

In our new notation, this says that

Llf::::::; df

when Llx::::::; 0.

Let us try this on some numerical examples, and see how good the approximation
looks.

150

Trigonometric and Exponential Functions

Example

1.

4.4

Let

(x

f(x) = .J

and take

25,

Xo=

bx

O);

0.4.

y
5
4
3
2
---f---'
---'---'---'---'---'---''--'--'---'---'---!--,__ X
0
12
14
24 I 26
22
4
6
10
20
8
18
16
2
Xo=25

Then

and

f(x0)

f'(x)

.J25

1;-

2yx

df= lo bx

5,

(x

>

0),

lo (0.4)

0.04.

The approximation formula

dfbf
suggests that

.-/25.4

f(x0

bx)

f(x0)

bf f(x0)

df= .J25

0.04;

.-/25.4 5.04.
The actual value of

.-/25.4,

correct to six decimal places, is

.-/25.4
Thus the error in our approximation is
approximation

Ax= 0.4 is

b f df wasn't

5.039841.

0.000159, which is not bad.

supposed to be good except when

not very small. Using

df

Ax= 0.1,

1
;- (0.1)
2y25

.J25.l f (5)

we get

dj

0.01;

5.01.

Note also that the

bx

is small; and

The Approximation of Differences by Differentials

4.4

151

The correct value is

)25.1
so that our error is

0.00001,

5.00999,
Using t:.x

which looks better.

0.01,

we get

11-o(0.01) 0.001;
)25.01 f::::! 5.001.

dj

Using five-place common logarithms, we get

)25.01

f::::!

5.0010.

Thus, in this case, the differential is as accurate as five-place tables.


It is natural to ask why the approximation

b.f f::::! df

f'(xo) b.x

should be as good as it is. The reason is as follows. We know that

On this basis, we wrote

f'(x0)
Multiplying by

b.x,

f::::!

we got

b.f
b.x

b.f f::::! f'(xo) b.x

when

b.x

when

b.x

(1)

0.

f::::!

f::::!

(2)

0.

The second of these approximations is much better than the first.

b.x

and
then the product

[b.b.xf - f'(x0)] b.x

f::::!

f::::!

The point is that if

0,

O;

when you multiply two numbers each of which is small, the product is even smaller.
We shall now express these ideas in a more exact form.

E(b.x)

f (x 0

Then

b.x) - f(x0)
b.x
lim E( b.x)

6.x-+O

because

1.

!ID

ax-+O

/(xo

j'(xo)

0,

b.x) - f(xo) - f'(Xo) .


b.x

For each b.x, let

(b.x

-:

0).

152

Trigonometric and Exponential Functions

4.4

Thus the graph of the function Elooks like the figure on the left below.
y

//

y =E(t.x)
(D.x7'0)

/
To this graph we add the origin. That is, we define
E(O)

= 0.

The graph of the extended function Eis shown on the right above.
lim E(tu)

E(O)

We now have

0.

Ax-+O

Note that Eis defined on some open interval containing 0.


y

x
---+- -xo- --a XoX-o+-
a -

(x0 - a, x0 + a) lies in the domain off,


( -a, a) lies in the domain of E. An open interval containing
called a neighborhood of the given point. In this language, we

The reason is that if the open interval


then the open interval
a given point will be

can sum up the above discussion in the following theorem.


Theorem 1.

Letf be a function defined in a neighborhood of

is differentiable at

x0

such that
i)

6.J= f'(x0)6.x

ii)

limxo

(Proof

x0,

and suppose that f

Then there is a function E, defined in a neighborhood of 0,

E(6.x)6.x,

(6.x)= E(O)

and

0.

Use the function Ewhich we have just defined.)

This theorem explains why

d
f

j'(xo) 6.x

is a good approximation of

6.j = J(x0

6.x) - j(x0)

The Approximation of Differences by Differentials

4.4

when !1x

153

The reason is that

0.

11/- df= !1f-f1(x0)!1x


Thus when !1x

E(!1x)!1x.

0, the error in the approximation !1f

small multiple of the smaU number

11x.

df is

doubly small, being a

In many cases, it is possible to estimate the largest possible error that can result
when you use

df as

an approximation for

!1f

For a discussion of this problem, see

Appendix D.

PROBLEM SET 4.4

Following is a partial table of the sine and cosine functions, for ready reference in solving

some of the following problems:

sin x

cos x

0.0814

0.0814

0.9967

0.2094

0.2079

0.9781

0.1222

0.3840

0.1219

0.9925

0.3746

1. Find sin 0.1251 approximately (sin 0.1251

0.9272

0.1248, correct to four decimal places).

2. Find cos 0.0844 approximately (cos 0.0844


3. Find sin 0.2123 approximately (sin 0.2123

=
=

4. Find cos 0.3869 approximately (cos 0.3869

0.9964).
0.2108).
0.9261).

5. How do you account for the first two entries in the first line of the above table?
6. Without the use of tables of any kind, get the best approximation you can for
sin 0.5235988. (This is a trick question.)

7. Same question, for cos ( -6.2832).


Without using tables of any kind, get numerical approximations for the following. The
answers given are the "right answers," correct to the indicated number of decimal places;

it is not to be expected that an approximation process based on the differential will give them
exactly.
8.

>'Y27.1

10.

{116.3

[Answer: 3.004]

V25.2

9.
11.

[Answer: 5.0200]

-V' -7.9

12. One of the standard approximation formulas used in mathematical physics says that
sin x

when x ""'0.

""'x

Explain how this formula is related to the ideas in this section of the text.
Consider the general approximation formula

when b.x ""0 .

b.f"" df
. What form does this take, for j(x)
13. Same question, for

cos (1 - x)

sin x, x0
""

O?]

when x ""1.

[Hint:

154

4.5

Trigonometric and Exponential Functions

14. Another standard approximation formula says that


(1 + x)" ""' 1 + nx

when x ""'0.

Interpret this in terms of the theory that we have been developing, and justify it. [Hint:
Surely the given formula is equivalent to

(1 + 6 xr - 1 ""' n 6 x

when 6 x ""'0.

Here, what isf? What is x0? What are 6fand df?]

15. Same question, for the formula


v1 + x""' 1 +

when x""' 0.

16. Same question, for the formula


1

""'1

.3; v
x

1 +

when x""' 0.

17. Without using calculus at all, justify the approximation formula


1

when x""' 0.

R:Jl-x

1 + x
--

Is this a "doubly good" approximation in the same sense in which 6f""' df is "doubly
good"? Why or why not?

4.5

COMPOSITION OF FUNCTIONS

In calculating derivatives, we have often found it convenient to regard one function


as a power of another.

For example, given

<f>(x)

(x2

3x

+ 5)5,

x2

3x

+ 5,

we let
g(x)

so that

We can then get <P' in the form


<P'

5g4g',

</>'(x)

5(x2

3x

+ 5)4(2x + 3).

Similarly, we have found it convenient to regard one function as the positive


square root of another.

For example, given

(x)

Jx2

we let
g(x)

x2

+ 1,
+ 1.

We then get <P' in the form

</>'(x)

2Jx2 + 1

x_
Jx2 + 1
__

Composition of Functions

4.5

155

The idea that we have been using is that of composition of functions. In the
first case, the action of is described by
: xH (x2 + 3x + 5)5.
We split this operation into two steps, like this:

x H x2 + 3x + 5 H (x2

3x + 5)5.

The first of these steps represents the action of the function

g: x

x2 + 3x + 5.

The second step raises things to the fifth power.


function

It can thus be described by the

In this situation, g is called the inside function; it represents the first step. The function
f is called the outside function; it represents the second step. And is called the
composition off and g. The reason for the use of the terms inside and outside is that
we can write

cp(x) = f(g(x)).
To get cp(x), we should substitute g in the formula for f
Diagrammatically:

x x2

3x +

(x2 + 3x + 5)5

Our second example fits the same pattern. We have

cp(x)

==

g(x) = x2 + 1, f(u) = .J,


.Jx2 + 1,
cp(x) =f(g(x)) )g(x) = .Jx2 + 1.
=

Diagrammatically:
g

x H x2 +

;v
1.

x2 +

Algebraically, to get the values of the composite function = f(g), we substitute


g(x) for u in the formula forf(u). This is why we described the "square-root function"
f by the formula

f(u) = .J
instead of the equally logical formula f(x) = ,J--;. We want to form the composite
function by setting

u = g(x) = x2 + 1,
and it would hardly make sense to set ( ?) x = g(x) = x2
We sum all this up in the following definition:

+ 1

( ?).

Trigonometric and Exponential Functions

156

4.S

Given two functions

Definition.

g: A-+ B,
the composition

f: B-+C,

f(g): A-+ C

is the function whose values are given by the formula

f(g)(x)=f(g(x)).
Here, for each

x, f(g)(x) denotes

the value of the functionf(g) at the point

x.

Diagrammatically:
f

AHBHC.

Let us consider some more examples.


Example 1.

Let

f(u)=sin u,

Then

f(g(x))

(In this example, what is


Example 2.

A?

g(x)=2x+ 1.

sin g( x)

What are

sin

(2x+ 1).

B and C?)

Let

f(u)=u2+u+I,

g(x)

Then

f(g(x))= c);)2 +J; + 1 =x


(What are

A, B,

Example 3.

and

Let

Then
Example 4.

f (u)= sin

u,

g(x)=x2

Thus, for example,

I.

- 5.

f\14 dt, g(x)


<i(X)
dt=J(x2o (t4
f(g(x)) J (t4
f(g(3))=f 2(t4
dt=f(t4
f(u) =

Then

J; +

C?)

f(g(x))=sin (g (x))=sin (x2


Let

J;.

0 .

1)

1)

1)

5).

x2

- 1)

- 1)

dt.

dt.

In Examples I through 4 above, we supposed that f and

then proceeded to form the composite function =f(g).

were given, and we

More often, however, we

are given a function , and in order to investigate the function , we express it as the

4.5

Composition of Functions

composition of two other functions, each of which is simpler than

f2(t4

to investigate the function

</>(x)

1)

</>.

157

For example,

dt,

we first observe that it has the form

</>(x)

(u(x)
Jo

where

g(x)

Thus

</>

where

f(u) =

lu(t4

(t4

dt,

1)

x2

f(g),

1)

dt,

g(x) = x2.

Similarly, in the preceding three examples, if</> is given by the final formula, we shall

f and g in,

for many purposes need to set up a pair of functions

</>=Jw.

such a way that

The derivative of a function is also a function; and so we can form composite

functions of the type/' (g) and f (g'). Consider, for example,

f(u)

= ua,

g(x)

x.

=sin

Then

f'(u)

g'(x) =

= 3u2,

Therefore

f (g(x))

=sin3

f'(g(x)) =

x,

3 sin2

x,

cos

x.

f'(g(x))g'(x)

.3 sin2

x cos x.

These formulas are significant, because it will turn out that

Similarly,

g(f(u))

Df (g(x))
= sin u3,

sin3

x = f'(g(x))g'(x) =

g'(f(u))

= cos

u3,

3 sin2 x cos x.

g'(f(u))f'(u) = (cos u3)3 u2,

which will turn out to be the derivative of cos u3. (Here cos u3 is the cosine of
not the cube of cos

u.)

f(u)

Let us try one more example:


=cos

f(g(x))

u,

=cos

g(x) = .JX:,
x,

f'(g(x))g'(x)

g'(x) =

f'(g(x))
=(-sin

= -sin

-jx)

lr,

2"x

.J-;,

11_.

2, x

In dealing with composite functions, we shall need the following:

u3,

Trigonometric and Exponential Functions

158

4.5

The composition of two continuous functions is continuous.

Theorem 1.

That is, if

lim g(x) = g(x0) = u0


and

limf(u) =f(u0),
u-+uo

then

Jim J(g(x))

= J(g(x0)).

x-+xo

The idea here is that

R:o!

x0

g(x)

=>

R:o!

g(x0)

=>

f(g(x)),::::; f(g(xo)).

In Appendix E it is shown that this idea can be used to get a proof.

PROBLEM SET 4.5

For each of the functions , given in the problems below, find formulas for functions
f andg, such that</> =j(g). Then get formulas for f',g',f'(g), and '.

1. <f>(x) =sin2 x

<f>(x) = (sin x + cos x)2

2. <f>(x) =cos2 x

3.

4.

<f>(x) =sin 2x

5. <f>(x)

6. <f>(x) =cos 2x

7.

<f>(x) = Vl - x2

8.

10. <f>(x)

fin:r

tan 2x

<f>(x) =sin6 x

(t2 + 1) dt

9.

11. <f>(x) =

<f>(x) = f't

fos.r

(t2 + 1) dt

(Note that the function


f(u) =

f'

(t2 + 1) dt

can be expressed without the use of integral signs; f can be calculated as a polynomial.)
12. a) Find Jim
U-4>1lo

sin u

sin u0

Ii

Uo

sin x2 - sin xg
b) Find Jim------.:.

x2 - xfi

(It is not hard to see a very plausible answer to Problem 12(b). To prove, in an orderly
way, that your answer is right, you should express the function

<f>(x)

sin x2 - sin xfi

-=---

x2 - xfi

as the composition /(g) of two functions f andg, and then apply Theorem 1.)
13.

Find Jim

14.

G iv en

sin x3 - sin x8

----X
Xo
x-+xo

<f>(x) =sin x2, proceed as in Problems 1 through 11.

15. Do the same, for <f>(x) =sin x3


16. Do the same, for <f>(x)

sin <X:

The Chain Rule

4.6

Given

17.

<f>(x)

fin

159

I + ,2 dt.

On the basis of the theory that you know so far, you are in no position to calculate

(u)

v'l+f2 dt.

And you have, so far, no general formula for

D[f(g)]

On the other hand, you ought by this time to be able to make a good guess about

D[f(g)], and then

sinx v'l+f2 dt

use your guess to write

<f>'(x)

some

kind of formula for

[Hint:

As a start, what is/'(u)?]


sinx - I
18. Find lim
.
:r-1Tf2 x - 7r12

[Hint: If you can figure out what the geometric meaning of this limit is, it will then be
easy to find its numerical value.]

19.

cos x + 1
Find lim ---x-;; x - 1T

21.

Find Jim

4.6

THE CHAIN

.x"/4

[Same hint.]

sin 2x - I
X

14

1T

tanx - I

20.

Find Jim

22.

secx - 1
Find Jim ---

:r-"14

:ro

14

1T

.X

RULE

You may have observed, in the preceding problem set, that the formula

Df(g)
held in a number of cases.

f'(g)g'

For example, if

f(u)

then
f' (u)

u",

nu"-1;

and

Df(g)

Dg"

ng"-1g'

f'(g)g'.

Similarly, if

f(u)

then

f' ( ll)
Df(g)

D,/g

Jli,

:'
----!,:
2,,.
ll

-1=
2 Jg

g'

f'(g)g'.

160

Trigonometric and Exponential Functions

4.6

The same formula seems to hold for


f(u) = sin u,
at least in the cases where we can test the formula by calculating DJ(g) = D sin g.
For example, it turns out that
2 cos 2x,
D sin 2x
and this has the form
DJ(g)= f' (g)g''
where
f(u) =sin u,
f ( u)
cos u,
=

'

f'(g)g' = (cos 2x) 2

The formula

g'(x) = 2,

g(x) = 2x,

2 cos 2x.

Df(g)= f'(g)g'

is called the chain rule. In fact, it always holds, whenever the right-hand side has a
meaning, that is, wheneverf'(g) and g' are defined. We shall prove this at the end of
this section. First, we give some illustrations of its use.
Example 1. Consider
This is a composite function
with

.\ ( ? \ '
cp(x) = sin (3x + 1).
cp(x) = f(g(x)),

f(u)= sin u,
g(x) = 3 x + 1,

By the chain rule,

f'(u) = cos u,
g'(x )= 3.

cp'(x) = D sin (3x + 1) = [cos (3x + l)]D(3x + 1)


= 3 cos (3x + 1).
Example 2. Consider
By the chain rule,

cp(x) = sin (k + x) .

f (x) = [cos (k + x)]D(k + x) = cos (k + x).


Note that if the chain rule is known, and the formula
D sin= cos

is known, we can find D sin (k + x) without using the addition formula.


To give new applications of the chain rule, we should not be talking about cases
where the outside functionf is u" or). For these outside functions, we have known
for a long time that the chain rule held. After the trigonometric functions, the next
outside functions to consider are integrals:

The Chain Rule

4.6

Example

3.

Consider

cp(x)

f,kx
1

Here

cp(x)= f(g(x)),
f '(u) = 1:'
u

g(x)
Dcp

This is a curious result:

(k, x

dt

f(u) =

1
-

>

0).

Df(g) = f '(g)g' = J... k


kx

lkx
1

1
-

dt

(u > 0),

dt

g'(x)

f'(g(x))= : '
x

kx,

"'
J
1

1
-

161

k,

_!.
x

dt.

What does it tell us about the functions?


Example 4. The chain rule can be applied several times in the same problem. For
example, we know that

D sing=
whatever g may be.

(cosg)g',

We can then apply the formula in cases where g' itself needs to

be calculated by the chain rule:

x)D sin x
(cos sin x) cos x.

D sin sin x = (cos sin

=
Here sin sin x is the sine of the sine of

x,

which is different from sin2 x.

Therefore

D sin sin sin x

= (cos sin sin

x)D sin sin x

= (cos sin sin x)(cos sin x) cos x.


Example 5.

Similarly,

D{[(x3 + 1)2 + 1]2 + 1}3 = 3{[(x3 + 1)2 + 1]2 + 1}2D{[(x3 + 1)2 + 1]2 + I}
= 3{ }2 2 [(x3 + 1)2 + I]D[(x3 + 1)2 + I]
3{ }2 2 [ ] 2(x3 + l)D(x3 + I)

3{

}2

2[ ] 2( )

3x2

Here we have left braces, brackets, and parentheses empty, in the intermediate
stages, to make the steps easier to follow.

The final answer is

3{[(x3 + 1)2 + 1]2 + 1}2 2[(x3 + 1)2 +

l]

which can be simplified slightly by collecting constants.

2(x3 + 1) 3x 2,

Trigonometric and Exponential Functions

162

4.6

We shall now prove the chain rule. Given

rp(x)
x0

we want to show that for each

f(g(x)),

we have

rp'(xo)

f'(g(xo))g'(xo).

Obviously, we must assume that


a)

b)

f has

g' (x0) at x0, and


derivative/'(g(x0)) at g(x0).

has a derivative
a

A differentiable function is continuous.

lim

g(x)

Therefore

g(x0).

For convenience of reference later, we write this in the form


c)

[g(x0

lim
t.x->O

x) - g(x0)]

= 0.

By definition,

rp'(x0)

rp(x) - rp(xo )
X - Xo
rp(x0 + x) - rp(x0)
= lim
t.x->O
x
. f( g(x o + x)) - f(g(xo))
= lim

x->x0

=hm

t.x->O

Let

u
so that

= g(x0 +

x) - g(x0)

g(x0

Then
"''
(Xo)

x)

1.
lm

f(uo

u0

t.x->O

g(x0
+

x) - u0,

u.

u) - f(u0)
.
x

Here the numerator is a difference

f = f (u0

Liu) - f(u0)

between two values of the function f


Now comes the crucial idea: we apply to
Section 4.4.

the theorem stated at the end of

We need to change the notation of the theorem, using

to fit the notation of the present discussion.

function E, defined in a neighborhood of 0, such that


and

f = f'(u0) u
Jim

E(u)

t.u->O

in place of

x,

The theorem then says that there is a

E(u ) u,

= E(O) = 0.

4.6

The Chain Rule

163

Therefore

D.f
D.x

D.u
D.x

f'(u0)

f'(g(xo))

E(D.u)

g(xo

D.u
D.x

D.x) - g(xo)
D.x

E(D.u) g

(x0

x) - g(x0)
.
D.x

It is now easy to see what the limit is. By definition of g'(x0), we have

.
I1m

g(x0

<ix-o
As

D.x ---+

0,

D.u---+

D.x) - g(x0)
D.x

g'(x0)

(Remember the definition of

0.

and recall condition (c),

D.u,

at the beginning of the proof.) Therefore, by Theorem 1 of Section 4.5, we have


lim
Ax-+O

This gives

D.f
ef/(x0) = Jim
<ix-o D.x
-

E(D.u)

= 0.

.
=

f'(g(x0))g'(x0)

+ 0

g'(x0).

We therefore have:
Theorem 1.

Let f and

g be

functions. Then
= f'(g)g',

Df(g)
at every point

x0

at which the right-hand member has a meaning.

That is, the formula holds at every point

x0 and

(b)

f is

differentiable at g(x0).

x0

such that (a)

is differentiable at

These conditions illustrate the normal pattern

of theorems involving differentiation formulas: the equation holds whenever the


quantities mentioned in the right-hand member exist.
PROBLEM SET 4.6

In this problem set, your main job is to learn to use the chain rule. In each odd-numbered
problem, from 1 to 19, you should indicate the logic of your work by writing formulas for

f,g,f',f'(g),andg', before writing the answer in the form D[(f g)]


given the function

<f>(x)

sin (x2 +

f'(g)g'.

For example,

1),

your solution should be written in the form

f(u)
f'(u)
g'(x)

sin

u,

=COS!I,
=

2x,

g(x)
f'(g(x))
'
(x)

x2

cos (x2 +

1,

1),
[cos (x2 + 1)]2x

2x cos (x2

1).

If you go through this routine for one day, you are Jess likely hereafter to omit the factor

g' followingf'(g) in calculating Df(g).

The parentheses and brackets in the expression [cos (x2 +

1)]2x

look clumsy, but to

eliminate the brackets we have to change the order of the factors, as in the last expression

4.6

Trigonometric and Exponential Functions

164

above. It would have been simpler to write

(?)

'(x) =cos(x2 +

1 )2x

(?)

but this is the wrong answer: the function on the right is the function whose value, for each
x, is the cosine of 2x3 + 2x. If you write this formula for ',you are relying on the reader to
remember what the problem was and to realize that you must not mean what you are saying.
In some cases you may not feel sure whether brackets are necessary.

When in doubt,

use them.

1.

7.

12.

16.

20.
24.
27.
30.
34.
37.

Now find the derivatives of the following functions:

2.

sinx2

8.

sin(x3 + x)
tanx -

sec 2x
sec

vx2

a) sin

(vx)2

b) tanx2

a) tan2 x

3-

21.

b) \sinx

(I

,i:x

[x[2

cos4x - sin4x

cos

vx2

sin sinx

cos:c

f,k:J; -1 dt

5.

tan(t2 +

10.

,1 cosx

15.

t3Jt

f,x 1
- dt
t

1)

I I.

a) sec2 x

18.

22.

6.

19.

x
tan a) '\!tanx

23.

-1
2

tan t2 +

x
tan --

b) secx2

cos 2x

26.
29.
32.

sin cosx

35. L

+ sinx)

cos3x

cos

14.

17.

25.
28.
31.

,,--

4.

9.

sin x cos2x + sin3

cos cosx
sin

cosx3

sinx3 +x

13.

3.

sin2 x

cos2

- sin2

sinx

+ cosx
b) tan ,;:;.:

33.

sin2 sinx + cos2 sinx


sin sin sin

36.

ix'
f,k - dt.
0

cos f

dt

tan sinx

Let k be any positive number; and for each positive number x, let
(x)

Find the simplest possible formula for '(x). Then do the same, for the functions (x)

38.

f,x' - dt
1

(x >

.t

39.

0)

lx3dt-3 xdt
J t
t
sin 1
- dt (0 < <
44. J
t
45. For each x > 0, let
41.

J,"'' 1- dt - 2 Jx 1- dt

defined by the following formulas.

42. J\;dt
t

TT

f(x)

/(ab)

x3 1
40. l - dt
t
V;; 1
1 "' 1
43. J - dt - - dt
f
2J t
1

=f,x dt.
t
1

Show that for every pair of positive numbers

[Hint:

a and b,
f(a) + f (b).

we have

When we try to attack this problem by the methods of calculus, the obvious

introduce a function into the problem.]

trouble is that the problem does not appear to involve any functions.
first step should be to

Therefore our

Invertible Functions.

4.7
46.

Let ef>(x) =f(xn), where

The Inverse Trigonometric Functions

> 0 and/is as in the preceding problem. Find

165

ef>'(x).

*47. Given
D cos = -sin,

D sin =cos,

sin 0

cos 0

0,

1,

and given no other information whatever about the sine and cosine, prove that
sin (k + x) =sin k cos x + cos k sin x,
cos (k + x)
for every k and

x. [Hint:

cos k cos x - sin k sin x,

0 if the first equation holds;


0 if the second equation holds, and investigate the

Let f be the function which is

let g be the function which is

function

F =12 + g2.]

This result tends to confirm a claim that was made in Problem *27 of Problem Set
4.3. The claim was that all properties of the sine and cosine are contained, implicitly,

in the properties that we have just used to prove the addition formulas.
find further confirmation of this.
*48. Let/be a function, defined

(a)

f"

Show that f(x) =sin

for every x, such that

-f ,

(b)

x for every x.

*A9. Let g be a function, defined for every

(a) g"
Show that g(x) = cos
4.7

-g,

Later we shall

for every

f (0)

x,

=0,

(c)

f' (0)

I.

such that

(b) g(O)

(c) g'(O) =0.

I,

x.

INVERTIBLE FUNCTIONS. THE INVERSE TRIGONOMETRIC FUNCTIONS

A function f is called invertible if its graph intersects every horizontal line in at most

one point. Thus f(x)

x3

is invertible, but f (x)

x2

y=f(x)=x3.

is not.
y=f(x)=x2.

Iff is invertible, then for each number yin the image of/there is exactly one number
x in the domain of/ such that/(x)
y.
Thus to every invertible function f there corresponds a new function 1-1, called
the inverse off. (This is pronounced f inverse. The symbol -1 is not an exponent,
=

4.7

Trigonometric and Exponential Functions

166

really;andJ-1is not 1//) The inverse is defined by the condition that

1-1(x)

ifJ(y)

x.

If fis invertible, this condition defn


i es a function, because for each xin the image
offthere is exactly one such y. It is not hard to see what this relation between fand
J-1 means geometrically. The point (x, y)is on the graph of J-1if the point (y, x)
is on the graph of/ Therefore,to get the graph ofJ-1 from the graph off,we should
reflect the graph offacross the line y
x.
=

Let us see what this means algebraically. Consider

f(x)

a
x .

The graph offis the graph of the equation


y=

a
x .

(I)

The graph ofJ-1is the graph of the equation


x=

Here we have simply interchanged

and yin Eq. (I). Now (2)is equivalent to

y = fl;: .
Thus
J-l (x) = -{Y;:,
as we would expect: the inverse of cubing is the extraction of cube roots.
y

Invertible Functions.

4.7

Theorem 1.

The Inverse Trigonometric Functions

167

Let f be an invertible function. Then

J(!-1(x)) = x,
for every

x.

Proof For each x, lety =J-1(x).


J(!-l(x)) = f (y) = x.

Then/(y)

x,

by definition ofJ-1. Therefore,

We can use this idea to calculate the derivatives of inverse functions, assuming
that the inverse function has a derivative.
Example
Thus

1.

The function f(x)

x3 is invertible, and its inverse is J-1(x) = \o/;:.


(i1')3 = x.

We take the derivative on each side of this equation, using the chain rule for the
composite function on the left. This gives:

3(\o/x)2 D\o/x =

D\o/x = --1=

3\o/x2

1,

(x - 0).

You may have calculated this by another method, in Problem Set 3.6, but the present
method is easier.
Example 2. A function of the form j(x) = xq (where q is a positive integer) is
not necessarily invertible; in fact, it never is when q is even. We therefore restrict x
to positive values. This gives an inverse function

r1<x>=vx

We calculate D/x in three steps, as follows:

1)
2)
3)

(::Jx)q = x,

q(::Jx)q-l D/x =
-

Dx = ---==
q::}xq-1

1,

(x

>

0).

168

4.7

Trigonometric and Exponential Functions

When we use this method, the equations that we write have the following general
form:

j(J-1(x))
1
f'(f- (x))Df-1(x)

1)

2)
3)

Df-1(x)

x,
1,

(f'(f-1(x)) 9'6 0).

f'(r (x))

(You should check this against the preceding examples.) The method assumes that
our problem has an answer, that is, that1-1 has a derivative. Thus we need to show
that this holds, in every case in which the fraction at the last stage has a meaning.
This is easy to see. Consider I, 1-1, as in the figure below, with
Yi

as the labels indicate.

1-1(x1),

X1

l(Y1),

If I has a tangent line L, at (y1, x1), then1-1 has a tangent line L,' at (x1, y1): to get
this, we reflect both the graph and the tangent line across the line y
x. The slope
of Lis
=

f'(y1)

l'(f-1(x1)).

If m 9'6 0, then Lis not horizontal. Therefore L' is not vertical, andl-1 has a deriva
tive at x1. Thus we have completed the proof of the following theorem.

Theorem 2.

D r-1( x)
1

f'(r1Cx))'

wherever the fraction on the right has a meaning.


In most cases, the method used in deriving this formula is easier to use than the
formula itself. To find n1-1, we write 1(!-1(x))
1, differentiate, and solve for
D l-1, as in Examples 1 and 2.
We shall now discuss the so-called "inverse trigonometric functions." This
involves a slight difficulty, because the fact is that no trigonometric function is
invertible. The reason is that every trigonometric function satisfies the identity
=

I(x + 27T)

I(x),

for every x for which the trigonometric function l(x) is defined at all. Therefore

Invertible Functions.

4.7

The Inverse Trigonometric Functions

169

every value that a trigonometric function takes on at all is taken on for infinitely
many values of x. For example, the graph ofj(x)

sin

looks something like this:

-1
y=f(x)=sinx
If we restrict

to the interval

[-7T/2, 7r/2],

then we get a new function whose graph

includes some, but not all, of the original graph.


Sin, and the graph of y

This new function is denoted by

Sin x looks like the left-hand figure below.

y
y
Sin

The graph looks as if Sin ought to be invertible; and in fact this is not hard to see.
In the right-hand figure above, we have switched the notation to fit the definition of
the sine, so that y
() on the interval

sine.

Every point of the semicircle corresponds to exactly one

[ -7T/2, 7r/2];

and every horizontal line intersects the semicircle in

exactly one point.


As always for inverse functions, we get the graph of Sin-1 by reflecting the graph
of Sin across the line y

x.

Therefore the graph of Sin-1 looks like this:


y

170

Trigonometric and Exponential Functions

Similarly, we define Cos

4.7

to be equal to cos

x,

on the interval

[O, 7T],

and we

show that Cos is invertible. The graphs of Cos and Cos-1 look like this:
y

To find the derivative of Sin-1, we write

(cos Sin-1

sin Sin-1

x,

x)D

Sin-1

1,

Sin-1

cos Sin-1

We want to simplify the expression cos Sin-1 x on the right, and, while we are at it,

we shall get a formula for sin Cos-1 x.

Since

cos2 u + sin2 u

1,

we can now solve, getting


cos u
sin u

For
u

.JI
.JI
=

sin2 u,

(1)

cos2 u.

(2)

Sin-1 x,

this gives
and so from

(I) we get

sin u

sin Sin-1 x =x,

cos Sin-1 x =
Similarly, for
u

.J 1

x2

(3)

Cos-1 x

we have
cos
and so from

(2)

=cos Cos-1 x =x
,

we get
sin Cos-1 x

Formulas (3) and


In fact, the double

.JI

x2.

(4) are correct, but they are not good enough for
signs can be omitted, and the formulas still hold:

(4)
our purposes.

Invertible Functions.

4.7

The Inverse Trigonometric Functions

171

Theorem 3.

cos Sin-1 x

==

sin Cos-1 x

.J 1
.J 1

- x2,
-

x2

To see this, we merely need to remember that

On this interval, the cosine is 0.


applies.

Therefore, in

(3),

it must be the plus sign that

Similarly,

0 Cos-1 x 7T.
On this interval, the sine is 0.

Therefore, in (4), it must be the plus sign that

applies.
We now substitute
D Sin-1 x.
Theorem 4.

.J l

- x2

for cos Sin-1 x, in the formula that we got for

This gives:
D Sin-1

,--

= 1/...; 1 - x2

(-l<x<l).

Note that D Sin-1 x is always >0, just as the graph suggests that it ought to be.
At the endpoints of the graph, the tangents are vertical.
The proof of the following theorem is like that of the preceding one:
Theorem 5.

D Cos-1 x

-1/.J1

(-l<x<l).

x2

Note that D Cos-1 xis always <O, as it should be.


For tan x, the process is simpler. The graph of y = f (x)

tan x looks something

like this:
y

To get an invertible function Tan, we take the portion of the graph that lies between

-7T/2 and

x =

7T/2.

We could verify by brute force than Tan is invertible, but

it is easier to prove first the following theorem:

Trigonometric and Exponential Functions

172

Theorem 6.

4.7

Letfbe a differentiable function on an interval/. Ifj'(x) 0 for every

x in J, thenfis invertible.

The proof is based on the mean-value theorem.

If f is not invertible, then the

graph intersects some horizontal line in more than one point.

f(a)
for some

and b in /.

f(b),

Therefore the graph has a horizontal chord.

means that the graph has a horizontal tangent; that is,f'(x)

contradicts the hypothesis for f.

Now the domain of Tan is an open interval ( -1T/2,

Tan' x

sec2

By MVT, this

0 for some x, which

1T/2).

On this interval,

Therefore Tan is invertible.

0.

Thus

The graphs of Tan and Tan-1 look like this:

y
y

Tanx
7r

----------------2

Theorem 7.

D Tan-1 x

1/(1

+ x2).

The derivation is easier than the preceding ones, because it turns out that there
are no double signs to be eliminated.
For the secant, the situation is trickier, and some handbooks contain formulas
that are wrong.

The reason is that the graph of the secant looks like this:

The Inverse Trigonometric Functions

Invertible Functions.

4.7

173

y
I
I
I
I
I
I

3.,,.

-2

_,.

x =

:
I

'(\
I/cos

I
I
I
I
I

= secx

: 3.,,.
12

1r

:(\

-1

I
I
I
I
I

I
I
I
I
I

(Remember that sec

IY
I
I
I
I
I,,.
1
2
I

I
I
I
I
I

I
I
I
I
I

wherever cos

0.)

This graph consists of

infinitely many connected pieces, but none of these connected pieces is the graph of an
invertible function. We therefore cannot use all of any one of the pieces. Everybody
agrees that we ought to use the part of the graph where

<

TT/2, but there is no

general agreement on what else we ought to use. To be safe, we define Sec x only for

<

TT/2. (See the graphs below.)

y
Jy=Sec x
I
I
I
I
I
I

---------

.,,.

--f----L---x
.,,.
2
(Query: How do you know that the secant never takes on the same value twice,
on the interval

[O, TT/2) ?)

On the basis of the definition of Sec-1, it is plain that the equation

Sec-1

means two things:


.1:

secy

(5)

and
7T

(6)

O:Sy
< -.
2

We now calculate the derivative.

We have
Sec Sec-1

(7)

x = x,

and
Sec' u
for every

from

0 to TT/2.

Sec u Tan

u,

(Why are we justified in using capital letters on the right?)

Therefore, by the chain rule,


(Sec Sec-1 x)(Tan Sec-1 x)D Sec-1

x =

1,

4.7

Trigonometric and Exponential Functions

174

and

x(Tan Sec-1 x)D Sec-1

x=

(8)

1.

Therefore

1
x = -----
(9)
x Tan Sec-1 x
We now need a formula for Tan Sec-1 x, analogous to the formulas for sin Cos-1
and cos Sin-1 x. We know that
1 + tan2 u = sec2 u
D Sec-1

for every

u.

Therefore
tan

For

u=

) sec2

I.

Sec-1 x, this says that


tan Sec-1

Since

u=

x=

) sec2 Sec-1

x -

0 Sec-1 x < 1T/2, we have


tan Sec-1 x

Tan Sec-1

0,

and so it must be the plus sign that applies on the right.


Tan Sec-1

x = )x2

Therefore

1,

and we have:
Theorem 8.

D Sec-1

x = 1/x)x2

I.

For convenience of reference, we repeat these differentiation formulas:

D Sin-1

1 ,
x = --) 1 x2

D Cos -1x

D Tan-1

x=

1
1 +

x2

D Sec-1

=
)1 x2

x=

_1

x x2

_
_

We now have a new set of functions arising as derivatives: none of these four functions
has appeared before as a result of differentiation. This means, for one thing, that we
can use our new functions to solve certain area problems that we couldn't solve
before. Later we shall see that the process by which we find a function whose deriva
tive is a given function has many other applications.
You will also need to remember

cos Sin-1

x = )1

x2,

sin Cos -1 x

= )1

x2.

4.7

Invertible Functions.

The Inverse Trigonometric Functions

175

PROBLEM SET 4.7

For each of the following functions, calculate the derivative.


2. Cos-1 (x - 1)

3. Tan-1 (x + I)

4. Sec-1 (x + 1)

5. Sin Sin-1 (x + 1)

6. Cos Sin-1 x

7. Sin Sin-1 x2

8. Cos Sin-1 (2x)

9. Sin-1

1. Sin-1 (x

10. Cos-1

1)

Vl

- x2

13. Sec-1 x2
I
16.

Sec-1

19. Sec Tan-1 x

\1 I

11. Tan-1 (x2 + I)

12. Tan-1 (sec2 x

14. See1 (I + tan2 x)

15. Tan-1

1
17. Cos-1

18. Sin-1x

20. Tan Sec-1 x

21. Sin Tan-1 x

x2
-

v1 - x2

J)

22. Cos Tan-1 x

23. Tan Sin--1 x

24. Tan Cot-1 x

25. Sin-1 (2x + 1)

26. Tan-1 (I - x)

27. Sin-1 x2

28. Show that

See1 x

Cos-1 - ,
x

for every x on a certain interval. What interval?


29. Show that

Sin-1 x + Cos-1 x
for every x on the interval [ I, 1 ].
uniqueness theorem of Section 3.8.)
-

30. Find

f -1 - ? dt.
+ rI

Sketch.

11'

2'

(A very short proof is possible. Remember the

33. Find

-1

32. Find

f/2
0

VI

t2

dt.

'/

31. Find

2dt.
l-+ t
1v2 r
I

v1

t2

Sketch.

dt.

I /(I + x2) on the whole


34. Try to get the right answer for the area under the graph of y
interval ( - w, w). You need not justify your answer, so long as it is right.
=

35. Given
0 x l,

find a formula for f-1(x). Then explain how your answer might have been predicted
without a calculation.
36. Find

(2
1
dt.
J 21v3 iVi2 1
11v3 t
dt.
1 --v t2 + 1
-

38. Find

37. Find

J xVx2
1

_
_
_
_

dx.

Trigonometric and Exponential Functions

176

39.

4.8

In Theorem 6 we required that f' (x) be different from 0 everywhere on the interval I.
This hypothesis was satisfied by Tan on the open interval ( -Tr/2,
to Sin on

Tr/2, Tr/2]

or to Sec on

vanish at the endpoints

and so we

[O, Tr/2),

because the derivatives of these functions

and 0.

To take care of such cases, we need the

Tr/2, Tr/2,

Tr/2),

But Theorem 6, as it stands, does not apply

could conclude that Tan is invertible.

following:
If f is differentiable on an interval I, and f'(x)

Theorem.

0 at every interior point of

I, then f is invertible.
Here by an interior point of I we mean a point of I which is not an endpoint.
Reread the proof of Theorem 6 and see whether it proves this more general theorem.

If so, say so and explain. If not, furnish whatever additional reasoning is necessary.
40. It might also be convenient to have the following generalized form of the uniqueness
theorem (of Section

3.8).

F'(x)

Here we require that

G'(x)

at all interior points of

the interval I.

Theorem
let

(?).

Let F and

be a point of I.

of I, then (iii)

F(x)

G be differentiable functions, defined on the same interval I, and


F(a)
G(a) and (ii) F'(x)
G'(x) for every interior point
G(x) for every point x of I.

If (i)
=

Reexamine the proof of Theorem 2 of Section

3.8,

and see whether it proves the more

general theorem above. (If not, complete the proof.)

Then name a case in which the

more general theorem is more convenient to use.

4.8

SIMPSON'S RULE. THE COMPUTATION OF

11

In Section 3.8 we developed a method for evaluating definite integrals. To find

!(x) dx,

where f is continuous, we first set up the function

F(x)
Then F'(x)

f(x), for every x.

f'f

(t) dt.

We find another function G, such that G'.

Then F and G have the same derivative f; and by adding a constant to G, we get
function, say H, such that H'

G'
by the uniqueness theorem that F(x)
=

f and H(a)

0.

Since F(a)

H(x) for every x. Therefore

f(t) dt

H(b).

It is possible to write a theorem which sums this up very briefly:


Theorem 1.

If f is continuous, and

G'(x)

(a x b),

f(x)

then

!(x) dx

G(b)

G(a).

f
a

0, we know

Simpsons' Rule.

4.8

Proof

For each

x,

The Computation of

TC

177

let

F(x)

=if(t) dt.

Then

F'(x)

f(x),

F(a)

and

0.

Let

H(x)

G(x) - G(a).

Then

H'(x)
F(x)

Therefore

H(x) for

G'(x)
every

x,

ff(t) dt

and so

H(b)

H(a)

and

f(x),

F(b)
=

G(b)

H(b).
-

0.

Therefore

G(a).

The proof reproduces the procedure that we have been using all along.
first

G that we try,

with

G'

G is

the

f; and His the function that we get when we adjust the

constant.
But in many cases it is hard to find a known function which has a given function
fas its derivative. For example, if we had never heard of tan, Tan, or Tan-1, then we
would have had no chance at all of finding a known function

G'(x)

G such that

1-.
1 + x2
-

Later, we shall learn more and better methods for attacking such problems.

But no

method, and no system of methods, works all the time. Therefore we often need to
use numerical methods, to calculate definite integrals approximately.
One way is the following.
val

[O, 1]

into

Suppose that we didn't know anything about deriva

H (1 - x3) dx approximately. We might divide the inter


10 subintervals of length 0.1, and add the areas of the circumscribed

tives, but we needed to find


rectangles.

Trigonometric and Exponential Functions

178

i=

xi=

4.8

ai =

Yi=

0.1

0.1

0.999

0.0999

0.2

0.992

0.0992

0.3

0.973

0.0973

0.4

0.936

0.0936

0.5

0.875

0.0875

0.6

0.784

0.0784

0.7

0.657

0.0657

0.8

0.488

0.0488

0.9

0.271

0.0271
0.7975

Here the areas of the ten circumscribed rectangles are

and their total area is 0.7975. This gives

A
The approximation A

f(1 - x3) dx

0.7975

A1.

A1 is not very good: by an easy calculation based on Theorem

I, we get the exact answer

f(1 - x3) dx
We might also have used

0.7500.

inscribed rectangles.

0.7975 - 0.1000

(Why?) The approximation

A2

==

Their total area would be

0.6975

A2

is not very good either.

But their

average

is

considerably better.

Aa
The sum

A3 has a

t(A1

A2)

==

0.7475

0.7500.

geometric meaning: it is the sum of the areas of the inscribed trape

zoids.
y

CS:J

I
I
I

I
I
I

Over each of the little intervals, the area of the trapezoid is the average of the areas
of the inscribed and circumscribed rectangles; and it is not hard to check that the
same is true of the sums.

This helps to explain why the approximation

A3

is

Simpson's Rule.

4.8

The Computation of

7t

179

reasonably good; we have approximated the graph of f by an inscribed broken line,


and used the area under the broken line as an approximation of the integral.
In practice, however, nobody uses the approximation A

I:::::!

A3, because there is

another method which gives better results without any extra work. This method is

Simpson's rule The scheme is as follows.


..

Suppose that we have a functionf, whose values we can compute, on an interval

[a, b]. We find a quadratic function


g(x)
which agrees wlthf at

= Ax2 +

Bx + C

a, at b, and at the midpoint of [a, b]; and we use the approxima

tion

f!(x)dx fg(x) dx.

Here by a quadratic function we mean a function given by a formula

A,x2 + Bx +

C. We allow the case A = 0, and so the graph may turn out to be a line instead of a
parabola. In any case, the integral on the right is easy to calculate: if

G(x)

then

:i x3
3

G'

and so

fg(x)dx

!!. x 2
2

Cx '

g,

G(b)

G(a).

In the figure above, the approximation looks good, because the errors on the
two halves of

[a, b] seem to cancel each other out. Most of the time, we cut [a, b]
[a;, ai+1]; we then use Simpson's rule on each

into a certain number of little intervals

of the little intervals, and add the results.

We shall now develop a shortcut formula for Simpson's rule, in a special case.

Theorem 2.

Let

g be a quadratic function, and let k be a positive number. Then

Jk g(x)dx

-k

where Yo =

g(-k), y1

g(O), and y2

(Yo
3

g(k).

4y1 + Y2),

180

Trigonometric and Exponential Functions

4.8

Before proving that this formula is true, let us first check it, in a simple case, to

g(x) = l for every x.


2k. Thus our fom.Jla says that

make sure that it is not absurd. One of the possibilities is that


In this case, the integral on the left is equal to

2k = (1 + 4 + 1),
3

which is correct.

Any time you wonder whether you have remembered Simpson's

rule correctly, you should check by this method; the check uncovers the most common
errors in recollection.
We proceed to the proof. We have

g(x) = Ax2 + Bx +
Let

G(x)
so that

A
B
- x3 + x2 + Cx'
3
2
-

G' = g. Then

fkg(x) dx

G(k) - G(-k)

(The algebra here is straightforward.)

y 0,

C.

iAk3 + 2Ck.

We need to express

y1' y2, and k. Evidently C is no problem:

To find

A, we use
Yo + y2

We can now solve for

Our expressions for

Y2

C,

Ak2 + Bk +

C,

Yo + y2 = 2Ak2 + 2Ji.

2A'k2 + 2C,

A:

A and

f g(x) dx
k

-k

C now give

= iAk3 + 2Ck
=

3 ( Yo

k
=

(y0 - 2Yi + Y2) + 2kYi

+ Y2),

4y1

which was to be proved.


Let us try Simpson's rule on the function

f(x)
Here we have

C in terms of

g(O) = y1.

2
Yo = Ak - Bk +

A and

k =

1,

x + 2

--

Yo=

1,

-1 x 1.

Simpson's Rule.

4.8

The Computation of

7t

181

y
3
2

-- 21
I
I
I

-
- 1

f(x)

-+

x+2

--''--

'--2 --X

The rule gives

Jl-1
t(l
x
2

+ t) 1.11.

Later, we shall find ways to calculate this integral as exactly as we please. It will then
turn out that the right answer, correct to four decimal places, is 1.0986. In this case,

the approximation is good, in spite of the length of the interval [ -1, 1], because the
portion of the graph off that we are dealing with is very close to its approximating
parabola.

-1
Let us now try

f(x) =
Here we have

k = 1,

The rule gives

1
,
1 + x2

Yo= i,

1 dx Ht
J-11
x2
+

---

Since

-1 x 1.

--

Y1 =
+ 4 + t)

-1 x =

D Tan

the right answer is

f1 --dx
=
-11 + x2

-1 1

Tan

. 1.57.
= !!...
2

1,

t 1.67.

1
,
1 + x2

---

Tan-1

( l)
-

7T

4.8

Trigonometric and Exponential Functions

182

Here the error is about

0.10,

which is not very bad. To get better results, we need to

cut up our intervals into smaller pieces.

2,

formulas is to generalize Theorem

The first step in deriving the necessary

to take care of the case in which the origin is not

necessarily the midpoint of the interval over which we are integrating.


Theorem 3.

Let

be a quadratic function, and let

ia+2k
g(x) dx
a

where

Yo = g(a),

k
=

Yi= g(a

(y0 + 4yi

be a positive number. Then


+

Y2

k),

J2),

g(a

2k).

The easiest way to see this is to move the graph


point

(a

k, 0) falls on the origin.

k units

to the left, so that the

When a parabola (or a line) is moved in this way,

it is still a parabola (or a line); the integral does not change, and neither do the
numbers

k, y0, Ji,

and y2 Therefore Theorem 3 is a consequence of Theorem 2.

Consider now a functionf, on an interval


into an even number

2n

[a, b].

We cut up the interval

of little intervals, each of length

b-a
.
k=
2n
The division points are

x0,

Xi,

, x2n,

as shown in the figure for

Yt
Yo

On the interval

[x0, x2) = [a, a + 2k], Simpson's rule gives


ia+2k
k
f(x) dx R::i (y0 + 4yi + J2),
3
a
-

2.

[a, b]

Simpson's Rule.

4.8

where

Yi =/(xi);

for each i.

On the interval

ia+4kf(x) dx

R:i

a+2k

[x2, x4]

The Computation of

[a + 2k, a + 4k]

7t

183

we get

(Y2 + 4y3 + y4).

For the 2n little intervals we have

Ja(bf(x) dx

R:i

(y0 + 4y1
3

2Y2

4y3 + 2y4

+ 4Y2n-1 + Y2n).

This formula is the final form of Simpson's rule. Let us try it, with
better approximation of

11 d x
.
-1 x + 2

The computation looks like this:


i=

xi=

Yi=

0
1
2
3
4
5
6
7
8
9
10

-1.0
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1.0

1.0000
0.8333
0.7143
0.6250
0.5555
0.5000
0.4545
0.4167
0.3846
0.3571
0.3333

1
4
2
4
2
4
2
4
2
4
1

k = 0.2, to get a

1.0000
3.3332
1.4286
2.5000
1.1110
2.0000
0.9090
1.6668
0.7692
1.4285
0.3333
16.4795

This gives

1
f
J_1 x + 2

R:i

0 2
3

(16.4795)

R:i

1.0986.

This answer is actually correct, to the fourth decimal place.

Obviously, however,

we must have been lucky: Simpson's rule is not supposed to be exact, and, besides,
we were carrying only four decimal places in the calculation.
When you use Simpson's rule, it is a good idea to use a table like the one shown
above. Make sure that the last entry in the fourth column of your table is 1 and not 2.
We have postponed until now the presentation of Simpson's rule, because this

is the first point at which we can do something interesting with it.


thing is as follows. We know that

11--dx = Tan-1 1
2
1 + x
o

Therefore

- Tan-1

7T .
0 =4

The interesting

184

4.8

Trigonometric and Exponential Functions

Applying Simpson's rule, we can thus get a numerical approximation of

Tr.

This is

Problem 1 below.

PROBLEM SET 4.8

1. Apply Simpson's rule to the function

f(x)
with k = !.

(0 x 1),

1 + x2

Check your answer against what people have been telling you about

'TT.

If you want to use k = 0.1, to get a more exact approximation, it might occur to you
to use a slide rule to calculate they/s. Would this be a good idea? Why or why not?
How about five-place log tables?

2. Apply Simpson's rule to the function

f(x)

with k = 2. Then calculate


mation.

3x3 - 5x2 + 1

(-2 x 2),

J:2/ (x) dx exactly,

and compute the error in your approxi

3. Apply Simpson's rule to the function

f(x)

x3 + x2 - 17

(-100 x 100),

Then calculate the integral exactly, and compute the error in the

with k = 100.
approximation.

4. Apply Simpson's rule to

f(x)
with k

5.

= x3

- 2x +

3,

(-1 x 1),

l Compute the error.

Apply Simpson's rule to

f(x)

= x4

- 2x +

3,

over the same interval as in Problem 4, using the same k, and compute the error.
6. There ought to be a theorem which accounts for some of the results that you have been

getting. State and prove the theorem.


.
7. Apply Simpson's rule to the function

f(x)
on the interval

[O, 1],

=1

- x3,

using k = 0.1. (This is the integral which we investigated in the

text above, using inscribed rectangles, circumscribed rectangles, and finally trapezoids.)
*8. Given a positive number k and numbers y0, y1, and y2, write an explicit formula for a
y0, g(O) =Yi. and g(k)
y2 That is, write
quadratic function g such that g( -k)
=

an expression of the form

g(x)
in which the coefficients

Yi andy2

A, B,

Ax2 +Bx +

C,

and C are expressed algebraically in terms of k, y0,

Exponentials and Logarithms

4.9

185

*9. Does the theorem that you proved in Problem 6 hold only on intervals of the type

[ -k, k] or does it hold on any interval [a, b]? Proof or refutation?


After finishing Problem 1, you may want to try a smaller k, to get a better approximation
of

As a check,

11'.

3.14159265,

11' =

correct to eight decimal places.

In Appendix F, at the end of the book, you will find a theorem which enables us, under

some conditions, to set a limit on the error in Simpson's rule.

4.9

EXPONENTIALS AND LOGARITHMS

For the case in which the exponents are positive integers, exponentials are part of
elementary algebra. We begin with:
Definition.

For each positive integer


xn = xxx

n,

(to

factors).

It is then easy to see, simply by counting factors on the left and on the right
that the familiar laws of exponents hold:

(A)
(xmr
If

is a negative integer, then

Thus, for example, for

n =

> 0, and for

:;if 0 we define

3 we have
x-

For

-n

(B)

= xmn.

1
--

x-C-3l

1
x3

:;if 0, we define

It can be shown that, if x :;if 0, then formulas (A) and (B) hold for all integers m and n.
When the exponents are allowed to range over all real numbers, exponentials
cease to be part of elementary algebra. In this section we shall state the facts about
exponentials and logarithms, but will make no attempt to verify them. (In the follow
ing two sections, we shall see how these facts fit together to make a logical theory.)
We begin with a positive base and a rational exponent.

1) Suppose that a > 0, and that

is a rational number p/q (where p and q are

integers and q > 0). We want to define ax

a'IJfa in such a way that (A) and (B) will

continue to hold. For (B) to hold, we must have

(a'IJfo)q

aP.

That is, aPfa must be the qth root of aP. Hence the following:

186

Trigonometric and Exponential Functions

4.9

Definition. If a > 0 and q > 0, then

a'Pfa= ,:;a'P.
Here we cannot allow the case a < 0. For a =
( -1)1/3

( -1 )2/6

-1
( -1)2

-'-

we would get

1,

1 =

-1,

1.

Thus, for a < 0, a'Pfa would depend not merely on the number that we use as an
exponent but also on the notation in which the number is expressed. This would lead
to nothing but trouble.
It is a fact that for a > 0, and x and y rational, the following laws hold:

2)

a"'. av= a"'+v,


(ax)v

(A)

a xv,

(B)

aO = 1.

3)

(C)

The rational numbers on the x-axis do not fill up the x-axis, because every interval,

however short, contains irrational numbers.

Therefore the set Q of all rational

numbers forms a sort of infinitely dotted line. So far, the function/(x) =ax has been
defined only for rational values of x. Therefore/is a function Q-+ R+, and the graph
is an infinitely dotted curve, as in the figure below. Note that/(x) > 0 for every x,
because a"' = a'flfq, which is the positive qth root of the positive number aP.
y

a>l

\
\
\
\

'
'

'

'

',1 "' ,,/

I
I
I
I
I
I

f(x) =a",

............
.......... __

x in Q

a<l
-x

j: Q-->R+

It is a fact that the definition of this function can be extended so as to give a new
function:
/: R-+ (0,

oo)

ax> 0,

defined for every x, such that f is continuous and satisfies


a

4)

I, we have f(x)

l'"

We now define loga as the inverse of f(x)


y = lox

by definition.

(A),

1 for every x. But for a > 0 and

>-

a'" (a

1).

(B), and

(C).

For

1,f is invertible.

That is,

av= x,

The image of the exponential function includes all positive numbers.

Exponentials and Logarithms

4.9

187

Therefore the domain of its inverse includes all positive numbers, and we have a
function
log,,:

(0,

co

__,..

R.

Logarithms to the base a obey the following laws:


loga xy
loga b

loga X + logay,

= X

loga 1 =

loga

(A')

(b > 0, b 1),

(B')

0.

(C')

In fact, these are derivable from (A), (B), and (C).

Since the logarithm and exponential are inverses of each other, we have
loga ax= x,
alogax

x.

And the graph of either of these functions is the reflection of the graph of the other
across the line y = x.

y=f(x)=ax, a>l

5) We now want to calculate the derivative of the logarithm. By definition,


I oga
, x0

1.
loga (x0 + 1:1x) - loga x0
.
= lm

Using the laws (A'), (B'), and


1

Jim - loga
t:.x-o 1:1x

1:1x

t:.x-o

(C'), we express this in the form

x0 + 1:1x
x0

lim loga

t:.x-o

t:.x
!:1x 1/

+ -

= lim - loga
t:.x-o

Let

x0

Ax
h =
.
Xo

x0

f:1x "'o-'
+ -

) t:.x]

x0

Trigonometric and Exponential Functions

188

Since

x0 is fixed, h --+ 0 as x --+ 0.


log

x0

4.9

This gives

_!_ lim
Xo h-+O

(1 + h)11h.

loga

If loga has a derivative, then the limit on the right-hand side exists, and conversely.
Suppose that

(1 + h)1'"

approaches a limit, and call this limit

e.

Thus

(1 + h)11".

lim

h-+O

Suppose that loga is continuous, so that the limit of the logarithm is the logarithm of
the limit. Then
log
Thus

x0

l. loga lim (1 + h)11h


Xo
h->O
D loga

Since

e1

e,

-x1

loga

l. loga e.
Xo

e.

we have
log.

and so for a =

e our differentiation

1;

formula takes the form

D log.

Considering the complications of the preceding discussion, the simplicity of this


formula is surprising; it is also 'important (see the next section).
On the basis of the preceding formulas, it is easy to find Da", for a -

6)
a

1,

we have no problem.) Let

f(x)

loga

x,

so that

J-1(x)

=a'".

The general formula

J(j-1(x)) = x
thus takes the form
loga a'" =

x.

Since
Du loga

-1

loga

e,

the chain rule gives:

Cx

loga

Da"

Therefore
Da" =

1
-e
loga

a'".

1.

1.

(For

Exponentials and Logarithms

4.9

In particular, for a =

189

we have

De"'= e"'.
A final simplification: we assert that
1

Proof.

-loga e

Let

Then

x = loga

e,

a"'=

e,

=log. a.

y =loge a.

by the definition of the logarithm. Therefore

(a"') 11
This holds when xy

= e11,

and

'"11
a = a.

Since the exponential function is invertible, it cannot take

1.

on the same value twice. Therefore the equation can hold only when xy = I. There

fore

-=
x

y,

which was to be proved. Thus we can write

Da'" =

'"
a log. a.

This is better, not just because it avoids a fraction, but also because
two bases for which tables of logarithms are published.

is one of the

Throughout the following problem set you may assume that the statements

made in this section are true. (They will be proved in the following two sections.)
For convenience of reference, we give a summary.

Laws of Exponentials (a >

a)

a'" a11 = a'"+11,

c)

a0 =

e)

Da'" =a"' log. a,

0)

1,

Laws of Logarithms (a >

0, a

g)

loga xy =lo x + logay,

i)

loga 1 = 0,

k)

loga a x= x,

1)

b)

(ax) v =

d)

a'" > 0

f)

e = l im1i-o (1 + h)1f1i.

h)

log0

j)

1)

bx

axv,

for every x,

= x loga b

D loga x =

(b >

l/(x log. a) ,

aloUa x = x.

PROBLEM SET 4.9

Find the first and second derivatives of the following functions.

x
e"' cos x

3.

5.
8.

[loge x]2

9. log. xsoo

2.

4.

7. log. x2

xe2"'
iex(sin x

xex
e'" sin x

1. x loge

6.

+cos x)

0),

190

4.9

Trigonometric and Exponential Functions

10. esin a:
13. [log. e]"'
16. e "'
19. log. (1 - x)
22. el-X

!h

25. log. sin x


28. log.

12. log. e"'


15. x2e "'

11. 10"'
14. elog, "'
2
17. xe"'

1 -x

20. log. (e"'

18. log. (x3)


21. e"'-1

1)

23. log. sec x

24. e"'

26. log. cos x

27. log. (sec x

29. log. (csc x

31. Show that for every x >

0,
log. x

32. Show that if a and

30. log. (x

+ cot x)

v x2

1)

l
"'
i

-dt.
t

b are positive and different from

logb

+ tan x)

1
=

1-oga b

1, then

33. Show that, under the..same conditions,

logb x

(loga x)(logb

a).

34. Show that, if a and b are positive and different from 1, then
[Hint: What is a10g b, and why?]
35. The function
f(x)

e "'

has the property of being its own derivative.


because for every k,g(x)

But it is not the only such function,

ke"' has the same property. We have, however, the following

theorem.
Theorem.

If g ' (x)

( - oo, oo), then there is a constant k such that


ke"'
for every x.

g(x), on

g(x)

That is, g(x)/e"' is a constant.

Prove this.

36. Show that the function f(x)

e"' is completely described by the conditions

f'(x)
f(O)
That is, show that

f(x)

( - oo <x < oo ),

1.

(1) and (2) imply that f(x)

37. Show that the function f(x)

f'(x)
f(O)
That is, show that

e-"'

(1)
(2)

e"' for every x.

is completely determined by the conditions

-f(x)

( -oo <x < oo )

(2)

1.

(1) and (2) imply that f(x)

(1)

e-"' for every x.

The Functions In and exp

4.10
4.10

191

THE FUNCTIONS In AND exp

In the preceding section, we gave a sketch of the way that logarithms and exponentials
ought to behave, postponing both the proofs and also the basic definitions.

shall now fill these gaps.

We

If you review the formulas of the preceding section, you will see that after con

siderable complications in the middle, we got a formula that looked simpie:


D loge

X = -

This enables us to write a f ormula for loge:


loge x

f,"' t dt.
1

If the theory works, then this formula must be right: the functions on the two sides
of the equation have the same derivative (namely,
at

x =

1 (namely,

same function.

0);

1/x), and they have the same value

and so it follows by the uniqueness theorem that they are the

We shall use the function f:

(l/t) dt as the foundation of the theory of exponentials


ff (1/t) dt, learn its proper

and logarithms. The scheme is to investigate the function

ties, and then define all our other functions in terms of it.
we shall investigate

f: (1/t) dt

Thus, at the beginning,

without assuming that we know anything about

logarithms, or about exponentials, or about the number


starting afresh, we give the function a new name In.

e. To emphasize that we are


natural

(Here In is suggested by

logarithm.) And the official theory


. begins with the definition of In in terms of an
:
Definition.

For each x > 0,


ln x

"' dt

Soon we shall show that every real number y is equal to In x for some x. For this

purpose we shall need:


Theorem 1

(The no-jump theorem). If f is continuous on [x1, x2], then f takes on

every value betweenf(x1) andf(x2).

Trigonometric and Exponential Functions

192

That is, if f(xi) < k < f(x2), then there is an x, between Xi and x2, such ti
f(x)

k. And ifj(x2) < k < f(xi), then the same conclusion follows. This theor1

will be proved in the next chapter.


Our first few theorems on the function In are easy.
Theorem 2.

Proof.

D In x =

I/x.

This follows from the definition of In and the formula for the derivative

the integral.
Theorem 3. In 1 = 0.

This is obvious.
Theorem 4.

Proof.

For every k, x > 0, D,,, In

kx

I/x.

By the chain rule,


D"' Inkx =

Theorem 5.

Proof.

1
-

kx

1
k =-

For every a, b > 0, ln ab = ln a + In b.

The trouble with this theorem is that it does not appear to involve any fu1

tions. To prove it, we first restate it, using k for

and x for b. It then says that 1

every k, x > 0,
Inkx =Ink + In x.
----

e proof is now as follows.

Let

f(x) =Inkx,

g(x)

=Ink + In x.

Then

f'(x)

= _!

g'(x),

j(l)

=Ink,

and

g(l)

Ink + In 1 =Ink + 0 =Ink.

By the uniqueness theorem,f(x) = g(x) for every x.


y

We now want to show that the graph of In looks approximately like the drawi
above.

The figure suggests that In 1 = 0 and In' 1 = 1.

These things we alrea

The Functions In and exp

4.10

know.

193

Other things suggested by the figure are conveyed by some of the following

theorems.

Theorem 6. In is invertible.

Proof
every

We know that In'

x.

and

x = 1/x;

I/x -:F- 0 for

every

Therefore In'

x.

x -:F- 0 for

By Theorem 6 of Section 4.7, In is invertible.

Theorem 7. For every x > 0, and every positive integer n,


In

xn = n In x.

Proof.

Obviously this formula holds when

integer

n,

then it holds for the next integer


In

xn+i =
=

In
In

And if it holds for any particular


Proof:

(x xn) = In x + In xn
x + n In x = (n + 1) In x.

Therefore, by induction, we have In


A number M is called an

n = 1.
n + 1.

xn

upper bound

f (x) M

n In x for every x,
for a function
for every

which was to be proved.

f if

x.

If there is such a number M, then we say that f is

bounded above. (For example,


x 1 for every x.) If no such number M
say that/ is unbounded above. (For example, ifj(x)
x, for every x,

the sine is bounded above, because sin


exists, then we

then/ is unbounded above.)

Theorem 8. In is unbounded above.

Proof

In Theorem 7, take

2.

Then
In 2 n
for every

n.

= n In 2 ,

And In 2 > 0, because In 2 is the area of a region. Therefore In cannot

have an upper bound; no number M is greater than or equal to all of the numbers

n In 2 ,

because

ln 2 > M

whenever

n >

M
-

In 2

Trigonometric and Exponential Functions

194

Theorem 9. In

(1/x)

4.10

-In x.

Proof
In

+ In

In

e x)

In 1

0;

and from this the theorem follows.


Theorem 10. In 2-n

-n In 2.

By Theorems 7 and 9.

Proof

A number mis called a

lower bound

of a function/ if m f(x) for every

there is such a number m, then we say that f is


sine is bounded below, because -1
exists, then we say that f is

sin x

unbounded below.

bounded below.

for every

x.)

x.

If

(For example, the

If no such number m

(For example, if f(x)

x, then f is

unbounded below.)
Theorem 11. The function In is unbounded below.

Because no number mis less than or equal to all the numbers In 2-n

-n In 2.

Every real number is a value of the function In. That is, every number y

Theorem 12.

is equal to In x for some x > 0.

Proof
y lies

that y
R

Since In is unbounded both above and below, it follows that every number

between
=

twoYalues of In.

In x for se x.

If In

x1

< y < In

x2,

then it follows by Theorem 1

Thus the image of the function In is the entire interval

(-oo, oo).

We know by Theorem 6 that In is invertible. Its inverse will be denoted by exp.


That is:
Definition. exp

Since In

ln

1.

x will turn out to be log. x,

But we should not use the notation

e",

this means that exp

x will turn out to bee"'.

at this stage, because we have not yet defined

in the present treatment.


The graphs of In and exp are shown in the figure opposite.

Since exp and In are

inverses of each other, we have:


exp In

Theorem 14. In exp

Theorem 13.

x.
x.

These are instances of the general rule

1-1(f(x))

J(f-1(x))

x.

As always, for functions which are inverses of one another, the image of exp is the
domain of In. Therefore
Theorem 15.

exp

> 0

for every

x.

4.10

195

The Functions In and exp


y
exp
y=x

/
/

/
/

/
/

/
/

/
,,

In

This theorem is also easy to see graphically, in the figure above. The graph
of In lies to the right of the y-axis. Reflecting this graph across the line y

x,

we get

the graph of exp. Therefore the graph of exp lies above the x-axis.

Theorem 16. exp


Because In 1

0.

1.

Theorem 17. exp (k +

Proof.

(exp k)(ex

) =

for every k and

x)

x.

Both sides of the equation have the same In:


ln exp (k +

x)

k +

x,

because In and exp are inverses of each other. And


In [(exp k)(exp

x)]

In exp k + In exp x

k +

x.

Since In never takes on the same value twice, the theorem follows.

Theorem 18. exp'

Proof.

exp.

We know that ln exp x


(In' exp

for every

x)

exp'

We now have functions ln


log. x and

e"'

the next section.

a)

In

f,.,
1

dt
-

Since In' u

-x

exp

exp

1/u, the chain rules gives

x = 1,

exp'

x =

x and exp x which have the properties that the functions


e"',

A natural next step is to find a number

in such a way that exp

x = e"'.

>

0),

e,

and

We shall do this in

Meanwhile we give a quick summary of this one.

(x

exp x,

exp, which was to be proved.

are expected to have.

define the exponential function

Definitions

x.

x = 1,

Therefore exp'

x.

196

Trigonometric and Exponential Functions

4.10

Laws for In

c) In 1

d) In xn

0,

e) In kx

Laws for

exp

g) exp 0

k) In exp x

f)

1,

exp x > 0

i)

(k, x > 0),

In k + In x

In' x

In x

(x > 0),
(x > 0).

I/x

h) (exp k)(exp x)
j) exp In x
x

for every x,
x

= n

I)

for every x,

exp'

exp (k + x),
(x > 0),

exp.

PROBLEM SET 4.10


Some of the problems below are to be solved by any method that works, including

4.9.

methods based on the unproved results of Section

Some, however, are supposed to

be worked strictly on the basis of the theory developed in this section; and these are stated
in the notation of In and exp. Thus, if the problem uses the notation a"', Ioga x, then the
solution may use the theory in Section

4.9;

but if the problem uses In, exp, then the solution

should also.

Find the derivatives of the f ollowing functions:

1.

ln2

5. exp

x2

9.

exp sin

13.

In sin x

2. In In

3.

1)

4.

In

6.

x]2

7. exp (2 In

x)

8.

In (exp

(x In x)

1 2.

e"' log,

16.

(sin

[exp

10.

sin (exp

14.

sin In

x)

In

(x2

1 1.

exp

15.

X"'

(X >

0)

x2

+ 1

x2)

x)sin x

(sin x

>

0)

17. We found that the function


ln

lx
1

dt

was unbounded above. Is this true also for

f(x)

Jx

dt
?

vt

Why or why not?

18.

How about the function

g(x)
19.

h(x)
.

h'(x), by any method.

lx
1

Given

find

l"'2
1

dt
--

vr

dt
2?
t

(0 <

<

Note, however, that you are not being asked to calculate

you are being asked only to calculate h'.


20.

Given

f(x)
findf'(x).

oo),

rsinx
Jo

vl+t2 dt,

h;

Exponentials and Logarithms.

4.11
1

2 . Given

f(x)

fanx
0

find f'(x).

22. Given

g(x)

findg'(x).

23. Given

h(x)

=exp

find h'(x).
24. Find
lim

v1 +

The Existence of

197

t2dt,

fxdtt'
i

(f' dt),

sinx + 1

x3"/2 X-37TI2

(By far the easiest way to solve this problem is to think of a geometric meaning for it.)
25. Find
28\ Find

Jim

tanx-1
tanx + 1

x-rr/4 X +

I4 .

1T

--

--

X-+1if4 X-7T14 .

Jim

In x2
lnx
27. Find Jim
26. Find Jim
.
1.
x-..1X x1X - 1
exp ( 2x)- l
29. Find Jim
xo

30'. Using Simpson's rule, compute an approximation of In 2. To four decimal places, the
right answer is 0.6931; and if you cut up the interval [1, 2] into ten parts, you get a
good approximation.
31. Show that for

x l,

Show that for 0 <


32. Given

f (x)

lnxx-1.
< 1, the same inequality holds.

x-1, find a formula for 1-1(x), and sketch both functions on the

same set of axes.

Show that expx


33. Let k

+ 1, for every x.

[Hint: Try to use a known property of In.]

Jn-1 l. Show that k > 2.

34. Show that k < 4.


4.11

EXPONENTIALS AND LOGARITHMS. THE EXISTENCE OF

In Section 4.9, we wanted to define

f (h)
h 0.
(1 + h)1'"

as

as the limit of the function


=

(1 + h)lfh,

To investigate this limit, we first need a proper definition of the function


Since the exponent

1/h

varies continuously through real values, we need

a definition of the exponential ax, where a > 0 and

is not necessarily rational.

The right definition is not hard to find. We know that if n is a positive integer, then
In

n
a =

n In a.

4.11

Trigonometric and Exponential Functions

198

Therefore

a" = exp

(n ln a) .

If the laws of logarithms hold for all exponents, then


ln a"'= x ln a,
which gives

"
a = exp (x ln a) .

We take this last equation as our definition of the exponential function a".

Definition. For

> 0,

a" =

Thus:

exp (x ln a) .

This gives:

Theorem 1. ln

a" = x ln

a.

We are now ready to show that

(1 + h)1f1i
(1 +

f(x) =

Then

lnf(x)

1
= -

ln

approaches a limit, as

h ---+ 0.

Let

x)1 x.

(1 +

x).

We now replace x by x, and observe that


\

ln f
, (x) =
!

x
ln

ln

(l +

(1 +

x)

x) - In

x
This last fraction is the fraction whose limit is ln' 1, by definition of the derivative.
Therefore
lim lnf(x)

Ax-o

and

ln'

1 = t = l,

limf(x) = lim exp lnf(x)

Ax-o

Ax-o

explim lnf(x)
Ax-o

= exp 1 = ln -1 1.

Replacing x by

again, we have
lim

h-+O

(1 + h)111i =

exp 1 = ln-1

1.

Now that we know the limit exists, we can use it as a definition of e:

Definition.

e = lim1io

(1 + h)1fh.

And we know:

Theorem

2.

e = exp 1 = ln-1 I.

This theorem has a geometric interpretation.

Exponentials and Logarithms.

4.11

The Existence of

199

That is,

e is the number such that

le -1 dt= 1.
1

Later we will find an efficient method of calculating

e. In fact,

e= 2.7182818,
correct to seven decimal places. It will turn out that

1
1
1
e=l+-+-+ ..+-+ '
1!
2!
n!

where

n!= 1

n.

The series on the right is infinite, but the terms diminish so rapidly as

increases

that we get good numerical approximations by using the first few terms.
We expected exp

x to bee"'. We can now show that this is true:


e"' = exp (x In e),

by definition of
Theorem 3.

e"'; and since In e= I , we have:

e"' = exp x, for every

x.

As before, we define the logarithm as the inverse of the exponential. That is

y = loga x
Since

<=>

aY = x.

e"' and exp x are the same function, they have the same inverse.

Therefore

we have:
Theorem 4.

log.

x = In

x,

for every

x > 0.

Thus exp really is an exponential, and In really is a logarithm. Once we know the
laws governing

e"' and log. x, it is easy to derive the laws governing other positive

bases. The first step is to express Ioga in terms of In. We recall that for a > 0,
aY

exp

(y ln a),

4.11

Trigonometric and Exponential Functions

200

by definition. Therefore
In

v
a

In exp

(y In a) ,

and
In
Since

a"'

av= y In a.

and loga x are inverses of each other,


log.x
a

= x.

In this equation, we take the In of each side, getting


(loga x) In

a=

In

x.

This gives:
Theorem 5.

For every

> 0,

Iogax =

x
-

In
In

Thus the function Ioga is a constant times the function In; and this means that the
extension of the theory from In to loga is easy.
Theorem 6.

For every

Ioga

> 0, 1,

xy =

Ioga b"' =

Ioga

+ Ioga

(x, y

>

(g)

0),

loga b,

(h)

Ioga 1 = 0,
1

Ioga/ X =

log,

Ioga
a

(i)

a"' = x,

log."'

= x,

(j)

,
x,

for every
for

(k)
(I)

> 0.

Here the formula designations are those of the summary at the end of Section 4.9.
The proofs are as follows.

Proof
Joga =

For

a = e,

the first three formulas are known to hold, because in this case

log, = In. If we divide every term by In

a,

then we get loga throughout, and

the equations still hold. To get Eq. (j), we observe that


log
Equations (k) and

(1)

x=

D loga

x=

1
x
- - -In a
x In a

In

merely remind us that loga x and

another. Similarly, we get the laws governing the exponential

a"' =

exp

(.JC

In

a) =

"'

Ina
.

"'

"'

are inverses of one

by using the fact that

Exponentials and Logarithms.

4.11

Theorem 7. For every a >

The Existence of

201

0,

a"' a11 = a'*11,

(a)

(a"')v =a"'11,

(b)

aO = 1,

(c)
for every

a"' > 0

x,

(d)
(e)

Da" =a"'Ina.
The proofs are as follows.
a)

We have
In (a"

a") =In a" + In a11 =xIn a + y In a


= (x + y)Ina

In a"'+v.

Since a" a" and ax+v have the same In, they must be the same; In is invertible,
and so In never takes on the same value twice.
b) By definition, b11

exp (yIn b). Therefore


(a")11 =exp (yIn a") = exp (yxIn a)
=exp (xyIn a) = a"Y.

c)

a0 =exp (0In a) =exp 0 =1.

d) a"' =exp (xIn a) >


e)

0.

Da"'
D exp (xIn a) = [exp (xIna)] In a, by the chain rule. Therefore Da"' =
a"'In a.
=

This completes the program that was sjs,ewked_jn Section 4.9. There are, however,
some things that we still need to check. In the elementary theory, we stated:
Definition

1.

For every positive integer

n,

and every real number x,


(to n factors).

In the new theory, we stated:


Definition 2. If x is positive, and n is any real number, then
[xn] =exp (nIn x).
We have used ( ) and [ ] to distinguish the two definitions. If n is a positive
integer, and x > 0, then both definitions apply; and we need to know that they give
the same answer. In fact, this is true:
In (xn) = nIn x,
by Theorem

of Section

4.10.

Therefore

(xn) = exp [Jn (xn)] = exp (n In x) = [xn],


by definition of [x"].
Similarly, we now have two definitions of avla.

Trigonometric and Exponential Functions

202

4.11

Definition 1. If a > 0, and x is a rational number p/q, then

Definition 2. If a > 0, and x = p/q, then


a

"'

a111q

These definitions agree if it is true that

a"
The proof is as follows.

( )

exp (x In a) = exp

In a .

( a) .
In

exp

Let

Then
and

q lny = p ln a.

Therefore
In y =!!.in a,

and

which was to be proved.

y =

exp

( )

In a ,

We have found that the differentiation formula

Dxk
holds true in certain cases.

kx"-1

We first proved it for the case in which

integer. Later we found that it held true when


and

x >

0, it says that

Dx112

k was a positive
k was a n7gative integer. For k = t,
/

ix112-1
2

ix-112
2

1_

__

2.Jx,

which is correct. We can now prove the following:

Theorem 8. For every x > 0, and every real number k,

Proof

Dx" = kx" -1.

By definition,
x"

Therefore

Dx"

= exp

(k In x).

D[exp (k In x)] = [exp (k In x)] D[k ln x]


xk

kxk-1

In this section we have presented no new results, except for Theorem 8; we have

merely furnished proofs for the theory sketched in Section 4.9.

no new material for problem work.

You therefore have

Hence we give the definitions of a new set of

Exponentials and Logarithms.

4.ll

The Existence of

203

functions, the hyperbolic functions, and list various identities which they satisfy.
In the following problem set you will be asked to derive these. The theory is simpler
than the theory of trigonometric functions. In fact, once you know about the expo
nerltial function, most of the following formulas have straightforward derivations.
The functions are called the hyperbolic sine, hyperbolic cosine, hyperbolic
tangent, and so on.

Definitions

smhx

e"' - e-"'
= --2

coshx

e"' + e-"'
= ---

tanhx

co thx

sechx

cschx

e"' -

-"'
e

e"' + e-"'
e"' + e-"'
e"' - e-"'
2
e"' + e-"'
2

e"' - e-"'

sinhx .
= -coshx

coshx

= -- .
sinhx

1
- --

cosh x

1
-sinh x

Identities
sinh

( -x,) = -sinhx.

cosh (-x)
tanh

(1)

= coshx.

(2)

( - x) = -tanhx.

cosh2 x- sinh2 x

(3)

=:===,,(.

(4)

1 - tanh2x = sech2 x.

(5)

coth2 x-

(6)

1 = csch2 x.

sinh (x

+ y) = sinhx cosh y + coshx sinh y.

(7)

cosh (x

+ y) = coshx cosh y + sinh

(8)

tanh (x

+ y) =

sinh 2x

= 2 sinhx coshx.

(10)

cosh 2x

= cosh2 x + sinh2 x.

(11)

"'

tanhx + tanh
1

+ tanhx tanh y

= coshx + sinh x.

e-"

cosh x - sinhx.

sinh

y.

(9)

(12)
(13)

204

4.11

Trigonometric and Exponential Functions

Derivatives

sinh' x = cosh x.

(14)

cosh' x = sinh x.

(15)

tanh' x = sech2 x.

(16)

coth' x = -csch2 x.

(17)

sech' x = -sech x tanh x.

(18)

csch' x =

(19)

csch x coth x.

PROBLEM SET 4.11

Verify the following.

(The numbers in parentheses refer to the numbered formulas

above in the text.)

1. (12)

2. (13)

3. (1)

4. (2)

5. (3)

6. (14)

7. (15)

8. (16)

9. (17)

10. (18)

11. (19)

12. Find the derivative of

F(x)

cosh2 x

sinh2 x.

14. Verify (5).

13. Verify (4).


16. Let
A

Show that A + B

15. Verify (6).

sinh (x + y) - sinh x cosy - cosh x sinhy,


cosh (x + y) - cosh x coshy - sinh x sinhy.

0. (It is not necessary to go back to the definitions to show this.

Try Identities (12) and (13).)

17. Let A and B be as in Problem 16. Show that A - B

0.

Now verify the following.

18. (7)

20. (9)

19. (8)

21. (10)

22. (11)

23. Express cosh 3x in terms of cosh x.


24. Show that
25. Show that
26. Show that, on the interval [O,

x > 0

=>

sinh x > 0.

x < 0

=>

sinh x < 0.

) cosh is increasing.

oo

27. Show that, on the interval ( - oo, OJ, cosh is decreasing.


28. Show that

cosh x 1

for every x.

29. Show that sinh is invertible.


30. Show that
cosh x

VI + sinh2 x,

for every x. Note that there is no double sign in this formula; if your derivation leads
to a"" sign, you musf find a way to get rid of it.

Exponentials and Logarithms.

4.11

31.

Find cosh sinh-1 x.

32.

Find D sinh-1 x.

33.

Find D sinh-1 2x.

The Existence of

205

34. Show that cosh is not an invertible function.

35.

Let

(0 x).

Cosh x = cosh x

(Compare with the definition of Cos: Cos x = cos x

(0 x

invertible.

) ) Show that Cosh is

7T .

36. Show that


.

Sillh X

37.

Show that

38.

Find D Cosh-1 x.

39.

Firid D Cosh-1 x2

{v

cosh2 x

for x

1,

for x <

-Vcosh2x - 1,

sinh Cosh-1 x =

0,

x2

0.

1.

40. Show that tanh is invertible.


41. Show that
sech x = VI - tanh2 x,
without any double sign in front of the radical.

42.

Find D tanh-1 x.

43.

Solve for x:

e2"'

e"'

- 6 =

0,

and explain why this equation has only one root.


44. Solve for x:
45.

Solve for x, in terms of y:

e"'

e"'

+ y - .6y2e-"'

\
I

2 - 35e-"'

0.
0.

46. Find a formula which express s sinh-1 x as the logarithm of an algebraic expression.
Hint: The graph of sinh is the graph of the equation

y =

i(e"' - e-"') .

(1)

Therefore the graph of sinh-1 is the graph of the equation


x =

i(e11 - e-i-).

(2)

Here we have reflected the graph across the line y = x, by interchanging x and y in

Eq.

(1). Now solve for y in (2), getting

Then

= (

"

sinh-1 x =
47.

Analogously, get a formula for Cosh-1 x.

48. Analogously, get a formula.for tanh-1 x.

The Variation of
5

5.1

Continuous Functions

INTERVALS ON WHICH A FUNCTION INCREASES, OR DECREASES

The function f is increasing if

x < x'

=>

f(x) <f(x').

x < x'

=>

f(x) > f(x').

Similarly, f is decreasing if

y
y

Here

and

x'

are any points in the domain off

increasing nor decreasing.

For example,

f(x)

Some simple functions are neither

x2

satisfies neither of the above

conditions.
Often, however, we can get a good description of a function by cutting up its
domain into subintervals, in such a way that on each subinterval the function is either
increasing or decreasing. For example, the domain might be a closed interval [a, b],
and the graph might look like this:

206

Intervals on Which a Function Increases, or Decreases

5.1

increasing on the interval [x2, x3]. Similarly, f


[x4, x5], and is decreasing on the interval [x1' x2].

This function is

[x0, x1]

and

f
J

207

is increasing on

I
I
I
I
I
I
I
: I
I
--+---x
X2
X
X1 X3

Similarly,f is decreasing on

[x3, x4].

If a function is differentiable, then we can find out where it is increasing or


decreasing by examining the derivative.
Theorem 1.

If f'(x)> 0 at every interior point of I, then f is increasing on I.

We recall that an

interior point of an interval is any point which is not an endpoint.

Theorem 1 is a consequence of the mean-value theorem (MVT).


y

If we had

a< b,

(?)a,b in l,

f(a)> f(b) (?),

as on the left of the graph above, then the slope of the chord would be

f(b) -f(a)< 0,
b -a
and this would give
because such an

f'(x)
<

0 for some

between

and

b.

This is impossible,

would be an interior point of I. If we had

(?) c,

din/,

c< d,

f(c)= f(d) (?),

as on the right, then the chord would be horizontal, and we would have f' (x)= 0
at an interior point

of I.

In Theorem 1, we allow the possibility that I is an infinite interval.


Here

f ( x)=

xz,

I=

f'(x) = 2x,

[O, co).

Consider

5.1

The Variation of Continuous Functions

208

and so /'(x) >

at every interior point of I.

that when we allow the possibility that/'(x) =

Therefore f is increasing on I. Note


0 at endpoints of I, we are not splitting

hairs; if we required that/'(x) be >O everywhere in I, then the theorem would not
cos x, I=
apply in the simple case f(x)
x2, I= [O, oo), or to the case f(x)
=

[7T, 27T]. We don't need theorems to be as general as possible, but we want them to be
general enough to be usable. And it is not unusual to find that/'(x)

0 at an end

point; in fact, this is what usually happens, when we break up the domain of our
function into the largest possible subintervals on which the derivative does not change
Here

sign.

f is

increasing on 11 and /3, and decreasing on 12; and the derivative

vanishes at the endpoints x1 and x2

Consider another example:

f(x)

Here

f'(x)

2x

x2

f'(t)

- 1,

- x

0,

(0

2).

and

f'(x)

>

on

( t, 2].

In the left-hand figure below, we have used this information, and have plotted

f(t)

and/(2).

Obviously /(1)

0,

and so we have plotted this point exactly also.

y
2

-1

-1

The same principle works in the opposite case:


Theorem 2. If f' (x) <

(Proof

0 at

every interior point of I, then f is decreasing on I.

Apply Theorem 1 to the function g = - f By Theorem 1, g is increasing on

I. Therefore f is decreasing on

I.)

5.1

Intervals on Which a Function Increases, or Decreases

To apply this theorem to the same function f(x)


I=

(0, !],

209

x2 - x, on the interval

we observe that/ is decreasing on this interval, because


f'(x)

2x

<

for

< x < t.

We use this information to complete our sketch.


This example doesn't look impressive, because we already knew how to sketch
parabolas. It is not so obvious, however, how to sketch the graph of a cubic function
taken at random, say,
f(x)

x3 + 2x2

3x

- 4,

-2

2.

This is not a put-up job; it is a "real-life" problem, and nothing is going to come out
even.

We need to find out where f' >


f'(x)

and where f' <

3x2 + 4x -

0.

Now

3,

so that

f'(x) = 0
Since

J13

3.6,

when

-2
X=

the roots of the equationf'(x)

JTI

3
=

are

-6

Since the graph off' is a parabola opening upward, it must look like the drawing
on the left above.

Thus
f'(x) >

when

x < x2,

f'(x) <

when

x2 < x < x1,

f'(x) >

when

x > x1.

210

The Variation of Continuous Functions

5.1

Therefore f is increasing on [ -2, x2]; f is decreasing on [x2, x1]; and f is increasing


on [x1, 2]. We calculate
f(-2)
f(x1)

2,

::::::; -

f(x2)::::::; 2.1,
4. 9 ,

/(2)

6.

This gives us our sketch on the right. (The problems in the following problem set are
not this awkward.)
To apply this method, you need to know how the derivative behaves; and we
may use the same method in investigating the derivative. For example, in the pre
ceding problem we had
2
f'(x)
3x + 4x - 3.
If we let
2
g(x) = f'(x) = 3x + 4x - 3,
then
g'(x) = 6x + 4.
=

Therefore g is increasing for x > -i, and is decreasing for x < -i Plotting g
exactly, at the points -2, x2, 0, and x1, we get the sketch of f' which is given
above. We know thatf'(x) > 0 for x1 < x < 2, because f' increases, starting at the
0. Similarly, f'(x) > 0 for -2 < x < x2, because on the interval
value f'(x1)
[-2, x2], f' decreases toward f'(x2) = 0. Similarly in the middle interval [x1, x2].
This idea is simple enough, but it is so useful that we had better record it as a theorem:
=

Theorem 3.

Iff is increasing on [x1, x2], thenf(x) > f(x1) for every x on (x1, x2].
y

__/L_
I

We recall that (x1, x2] is a half-open interval;


(x1, x2]

{x I x1 < x x2}.

We have been talking aboutthe casef'(x1)


Theorem 4.

0.

Iff is decreasing on [x1, x2], then f(x) < f(x1) for every x on (x1, x2].

PROBLEM SET 5.1

For each function given, state on what intervals the function is increasing, and on wha
intervals it is decreasing; and sketch the graph.

Local Maxima and Minima, Direction of Concavity, Inflection Points

5.2
1.

f(x)

2. f(x)
3.

f(x)

4.

f(x)

5.

f(x)

6.

f(x)

7.

f(x)

8. f(x)
9.

f(x)

sin x,

-1;:;:;

Sin-1 x,

1
- --2
1
+x

-2;:;:; x;:;:; 2

'

x3 - 3x,

(sin x + cos x)2

'

x 2
x ;:;:; 3

-1 ;:;:;

x;:;:; 2

1,

x3 + 6x2 + 9x + 3,
ew

2;:;:;

-1;:;:;

x3 + 3x2 - 2,

x;:;:; 1

-2;:;:; x;:;:; 2

'

-1 --2
+x

0;:;:; x ;:;:; 1

1 0. f(x) = x In x,

1 ;:;:; x;:;:; 5

11 .

0 ;:;:; x;:;:; 27T

f(x) =cos x,

12. f(x)
13.

f(x)

14.

f(x)

15.

f(x)

6
1 .

f(x)

7
1 .

f(x)

18. f(x)
19.

f(x)

211

0 ;:;:; x ;:;:;

sin 2x
-x

+2x2

1
=

-1 +x 4

-2;:;:; x;:;:; 2
-1;:;:;

xe-w

---4
1 +x

1T

x;:;:; 2

-1

;;;; x ;;;; 1

-1

;;;; x ;:;:; 1

x cos x - sin x
x/2 + sin x,
e"'

2x,

0 ;;=;x;:;:; h
0;:;:; x;:;:; 2

(Here you are not going to be able to get answers in an exact numerical form. The figure
should indicate plausible approximations.)
20. Investigate the converse of Theorem 1. That is, find out whether the following state
ment is true:
Theorem(?). If (i) f is continuous on [a, b], (ii) f is differentiable on (a, b), and (iii) f is
increasing on [a, b], then (iv)/'(x) > 0 for every x of (a, b).

2 1 . Is the following true?


Theorem(?). If f is differentiable at x0 and f '(x0) > 0, then some chord of the graph of
f has a positive slope.

22. Investigate:
Theorem(?). Let f be a function satisfying (i), (ii), and (iii) of Problem 20. Then (iv')
f'(x) 0 for every x of (a, b).
5.2 LOCAL MAXIMA AND MINIMA,
DIRECTION OF CONCAVITY, INFLECTION POINTS

Again we consider a continuous functionf, defined on a closed interval [a, b]. In the
figure,f(x2)
M; and Mis the largest value off
=

212

5.2

The Variation of Continuous Functions


y

We say thatf has a maximum at x2; and we say that Mis the maximum value off
Similarly,f(x3)
m; and mis the smallest value off We say that/has a minimum
at x3 ; and we say that mis the minimum value off
Here when we speak of maxima and minima, we mean maxima and minima on the
whole domain of the functionf; in this case the domain is [a, b]. Before you know
what is a maximum or minimum, you must first know the domain of the function.
In the figure above, f(x1) is not a minimum value, becausef(x3) <f(x1). But
f(x1) is the smallest value thatf takes on when x is close to x1. We say thatf has a
local minimum at x1. This is abbreviated as LMin. Local minima can occur in three
ways:
=

(fV

I
I

I
I
I

I
I

I
I
I

I
I
I
I
I

X1-0XJ X1+0

I
I
I
I
I

I
I
I
I
I

X1 X1+0

m
I
I

X1 -o

I
I
I
I

Xj

1) x1 may lie on an open interval (x1


o, x1 + o), in the domain off; and /(x1)
may be the smallest value of the function on the interval (x1
o, x1 + o). In this
case, we say that/has an interior local minimum at x1. This is abbreviated as ILMin.
-

2) x1 may be the left-hand endpoint of the domain of/; and/(x1) may be the smallest
value Of j on an interval (X1, X1 + 0).
3) x1 may be the right-hand endpoint of the domain of/; and/(x1) may be the smallest
value off on an interval (x1
o, x1].
-

Thus, for the function/whose graph is sketched at the beginning of this section,
we have local minima at x1 and x3. Note that every minimum is automatically a local
minimum, just as the tallest man in the world is automatically the tallest in his own
neighborhood.
Local maxima are defined similarly. Local maximum is abbreviated as LMax.
A local maximum can occur in three ways:

Local Maxima and M inima, Direction of Concavity, Inflection Points

5.2

I
I

Xi-5

I
I
I

I
I

Xi xi+5

!\):

0
I
I
I

I
I
I

X1 xi+5

In the figure on the left, f has an

213

I
I
I
I

I
I
I
I

X1-5 Xi

interior local maximum at x1 This is abbreviated

ILMax.
There are simple conditions under which a function has an ILMax or an ILMin
at a given point.

1. If/is increasing on an interval [x1 - o, x1] and decreasing on an interval


[X1, X1 + o], then/ has an ILMax at X1

Theorem

2. If/is decreasing on an interval [xi - o, Xi], and increasing on an interval


[xi. X1 + o], then/ has an ILMin at Xi.

Theorem

If f' > 0 on

In applying these, we use the derivative.


on

(xi, x1 + o), we can apply Theorem

1.

(xi

o, x1) and f' < 0

Similarly for Theorem 2.

In fact, if you

find out where a function is increasing and where it is decreasing, it is always obvious
where the interior local maxima and minima are; they are at the turning points, where
the graph stops behaving in one way and starts behaving in the other way.
Most of the time, for functions defined on a closed interval, the endpoints of the
interval give either local maxima or local minima. Therefore, if we are investigating a
function for local maxima and minima, we always investigate the endpoints. Of course,
interior local maxima and minima may occur anywhere in the interior of the interval.
In searching for them, we use the theorem suggested by the figure below. If the f unc
tion is differentiable, then at an interior local maximum the derivative must be

-+----x-X1
f'(x1) =0
Theorem 3.

If f has an ILMax at

Xi, and f is differentiable at Xi, then f' (xi)

This is geometrically obvious, and a logical proof is also easy.

m(x)

f(x) - f(xi)
,
X
Xi
-

so that
Jim

m(x)

f'(x1).

Let

0.

0.

214

1)

The Variation of Continuous Functions

5.2

Suppose thatf'(xi) > 0. Then the function m(x) must be >0 when x Xi.
y

Take x

!::::!

Xi, with x > Xi.

x Xi

and

Then

x > Xi

=>

m(x) > 0

=>

m(x)(x - Xi) > 0

=>

and

x - Xi > 0
=>

f(x) - f(xi) > 0

f(x) > f(xi),

which is impossible, becausef has an ILMax at Xi.

2)

Suppose thatf'(xi) < 0. Then the function m(x) must be <0 when x Xi.

Take x Xi, with x < Xi. Then


x

!::::!

Xi

and

which is impossible.
Since
proved.

(1)

and

(2)

x < Xi

and

=>

m(x) < 0

=>

m(x)(x - Xi) > 0

=>

x - Xi < 0
=>

f(x) - f(xi) > 0

f(x) > f(xi),

are both impossible, it follows thatf'(xi)

0, which was to be

215

Local Maxima and Minima, Direction of Concavity, Inflection Points

5.2

This is the standard method for finding an ILMax. Given a differentiable function

x wheref'( x)
0. Usually there are only a finite number of
These are the only possible places where interior local maxima can occur.
Therefore we have only a finite number of values of x to investigate; and when we are

f, we find the points

such points.

done, our list of interior local maxima is complete.


Note, however, that the converse of Theorem 3 is false: iff' (x1)
follow that f has a local maximum ( or a local minimum) at x1
(fx)

x3, -1 x 1,

thenf'(O)

0, it does

not

For example, if

0, butf is increasing on the whole interval

[-1, 1]
y

We have a similar theorem for interior local minima:


Theorem 4.

Proof

Iff has an ILMin at

x1,

Let

andf is differentiable at

g(x)

Theng has an ILMax at

x1.

x1,

then f'(x1)

0.

-f(x).

Thereforeg'(x1)

0.

Thereforef'(x1)

-g'(x1)

-0 = 0, which was to be proved.


y

I
I
I

I
I
I

If f' is increasing, on an interval

[xi. x2],

thenf is

concave upward

on

[x1, x2].

(You ought to be able to convince yourself that this is a reasonable use of language.)

[x2, x3], thenf is concave downward on [x2, x3]. In the figure on


x2 is the point at which the direction of concavity changes. Such a point is
called an inflection point. Of course, the direction of concavity can change from up to

IfJ' is decreasing on
the right,

down or from down to up.


Definition.

An

Hence:

inflection point

ILMax or an ILMin.

of a function f is a point at which

f'

has either an

216

5.3

The Variation of Continuous Functions

Note the way in which these definitions fit together. If you know how to investi
gate (a) increasing, (b) decreasing, (c) interior local maxima, and (d) interior local
minima, then automatically you know how to investigate direction of concavity and
inflection points.

The reason is that f', once you get it, is a function, and can be

investigated in the same way as any other function, with the aid of its derivativef".
Wheref' increases,fis concave upward; where/' decreases,fis concave downward;
and where f' has an interior local maximum or minimum, f has an inflection point.
Most of the time, we investigate local maxima and local minima because we
want to find the maxima and minima. We find the maxima and minima, on the whole
domain, by looking to see which local maximum value is the largest and which local
minimum value is the smallest.
Finally, we observe that a function may easily have a local maximum or minimum
at an endpoint at which it is not differentiable.

x2/3 (0 x

<

oo

theory takes care of this case.


of the interval [O,

For example, the function f(x)

has a minimum (and hence a local minimum) at

oo ,

Since the derivative

jx-1/3

x = 0.

The

is positive in the interior

it follows that the function is increasing, and so it has a

minimum at the left-hand endpoint.

PROBLEM SET 5.2

1 through 19. For each of the functions described in Problems 1 through 19 of the
preceding problem set, find the local maxima, the local minima, the maximum, the minimum,
the inflection points (if any), and the image. (The image will always turn out to be a closed
interval.) Tell where each of the functions is concave upward and where it is concave down
ward.
20. Consider the function defined by the following conditions:
1

f(x)

= x sin

/(0)

for 0 <

'TT

0.

An exact sketch is not practical, because the ILMax and ILMin points are hard to
calculate. Give a rough sketch, however, indicating as well as you can how the function
behaves. Is it continuous at O? Does it have a local maximum or minimum at O? Is it
differentiable at 0?
*21. Suppose that/is both continuous and differentiable on [O, 1]. Does it follow that/has

an LMax or an LMin at 0? Why or why not?


5.3

THE BEHAVIOR OF FUNCTIONS AT INFINITY

So far, we have been discussing functions on closed interva!S: In this section, we shall
consider larger domains, including infinite intervals, such as ( - oo,
and so on, and also intervals with holes in them.
tangent is
D

and this is an infinite interval

{x I x

( - oo,

oo

TT/2

oo),

[O,

oo ),

For example, the domain of the

nTT};

with infinitely many holes in it.

The Behavior of Functions at Infinity

5.3

217

Most of the ideas that we shall be investigating are illustrated by a simple function,
whose domain has a hole in it at

0.

!"'---

_)

-1

1l
I
I
I
I
I
I

-1

f(x)

1
=

(xO)

I
I

f(x)

1
x2-1

xI

A carefi:I inspection of the left-hand graph above will give you an idea of the
meanings of the following statements:

limf(x)

0,

(1)

lim

f(x)

0,

(2)

lim

f(x)

= oo,

x--+ oo
x-+-co

(3)

x-+o+

lim f(x) =

x-+O-

Definitions will come later.

(4)

- oo.

Meanwhile, let us look at another example.

The

function whose graph is shown on the right above has the following properties:
i)

has an interior local maximum at

Everywhere else near


ii)

lif(x)

0, x2

>

0, x2 - 1

0. (At x = 0,
-1, and 1/(x2

>

the denominator is -1.


-

1)

<

-1.)

-oo.

o:--+1

iii)

lim f(x) = 00.


ai-+1+

iv)
v)

lim

f(x)

- oo.

x-+-1

lim

x-+-1-

f(x)

oo.

Here statements (ii) through (v) mean the things that the figure suggests.

An

> 1,

examination of the formula shows why the figure is right.

For example, if

218

The Variation of Continuous Functions

5.3

and x 1, then x2 - 1 > 0, and x2 - l 0. Therefore 1/(x2 - I) is positive and


very large. This is shown in the figure and stated by (iii). Similarly, if x < I and
x I, then x2 - 1 < 0 and x2 - 1 0. Therefore 1/(x2 - 1) is negative and is
numerically very large. This is shown in the figure and stated by (ii).
Let us now make this precise, by stating definitions that we can work with.
Definition.

limx00f(x)

means that for every E > 0 there is an M such that

> M

=>

- E < f(x) < L

E.

L-<

--- --------

---------

This is like the definition of limx-xof(x)

Roughly,

L.

means

x x0

=>

f(x) L,

means

=>

f(x) L.

lim f(x)
and
limf(x)

oo

x-+oo

In the definitions, the condition x x0 is expressed by 0 < Ix - x01 < o, and the
condition x oo is expressed by x > M.
Let us see how our definition of limx_,00 applies to the function
1

f(x)

(x

0).

We claim that
lim ..!

0.

X-Jof"fJ X

Under the definition, given E > 0, we are supposed to find an M such that
l

-E < - < E
x
This is trivial: take M
Definition.

limx-00/(x)

1/E. When x
=

x <

whenever x > M.
>

1/E, obviously 0 < 1/x < E. Similarly:

means that for every


=>

- E </(x) <

>

0 there is an

L +

E.

such that

The Behavior of Functions at Infinity

5.3

219

L-

In the same spirit:


Definition.

limxx0+ f(x)

oo

x0 <

means that for every

x < x0 + o

M there

is a (J > 0 such that

f (x) > M.

=>

I
I
I
I
I
I
M
t- _J
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
--L---'--'--x
x0 x0+o
------

That is, you can make f(x) as big as you want (i.e., > M) by taking
of x0 and very close to x0 (i.e., between

x0

and

+ b.)

x0

We need to talk about one-sided limits (as

x x

or

x x)

x to

the right

because these

often turn out to be different. In some cases, however, the one-sided limits have the
same value. In such cases, limxx.

f (x) must exist, and must be their

common value.

Thus
1.im 2
1

x-+O X

= 00.

The following two theorems justify the remarks that were made above about

f(x)

1/(x2 - 1).

Theorem 1.

Suppose thatf(x) > 0 on an interval (x0, x1).


1.lill
x,,0+

If limxx. f(x)

0, then

1
= 00.
j(x)

M > 0. We need to find a b > 0 such that l/f(x) > M whenever


< x0 + b. Let E = I/M. By the definition of the statement lim,,,_." f(x) = 0,

we know that there is a o > 0 such that

Proof
x0

<

Given

f ( x) <

whenever

x0

< x <

x0

+ b.

5.3

The Variation of Continuous Functions

220

x0 < x < x0 + o,

This is theo that we want: when

f(x) <

we have

1,

and hence
1
__

> M.

f(x)
(Remember that M >

0,

and f (x) >

0 for

the values of

that we are interested in.)

Similarly, we have:
Theorem 2.

Suppose that

f(x) < 0 on an interval (x0, x1).


I.lm
x->xo

Proof

Given

-E

When

Let E =

M < 0.

f(x) >

1
-

f(x)

0, then

= -oo.

-1/M > 0.

Leto be a positive number such that

x0 < x < x0 + o.

whenever

<f(x)

x0 < x < x0 + o,

If lim,,_,.,0+ f(x)

we have
1

-E,

< - '
f(x)

-- >
E

1
-

f(x)

-<M.
f(x)

(Here we have been reversing inequalities, because we have been dividing by negative
numbers.)
Following the analogy of the above definitions, you ought to be able to write
your own definitions of the following statements:
lim

Jim f(x) = oo,


l.-+O')

f(x)

oo,

oo.

x--oo

Jim f(x) =

lim

- oo,

f(x)

X4-00

Consider now the question of

(?)

lim

x-+oc:

(1 !)x (?)
+

We use question marks, because it is not obvious that the indicated limit exists at all:
as

1/x---->- 0, and 1 + 1/x---->- 1.


type 100." We recall, however,

x---->-

of the

oo,

answer like this:

Jim

h-+0
This was also of the form

f(u)

a similar situation before where we got an

(1 + h)lf1i

"1 a:i."
=

Therefore we have an "indeterminate form

= e =

111-1

l.

And the two are related: if we let

(1 + u)11u

and

g(x)

1
=

'

The Behavior of Functions at Infinity

5.3

221

then

( r
1 +

f(g(x)),

and we want to find


limf(g(x)).

x-+ co

This is like the situation in Section 4. 5. There we found:


Theorem.

If lim., xo g(x)

and limu-.uo j(u)

= u0 = g(x0)

(u o ) .

limf(g(x))
For the case in which

x---+

ro,

= j(u0), then

=f

instead of x---+ x0, this theorem is still true, and

the proof is virtually the same. That is:


Theorem 3.

If lim.,_.00

g(x) = u0

and limu_.,,0

f(u) =

limf(g(x))

L.

x-+ oo

L, then

Roughly speaking, the reason is that


x

ro

g(x)

u0

In fact, the same result holds if lim.,00 g(x)


Theorem 4.

If lim.,00

g(x)

and limu oo

ro

f(g(x))

lim f(g(x))
x oo

ro.

f(u)

L.

L, then

L.

These theorems give quick answers to some rather hard-looking problems.


Returning to our discussion of

f(u) = (1 + u)llu,

g(x)

1
-

f(g(x))

(1 ;r
+

we get immediately:
Theorem 5.

lima:-.oo

(1 + 1/x)"' =

e.

This limit is used as a definition of


logarithms.

e,

in some treatments of exponentials and

In such a treatment, the formula

theorem.

e =

limho

(1 + h)1fh

appears as a

PROBLEM SET 5.3

Investigate the following functions for maxima, minima, local maxima, local minima,
direction of concavity, and inflection points. Then investigate for limits of the sort defined
in this section.
1. f(x)

1
=

x(x

- 2)

(x >'6 0, x >'6 2)

222

The Variation of Continuous Functions

2 f(x)

3.

f(x)

(x - l)(x -

4. f(x)

8. f(x)

IO. f(x)

x-+oo

3)

(x -2, x

x
-2
x + 1
1

x + 1
3

x -x

)"'2

( -1 x)

1
1 + 2
X

( Ir
( r
3. 3r2
(

13.
16.

lim

x-,,12

0)

xz

7. f(x)

9. f(x)

-2
x + 1

x + 1

-3

x +
1-

x3

-3

+x

(-1 x)

(-1

x)

x-o+

Jim (x)1f(x-ll

x-1

1 +-

18. f(x)

1 +

21. /(x)

2x

(x

14. Jim o + v':X)11v'x

+ cosx)secx

(1

11 . f (x)

(x 0, x 1, x -1)

Investigate the following, for lim x-oo .

20. f(x)

5. f(x)

15. Jim (1 + x4)11x'


x-o

17. f(x)

3)

-2
x + 1

Investigate:

12. lim

( x ;= l, x ;=

2
x -x -6

6. I <x)

3)

5.3

(1

2
1 +x
+

(1 L)"'
( -rx

19. f(x)

22.

e-xy'

l +

ln x

1 + x

24. Discuss as in Problems 1 through


f(x)

11
=

(In x)/x

(x > 0).

(Here the sticky point is limx- er, You ought to be able to figure out what this limit is,
and convince yourself that your answer must be right. But to prove that the answer is

right is an unreasonably hard problem, at this stage.)

25. Find lim x-o+xIn (1/x). You need not prove that your answer is right.
26. Is there such a thing as limx - 'Y) sinx? Why or why not?
27. Is there such a thing as Jim"'_"' (l/x) sinx? Why or why not?
28. Prove the following:
Theorem

(The squeeze principle). If


f(x) g(x) /z(x)

(x a),

and
Jim f (x)
x--oo

Jim

x-

h(x)

L,

The Introduction of Functions into Geometric Problems

5.4

223

29. If you borrow a dollar for a year, at

you owe 2 dollars.

100 % simple interest, then at the end of the year


(A certain Marcus Junius Brutus lent money at this rate, in the first

century B.C. He was also an assassin.) If interest is compounded semiannually, then

at the end of the year you owe

(1 + t)2

$2.25.

If the interest is compounded n times a year, then you owe

Suppose now that interest is compounded continuously: the bank passes to the limit,

as

increases without limit, and at the end of the year they charge you the limit. How

much do you owe?

30. Suppose that the basic interest rate is 6%, but interest is compounded continuously,
as in Problem 29. How much do you owe?

(To get a numerical answer to this one,

you will need to use one of the tables at the end of the book.)

5.4 THE INTRODUCTION OF FUNCTIONS INTO GEOMETRIC PROBLEMS;


THE USE OF EXISTENCE THEOREMS AS SHORTCUTS

On several occasions already we have been confronted with problems which did not
appear to involve functions, and have solved them by introducing functions.
For example, in Section
from

o to

3.7 we wanted to find the area under the graph of y

4
x ,

1.
!I

y=t4

y=x4

F(x)
We solved this problem by attacking the more general problem of calculating the

function

We found that

xs

F(x)
and then set

x ==

1 to get the answer

t.

Similarly, in Section 4.10 we wanted to show that


In

(ab)

==

In

a +

In

b,

224

5.4

The Variation of Continuous Functions

for every pair of positive numbers a, b. To use the methods of calculus, we had to
introduce functions into the problem. Given k > 0, we set
(x > 0),
(x > 0).

f(x) =In kx
g(x) = Ink + In x

We then found that/'(x) = g'(x) for every x, and/(1) = g(l). It followed that
f = g; and this proved our theorem.
We use the same kind of method to attack problems in maxima and minima which
may be stated in geometric or physical terms. Consider some examples.
Problem 1. A segment of length 1 has its endpoints on the sides of a right angle.
What position for the segment gives maximum area for the resulting triangle?
y
y

The first step is to introduce a coordinate system, as shown on the right above.
The endpoints of the segment now fie on the positive ends of the axes.
Let x be the x-coordinate of the endpoint that lies on the x-axis; and let the other
endpoint be (0, y). When x is named, y is determined. Thus there is a function f
which gives y in terms of x. Since
x2 + y2 = 1,

we have

f(x) = .JI

x2

(0 x 1).

And for each x, the area enclosed is


A(x) = txf(x) = tx.J1 - x2.
We need to investigate the function A for maxima. Now
A'(x)

.! x + .J1 - x2
2 .Jr x2
-

1 2x2 - 1
2 .Ji - x2

.! -x2 x2)
2
.J 1 - x2

(0 x 1).

Therefore A'(x) = 0 when x = /212. Since we are concerned only with numbers
on the interval [O, l], only x = Ii.12 is of interest to us. Here A = t. Any maximum
of A is surely an ILMax, because A(O) = A(I) = 0, and A(x) > 0 for 0 < x < 1.

The Introduction of Functions into Geometric Problems

5.4

225

Therefore our problem is solved, with A

!, if we know the following theorem:

(Existence of maxima). Iff is continuous on [a, b], then/has a maximum


value on [a, b].

Theorem 1

y
y

--(1\1
I

I
I

I
I

--
I

b=x

The maximum may be an ILMax, as on the left above, or it may be at an endpoint,


as on the right. But in many cases, like the one we have just been discussing, it is
plain that the second of these possibilities does not arise. In such cases, we can infer
immediately that the maximum is an ILMax. If the derivative vanishes at only one
point, then this point must be the maximum.
We shall prove Theorem 1 in Section 5.6. Meanwhile, let us look at some more
applications of it. In the preceding example, there are other functions that we might
equally well have introduced.
y

If the

angle at P has measure e (0 e TT/2), then

y =sine,

x =cose,

and

A(O)

txy =t sine cose =! sin 20.

226

The Variation of Continuous Functions

5.4

Therefore

A'(O) = !(cos 20) 2 = t cos W.

The only point 0 on the interval [O, TT/2] where A' (0)

0 is the point where

= !!. .

We claim, without further investigation of derivatives, that this must be where the
maximum occurs. (As in the previous discussion, there must be a maximum some
where; this is not at an endpoint 0 or TT/2; it is therefore an interior local maximum;
at an ILMax, A' (0)
0; and 0 = TT/4 is the only point of the interval at which
A'(O) = 0.)
Setting 0 = TT/4, we get the maximum value of A as
=

A()

i sin

( )
1

i sin

= i,

as before.
On reflection, you may find a way to solve this problem by purely geometrical
methods, without taking any derivatives or even introducing any functions. The
geometric method is easier if you think of it. Even in cases where elementary methods
can be made to work, however, calculus does the same job methodically.
Problem 2. In a coordinate plane, let A
(0, 1) and B
(3, 2), as shown in the
figure. What is the length of the shortest path from A to the x-axis to B? And where
should the path touch the x-axis, for this minimum to be attained?
In other words, for what choice of P
(x, 0) is the sum of the distances AP
and PB as small as possible?
=

y
3
2

Solution. Let

f(x) =AP+ PB
= .J12 + x2 + ./(3 - x)2 + 22
=.Ji + x2 + .Jx2 - 6x + 13.

The Introduction of Functions into Geometric Problems

5.4

Then

f'(x)

x
J +
i

xJx2

x2

x -3

Jx2

6x

227

+ 13

6x + 13 + (x - 3)-,/
J 1 + x2 Jx2 - 6x + 13
-

Thereforef'(x)

0 when

x2(x2 - 6x + 13)

or

x4

or

or

6x3

+ 13x2

(x2

6x

-9

x2 +

2x

- 3 = 0,

or
(x

+ 3)(x -

I)

6x

+ 9)(x2 + 1),

x4 - 6x3 + 9x2 + x2

3x2 +

6x

+ 9,

0,

0.

To examine second derivatives looks hard.

I)

Let us try to use reasoning instead.

x decreases past 0, AP increases, and so does PB. The same is true when x
3. Therefore, in searching for a minimum, we can restrict the search to,
interval [ -1, 4].

When

increases past
say, the
2)

Suppose that we know that the function has a minimum, somewhere on the

interval (-1,

4].

Then the minimum must be an ILMin, at whichf'(x)

is only one such point on our interval, namely,


must be at

x =

I.

1.

0. There

Therefore the minimum

Obviously, to complete this discussion, we need the following theorem:

Theorem 2 (Existence
value on [a, b].

ofm in ima).

If/is continuous on

The proof is easy, granted that Theorem

1 is true.

[a, b], then/has a minimum

Since -fis continuous, it has

a maximum; and any maximum of -fis a minimum off


y

-M

:
b l -- x

-- x
-+--+'a'--I

-t- ---I

-f

I
I

228

5.4

The Variation of Continuous Functions

Here again, once the problem is solved, you may be able to think of a simpler
attack on it.
Problem 3.

But the methods of calculus work in any case.


Find the right circular cylinder of largest volume, inscribed in a sphere

of radius 1.
y

-1

To avoid a difficult drawing problem, we show not the three-dimensional figure


but merely a plane cross section of it.

One way to introduce a function into this

problem is to express the volume of the cylinder as

(0 x 1).
This gives
V'(x)

2TT 2x

Ji -

x2 + x2

x2

Ji -

Therefore
V'(x) = 0

->

2x - 3x3 = 0

->

x(3x2 - 2) = 0 .

Since x must lie on [O, 1), we find that V'(x) = 0 only when x = 0 or x =
Now V has a maximum, because V is continuous on [O, 1].
ILMax, because V(O) = 0
fore V'(x)

Ji

And this must be an

V(l), and V(x) > 0 everywhere else on [O, 1]. There-

0 at the maximum. Therefore the maximum occurs at

J'i.

Hence

the maximum volume is

There is another function that we might have used to solve the same problem.
might have written
V(y)

TTr2h

TTX2 2y

TT(l - y2)2y

This would give


V'(y) = 2TT(l - 3y2),

2TT

(y - y3).

We

5.4

The Introduction of Functions into Geometric Problems

so that
V' (y)

<=>

3y2

<=>

Jt

or

229

-/i.

Here again only the positive number applies, because y must be on the interval [O, 1].
As before, we conclude that the maximum value occurs at y

J};

The second method is simpler. This sort of thing happens often. It is therefore a
good idea to have a quick look at all of the functions that it seems natural to try,
before doing any hard work with any one of them. If the first function that you try
looks simple, there is no point in examining others.
Our third problem shows a danger which should be remembered hereafter.

We

might have supposed that the inscribed cylinder attains its maximum volume at the
stage where the inscribed rectangle (in the cross section) attains its maximum area.
But this is false: it is easy to show that the inscribed rectangle of maximum area is a
square; and the cross section of the maximal cylinder is a rectangle of base

2.Jj and

altitude 2.Jt. Therefore we should never assume without proof that two maximum
minimum problems are equivalent.
A further word of caution: In establishing that a certain

x0

gives a maximum or

minimum, you may use the theorems of the preceding sections.

Under certain con

ditions, you may avoid these theorems (and the calculations that they require) by the
sort of reasoning that we have used in the problems above. But in any case, you must
use

either the theorems of the preceding sections or a reasoning process which justifies

your conclusions. To find a point

x0

where a derivative vanishes and hence infer that

your problem is solved is a mistake. For one thing,

x0

may give a minimum when you

were looking for a maximum, or vice versa. For another thing,

x0

may give a point

of inflection.
PROBLEM SET 5.4
1.

Find the area of the largest rectangle than can be inscribed in a semicircle of radius

a.

2. Find the area of the largest rectangle that can be inscribed in an equilateral triangle

whose sides have length


3.

a.

Find the area of the triangle with the smallest area which contains a square with side

a.

4. Find the perimeter of the triangle with the smallest perimeter which contains a square

with side

a ..

5. A rectangular field has one side along a river and a fence along the other three sides.
If the total length of the fence is k, what is the maximum possible area of the field?
6.

Given a rectangular field with one side along a river, as in Problem 5.

If the area of

the field is A, what is the minimum possible length of the fence?


7. If a rectangular wooden beam is supported horizontally at its ends, then the maximum

weight that it can support at its midpoint is proportional (at least approximately) to its
width, and to the square of its thickness. That is, W

y2, where xis the width,

5.4

The Variation of Continuous Functions

230

is the thickness, and k is a constant depending on the wood (and on the units of

length and weight).


Suppose that such a beam is to be cut from a cylindrical log of radius

a,

in such a

way as to maximize W. What should be the width and the thickness?


8.

An open pan is to be made out of a square metal sheet, by cutting out the square pieces
from the corners of the sheet and folding up the sides of the metal that is left.
square pieces are to be thrown away.) If the sheet has edges of length

a,

(The

what is the

volume of the pan of largest volume that can be made in this way?

9.

An open pan, of the sort described in the preceding problem, has a total surface area
of

10.

128

sq. in. What is the largest possible volume?

10

Find the closed circular cylinder with volume

cu. in. and surface area as small as

possible.
11.

12.

Solve the same problem, given that the cylinder is open at one end.
Solve the same problem, given that the cylinder is open at both ends.

(It sits on a flat

table and holds flour.)

13.

A piece of sheet metal,

feet long and

feet wide, is to be bent so as to form a trough

n feet long, with open top, open ends, and triangular cross sections. What is the greatest
possible cross sectional area?

14.

A trough is to be made with isosceles right triangles as endpiecesand congruentrectangles


as sides, as shown in the figure.

If the total surface area is to be

100

sq. in., what is

the maximum volume?

15.

In a rectangular parallelepiped, with a square base, the total length of the edges is k.
What is the largest possible volume?

16.

A rectangle is to be inscribed in the region above the x-axis and below the graph of

y =

x2 Find the area of the rectangle of maximum area.

y =

17.

Same problem, for

18.

Find the rectangle of maximum area contained in the region above the line
to the right of the line x

19.

x4

1,

and under the graph of y

A rectangle is inscribed in the region R

{(x,

y) I ixl

= 1/x.

[y\ 1},

y = t,

in such a way as to

maximize the area. Find the area of the rectangle.


Find the values of x at which the following functions take on their maximum values,

and j ustify your answers. You need not find the maximum values of the functions.

20. f(x)
22.

h(x)

x
=

+ x2

J"'

Problems

-1

Sin-1 t dt

24 through

21.
23.

g(x)

</>(X)

=
=

x
--

1 + x4

finx

V1

+ t8 dt

27. Investigate the preceding four functions for minimum values.

5.4

The Introduction of Functions into Geometric Problems

231

28. An isosceles triangle has base d and. altitude h. Find the area of the rectangle of largest
area that can be inscribed in it.

29. Given a triangle with angles of 30, 60, and 90, there are three plausible ways of
inscribing in it a rectangle of maximum area; the rectangle may have a side lying along

any one of the three sides of the triangle.

Show that all three of these "maximal"

rectangles are really maximal; that is, show that they all have the same area.
30.

Show that there are some triangles for which the conclusion in Problem 29 does not hold.

31. Show, however, that the conclusion of Problem 29 holds for a class of triangles which
includes more than the 30-60-90 triangles.

32. Consider the curve which is the graph of the equation x2 + 4y2

4.

Find the area

of the rectangle of largest area that can be inscribed in this curve.

33. A right circular cone has a base of diameter d, and altitude h. Find the volume of the
largest right circular cylinder that can be inscribed in it.
34.

Find the area of the isosceles triangle of maximum area that can be inscribed in a circle
of radius r.

35. Find the volume of the right circular cylinder of maximum volume that can be inscribed
in a sphere of radius r.
36.

Suppose that in Problem 34 the word "isosceles" is omitted.

Is the solution of the

resulting problem the same as before?

37. Similarly, discuss the problem obtained by omitting the word "right" in Problem 35.
38. Find the length of the longest ladder than can be carried (in a horizontal position)
around the corner shown on the left below. The segment from P to Q shows a possible
position of the ladder.

39. In the right-hand figure above, the circle (of radius r) is inscribed in the right angle
LBAC. What is the minimum possible area of 6ADE?
**40. Suppose that in Problem 39 we do not require that LBAC be a right angle.
that LBAC has measure
and

ix.

ix,

Given

find the minimum possible area of 6ADE, in terms of

(This is much harder than Problem 39.)

232

5.5

The Variation of Continuous Functions

5.5

THE USE OF FUNCTIONAL EQUATIONS AS SHORTCUTS

In the preceding section, we found that under some conditions we could locate maxi
mum and minimum values merely by finding a point where the derivative vanishes.
We shall now see that in some cases we can locate maximum and minimum values
without calculating the function. Consider first a simple problem, from Section 5.4.
Problem

1.

A segment of length

has its endpoints on the sides of a right angle.

What position for the segment gives maximum area for the resulting triangle?
y

As in Section 5.4, we set up the axes as shown. Let x be the x-coordinate of the
lower endpoint of the segment; and for each x from
of the other endpoint.

0 to 1, letf(x) be they-coordinate

Note that we are entitled to use functional notation: f(x)

really is determined when x is named. And for each x, we have


x2
because x2

[j(x)]2 = l2,

[ f(x)]2 is the square of the length of the segment. Therefore the function

f satisfies the equation

(0

x2+f2=l
The area of the triangle is
A(x)
Now in

(1),

1).

(1)

tx f(x).

(2)

the left-hand member is a function, whose derivative is 2 x+ 2 ff'.

But this function is known to be a constant, equal to


Therefore
x

(0 <

+ff' = 0

for every x from

< 1).

Here, of course, we are assuming that f has a derivative, for

to

1.

(l')

0 <

< 1,

but this

must be true, because the graph off is a quadrant of a circle. Obviously


A'(x).=

x f'(x)

+t

f(x).

(2')

The maximum of A(x) must be an ILMax; and so, at the maximum of A(x), we have

xf' +f = 0.

We now know:

!' =

!' = _i
x

on

(0, 1),

at the maximum.

(2")

The Use of Functional Equations as Shortcuts

5.S

233

Therefore, at the maximum, both of these equations hold, and


and
x = f(x).
f
x
That is, the maxim um is achieved when the triangle is isosceles.
This discussion has been long, because ideas needed to be explained; but once the
ideas are understood, the calculations are simple:

x 2 + f2 =

1,

x + 2 ff'= 0,

f' =

J;

A'(x)= tx f'(x) + tf(x);

A(x) = tx f(x),
and hence

A'= 0

f'= - j
x

<::?-

Therefore, at the maximum,

x
- -= _J
x
f

x = f(x).

and

In this case, of course, it was not much trouble to find a formula for f and use it.
But in many cases, equations like

x2+12=

are more convenient than formulas for the function f These are called functional
equations. Obviously every trigonometric identity is a functional equation. Usually,
however, we use the word identity when the function is known, and the termfunc
tional equation when the equation itself is being used as a working definition of the
function.

Consider another example, Problem


Problem 3.

in Section 5.4.

Find the right circular cylinder of largest volume, inscribed in a

sphere of radius

a.

y
a

As before, we show a vertical cross section of the figure. Let x be the radius of the
inscribed cylinder, and letf(x) be half the altitude. Then

x2 +f2= a2,
and

f' =

x
f

x + 2 ff' = 0,

(0

<

<

a).

(3)

s.s

The Variation of Continuous Functions

234

Now the volume is

V(x) = 7TX2 2f(x),

so that

V'= 27T (x2.f' + 2xf).

At the maximum, V'

0, and so

f' =

- 2xf= - 2!
x
x2

(4)

(at Max).

Therefore, at the maximum, both our formulas for f' must hold, and so

2
=- f
x
f

-
and

./2

For a

1, this tells us that

(5)

f=-x.
2

- '\/1;,_
3
-

'

as before.
Note, however, that in a way the most natural answer to a problem like this is a

shape, rather than a size.

And the solution based on the functional equation ordinarily

gives the answer in the form of a shape, that is, in the form of a ratio between two
measurements.
mines the

size

For example, in the preceding problem the constant

entiated in the equation


maximum,

a,

which deter

of the whole configuration, disappeared immediately when we differ

x2 + f2 = a2.

Our final equation

(5)

means that at the

2f(x) = /2 x,

that is, the altitude of the maximum cylinder is equal to


y
a

-a

/2 times the radius of its base.

The Use of Functional Equations as Shortcuts

5.5

235

The answer is also a shape when the problem is to find the rectangle of maximum
area in a given circle:

x2 + f2
f'

2x + 2ff'

(0

-
f

A(x)

a2,

0,

x < a);

<

(2x) 2f(x) = 4

x f,

A'(x) = 4(xf' + f),


A'(x)

<=>

f'

-f
x

Therefore at the maximum,

because

f '=

and

and f are both positive.

x = f,

This is a qualitative answer, as it should be: it

says that the maximum rectangle is a square.

The constant

has disappeared,

because the shape of the maximum rectangle is the same for all circles.
In the following problem set, you will find more cases in which maxima and
minima can most conveniently be found by using functional equations.

Meanwhile

let us look carefully at what happens when we take the derivative on each side of a
functional equation. The ideas here are illustrated by a simple case. When we write

x2 + 12 = a2

=>

we are claiming that

Eq. (7).

(6)

2. x + 2 . ff' = 0,

(7)

every differentiable function which satisfies Eq. (6) also satisfies

It often happens that there is more than one such function/ For example,

consider
Here

f{(x)=
and

f(x)

-x
,
fi( x)

-x
..Ja2 - x2

x
..J a2 - x2

-x

--;===

-..Ja2 - x2

Therefore
Therefore

fi(x)f{(x)

-x,

and

{2. x + 2. fd{ = 0,
2

-x
f2(x)

. x + 2. ! !
2

0.

(8)

236

5.5

The Variation of Continuous Functions

That is, both/1 and/2 satisfy (7). A figure makes it obvious what is going on here.
y

At each of the labeled points, we have

fi(x)=

-x
=
f;(x)/x f;(x) '
-1

because the tangent is perpendicular to the radius.


The same sort of thing goes on in more complicated cases. The graph of

y= x3 - x

(9)

looks like the left-hand figure below.


Therefore the graph of

x= ya-y

(10)

looks like the right-hand drawing below.

We have interchanged

and y in Eq. (9), and reflected the graph across the line

The Use of Functional Equations as Shortcuts

5.5

y =

x.

237

This gives the curve C which is the graph of (10). C is not a function-graph.

But C is the union of the graphs of three functions fi,f2,f3, as indicated in the figure.

And each of the functions ft> h, and /3 satisfies the functional equation
x

=/3

Therefore each of these functions satisfies the differential equation


1

3 Pf'

!'.

This is what we are claiming when we differentiate the functional equation, and write
x

=/3

=>

1 = 3 . !21'

f'.

PROBLEM SET 5.5

In Problems 1 through 10 below, the notation 5.4.n refers to Problem n of Problem Set
5.4. In each of these cases, the indicated ratio is to be found by the method based on func
tional equations.
1. In 5.4.1, find altitude/base, at the maximum.
2. In 5.4.2, same.
3. In 5.4.5, same, using the side parallel to the river as base.
4. In 5.4.7, findy/x, at the maximum.
5. In 5.4.14, let l be the length of the rectangular side and let w be the width. Find w/l
at the maximum.
6. In 5.4.15, let h be the altitude and let e be the length of each edge of the base. Find
h/e, at the maximum.
7.

In 5.4.16, find altitude/base, at the maximum.

8.

In 5.4.28, same.

9.

In 5.4.33, let a be the altitude of the cylinder, and let r be the radius of the base. Find
a/r, at the maximum.

10. In 5.4.34, let h be the altitude, and let a be half the length of the base. Find hfa, at the
maximum.
11. In 5.4.35, let h be the altitude and let a be the radius of the base. Find hfa, at the
maximum.
12. We know of a function f, with domain [ -1, 1], which is a solution of the functional
equation sin f(x) = x. (Our "known function," of course, is f(x) = Sin-1 x.) What
other continuous solutions of the equation have the entire interval [ -1, 1] as domain?
Draw a figure.
13. Write a differential equation which is satisfied by all solutions of the functional equation
x4 + [/ (x)]4

1.

14. a) Let n = 101010 Sketch the graph of xn + yn


l. [Hint: A commonly used drawing
instrument will give you an excellent sketch.]
b) Let n = 101010 + 1. Sketch the graph of xn + yn = 1. [Same hint.]
=

238

The Variation of Continuous Functions

5.6

15. Find the functions/which satisfy the differential equation

x +ff'= 0.
(You need not show that the solutions that you describe are the only ones.)

16. Given that f and/' are continuous, let

F(x) =
Calculate

F(x),

f'f

(t)/1(t) dt.

in terms off

*17. Now show that your list of solutions, in Problem 15, is complete.
18. Let f be the function whose graph is the union of (a) the lower left-hand quadrant of
the circle with center at (0, 1) and radius 1 and (b) the upper right-hand quadrant of the
circle with center at (0, -1) and radius I. Show that f is a solution of the differential
equation

[f'(x)]2 = [x + f(x)J'(x)]2,
except, of course, at the endpoints x = 1, where the tangent lines are vertical and the
function- has no derivative. As a start, observe that at x
0, the tangent to the graph
is horizontal and the equation is satisfied: 02 = [O + 0 0]2
=

*19. Consider the family of quadratic functions represented by the formula


(1)

f (x) = (x - a)2
Differentiating, we get

j'(x) = 2(x - a),


and squaring, we get

[j'(x)]2 = 4(x - a)2,

and

[j'(x)]2 = 4
Evidently (1)

/(x)

(2)

(2). But the converse is false.

a) Show that one of the solutions of (2) is a

linear function f

b) Show that (2) has some solutions which are neither quadratic nor linear; that is,
the differential equation has solutions whose total graphs are neither lines nor
parabolas.

5.6

THE COMPLETENESS OF

R AND THE EXISTENCE OF MAXIMA

In Section 5.4 and later, we have used the fact that, if f is continuous on
then f has a maximum value o:n

[a, b].

[a, b],

In Section 5.4 this theorem was used as a

shortcut in finding maximum values, but this is only one of the uses of the theorem.
In fact, the theorem is part of the foundation of the calculus, as we shall see.
In proving it, we shall need to use, for the first time, the fact that the number line
has no holes in it.

As a guide in giving an exact description of this property of the

number system, let us consider what happens when you remove a point from the
number line, thus getting a system which really does have a hole in it.
Let A be the set of all negative numbers, and let B be the set of all positive

The Completeness of R and the Existence of Maxima

5.6

numbers.

239

We mean strictly positive and strictly negative, so that 0 belongs neither

to A nor to B. Then

1)

B has no least element.


R+
3 ------= 2-=---

1--"x'---x-- 1=----

The reason is that if xis a positive number, then so is x2


/ , and x/2 < x. Therefore
no positive number x is less than all other positive numbers, and so

(1) holds.

Similarly,
2)

A has no greatest element.


---.,;.
3 ..
R- -_

-,...2

l X--:':-X-+-1 ---
-

For if x < 0, then x/2 < 0, and x2


/ > x.
Now let
K

A u B

{xI x :

O}.

Then obviously:

3)

K is the union of two nonempty sets A and B, such that (a) every number in A

is less than every number in B, but (b) A has no greatest element, and (c) B has no
least element .
Evidently this situation could not have arisen if we had not excluded 0: if we
put 0 in A, then 0 would be the greatest element of A; and if we put 0 in B, then 0
would be the least element of B. Thus the situation described in

(3) can arise only in a

number system with a hole in it, and so the following statement conveys the idea that
there are no holes in R:
The Dedekind Cut Postulate (DCP).

Suppose R is expressed as the union of two

nonempty sets A and B, such that every element of A is less than every element of B.
Then either A has a greatest element or B has a least element.
B

Xo

In the figure, x0 must belong either to A or to B.

Therefore x0 is either the

greatest element of A or the least element of B.


We have stated DCP as our first description of the completeness of R, because
it is the best known description, and in some ways the most natural.
purposes, the following idea is easier to use.

But for some

Given a sequence

[a1, b1], [a2, b2L ...


of closed intervals.

If every interval in the sequence contains the next, then we say

240

5.6

The Variation of Continuous Functions

that the sequence is

nested.

Algebraically, this means that


for every

For example, if

( - ! !)

[ai' b]
i

'

i.

for every i,

then the sequence is nested. This sequence "closes down on O." That is, 0 lies in each
of the intervals in the sequence, and 0 is the only number that lies in all of them.
A more important example is as follows.

Given a circle of radius 1, let Pn be the

perimeter of an inscribed regular (n + 2)-gon, and let qn be the perimeter of a circum

scribed regular

(n +

2)-gon. Evidently

P1 < P2 < p3 <

and

for each i.
Thus we have a nested sequence

of closed intervals. And this sequence "closes down on 27T." That is, 27T lies in all of
the intervals in the sequence, and no other number lies in all of them.
The following postulate says that every nested sequence of intervals closes down
on at least one point.

The Nested Interval Postulate (NIP). For every nested sequence of closed intervals
there is a number x which lies in every interval in the sequence.
This conveys the idea that the number system is complete. Suppose, for example,
that 27T were missing, so that the number system had a hole in it where 2rr ought to be.
Then no number at all would lie on all of the intervals
have just discussed.

Similarly, if

./2

[Pi, q1], [p2, q2],

sequence of closed intervals closing down on no number whatever.

[,./2 - lfi, ,J2 + l/i]

that we

were missing, then there would be a nested


(We could use

as the ith interval in the sequence.)

Using the nested interval postulate (NIP), we shall prove the following theorem:

Theorem

1. If/is continuous on

[a, b],

then/has an upper bound on

That is, there is a number M such that/(x) M for each x of

[a, b].

[a, b].

Lemma. If/is unbounded above on an interval [c, d], then/is unbounded above on
at least one of the halves of
By the halves of

[c, d]

[c, d].

we mean the intervals

[c, (c + d)/2]

[(c + d)/2, d].


[c, (c + d)/2]
bound on [c, d].

and

The proof of the lemma is immediate: if/has an upper bound M1 on


and has an upper bound M2 on

[(c + d)/2, d],

then/has an upper

\\{e merely use the larger of the bounds M1 and M2

5.6

The Completeness of R and the Existence of Maxima

241

I
I
I
I
I

--1------1
I
I
I
I

+-
c
c+d-d,,...__
._ x

2
We proceed to prove the theorem.

For short, we say that an interval is

good

is good. We start by supposing that

[a, b]

if f is bounded above on the interval; and we say that an interval is


good. Thus we need to prove that

[a, b]

bad if

is bad, and we shall show this assumption leads to a contradiction.


Let

Let

it is not

If

[a, b] is bad, then it follows that at least one of the halves of [a, b] must be bad.
[a1, b1] be a bad half of [a, b]. For the same reason, [a1, b1] must have a bad half.
[a2, b2] be a bad half of [a1, b1]. Continuing this process to infinity, we get a

sequence

[a1, b1], [a2, b2], ...


of closed intervals, all of which are bad, and each of which is a half of the preceding
one. Therefore

and so

b; - a;
By NIP, there is an x such that
But f is continuous at

-: (b

2'

a; x b; for

a).

each i.

Thus, for every E > 0,

x.

f has

an EO-box at the point

(x,J(x)).
y
M = f(x) +

--------1--------,
I
I
I
I

f(x)

Thus

lx-xl<o

::?-

f(x)-E<f(x)<f(x)+E,

and so f (x) + E is an upper bound for f on the interval


lim

i-+ 00

(b; - ai)

0,

(x -o, x + o).

But since

242

The Variation of Continuous Functions

5.6

we have

bi - ai

<

for some i.

For such an i, the closed interval

(x

a). That is,

- a,

[ai, bi] lies inside the open interval

as shown in the figure.

x-o

a;

(This is easy to see geometrically, because

[a;, b;] contains the midpoint x of the open

interval, and is less than half as long.)

(x - a, x + a)
[ai, b;]. This contradiction completes

But this situation is impossible, because f is bounded above on


and is not bounded above on the smaller interval
the proof of the theorem.

One of the ideas that we have just used is going to be useful later. We therefore
record it as a theorem:
Theorem 2.

Suppose that

for each i, and


lim

(bi

a;)

0.

i-+ 00

Then every interval

(x - a, x

a) contains some interval [a;, b;].


(x - a, x + a) contains [a;, b;],
a) contains all of the later intervals

This was proved in the preceding discussion. If


then of course it follows that

(x - a,.\'

Given that a functionfis bounded, it does not follow thatfhas a maximum or a


minimum. Consider, for example,
f

Tan-1.
y

---------------

7r

The Completeness of R and the Existence of Maxima

5.6

243

When xis far to the right, Tan-1 xis close to

TT/2, but Tan-1 xis never actually equal


TT/2 for any x. Similarly, when xis far to the left, Tan-1 xis close to -TT/2, but
-TT/2 is not one of the values of the function. On the other hand, it is easy to see that
the numbers TT/2 and -TT/2 are related to the function Tan-1 in a special way: TT/2
is an upper bound of the function; and of all upper bounds of the function, TT/2 is the

to

smallest.

We express this by writing

!!. =sup Tan-1.

Here

sup

is

2
pronounced supremum.

To be exact:

Definition. If k is an upper bound of a function f, and k is smaller than every other


upper bound of/, then k is called the supremum of/, and we write
k =sup/

More generally, we define the supremum for any set of numbers:

Definition. Let B be a set of numbers. If x k, for every xin B, then k is an upper


every other upper bound of B, then k is called the

bound of B. If k is smaller than


supremum of B, and we write

k =sup B.

Consider, for example, the case where B is an open interval

(a, b).

Every number

k ?; b is an upper bound of B. Thus the upper bounds of B form an interval [b, w ).


B

Here

is an upper bound of B, and

Therefore

Consider now
B_

is smaller than all other upper bounds of B.

=sup B.

{1 '3 '4'

... f'

Here the upper bounds of B are the points of the interval

[ 1, w),

and sup B = 1.

4
5

---'------++<>---'-- x
0
1
2 3 5
x =2
346

In each of these cases, starting with a nonempty set B which is bounded above,

we have found that the upper bounds form an interval of the type

[k, w), and k =

sup B. The following postulate says that this is what always happens:

The Least Upper Bound Postulate (LUBP). Let B be a nonempty set of numbers.
If B has an upper bound, then B has a supremum.
Using the least upper bound postulate, we shall show that no continuous function
can behave like Tan-1 if its domain is a closed interval:

5.6

The Variation of Continuous Functions

244

(Existence of maxima). If/is continuous on [a, b], then/has a maximum


value on [a, b].

Theorem 3

M =f(x),f(x) M for every

Proof We know by Theorem

that/ is bounded. Let


k

Thenf(x) k for every


Suppose not, and let

supf

on [a, b]. We need to show thatf(x)

g(x)

1
=

(a

k - f(x)

for some x.

b).

Then g is continuous. But g is unbounded. For suppose that


for a x b.

g(x) M
Then
1

:$

k - f(x) -

and

f(x) :$
-

M,

- _!_
M

k - f(x),

for a

b.

This is impossible, because k is the least of the upper bounds off


Thus, if/has no maximum, there is a continuous function g which is unbounded
on [a, b]. This contradicts Theorem 1, and so completes the proof of Theorem 3.
We have already observed, in Section 5.4, that the existence of maxima implies
the existence of minima. Therefore

(Existence of minima). If/is continuous on [a, b], then/has a minimum


value on [a, b].

Theorem 4

(This was Theorem

of Section 5.4.)

PROBLEM SET 5.6


1.

2.

Let B be the set of all rational numbers p/q for which p2/q2 < 2. What is sup B?
Consider a circle of radius 1. For each polygon

perimeter of P. Let B be the set of all numbers

P inscribed in the circle, Jet k(P) be the


k(P). What is sup B?

3.

245

The Completeness of R and the Existence of Maxima

5.6
Consider the graph off

(x)

sin

x, 0

1T.

Suppose that we cut up the interval

[0, 7r] into little intervals, in any way, using subdivision points 0
x1 < x2 <
<
xi < xi+l <
< x,. = 1T. Over each little interval [xi, xi+i l we set up the tallest
possible inscribed rectangle with [xi, xi+i l as base. Let s be the sum of the areas of the
rectangles. Let B be the set of all numbers s which are obtainable in this way. What is
=

sup B?

(A numerical answer is called for here.)

4. Let B be any set of numbers. If b EB, and b is larger than every other element of B,
then b is called the greatest element of B, and we write b

Max B.

Question: If B

has an upper bound, does it follow that B has a Max?


5.

Suppose that we had defined bounds and suprema in the following way:
"Let B be a set of numbers, and let k be a number.
then k is a strict upper bound of B.

If

< k, for every

in B,

If k is a strict upper bound of B, and is smaller

than every other strict upper bound of B, then k

sup B."

a) What is the difference between this "definition" and the usual definition of upper
bounds and suprema?
Under the new "definition" of "supremum," which if any of the following statements
are true?
b) Every finite set has a "supremum."
c) No finite set has a "supremum."
d) Every open interval has a "supremum."
e) No open interval has a "supremum."
f ) Every closed interval has a "supremum."
g) No closed interval has a "supremum."
6. If B is a set of numbers, then -B denotes the set obtained when we replace every
element

of B by its negative

-x.

That is,

-B
For example, if B

(-

oo,

{ -x Ix EB}.

[l, 2], then -B

[ -2, -1]; if B

[ -1,

oo

) , then -B

1 ], and so on. Prove the following:

Theorem. If (a) k is an upper bound of B, then (b) -k is a lower bound of

B And
.

conversely, (b) implies (a).


(This is easy; don't try to make it hard.)
7. If k is a lower bound of the set B, and k is greater than every other lower bound of B,
then k is called the infimum of B, and we write k

inf B. Show that if a set B is bounded

below, then B has an infimum.


8. Let B be a set which is bounded below, and let K be the set of all lower bounds of B.
Describe Kin the interval notation.
*9. Let [av bi], [a2, b2], .. .be a nested sequence, and letA

{ava2,

Show that (a) every number b; is an upper bound of A. Let x


(b) ai

}, B

{b1,b2,

}.

sup A. Then show that

x bi for every i.

This result means that the least upper bound postulate

(LUBP) implies the nested

interval postulate (NIP).


*10. Let Kbe a (nonempty) set of numbers, bounded above. Let A be the set of all numbers

a which are not upper bounds of K. That is, a EA if a < k for some k in K.
Show that A cannot contain a greatest element.

246

* 11.

5.7

The Variation of Continuous Functions

Show that the Dedekind cut postulate (DCP) implies the least upper bound postulate
(LUB P) .
The results of Problems 9 and 11 mean that
DCP

=>

LUBP

=>

NIP.

Thus our only really new assumption, in this section, is DCP.


5.7

THE MEAN-VALUE THEOREM AND THE NO-JUMP THEOREM

The mean-value theorem was stated in Chapter 3, and we have been using it ever
since. We are now finally in a position to prove it. We need one preliminary result.
Rolle's Theorem. Iffis continuous on the closed i'nterval [a, b] and differentiable on
the open interval (a, b), and f(a) f(b)
0, then j'(x)
0 for some x between
a and b.
=

Proof There are three cases to consider:


1) Suppose thatf(x)
gives/'(x) = 0.

0 for every x on [a, b]. Then any number x between a and

2) Suppose that/(x) > 0 for some x on [a, b]. Now fhas a maximum at some x,
and x is not a or b. Therefore fhas an ILMax at x. By Theorem 3 of Section 5.2 it
follows that f'(x)
0.
3) Iff(x) < 0 for some x, then the minimum of/is an ILMin. By Theorem 4 of
Section 5.2 we know that at an ILMin the derivative vanishes.
=

The proof of MVT is now easy.


y

f(b

l-

----------

The Mean-Value Theorem and the No-Jump Theorem

5.7

247

Given that f is continuous on [a, b] and differentiable on (a, b), let g be the linear
function which agrees with fat a and at b. Thus

g(a)

g(b)

f(a),

f(b).

We could write a formula for g, in the form g(x)


mx + k, if we needed to, but
we don't need to. Since the derivative of a linear function is simply the slope of the
line which is its graph, we know that
=

g '(x)

f(b) - f(a)
,
b - a

for every x. For each x of [a, b], let

c/>(x)

f(x)

g(x).

Then cf> is continuous on [a, b] (because fand g are), and cf> is differentiable on (a, b),
with

c/>'(x)
Since cf>(a)
c/>(b)
some x. Thus
=

f'(x) - g'(x)

f(b) - f(a)
f'(x) b - a

0, we can apply Rolle's theorem.

f '( x)

and

f' (x)

f(b) - f(a)
b - a
=

Therefore cf>'(x)

0 for

o,

f(b) - f(a)
a
b
-

for some x, which was to be proved.


The no-jump theorem is harder. To prove it, we need to go back to first principles,
and we need some preliminary results.
Lemma 1. Let f be a continuous function, on an open interval containing x0
f(x0) > 0, then there is a o > 0 such that

x0 -

O <

<

x0

+ c'J

=>

Proof Since fis continuous, we know that


lim f(x)

f(xo).

f (x)

> 0.

If

The Variation of Continuous Functions

248

In the definition of a limit, we take

x0 -

< x < X0

because f(x0)

+ o

=>

E =

f(x0)

f(x0) -

0. Therefore the

E =

5.7

>

0.

There is a

<f(x) <f(x0) +

that we have is the

o >
E

0 such that

=>

0 <f(x),

that we wanted.

Lemma 2. Letfbe a continuous function, on an interval containing x0 Iff(x0) < 0,


then there is a

o >

0 such that

x0 -

Proof?

< x < x0

=>

+ o

f(x) < 0.

(The proof of Lemma 1 can be adapted, to give a proof of Lemma 2.

it is quicker to derive Lemma 2 from the statement of Lemma

A functionfchanges sign, on an interval/, iff(x) >

But

1.)

0 for some x in I and/(x') <

0 for some x' in /.

Lemma 3. If fis continuous, on an interval containing x0, and f(x0) - 0, then there
is a 0 > 0 SUCh that j does not change sign on the interval (x0 - O, Xo + 0).

Proof.

For f(x0)

>

0, this follows from Lemma 1.

For f(x0) < 0, it follows from

Lemma 2.
We are now ready to prove the following convenient special case of the no-jump
theorem.

Theorem 1. If/is continuous on [a, b], and/ changes sign on [a, b], then
f(x0)

for some x0 in [a, b].


y

The proof is based on Lemma 3 and the nested interval postulate (NIP). We
suppose that/(x) - 0 for every x in [a, b]. We shall show that this assumption leads
to a contradiction.
Given that f changes sign on [a, b] and that f(x) is never

0, it follows that f

changes sign on one of the halves of [a, b]. We recall, from Section 5.6, that the
halves of [a, b] are [a, (a + b)/2] and [(a + b)/2, b]. Let [a1, b1] be half of [a, b],
such that f changes sign on [a1, b1]. Similarly, let [a2, b2] be half of [av bi], such
that f changes sign on [a2, b2]. Proceeding to infinity in this way, we get a nested
sequence

[a1, b1], [a2, b2], ...


of closed intervals, such that f changes sign on each of them. Evidently

b; - a;

i(

b - a),

5.7

The Mean-Value Theorem and the No-Jump Theorem

249

and so
lim (bi

a;)

0,

i-t> 00

as in the proof of Theorem 1 of Section 5.6. By NIP, there is an x0 which lies on all
of the intervals in the nested sequence. That is,
for every

i.

By Lemma 3 there is a o > 0 such that/ does not change sign on the interval (x0
x0 + o). By Theorem 2 of Section 5.6, there is an i for which [a;, b;] lies in (x0
Xo + 0), as indicated in the figure.

o,
0,

This is impossible, because f changes sign on [a;, b;], but does not change sign on
(x0
o, x0 + o). This contradiction completes the proof of Theorem 1.
It is now easy to prove the no-jump theorem.
-

Theorem 2

(The no-jump theorem). If f is continuous on [x1, x2], then f takes on

every value betweenf(x1) andf(x2).

Proof Suppose first that


j(x1) < k < f(x2),

and let

g(x)

f(x)

k.

0 for some x0 on [x1, x2]. This


Then g changes sign on [x1, x2]. Therefore g(x0)
givesf(x0)
0,
andf(x0)
k.
k
Iff(x2) < k < f(x1), then the same function g still changes sign, and so the proof
is exactly the same.
=

This completes our reexamination of the foundations of calculus. It now appears


that the idea of a continuous function is adequately described by the EO-definition
of a limit and that the completeness of the number system R, in the sense of "no
holes," is adequately described by the Dedekind cut postulate (which implies the least
upper bound postulate and the nested interval postulate).
The theorems in this section and the preceding one are not news; it was obvious
at the outset that these theorems ought to be true. But the fact that these theorems
can be proved, on the basis of a single simple assumption DCP, is significant. It
means that mathematics hangs together in a special way.

5.8

The Variation of Continuous Functions

250

Nobody expects that a doctor will write down a definition of the word
and then write a few assumptions about

men,

man

in such a way that all medical science

can be derived by logical reasoning from the definition and from the assumptions.

Medicine is an empirical science: it depends on observations of fact, not just at the

outset but continually. Mathematics is different.

Moreover, in your study of mathematics you have already passed the point where

the truth can be relied upon to be obvious and where obvious things can be relied

on to be true. From now on, logic is going to be an important part of your mathe

matical equipment.

This is partly due to recent developments.

calculus was illogical, and very few people cared.

As late as 1800,

In the last century, however,

mathematical ideas which require careful logical analysis have become more

important, in pure research and also in applications.

5.8

Let

THE DERIVATIVE OF ONE FUNCTION WITH RESPECT TO ANOTHER

fandg

be differentiable functions. Take a point x0, and form the differences

Af= f(x0
A g =g(x0

Ax) - f(x0),

Ax) - g( x0) .

!if/Ag approaches a limit, as Ax---->- 0, then this


with respect to g, and is denoted by dfd
/ g. That is,

If

limit is called the derivative

off

!if= df
6.x-+O A g
dg,
Jim

by definition. In fact, the limit always exists, wheneverg'(x0) "16- 0.


Theorem 1.

'

df
f
-=-,
dg g'

whereverg'(x) "16- 0.

Proof
!if
6.x-+O fig
lim

Jim

6.x-+O

!if/fix
fig/fix

For the case in which g(x) = x for every

x,

reduces to an ordinary derivative:

Theorem 2. Ifg(x) = x for every

x,

f'(xo)
.
g'(x0)

the derivative off with respect tog

then

df df
=
dg dx

f'(x).

Obviously,

df
=
dg

for each x0

Jim
6.x-+O

!if
fix

f'(x0),

The Derivative of One Function with Respect to Another

5.8

251

Some examples are as follows:

d sin x
=
---

d cos x

dsin x
--- =

dx

cos x

---

-sin x

(wherever cos x ":/= 0)

-cot x,

cos x,

de"'

e"'

dx2

2x

(wherever x ":/= 0)

We often write

d
-f(x)
dx

df

for

dx

Thus every derivative can be written in the form

f'(x)
'
f

d
-f(x)
dx

df.
dx

The notation df/dx for derivatives is widely used, especially in physics, and it is
natural to use it when you are continually dealing with the derivative df/dg of one

function with respect to another.

It has a disadvantage, however: there is no con

venient way to write the value of the derivative at a particular point x0


we denote this by

fx 'X=Wo'

but the notation f' (x0) is more convenient.

We now want to prove a sort of cancellation law

df. dg
dg dh

df
dh

We can derive this from the equation

l:lf. !:l g - l:lf


!:lg l:lh - l:lh '
taing the limit as l:lx--+ 0. Thus we need

l:lj
!:lg

df

----

as l:lx

--+

dg '

!:lg

dg

l:lh

dh '

_ _,,_ _

0. This requires

g'(x0) ":/=

0,

as in Theorem 1. Hence the conditions in the following theorem:

Sometimes

252

5.8

The Variation of Continuous Functions

Theorem 3.

If f, g, and h are differentiable, then

dg

df

dg. dh

'

df

dh'

wherever g :- 0 and h' :- 0.


Theorem 4.

dg 1_
df
df/dg'
_

wherever df/dg :- 0.

(The limit of the quotient is the quotient of the limits.)


We shall now find a short-cut for calculating derivatives of the type df/dg.
Consider
2 sin x cos x
d sin2 x
2 SinX
.
(cos x :- 0).
cos x
d sin x
---

This has the form

du2
du

This is like

2u.

df

dx

f'(x)

2x.

That is, to find du2/du (where u is a function), we treat u as if it were a dummy variable
x and differentiate in one step. This is an example of the following situation.
Let f and g be functions.
then we say that f is a function of g.

Definition.

If there is a function </> such that f

</>(g),

For example, sin2 x is a function of sin x, with </>(u)


u2 And cos2 x - 2 cos x
is a function of cos x, with </>(u)
u2 - 2u. The easiest way to calculate df/du, in
each of these cases, is to write
=

d sin2 x
d sin x
---

du2
=

2u

2 sm x'

d(u2 - 2u)

d(cos2 x - 2 cos x)

d cos x

du

du

2u

2 cos x - 2.

This procedure is justified by the following theorem.


Theorem 5.

Let/be a function of g,

</>(g), where all the functions are differentiable.

Then
wherever g' :- 0.

</> ( )
' g,

5.8

The Derivative of One Function with Respect to Another

253

Proof
df
dg

f'

g'

gg
</>'( ) ' = f (g).
g'

Using this theorem, we can write immediately


dtan2x

2 tanx,

dtanx
instead of using Theorem 1 and writing
dtan2x

2 tanx sec2x

dtanx

sec2x

= 2 tanx'

there is no point in writing sec2x = D tanx in both numerator and denominator,


since we are about to cancel it out in any case.
PROBLEM SET 5.8
Calculate dffdg, given:

1. /(x) =e"' , g(x) = x2

2. /(x) = e'" ,g(x) =2x

3. f(x) =e"', g(x) =Tan x

4. f(x) =exsinz,g(x) =x

5. f(x) =exsinx,g(x) = x3

__

7. /(x) =e'"3, g(x) =x4

6. f(x) =e"' ,g(x) = x2


8. f(x) = sin x,g(x) = cos x

9. f(x) = x3,g(x) =Tan x


Problems 10 through 14.

In Problems 1 through 5, first calculate the function

</> such
.) Then calculate ' (u) =
. Finally,
calculate <f>'(g),and compare it with your previous formula for df/dg. (Or, if you worked
that f = </>( g). (Answer in the form

</>(u) =

the problems this way in the first place, work them by the other method, and check.)
Calculate as for Problems 1 through 9.
15. f(t) =sin t,g(t) =et

16. f(t) =cost, g(t) =Tan t

17. f(t) =t6,g(t) =t3

18. /(t) = t6,g(t) =Tan t

19. f(x) = ln x, g(x) =e"'


Problems 20 through 24. Solve the preceding five problems by another method.
25.

Given/2 + t 2 + 1 = 0, find df/dt.

26.

Given /3 + t3 = 1, find df/dt. Then calculate f = f(t), find /' (t), and compare the

27.

Same, for f4 + t4 = 1. (Here there are two functions f = f(t) to be considered.)

result with dffdt.

28. Now try to check your answer to Problem 25 in the same way that you checked your
answers to Problems 26 and 27. (It often happens that a formal process gives "answers"
in cases where there never was a question.)

The Technique
6

6.1

of Integration

INTRODUCTION

In Section 3.7 we found a way to solve certain types of area problem.

area under the graph of a continuous function/, from

to

b,

To :find the

we introduce the area

function

f f(t) dt

A=

rYf
I

We know that

F'(x)

f'f(t) dt

f(x)

for every

To calculate the area function, we :find another function

G'

We then know that

G'
If it happens that

G(a)

0, then we have

H(x)

Then

H'(x)
Therefore

G'(x)
H(x)

x.

G such that

F '.
G(x)

F(x)

for every

G(x) - G(a).

F'(x)

F(x)

and
for every

H(a)

x,

and so

ff(t) dt

F(b)

H(b)
254

G(b) - G(a).

0.

x.

If not, we let

Independent Variables and Indefinite Integrals

6.2

255

To sum up:

G'
The notations

and

=>

G(b) - G(a).

were introduced for the sake ofthe derivation. Once we have

the answer, it is natural to use

F'

More formally:
Theorem 1

ff(t) dt

and F, and write:

ff(x) dx

=>

F(b) - F(a).

(The fundamental theorem of integral calculus).

[a, b], and F'

f,

Iff is continuous on

then

ff(x) dx

F(b) - F(a).

To apply the theorem, ofcourse, we need to find F when/is given. This process
antidijferentiation. We shall see later that the method of antidifferentiation

is called

enables us to solve not only the sort ofarea problems that we have used it on so far,
but also a variety of problems which, offhand, don't look like area problems at all.
But these applications should be postponed. The point is that, to apply the method,
we need to know how to calculate a function F whose derivative is a given function/;
up to now we have been finding such functions F only by hit-or-miss procedures, in
simple cases; and it would not be good to reduce various problems to problems in
antidifferentiation, when we are unable to solve the antidifferentiation problems.
We should therefore first learn better methods for calculating functions when their
derivatives are given.

6.2

INDEPENDENT VARIABLES AND INDEFINITE INTEGRALS

The usual way of defining a function is to write an expression which gives the value
ofthe function for every number in the domain. For example, we may define functions

f and g

by writing

f(x)

x2

(-oo <

In these formulas, the letter

"x"

< oo),

is called the

g(x)

..}

(x

0).

independent variable.

It is simply a

dummy letter, marking the places where numbers are to be inserted.


speaking, it makes no difference what letter we use as a dummy.

Logically

For example, we

could have defined exactly the same functions by writing

f (t)

(-oo < t < oo),

t2

Ifwe have decided to use, say,


Thus, when we write
a3

- 1

g(t)

g(t)

Ii

(t 0).

as the dummy, then we say that/is a function of


t, we are describing g as a function of t ; h(rJ..)

cos2

x.
=

is a function of rJ..; and so on.

We return now to the problem ofantidifferentiation. We found long ago, from

the uniqueness theorem, that iftwo functions have the same derivative, on an interval,

256

The Technique of Integration

6.2

then they differ by a constant. Thus, if


j'(x) =x2

(-oo < x < oo),

then /must be a function of the form


x3
f(x) =3
where C is a constant.
Therefore

C,

The converse is ,trivial: for every C, D(x3/3


{F I F' =x2} =

C) =x2

{3 c}.
+

The set of all functions F for which F' =/is commonly denoted by

ff(x) dx.
This is called the indefinite integral off Thus

J x4 dx = {F J F'(x) =x4} -;= {h5

C},

Jcos x dx ={F I F'(x) =cos x} = {sin x

C},

and so on. Any other dummy letter would have done as well:

Jt4 dt ={F I F'(t) =t4},

and

J cos t dt = {F I F'(t) =cos t}.


In each case, the braces on the right indicate that we are talking about the set of all
functions of the form given inside. The symbol dx (or dt) merely reminds us that
x (or t) is the dummy letter used in describing the function. In the examples above,
the reminder may seem unnecessary. Similarly, when we write

f(3x3

2x4) dx =

{3:4 2;5 c},


+

we might have gotten along without the "dx," because the only constants involved
are the numerical constants 2 and 3. On the other hand, if we write

Jc!Xx3y

(3x2y2) dx,

the "dx" is needed; it tells us that a, (3, and y are to be regarded as constants, and that
the function which we are dealing with is
f(x) = ax3y

f3x2y2.

257

Independent Variables and Indefinite Integrals

6.2

When the problem is understood in this sense, it is plain that the answer is

f(o:x3y + {Jx2/) dx= {o:x4y + {Jxa


-y2 + C} .
-4-

3-

(i)

This should be compared with


(ii)

In (ii),

f(o:x3y + /Jx2/)do:= r23y + f3x2/o: + c) ,


f(o:x3y + f3x2y2)d/3= {o:x3y/J + /3222 + c).
o:,
x
g(y) o:xay + fJx2y2.
h(o:)= o:x3y + {Jx2y2.
{3, and

(iii)
(iv)

are constants, and the function is


=

In (iii),

x,

y, and fJ are constants, and the function is

Similarly for (iv).


The process of calculating indefinite integrals is called indefinite integration, or
briefly, integration. Given a differentiation formula, we can get a corresponding
integration formula merely by writing the given formula "backwards," with minor
adjustments in some cases to take care of constants. In each case below, the formula
on the right follows from the formula or formu1as on the left.

Dxn= n xn- (n - 0),


Dxn+l= ( + l)xn (
Dcx:11) = xn (n - -1)
D-y;-x= 1;-x '
D(2Jx) = /x
D x= x
D x=
x)= x
l

=>

1)
f xndx= { n 1 xn+1 + .c}

2-y

=>

J)xdx=

sin

cos

=>

f x dx=

{sin

cos

-sin

=>

f xdx=

{-cos

D(-cos

sin

{2Jx + c},

cos

x + C},

x,

sin

x + C},

(n

-1),

The Technique of Integration

258

In

.! (x
x

>

0)

6.2

=>

De"'=e"

=>

J dx=
fe"' dx=

{In

x + C} (x

{e"'

+ C}.

>

0),

We know many more differentiation formulas than this, and so we could have
written many more integration formulas. But we postpone the complete list until we
can write it in a better form, which we shall now explain.
Given a function/, if

is another function, then

f (u )

is a composite function.

By the chain rule,

Df(u)=j'(u)u'.
It follows that

ff'(u)u'(x) dx

{f(u(x)) + C}.

For example, if

u(x)= x2 + 1,

f(u)= sin u,
then

, D[f(u(x))]= D[sin (x2+ 1)] =f'(u)u'(x)

(cos

u)2x

= [cos (x2 + 1)]2x.


Therefore

[cos ( x2

+ 1)]2x dx= {sin (x2 + 1) + C}.

More generally,

sin

u(x)

u(x)]u'(x),

[cos

and so

[cos

u(x)]u'(x) dx

{sin

u(x) + C}.

This works for any functions. If

F'=f,

so that

ff(x) dx

{F(x) + C},

then

D[F(u(x))]

F'(u(x))u'(x)

f(u(x))u'(x),

so that

ff(u(x))u'(x) dx= {F(u(x)) + C}.


In such formulas, we abbreviate

u'(x) dx by the

cos

symbol

u du= {sin u + C},

du.

Thus we write

6.2

259

Independent Variables and Indefinite Integrals

which means that for every differentiable function

[cos

u(x)]u'(x) dx

u(x),

we have

{sin u(x) + C}.

Similarly, we write

fe"du
which means that if

u is any

{ e" + C}

differentiable function, then

fe""">u'(x)dx
This is true, because

Deu<x>

{eu!x> + C}.

e"x>u'(x).

Using different dummy letters, we can convert the above formula to any of the forms

feu<t>u'(t)dt,

or
and so on.

More often, however, we start with an integral described in the long

notation and observe that it is convertible to a short form. For example,

fex2+12x dx
has the form

where

u(x)

x2 + 1.

Therefore

fex2+12xdx fe"du
=

f [sin (t2 + 1)]2t dt has the

Similarly,

[sin

(t2 + 1)]2t dt

sin

u du

{ e" + C}

form

{ ex2+1 + C}.

f sin u du.

Therefore

{ -cos u + C}

{ -cos (t2 + 1) + C}.

Note that the solution is not finished in the third formula above, because
function.

To complete the solution, we need to express the function

che dummy letter

t.

To sum up:

F'

=>

D[F(u)]

= f (u)u'.

Therefore

ff(x)dx

{F(x) + C}

In the abbreviated form, using

ff(x)dx

=>

ff(u(x))u'(x)dx

du for u'(x) dx,

{F(x) + C}

=>

{F(u) + C}.

we have

ff(u) du

{F(u) + C}.

is a

in terms of

260

6.2

The Technique of Integration

Using this general idea, we can write all of our old integration formulas in the more
general form. The first few look like this:

JUndu={nU:ll + c},
J

cos

J)udu= {2-Ju + c},


J

u du= { sin u + C},

J J:.udu={

ln

u + C} (u

>

sin

0),

udu= { -cosu + C},

u
e

du= { e" + C}.

And of course we have

f [f(x) + g(x)]dx=ff(x)dx + Jg(x)dx,


J!<J(x)= k ff(x)dx, k 0,

because

D[f+ g] =DJ+ Dg

D (kf) = kDf

and

Let us now consider how to apply such formulas as these, as a practical matter.
Example

1.

Consider

Jcx2 + 1)7xdx.
This is almost, but not quite, in the form

J u7du.
If we take

u(x) = x2 +

1,

then

du=u'(x) dx=2x dx.

We therefore have

J(x2 + 1)7xdx=Jt(x2 + 1)72xdx = Jtu du


'

= {t tu8 + C} = { 1\(x2 + 1)8 + C}.

This checks:

D[T\r(x2 + 1)8]=T1s 8(x2 + 1)7 2x=x(x2 + 1)7

Example

2.

Consider

-Jx
1_ dx
"\ x

cos

(x

>

The only form that might fit this integral is the form

u(x) =.jX,

0).
f cos udu.

Thus we would have

1
du = u'(x)dx =--dx.
2.jX

The only difference between what we have and what we want is a multiplicative

Independent Variables and Indefinite Integrals

6.2

261

constant. Therefore

JX
dx=
Jx

cos

Example 3.

(cos v x)
;-

1
x dx= 2
J

= 2 Jcos u du= {2 sin u

;(cos v x)

1
x dx
2J

C}= {2 sin Jx

C}.

Con:.ider

Jecos"' sin x dx.


So far we have only one integration formula involving the exponential function:

feu du= {eu

C}.

If our problem fits this form, we must have

u(x)

=cos

x,

du

u'(x) dx=

-sin

x dx.

Here again the multiplicative constant causes no trouble:

Je00sx sin x dx=

J{-

-e008"'(-'-sin

eu

x) dx= - eu du

C}= {-ecosx

C}.

Below we shall give a list of all the integration formulas that we can write, at this
stage, on the basis of the differentiation formulas that we know. Special explanations
are needed, however, in connection with the formula for
u,

defined on a domain where

We need to know that

u(x)

u(x)

>

0 for

every

x,

f (lju) du.

Given a function

we know that

1
D ln u(x)= - Du(x).
u(x)

>

on t domain under consideration, because only

positive numbers have logarithms. Therefore we write

J du=
But even where

u(x)

<

0,

{In

C}

(u

that is, it makes sense to ask what functions f have

D[ln
This gives us

u(x)

<

0).

it makes sense to write

J du;

answer is easy: if

>

0,

then

-u(x)

>

0.

(1/u)u' as their derivatives. The


-u(x) has a logarithm, and

Therefore

(-u(x))l = -1- D[-u(x)]= -1- (-u'(x))= -1 u'(x).


-u(x)
u(x)
-u(x)

J du= {ln(-u)

C}

(u

<

0).

262

6.2

The Technique of Integration

Hence the two formulas for

f (l/u) du in the list

below.

Ikf(x) dx k ff(x) dx (k O)
f [f(x) g(x)] dx ff(x) dx f g(x) dx
=

(n -1)

f du

{ In u

f du

{In (-u)

C}

(u

>

C}

(u

sin

sin

0)

(4)

<

0)

(5)

-cos

sec2

csc2

tan

-cot

sec

tan

sec

csc

cot

-csc

"

"
e

f
{
1
u
I u J
u2 - 1
=

>

Sin-1

Tan-1

{ Sec-1 u
.

<

C}
+

C}

(2)
(3)

f u du { u C}
I u du { u C}
f u du { u C}
f u du { u C}
I u u du { u C}
I u u du { u C}
f du { C}
Ja" du {1:"a c} (a 0, a 1)
I
J1 - u2 { u C} (lul 1)
cos

(1)

(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)

(u

>

1)

(16)

Independent Variables and Indefinite Integrals

6.2

263

To solve the following problems, you will start by expressing the given integral
in the form

f f(u) du.

In each such case, you should (a) say what

u and du are and (b)

state the general formula that you are applying. It is natural to write down the original
integral first, and after this it would be awkward to interrupt the solution with the
formulas for

u(x) and du

u'(x) dx. But u and du can be filled in on the right, like

this, for example:

J(x3

+ 1)10 x2dx

Jt<x3 1)13x2dx
fiu10 du {t TI-u11

C}

u{x)
du

x3 + 1
3x2 dx

h\(x3 + 1)11 + C}.

This form of the solution shows what we have in mind; writing formulas of the type

U(X)

X 3 + 1, du

3X2dX, f tu16 du

a 1\-U11

'

C}

Will help you tO avoid

mistakes. For example, you might write hastily

(?)

f(x3

as if it were true that for

u(x)

1)10x2dx

and du, we uncover such errors.

x3 +

1, du

{-fi(x3 + 1)11 + C},


=

x2dx. When we write formulas for u

Similarly for the following wrong solution:

(?)

J<x2

+ 1)2dx

{t(x2 + 1)3 + C}.

In full, the solution would begin like this:

(?)

J<x2

+ 1)2dx

du

x2 + 1
dx(?!)

The error is obvious, and so we start over again:

J<x2

+ 1)2 dx

J<x'

+ 2x2 +

1) dx

{tx5 + fx3 + x + C}.

PROBLEM SET 6.2

Calculate the following integrals, and check by differentiation in each case. Some of
these problems fit together in sequences, in which the answer to one problem helps in the
solution of another; you should watch for such patterns.

1.
4.

f
I

(1 + x2)3x dx

2.

(t4 + l)t8dt

5.

f
f (x2

(1 + t 3)t2dt

3.

t2)3 dx

6.

f
f (x2

(2 + u2)3u du
+

t2}8tx dx

264

7.

10.

13 .

16.

19.

21.

22.

24.

27.

30 .

33.

3 6.

39.

42.

45.

48.

51.

54.

57.

The Technique of Integration

f
J
f
f
f
f
fl
f
J
J
J
J
J
J
J
J
J
J
f

(x2 + t2)3 txd t

(rs/2

l)rs/2dt

11.

v'cos x sin xdx

14.

(e"'

-l- e-"')2 e"'

x2

(I

+ x3)3

f
f
f

8.

e-"')dx

6.2

(l

1
+ y1x)3-=dx
v'x

(I

+ sin x)2 cos xdx

+ 2)4 e"' dx

(e"'

17.

x2

(e"'

12.

15.

f :

20.

dx

dx
1 + xs

9.

f
f
f

e-"')3 dx

x2

(t3f2 + 5)10.,/(dt

(1 + tan x)3i2 sec2 x dx

(e"' - 2)3e-"' dx

f :

18.

(1

dx
x2)2

dx

(There are two intervals to be considered in this problem.)

23. f

in xdx

sin x cos xdx

25.

sin101 x cos xdx

28.

cos57 x sin xdx

31.

(cot2 0 + 1) dO

34.

cos 0

-- dO

37.

(cos2 0 - sin2 0)dO

40.

cos2 0dO

43.

sin2 0

sin2 0dO

46.

cos2 0 sin 0dO

49.

cos (0/2) dO

52.

x e-"'2 dx

55.

e2"'dx

58.

J
J
J
J
J
J
J
J
J
fJl
J
J

In (x2)dx

(Same comment.)

sin2 x cos xdx

26.

cos2 x sin x dx

29.

(1 + tan2 0)dO

3 2.

cot2 0dO

3 5.

sin 20d0

38.

(cos2 0 + sin2 0)dO

41.

(1 - 2 sin2 0)dO

44.

sin2 20dO

47.

sin 0

50.

(1 - sin2 0)dO

- cos 0
dO
2

5 3.

t2et3 dt

56.

e5t dt

59.

J
J
J
J
J
J
J
J
J
J
J
J

sin3 x cos xdx

cos3 x sin xdx

tan2 OdO
sin 0
--do
cos2 0
cos 20dO

(2 cos2 0 - 1)dO

(2 sin2 0 - 1)dO

sin2 0 cos2 0dO

sin3 0dO

v' 1 - cos 0 sin 0 dO

xe"'2 dx

e1

tdt

Integrals Leading to the Logarithm and the Inverse Secant,

6.3

60.

63.

66.

69.

72.
75.
78.

J dt
J sin t dt
I dx
J (2
dt
dt
I
et'+3t

61.

e008 t

64.

(10x)2

67.

+i-312)

70.

( .Y1

I
I

t3

1 +t

73.

12)s

7 6.

4 dt

ex

Yl - e2x

dx

osx
c--dx
I x
2 I x x x xdx

f dx
J2x+idx
dt
I (2
ein sec' x

62.

65.

+ o-3/2

68.

tdt

I
dt
I
I exdx

71.

.Y1 - t2
t2

74.

\o/1 +t3
ex

77.

1 +

xdx
-x
I 2
csc x x xdx
I x x

Algebraic Devices

f cosxdx
J xdx
J
dt
dt dt
I
dt
I - (2t)2
I -=dx
esin x
10x

t(2 +t2)-312

(There are different intervals to consider in Problems 79 through

79.

80.

sin

sec 2

8 .

+sec

tan

83.

secx +tan

.y 1 - t2

.y 4

ex

Y] _ex

84.)

sin

cos

+ csc

csc

265

cot

+cot

81.
84.

J xdx
J xdx
tan

sec

6.3 INTEGRALS LEADING TO THE LOGARITHM


AND THE INVERSE SECANT. ALGEBRAIC DEVICES

J du/u,

In the preceding section, we got two formulas for


and

for the intervals

(0, oo)

(-co, 0).
u

d
-;
;

I
du
- =
I
=

Since

lul = u

when

>

and

{In u +

{ln

( - u) + C}

luJ = -u

0)

(4)

(u < 0).

(5)

(u

C}

when

>

u < 0,

these two formulas can be

combined into one:

r- =
du

Ll

{ln

Ju J + C}

(on

(0,

oo

or

( - oo, 0)).

(17)

Here the expression in parentheses on the right reminds us that the formula can
be used on an interval where u > 0,

or

on an interval where

on an interval where u takes on the value

"1/u"

0.

When

0,

u < 0;

it cannot be used

there is no such thing as the

on the left or the "In lul" on the right. Thus, whenever we apply formula

we might have used formula (4) or (5). The advantage of


Consider

r -l d

J-2

(17),
(17) is that it is easier to use.

6.3

The Techniqu of Integration

266

In the fundamental theorem of integral calculus, we take


1
f(x) =x
Then F'=f Therefore

l-1d =
-2
2

F(-1) - F(-2)

F(x)

In !xi.

ln 1-11 - ln 1-21=0 - ln2 = -ln2.

This is negative, as it should be; the integrand is negative, and we are integrating
from left to right. The calculation might be confusing if we used formula (5):

r-i dx = [ln(- x)]=; =In


L2 x

[-(-1)] - In [-(-2)] =0 - ln2

- ln2.

Hereafter, we shall use the following shorthand for this kind of calculation:

r-i dx
L2 x

In general

[In Ix!]==In 1-11 - In 1-21.


[F(x)]=F(b) - F(a),

by definition. Sometimes, where no confusion could result, we may omit the opening
bracket on the left. Thus

3I= -

We can convert various integrals to the form J du/u. For example,

tan u du =

Except for sign, this has the form

we have

cos u

du.

v' v u, dv
u du.

J
dv
(v
v
J --;- vl
u
d:
u du =
du = J
u
J
J
=cos

Since

sin u
--

tan

{ln l

+ C}

= - sin

> 0 or

<

0),

- sin

--

cos

={-In !vi+ C} ={-Jn lcosul + C};

tan u du

{Jn !sec ui + C}

This is a standard formula.

(sec u > 0 or sec u

<

0).

(18)

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

167

Similarly,

This gives

cot u du =

cos u
du.
-.
smu
-

(sin u > 0 or sin u < 0).

cot udu= {ln Jsin uJ + C}

(19)

By an ingenious device, we can find

secxdx.

We multiply and divide by secx+tan x, getting

Since

secxdx =
D

and

secx = secx tan x


D

the integral has the form

sec2 x+ secx tan x d


x.
secx+ tan x

tan x = sec2 x,

where
u = secx+ tan x,
Therefore

du

(secx tan x+sec2 x) dx.

secxdx = {In u
J l + C}

{ln Jsecx+ tan xi + C}.

As always, the chain rule gives us a more general formula for J sec u du:

sec u du = {In Jsec u+tan ul + C}

(sec u+ tan

> 0 or < 0). (20)

Similarly,

cscxdx =

and this gives

cscx(cscx+cot x)
csc x+cot x

dx = {-lnJcscx+cot xi+ C};

csc udu = { -ln Jcsc u+cot ul + C}

(csc u +cot

Consider now the formula


D

Sec-1x =

1
xJx2

(x > 1).

> 0 or < 0).

(21)

268

6.3

The Technique of Integration

The graph of Sec-1 looks like this:


y

7r

2 ---- -- ------- --- -----

(See Section

4. 7.)

Thus Sec-1 is defined on the interval [I,

oo

).

But at 1 its tangent is

vertical; and so the differentiation formula holds only for x > 1. It gives

dx
.
.J
x x2 -

+ C}

(x

>

1),

du
= {See1 u
U'\I U 2 -

+ C}

(u

>

1).

{See1

and more generally

Notice, however, that the integral

(16)

-2
1
J-ax.Jx2-l dx
___

makes sense.

We therefore need an integration formula which will apply to this

integrand on the interval ( - oo,

Ix.Jx21 - 1

dx

__

where

-1).

On this domain,

1
dx=
(-1)
I -x.J( -x)2 - 1
f .J - 1 du,
u

u(x)=-x,
Therefore, for

<

x'./ x

-x

du=(-l)dx.

-1,

-1

= {See1 u
=

because !xi

u2

{See1 lxl + C}

when x < 0.

the general case (with a function

du

fu.Ju2 - 1

+ C} = {S.ee1 (-x) + C}

(x

< -1),

Fitting our two formulas together, and passing to


u

instead of

{Sec-1 lul + C}

x),

we get

(u

>

or

u < - 1)

(22)

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

269

There is a rough rule to help you decide which of our present list of formulas to
apply to a given problem: look in the integrand for functions which are the derivatives

of other functions. The point is that all our formulas have left-hand members of the
form ff(u) du; and we need to decide, in each case, what u is.
Example 1.

I:

In x

dx.

ls there anything here that is the derivative of something else? Yes:

D In x = .!. .
x

Taking u(x)= In x, we have

du= u'(x) dx= .!. dx.


x

{u

Thus our integral has the form

ln3 x

dx=
=

I
{:

(ln3 x)

du= - dx
x

x dx

(1 + x2)7 .

Looking for functions which are derivatives of other functions, we observe that

D(l + x2)= 2x.


Multiplicative constants are no trouble:

x dx

=
(1 + x2)7
2

where

u(x) =

2x dx

! u-7 du,
J

=
(1 + x2)7
2

du= u'(x) dx= 2x dx.

+ x2,

Therefore the answer is

!.

Example 3.

1
-7 + 1

u-7+l

} {-l

+ c =

12

Sometimes we have to hunt harder:

} {-l (1

u-6 + c =

x dx

)1 - x4

12

= l x

u3 du

+ c = {t ln4x + C}.

Example 2.

dx=

+ x2r6 + c .

270

6.3

The Technique of Integration

There is no hope that l/J l -x4 is part of


must be

du. Either the problem is hard or du


x dx, or a constant multiple of x dx. Now 2x=Dx2; and x2 is what gets

squared under the radical sign in the denominator. This suggests


u=x2,
x dx

2x dx

JJ1 - x4=,2 JJ1-(x2)2 =2 JJ

du
1

- u2

={t Sin-1 u + C} ={t Sin-1 x2 + C}.

Example 4.

Some obscure-looking integrals may be calculated algebraically:

=J(1 + -) dx.
x-1
Jx-1

(Here we have divided the denominator into the numerator, getting a quotient and a
remainder.) Therefore

Example 5.

xdx
= {x + In Ix-11 + C}.
x-1

Sometimes we need to find other algebraic devices, for such problems

as this:

dx

J1 + e-x

As it stands, this is hopeless: nothing in the integrand is the derivative of anything

else. But

dx

du
=
=.
1 + e-x
ex+ 1
u

u=ex+ 1

={ln lul + C}={ln (1 + e") + C}.


(No absolute-value signs are needed, because 1
Example 6.

Sometimes the same devices appear in more complicated forms:

r --dx
-

+ e" > 1 for every x.)

e" + e-x

---

du
e" dx
e" dx
1 + e 2" - 1 + (e")2
1 + u2

(u=ex, du=e"dx)

={Tan-1 u + C} ={Tan-1 e" + C}.


Here we have used, in combination, the methods that worked in Examples 3 and 5.
Example 7.

Often we need routine algebra and arithmetic:

_E_= !
J + x2 J
4

Here

dx
t dx
= !.2.
1 + (x/2)2
4
1 + (x/2)2

u=x/2, du=tdx. This gives

{t

Tan-1

+ c}.

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

271

x
(1/J3) dx
I J3d-x x2 I/31 Ji -d(x/J3)
I
2
Jt (x/J3)2
{ ;3 }

Similarly,

sin-1

+ c .

There was nothing special about the numbers


way, we get

Ia2 dx x2 {.!a
IJa2dx- x2 {
+

Tan-1 + c

u2

IJa2du- u2

{a
{

-Tan-

(/3)2

In the same

(a > 0).
we get two more standard formulas:

+ C

22 and 3

(a > 0),

g from x to any differentiable function u,


du
u
1
1
a2

sin-1 + C)

Passin

sin-1 + c

(a > 0),

(23)

(a > 0).

(24)

PROBLEM SET 6.3

Calculate the following integrals, and check by differentiation in each case.

1. I
4. J
7. I
10. I
13. I
J
J : e
J e"'e"' - e-x
e25. J (e"' e-')(e2"'
v' 1

x4

v'4 - y4

dx

v'1 -

dy

5.

9 :sx4 dx

16.
19.

22.

zs

zs

dt

dt

2zs

dz

dz
1 :55zs

dz

xs

dx
dt

dz

,.dx

t4

11.

dz

t2

x
dx
1 + 9x 4

x
3
dx
Vl - x8

2. J 4
J
8. J
J 1
14. J
17. J 1
20. J 1 :t e2t
23. I e"'e"' - e-x
e-x
J

e-2"') dx 26.

dx

2
dx
v'2 - x3

3. J
J
J
12. J 1
J
18. J
21. J e"' e-x
24. J (e"' e- )(
2 7. J

4y4

v'1

6.

15.

:39 4 dx
x
z6

dz

6dz
+ z

x
7
dx
Vl - x8

xs

dx

dx

"'

e"' - e-"') dx
2

2
dx
v'2 - x6

272

The Technique of Integration

2 dx
28. J v1 x-2x3
J v1Sin-1x
-x2dx
34. J xe"' dx
37. J (xcosx
sinx) dx+
J x sinxdx
43. J (3x+2 Inxx2
) dx
1
46. J x( l + 1n2x) dx
49. I x2 V; - 1dx
J V 1 x2dx
J Cos-Ixdx
f
1 + e4u du
6 .1 I 1 :x2dx
64. We know that
31.

40.

52.

55.

58.

Consider

6.3

2 dx
29. J v1x-2x6
32. J Tan-Ix
-1-+x2 dx
35. J In e"'' dx.
38. fxcosxdx
J (2x Inx+ x) dx
44. J x21nxdx
47. J (x + x1)2In+(x2x2 +2x) dx
f Ve2x dx
f Sin-Ixdx
J Cos-I (2x) dx
f du
59. l :2u
62. f Tan-1 xdx
41.

50.

1
30. fxln-2xdx
33. f (xe"' + e"') dx
36. J ln2 e"' dx
f (x sinx -cosx) dx
42. J xlnxdx
J ln3x
--dx
x
48. I tVt21 dt
J (sin-Ix+ Vl x2) dx
J (Cos-Ix - Vl x_ x2) dx
f du
1 + e2u
f(Tan-Ix+ 1 :x2) dx
Jv1 -z2dz
39.

45.

51 .

53.

54.

56.

57.

D-x Dx-I -ix-2


JI-Ix21 dx.
=

60.
63.

x2

6.4

Integration by Parts

273

In the fundamental theorem of integral calculus, we take

f(x)
Then

F'

F(x)

x2'

1
x

f Therefore

fl -dx
1
-1

x2

F(l) - F(-1)

Now we interpret the problem geometrically. We seem to have proved that the region
under a positive function has negative area.
a) What went wrong?
b) Show that the area in question is not only positive but infinite.

(This does not

follow from the mere fact that the region is unbounded. Some unbounded regions have
finite areas.)
65. Let R be the region under the graph of f(x)

I/v',

from

that R has finite area.

6.4

0 to

1.

Show

INTEGRATION BY PARTS

By differentiation, we get

D[x sin x]

x cos x

x.

+ sin

Since

D cos x

sin

x,

+ sin

we have

D[x sin x

+ cos

x]

x cos x

sin

x cos x.

Therefore

Jx

cos

x dx

{x sin x

+ cos

C}.

Thus, working backward, we have found the solution of an integration problem which
m ight have looked hard if we had approached it forward, starting'wifu_the unknown

integral

f x cos x dx.

We shall now describe a general method of solving problems of

this kind.
The formula for the derivative of a product is

D[u(x)v(x)]

u(x)v'(x)

u'(x)v(x).

Therefore

T herefo re

J [u(x)v'(x)

u'(x)v(x)] dx

Ju(x)v'(x) dx

u(x)v(x)

{u(x)v(x)

C}.

Jv(x)u'(x) dx.

Here we have dropped the constant, C, because each of the indefinite integrals on the
two sides of the equation carries its own constant with it; what the equation says is
that the two sides of the equation represent the same class of functions.
short notation

du

u'(x) dx,

dv

v'(x) dx,

Using the

6.4

The Technique of Integration

274

we get the formula

Ju dv= uv - Jv du.
This is the formula for integration by parts; the word parts refers to the functions

u(x) and v'(x) in the integral


one integral by another.

on the left. Any time we apply the formula, we replace

The method is useful when the new integral is easier to

calculate than the old one.


Let us first try the method on

x cos

x dx.

Let

dv

=x,

x dx,

=cos

so that

du= dx

and

= Sill

X.

(We need not allow for a constant here; any function

whose derivative is cosx

will work. We will return to this point in a moment.) By the basic formula, we get

x cos

x dx = u dv= uv

Jv du= x

= {x sin x +cosx

sin x

sin

x dx

+ C}.

If we had used the seemingly more general

=sin

+ c,

we would have

Judv =

x(sinx + e) -

(sinx + ;)dx

= {x sinx + ex + cosx

- ex

+ C}

= {x sin x

+ cos

+ C},

exactly as before. The same happens in general:

u(v

+ c)

f<v

+ c)du

= uv

+ uc

Jv du - uc

= uv

- v du.

In applying the basic formula, we made what may seem to be an arbitrary choice
of

u and dv.

We might have taken

=cos

x,

dv = x dx

du

= -sin x dx,

x2
v = -.
2

Integration by Parts

6.4

This would have given


xcos x dx =

udv = uv -

v du

cos x +

This is true, but is worthless as a method of finding

integral is harder to calculate than the old one.


An equally bad choice would be
u=

x cos x,

du= cosx

du= dx,

which gives

xcosxdx = udv = uv -

x2sinxdx.

f x cos x dx,

275

because the new

x sin x,

vdtt

= x2cosx -J(xcosx - x2sinx)dx.

Here again the new integral is harder than the old one. We remember also that no

term of the form x2cos x appears in the right answer.

Therefore the term x2 cos

cannot be the beginning of the solution, as we might hope: it must be a blind alley.

These examples indicate that integration by parts can be either a good or a bad

method, according to the skill with which we choose the parts.

Practice is a help,

but there are general rules which help us to decide what choices are promising:

1)

dv has got to be something that we know how to integrate.

2)

We want

apply the method at all.)

simpler than

3)

f vdu to be

u.

an. easier integral than

f u du.

For the same reason, we want

J xe"'

We can integrate both x and e".

suggests that u= x and

r
u = e

u=

J xe"'dx = J u

Note that

(3)

x,

du=

advises us not to try


u

because we would then get

i:

dx.

Therefore

(1)

gives us no guidance.

"
e ,

Dx

Rule (2)
u = x is to be preferred.

=I, which looks good.) We therefore

du= e"' dx,

du= dx,

uv

Let us try them on

are both acceptable, but that

e"', which is no worse thane", but

This gives

u.

to be simpler than dv; at least we don't want it to

These rules are not infallible, but they are a help.

(De"'=

Therefore we want du to be

At least, we don't want du to be more complicated than

look worse than du.

use

(If it isn't, we can't

v = e"'.

xe"' - e"'dx = {xe"' - e"'


dv =

C}.

dx,

= x2/ 2, which looks worse than du. In fact, this choice

6.4

The Technique of Integration

276

won't work.

Consider next

f
Rule

x2e'" dx.
du = e'"dx. We therefore take

(3) tells us that we had better take

du= e'" dx,

du= 2xdx,

u= x 2,

This looks good under rule (2), and acceptable under rule

x2e'" dx =

du=

uv

(3).

du = x2e" - 2

= {x2e" - 2xe'" + 2e"' +

= e'".

We get

xe'" dx,

C},

by the result of the previous problem. If we hadn't known the answer to the previous
problem, it would still be easy to see that we had made progress, in replacing

J x2e"' dx

by

J xe"'dx;

we would then attack the new problem'-by the same method.

pression for the

It sometimes happens that integration by parts gives us not aB


'

integral that we started with, but an equation that can be solved for, this integral.
Consider

We take
u

e"' sin x dx.

du= e"' dx,

= e'",

du= sin x dx,

e'" sin x dx = -e'" cos x +

= -cos x,

e'" cos x dx.

We repeat the process, taking

u= e'",

du= e'"dx,

du= cos x dx,

=sin

x.

For short, we wrf(

I=
We then have

I = -e'" cos x +

e" sin x dx.

e"' cos x dx = -e"' cos x + e'" sin,x -

e"' sin x dx.


_

Here the last integral is simply the one we started with. Therefore

2I = {e"' sin

and

- e"'cos x +

C},

e'" sin x d/= {t[e'" sin x - e"' cos x] +

Sometimes we need to make a strange choice, in which


and du is merely dx. This is what we need, to find

in x dx.

C}.
is the whole integrand

6.4

Integration by Parts

Here we use

u= lnx,

du= - dx,
x

dv = dx,

277

v = x.

When we replace 1 by x, we seem to have lost somewhat, but the profit in passing
from In x to l/x more than makes up for it. In fact, this scheme works:

inx dx

uv

J
--J1

v du

= xlnx

x Inx

J
x

dx

dx = {xlnx - x +

C}.

PROBLEM SET 6.4

Evaluate the following integrals.


integration by parts.

Each of them can be calculated by the method of

You should try to work these problems with the smallest possible

number of false starts. In each case, survey the situation and try to arrive at a conclusion on
the question of-what choice of

and dv is most promising.

If you do this carefully,

you ought to be able to solve each of the problems below on the first try.
Each answer should be checked by differentiation.
1.

4.

7.

10.

13.

16.

19.
22.

J ln2xdx
J
J
J
J
J
J Tan1xdx

2.

xe axdx

5.

ea'" sin x dx

8.

eax cos bxdx

11.

x3e"'dx

14.

x3 ln x dx

17.

20.

J
J
J
J
J
J
J xTan-1xdx
In (x2) dx

3.

x sin axdx

6.

eax

9.

COS

xdx

12 .

x2sin xdx

J
J xcosaxdx
J
J
J2
J
J
(ax)e"'dx

eax sin bxdx

X " COS X dx
9

x2 ln xdx

15.

Sin-1xdx

18.

21.

'

x In2 xdx

Sin-1(2x) dx

xex sin x dx

Derive and check:

In" xdx

= x

In" x

lnn-l xdx.

Formulas of this kind are called reduction formulas.


formula, we can calculate the integral on the left.

23.

Find

24.

Derive a reduction formula for J xn sin xdx.

ln3 xdx.

By n

1 applications of the

6.5

The Technique of Integration

278

25. Derive a reduction formula for J xnex dx.


26. Derive a reduction formula which reduces
27.

in J x" Inn x dx.

J sin In x dx. (Here you should survey the situation, decide on the most promising
procedure, and then proceed with faith.)

28. J cos In x dx. (Same comment as for the preceding problem.)


6.5

INTEGRATION OF POWERS OF TRIGONOMETRIC FUNCTIONS

We shall find how to calculate integrals of the form

J
where

and m are any integers, positive, negative, or zero, and integrals of the forms

and

where n

sinnx cosmx dx,

0 and

secnx tanmxdx,

cscnx cotmxdx,

0. We shall discuss the various cases in the order of increasing

difficulty.

odd and positive.

(1)

For example, we might have

sin2 x cos3 x dx.

The method in such cases is as follows.


cos2x

and

+ sin2x

sin2x cos3 x dx

This method works whenever

J
J

Since
cos2x = 1 - sin2x,

1,

sin2x(l - sin2x) cosx dx


sin2x cos xdx -

{t sin3x

sin4 x cosxdx

t sin5x + C}.

in is odd. (In this case n need not be an integer; it may


2k + 1, our integral has the form

be any real number.) For m =

sin"x cos2k+lx dx

f
J

sinn.x(cos2x?cosxdx

sinnx(l - sin2x?cosxdx .

6.5

Integration of Powers of Trigonometric' Functions

279

We expand (1 - sin2 x)k by the binomial theorem. This gives us a sum of integrals
of the form

f m du

(u

sin x,

du

cos x dx).

We integrate these one at a time and add the results.

f sin11

cosm x dx,

odd and positive.

This is like the preceding case. Here

(2)

= 2k + 1, and the integral has the form

J cosm x(sin2 x)k sin x dx = J cosm x(l - cos2 x?(-sin x) dx.


-

Expanding by the binomial formula, we get a sum of integrals of the type

fcos; x( -sin x) dx;


we evaluate each of these by the formula for J u1 du, and add.

J sin" x cosm

and n

x,

and even.

(3)

To handle this one, we recall that


cos 2x = cos2 x

sin2 x = cos2

(1 - cos2 x)

2 cos2 x

1.

Solving for cos2 x, we get


cos2 x

+ cos 2x
2

Similarly,
cos 2x = (1 - sin2 x)

and

sm 2 x =

sin2 x

cos 2x

2 sin2 x,

Making these substitutions in the integrand, we get a form in which the exponents
are divided by 2. For example,

sin2 x cos2 x dx

cos 2x

1 +

cos 2x
2

dx

6.5

The Technique of Integration

280

We now make the same sort of substitution again, getting

1 - cos 4x
dx
4J

tx - t cos 4xdx

{tx - ;t2 sm 4x + C}.

When the exponents are large, this method is tedious, but at least we know that it
will work.
(n positive).

For

1,

we know that

Jtan xdx
For n

(4)

(cos x > 0 or cos x < 0).

{-In jcos xi + C}

(4a)

2,

Jtan2 xdx J(sec2x - 1)dx.


=

(Remember that 1 + tan2 x

sec2 x.) This gives

Jtan2xdx

{tan x - x + C}.

(4b)

For n > 2, we have

Jtann xdx Jtann-2x(sec2x - 1)dx,


=

and so

Jtann xdx

1
-- tann-i x - tann-2 xdx.

n - 1

(4c)

This is called a reduction formula. By repeated applications of it, we can reduce the
integral to one of the forms (4a) and (4b).

Jcotn x dx
This is like (4). For n

cos x
-.-dx
Jcot x J sm
x
=

For

(n positive).

(5)

1,

.
{In Ism x I + C}

(sin x > 0 or sin x < 0).

(5a)

2,

Jcot2xdx J(csc2x - l)dx.


=

(Remember that cot2x + 1

csc2x. ) Thus

Jcot2xdx

{-cot x - x + C}.

(5b)

6.5

Integration of Powers of Trigonometric Functions

For

281

> 2,

fcotn x dx fcotn-2 x cot2 x dx fcor-2 x(csc2 x


=

1) dx;

and so

Icotn x dx

1
-

coc-1 x - cotn-2 x dx.

(5c)

By repeated applications of (5c), we can reduce our integral to one of the forms
(5a) and (5b).

fsec" x dx,
For

n =

2, J sec2 x dx

even and positive.

{tan x + C}. For

n =

2k, k >

(6)

1,

Isec" x dx fsec2k x dx fsec2k-2 x sec2 x dx


f(1 + tan2 x)"-1 sec2 x dx.
=

When we expand (1 + tan2 x)k-l by the binomial formula, we get a sum of integrals
of the form

Jtan; x sec2 x dx fu; du.


=

We integrate each of these by the power formula and add the results. For example,

fsec6 x dx f(sec2 x)2 sec2 x dx f(1 tan2 x)2 sec2 x dx


f(1 tan2 x tan4 x) sec2 x dx
+ 2

/1 =

n =

2k, k >

even and positive.

2,

fcsc2 x dx
For

{tan x + i tan3 x + t tan5 x + C}.

fcscn x dx
This is like (6). For

{-cot x + C}.

1,

Jcsc2k x dx Jccsc2 x)k-1 csc2 x


Jccot2 1t 1 csc2 x dx
Jccot2 x + l)k-1(-csc2 x dx).
=

(7)

282

6.5

The Technique of Integration

When we expand the binomial, we get a sum of integrals of the form

f
f
For

n = 1,

cot' x( -csc2

seen

xdx,

x) dx =

J uidu.

odd and positive.

(8)

we found that

secxd x

secx(sec
sec

+tan

+tan

x)

dx =

fdu-u

where
sec

u =

+tan

du =

x,

(sec

x tan x

+sec2

x) dx.

Therefore

fsecxdx =
For

{lnjsecx +tanxj + C}.

odd and greater than 1, we have a problem.

For example, in f sec3

x dx

it does no good to write

sec2 x sec xdx

= (1

+tan2

x)

sec

xdx,

because the second term fits no standard form. The solution is obtained by integrating
by parts. We have

Let

seen

u =

xdx =

secn-2

x,

secn-2 x sec2

xdx.

sec2

x dx,

dv =

du = (n - 2) secn-3 x sec x tan x dx = (n - 2) sec"-2 x tan x dx,


v

=tan

x.

This gives

seen

xdx =
=

f udu = uv fvdu
-

secn-2

:; secn-2 x tan

secn-2

f
x - (n - 2) f
x - (n - 2) f

tan x -

x tan

(n - 2)

secn-2

x tan2 xdx

secn-2 x(sec2

x - 1) dx

seen

(n - 2)

xdx

sec"-2

xdx.

Integration of Powers of Trigonometric Functions

6.5

283

Thus, if I is the original integral, we have


I

secn-2 x tan x

- (n-2)1 + (n - 2) secn-2 x dx,

and

seen

x dx

1
=

--

n-1

secn-2 x tan x +

n-2
--

n-1

secn-2 x

dx.

There is a similar reduction formula which works for odd powers of the cosecant:

cscn

x dx

1
=

--

n-1

cscn-2 x cot

To derive this, we integrate by parts, taking

x +

--

n-1

csc"-2

as in the previous derivation to solve for J csc" x

n-2

cscn-2 x

x, dv

csc2

dx.

(9)

x, and proceed

dx.

By these formulas and methods we can integrate products of powers of trigono


metric functions.

There is absolutely no need to memorize the formulas which we have


just finished developing. You can handle simple cases by remembering the methods.
If you need to compute an integral, in one of the difficult cases-and this almost never
happens to people, in real life-then you look up the appropriate reduction formula.
Moreover, it isn't even safe to try to memorize complicated formulas: you are very
likely to misremember them and get wrong answers.

PROBLEM SET 6.5

Before starting to work on these problems, you should read Section 6.5 carefully, until
you understand what the methods are and why they work.

In working the problems, you

should refer to the text as seldom as possible. You should try to avoid looking up even the
reduction formulas (8) and (9), unless a problem requires you to apply one of them more
than once. If only one reduction is required, you should integrate by parts, instead of using
the reduction formula. As you will see, the first few problems below are designed to remind
you of the methods that we have been using. Check by differentiation in each case.
1.

4.

7.

10.

J
J
J
J

sin2 x cos3 xdx

2.

cos2xdx

5.

cot4xdx;

8.

sec4 xdx

11.

J
J
J
J

sin3 x cos2 xdx

3.

sin2 x cos2 xdx

6.

tan5 xdx

9.

csc4xdx

12.

J
J
J
J

sin2 x dx

tan4xdx

cot5 xdx

sec3 xdx

13.
x dx
6. x x dx
x 2x dx
19.
22. 2x 2x dx
25. --dx
x
2 . J x dx
30.
J sin

sec

J sin2

cos

sin2
sec

f sec5

17.

J cos

csc

f csc

sec

tan

14.

J cos3

sin

f tan

x dx
x x dx
21.
2x dx
24. I --dx
x
1
27. I --dx
x

x dx
x x dx
20.
2x dx
23. x x dx
1
26. I --dx
x
29. x1 x dx

f csc3

6.6

The Technique of Integration

284

x dx

J sin

sec3

cos2

sin4

cot

A sinn-l

18.

sinx

tan

There is a reduction formula of the form


J sinn

J csc5

f cos2

cos4

csc

15.

x x
cos

+BJ sinn-2

x dx,

where A and Bare constants, expressed, of course, in terms of n. Derive such a formula.

31.

[Hint: It is no use trying to do this merely by the use of the elementary trigonometric

identities relating the sine and the cosine.]

There is a reduction formula of the form

Derive it.
6.6

J cos"

x dx

A cos"-1

x x
sin

+ BJ cos"-2

x dx.

INTEGRATION BY SUBSTITUTION

In Section 6.2 we found that there was a close connection between certain simple

integrals and some more complicated ones. For example, if we know that

then we know that

Jx2 dx = {tx3 + C},

sin2 e cos e de= {t sin3 e + C}.

(We are using a different dummy letter in the second problem, for reasons which will

soon be clear.) Thus we have two related integration problems:


Jx2 dx

{tx3 +

xsino

C}

,,_,sin o

sin2 e cos e. dB = {t sin3 0 +


(!)

C}.

Integration by Substitution

6.6

The

( !)

285

at the bottom indicates that the equation in the bottom line is the final

conclusion. The pattern here is the following:

= {F(x)
ff(x) dx
l
l
ff(u) du =ff(u(O))u'(O) d() = {F(u(O))

X-+U(O)

X-+U(O)

(!)

Jf(x) dx,

Thus, if we know how to find

C}

C}.

we can use the result to find

Jf(u) du.

It sometimes happens, however, that we want to move in the opposite direction;


sometimes we can see how to calculate

ff(u(e))u'(e) d(),
Jf(x) dx.

and we want to use the result to calculate

So as to give ourselves a simple

example to work with at first, let us suppose that we know about the functions
Sin and Sin-1, but do not know that l/.J 1
consider

x2 is the derivative of Sin-1

.J 1

x2

But perhaps it would be

manageable if we could extract the indicated square root. For

x = Sine,

root can be extracted. (See below.) If we replace the dummy letter

dx

becomes Sin'()

de,

We then

dx

We observe that it does not fit any form that we know.

Sine, then

x.

the square

by the function

and we get the related integrals on the left in the

following diagram:

J
J

dx

.Ji

cos

.J1

=?

x2

X-+inO

e de

Sin2 e

The trigonometric integral is easy:


cos

(Query:

J.J1

e de
Sin20

Why is it true that

_
_

cos

J .J

e de

cos2e

1 d()

= {O

C}.

.Jcos2 () = cos e'


for the values ofe that we need to consider? What values of()

do we need to consider?)

286

6.6

l'he Technique of Integration

The above calculation enables u::. to complete our diagram:

fJldx1
f

X2

(I)

x-Sino

cos e de

J1

- Sin2e

C}

{Sin-1 x +

o-Sin-1x

{0 +

c}

In this case, of course, the solution in the top line was known before we started.
But the same scheme works in general, whenever we can calculate the new integral
on the lower left:

ff(x) dx
l
ff(u) du ff(u(O))u'(O) dO
x-u(8)

{G(O)

C}

We shall prove, at the end of this section, that this procedure is valid, whenever the
symbols

u' and u-1 have a meaning; that is, whenever u has both a derivative and an

inverse.

Meanwhile, we shall show how the scheme is used to solve problems

which would otherwise be hard.


Example 1.

dx

x2J1 - x2

_?

(-1 < x < 1, x :

0).

As in the preceding case, it seems to be the radical that is causing the trouble; and so
we get rid of it by the substitution
x--)-Sin 0

dx --)-Sin' e dO

This gives

f J dx

x2 1 - x2

(Throughout,

-Tr/2

--)- f

cos e dO

< e<

Tr/2;

2 '

cos e de.

Sin2 0 cos e

- :!. < e < :!.

csc2 0

dO

on this interval, sin x

{-cot 0 +

Sin x, and the usual

identities hold automatically.)


We now reverse the substitution, using e--)- Sin-1 x. This gives

fh

x2 1 - x2

A formula for cot Sin-1

{-cotSin-1x +

C}.

is easy to read off from a figure.

C}.

6.6

Integration by Substitution

287

Here

Sine= x,
.
cot Sm 1
_

Therefore

x=

e=

cot

dx
=
x2J1-x2

Sin-1

x;

1 - x2
e= -k = J
-'----x
x

{- J
x

+ c

Note that all the trigonometry has cancelled out of the problem. Our answer checks:

-x
DJ = l. x
- J1 - x2
x
x2
J1-x2
=

-1
1
.
[-x2-cJ1-x2)2J=
x2)1 - x2
x2)1 - x2

We can sum this up in a diagram as follows:

dx

Jx2J1-x2=(
l
J e de =
-.. e
xSinO

csc2

- x2

The substitution
integrand is
Example 2.

J1

Consider

Sin

{- }
x
I
+ c

oSin-1 :v

{-cot

e+

C}

is the usual one to try, if the troublesome part of the

In other cases,

x-...

Tan

e works in much the same way.

288

The Technique of Integration

6.6

To get rid of the radical, we use

(- 2

:!!. < e <

x -+ Tane
This gives

!!)

dx -+ sec2ede.

2 '

j..J1 + x2 dx->-j..J1 + Tan2esec2ede.


The domain of Tan is the interval (-TT/2, 7r/2), on which sece > 0.

Therefore

sece= ..)1 + Tan2e, and

JVl

Tan2esec2ede= sec3 ede.

We now use one of the reduction formulas of Section 6.5:

Jsec3ede= fsecnede

(n = 3)

1
= -- secn-2etan e+
n-1

n
2
n-1

- Jsecn-2ede

t secetane+ t secede

= {tsecetane +tin lsece +tanel+ C}


= {G(e) + C}.
We complete the solution by letting e -+ Tan-1

x.

This gives

J ..J1 + x2 dx = {G(Tan-1x) +C}


= {!(sec Tan-1x)(tan Tan-1x)
+ t In I sec Tan-1x + tan Tan-1 xi + C}.
Obviously tan Tan-1 x
figure.

x.

The formula for sec Tan-1


y

can be read off from a

Integration by Substitution

6.6

In the figure, -7T/2 < e < 7r/2, bute may be positive or negative. We take
This gives
x =Tane,
e = Tan-1 x,
r = sece = secTan-1 x,
so that
secTan-1x

OP

289

1.

=.Ji+x2

Therefore the answer is

f.J1 +x2dx = {tx.Jl +x2+

tlni .J l +x2+xi+ C}.

This can be simplified slightly: since


absolute value bars, getting:

f.J1 +x2dx =

.J 1 +x2+ x

> 0

for every x, we can omit the

{tx.J l +x2+tln(.J l +x2 +x) + C}

As before, we sum up in a diagram the process by which the problem was solved:

I.JI+x2dx
l
J 8d 8 =

= {tx.J l +x2+tln(.J l +x2 +x) + C}


(!)

x-Tane

sec3

e-Tan-1 x

{tsec8tane +tlnjsece + tan8j + C}

Such diagrams are worth drawing, especially the first few times you use the
substitution process; often the calculations are long, and it is easy to lose track of
what the process means.
The answer in Example 2 suggests that no method would have made the problem
seem easy. Note that the formulas of Section 6. 5 are turning out to be useful in solving
problems which do not appear, at first, to involve trigonometry at all.
We return to the general theory, to see why this method works. The pattern of
our work is described by the diagram:

ff(x) dx
l

0) {G(u-1(x)) + C}

x- 1<(6)

Jf(u(8))u'(8) de =

e- u-1(x)

{G(8) + C}

What we are claiming, when we use the method of substitution, is that, if the second
equation holds, so does the first. In terms of the definition of the indefinite integral,
this means the following:

290

6.6

The Technique of Integration

Theorem 1. If u is differentiable and invertible, then

G'

The proof is as follows.

f(u)u'

G'

Therefore

x.

G'(u-I) Du-1.

f(u)u'. Therefore
G'(x)

for every

=f

By the chain rule,

D[G(u-I)]
By hypothesis,

D[G(u-I)]

=>

G'(u-I)

f(u(x))u'(x)

f(u(u-1))u' (u-1),

and

D[G(u-1)]

G'(u-1) Du-I

f(u(u-1))u'(u-1) Du-1.

Now

f(u(u-1))
because

u(u-I(x))

x for every

f(u(u-1))u'(u-I) Du-I

f,

Therefore

x.

u'(u-1) Du-I

u'(u-1)

--

u'(u-1)

by the general formula for the derivative of the inverse of a function.


cancels, and gives us

D[G(u-1)]

Now

f, which was to be proved.

u' (u-1)

PROBLEM SET 6.6


Calculate each of the following integrals, by any method.
the easiest method is to use a substitution of the form

In most cases, but not all,

_,.Sin 8,

x-+

Tan 0, or

x -+ Sec

0.

In each case where you do use the method of substitution, you should sum up the process

of solution in a diagram as in Examples


differentiation.

1.
4.

7.
10.

13.

16.

19.

22.

J
J
J
J
J
J
J
J

(1 - x2)-3!2dx

2.

dx
dx
x(l + x2)

5.

dx
x2(1 + x2)

8.

dx
x2v' x2 - 1
x2dx

14.

Vl - x2
x2v'l
xv'l

11.

x2 dx
x2dx

x2v' x2 - 1dx

17.
20.

and 2 in the text. Finally, check in each case by

dx
v' x2

J
J
J
J
J

dx

dx
xv' x2 - 1

xdx
+ x2
xdx
x3dx

v'l - x2dx
+

12.

15.

v'1 - x2

x3 v' 1

6.

9.

v' x2 - 1

3.

x2dx

18.

21.

J
J
J
J
J
J
J

dx
v' x2 - 1
x(l - x2)-3f2dx
Vl - x2dx
x2v'1 - x2dx
x2dx
x2

1 +

(1

: x2)3dx
x3

dx
x2

Algebraic Substitutions

6.7

Jb f(u(O))u' (0)dO

23. a) Show that

iu(b)

ida>

whether

u is

b) Show that if

291

/(x) dx,

invertible or not.

u is invertible, then

lcl f(x)dx J,"-l(d) f(u(O))u'(O)dO.


=

u-l(c)

24.

Obviously there is no point in writing this on a paper which is to be turned in and


graded; but for your own benefit, reproduce the proof of the following, without reference
to the text:

G'
6.7

f(u)u'

D[G(u-1)]

=>

ALGEBRAIC SUBSTITUTIONS

It is a good rule, if you have a problem which you don't see how to solve, to try to
think ofan easier problem that resembles it. If you can solve the easier problem, and
bridge the gap between the two, then you have solved the problem which you started
with.

J-::: =x2= == d x.
.J2x + 1

For example, consider

This does not fit any of the standard forms that we know about. There is no reason to

.)-;, or .J(, then

suppose that a trigonometric substitution would help; and in fact, none of them
would. We note, however, that ifthe denominator were ofthe form
the problem would become easier. Now

2x

We therefore try the substitution

tx

- u(t)

x2 dx
J.J2x +

Ht - 1 ) .

Ht - 1),

dx-+ u'(t) dt

Under this substitution,

t dt.

-+ J t(t -

1)
.Ji

dt

The latter integral is easy to calculate. It is

! J t2 - 2 +
8

.jt

dt

i J(t3/2 - 2tl/2 + t-1/2) dt

{i{it5;2 - 2 . ita12 + 2t112) + C}


h\ts12 - tta12 + it112 + C}
{G(t)

+ C}.

292

6.7

The Technique of Integration

To get the answer to the problem which we started with, we use the inverse substitution

t-+ u-1(x) = 2x+ 1.


This gives

x2 dx
-J2x+ 1

= h(2x+ l)s/2 - t(2x+ l)a/2+!(2x+ 1)112+ C}


= {G(u-1(x))+ C}.

The scheme here is the same as in the preceding section:

x2

J-J2x+1 dx
lx1i(t)
J

t2 - 2t+ 1
r

8v t

= {G(u-1(x))+ C}
<1>

dt

{G(t)+C}

u and u-1 are described algebraically,


G(t) and G(u-1(x)) are too long to be conveniently written in

The only differences are that (a) the functions


and (b) the formulas for

the diagram. In any case, we know that the method works: this follows from Theorem
1 of Section

6.6.

Often we can tell that a substitution is going to work, long before we know what the
answer is. As soon as we wrote

x2 dx

J-J2x+1

-+

JW

1) l dt
,
-Jt

it was evident that the numerator was a polynomial.

We can integrate the quotient

of a polynomial and a power. Similarly, we know that we can integrate

f (x3 - 3x/ +4)2 dx'.

x2 3

the calculation will be tedious, but the outcome is not in doubt.


If one algebraic substitution works, there are usually others that also work.
In the preceding problem, we might have used

y2=2x+l,

Thus we use

x = t(y2 - 1).

x -+ u(y) = t(y2 - 1),


dx-+ u'(y) dy = y dy,
x2-+ t(y2 - 1)2,
HY2 - 2. Y

-J2x+ 1

; ay =i cy4
J

_,..

y,

2y2+ 1) dy

= {21oy5 - i-y3 +b + C}.

Algebraic Substitutions

6.7

293

As usual, we now reverse the substitution, using

This gives the final answer

{0(2x

1)512 - t(2x

1)312

i(2x

1)1'2

+ C},

as before.

There are no rules which tell us the best substitution to try in every case.

The

best approach is to look at the integrand, ask ourselves what feature of it is most
troublesome, and then choose a substitution which seems likely to remove the
troublesome feature. For example, if we want to calculate

j.

dx
-Jx2 +

'
1

we want to extract the square root; we can do this if

x---+

Tan

( - !!.2 < e < !!.)2 ,

+ 1 ---+ sec2 ().

x2
This works:

J-Jx2dx

+ 1

---+

e de
sec e

sec2

which leads to a solution, as you found in Problem

sec

e de ,

2 of Problem

Set

6.6.

We might also have tried

z2
x

---+ u(z) =

dx

,/ z2

\ z

This gives

dx
-Jx2 +

__..

J!
z

,Iz2

+ 1,

dz.
1

x2

dz
1

,
J-J__:!:___
z2

which gets us nowhere, unless we happen to remember the solution of Problem 3 of


Problem Set

6.6.

Usually, to find out what algebraic substitution is going to work, we need to


solve an algebraic equation.

For example, given

dz
1 + ./:'

294

The Technique of Integration

6.7

we wish that the denominator had merely the form

1 +;-; t,

we need

t.

To get

z (t - 1)2

We usually write this with"=" signs:

This gives

J;=t- 1,

1 +;;=t,

z = (t- 1)2

z u(t) = (t- 1)2,


dz u'(t) dt

- Ji :=J:Z J

2(t - 1) dt,

2(t- o dt = (2 - ) dt
t

The reverse substitution

{2t - 2 ln !ti + C}.

t u-l(z) = 1 +)-;,
gives the final answer

(Query:

{2(1 +,Jzj

2 In (1 +-Jzj + C}.

Would it be all right to delete the

"1"

in the first parenthesis?)

This is probably the most efficient solution. If we hadn't thought of it, we might

have tried
which gives the substitution

,J; = t,
z u(t) =t2,
dz u'(t) dt = 2t dt,
l

Dividing

1 +t into t,

Therefore

dz

2t dt

J +Jz J1+t.
we get

t
1
--=1---.
l+t
l+t

2J_!___E:!_ =2J(1 - -1-) dt= {2t- 2 Jn jl+ti. + C}.


l+t
l+t

6.7

Algebraic Substitutions

295

Finally we apply the inverse substitution


t---+

u-1

(z)

.jz,

getting

I 1 +dz.J

--- =

r + C}.
r - 2 ln (1 + vz)
{2vz

(Is this really the same as the previous answer? Why or why not?)
We have used the substitutions x---+ Sin8;

Tan8, and

x---+

integrands involving the radicals

x---+

Sec() to handle

and
Slight variations enable us to take care of more general cases, involving
For example, to find

.Ja2

- x2,

.Jx2

Ja2 - x2dx

(a >

- a2.

0),

we use x---+ a Sin() = u(8), so that


a2 - x2
Here x/a

a2(1 - Sin28),

Ja2 - x2

a cos().

Sin8, so that
() = Sin-1 .
a

Thus

Ja 2 - x2dx

J
J
{:

a2 cos 2fJdfJ

a2 t ( cos 28 +

(a cos8)a cos8dfJ

---+

\in2fJ +

Now
sin 28

2 sin()cos8

This gives

.Ja2 - x2dx

tx.Ja2 - x2

In the same way, we use

x---+ a Tan 8,

! Ja2 - x2
a a

Sin-1

+ c} .

1) dfJ
e

+ c .

296

The Technique of Integration

to get. rid of

6.7

-Ja2 + x2; and we use


x ---+ a

to get rid of

Sec

e,

-J x2 - a2.

There are miscellaneous substitutions which work on miscellaneous problems .


For example, in

dx
x2(x 2 +

1)

'

the trouble seems to be that the integrand is concentrated in its own denominator.
We ought to be able to correct this by letting

x ---+ u(t)

1
=

-1

dx ---+ - dt.
t2

This gives

fx2(x +

1)

---+
---+

J(
{-

-1+

t2) dt {-t +Tan-it


+ c}.
=

+ C}

Tan-1

Here, in the last step, we have applied the inverse substitution

t ---+ u 1(x)

= x.
1

In writing up solutions of problems, in the following problem set, you need not
draw diagrams of the form:

ff(x) dx = {G(u-1(x)) + C}
1x11(t)
(!)

ff(u(t))'(t) dt

{G(t) + C}

But whenever you use a substitution, you should explain what you are doing, by
writing formulas of the type

x ---+ u(t)

Algebraic Devices: Completing the Square and Partial Fractions

6.8

297

PROBLEM SET 6.7

Calculate the following, by any method.


1.
4.

7.
10.

13.
16.
19.
22.
25.
28.
31.

J
J
J
J
J
J
J
J
J
J
J

dx

2.

V:x-)3

(I +

(a2 + x2)-af2dx

5.

dx

8.

l + x
z3dz

11.

vz + I
dx

14.

l + vx
(I +

vx)3vxdx

17.

dx

20.

Yl + e2"'
x Sin-1xdx

23.

l
dx
(1 + vx)2

26.

29.

dx
v1 +fix
dx

32.

x3(1 - x)

J
J
J
J

vx
(1 + Vx)3

3.

dx

dx

6.

v1 + e"'
dx

9.

(1 - x)2
z3dz
Vz2

12.

dx

J
J
J
J
J
J :
J
x4(x

15.

I)

dx

18.

Sin-1xdx

21.

Tan-1 xdx

24.

1
dx
vx(l + vx)2

27.

dx
e"')2

30.

(l

33.

x2 ln xdx

J
J
J Yvx
J
J
J
J
J
J
J
J
(a2

x2)-af2dx

dx

v1 - e"'
dx
+ 1

dx

YI + 'x

(l - x2)4dx
dx

(1 + e"')4

xln xdx
xTan-1 xdx
1

l + fix

dx

dx

v 1 + e3X

x2Tan-1xdx

6.8 ALGEBRAIC DEVICES: COMPLETING


THE SQUARE AND PARTIAL FRACTIONS

In Section 6.6 we used trigonometric substitutions to calculate integrals involving

Ja2

x2 and

Jx2

- a2. By completing the square, we can extend these methods so

as to take care of expressions of the form

x- + x +
9

Jax2 + bx

x- + x + i + !
?

c.

For example,

(x + t)- +
?

(.)3)2
2

Therefore

which has the form

dx

v x2 + x +

dx
'

(x + t)2

JJuzdu+

a2

+ (./3/2)2

'

6.8

The Technique of Integration

298

We can calculate this by the substitution


u--+ a Tan e.
This gives
{lnl .Ju2 +a2 +ul + C} = {ln l .Jx2 +x + 1 +x +ti+ C}
Similarly,

x2 +x = x2 +x +t - t = ex +t)2 - ew,

and so

dx
.Jx2 +x

which has the form

dx

J
J du
=

.Jex +t)2 - eW'

.Ju2 - a 2

Here we would use


u--+ a Sec(),
and proceed as in Section 6.6.
The following simple-looking problem has a curious solution:

=?
x2 - 1

We try
x--+ Sec()

so that

ex> 1)

dx --+ sec() tan

()

d(),

giving

sec() tan() d()


tan2 ()

csc ()

d() = {-In Iese e +cot e1 +C}

Now

1
x+ 1
x
x+ 1
--+
=
=
x -1
.Jx2 - 1
.Jx2 - 1
.Jex + 1)ex - 1)
-

This gives the answer

1 c}

J { 1
x2 - 1

= t 1n +
x +1

ex> 1).

ex> 1).

We check by differentiation:

( l : I )

D tin

= Detinlx - 1\ - tln\x +1\)


1
=-
2

--

x -1
1

x2 - 1

1
1
1 ex + 1) - ex - 1)
=-
-
2 x+1
2
ex - l)ex + 1)

--

Algebraic Devices: Colfi'pleting the Square and Partial Fractions

6.8

299

This shows that our answer was right. But it also shows that our use of trigonometry
was unnecessary; the solution depends merely on the algebraic identity
_l

_l.
2_
_
2_ +
x - 1
x +1
__

x2 - 1

This suggests that we should have a systematic method of breaking up rational


functions into sums of simpler functions. We call this the method of partialfractions.
Theorem

1. If a b, then there are numbers


ex+d
(x - a)(x - b)

Proof

_
_

x - a

A and B such that

_B_
+
x - b

(x a, b).

The obvious method works:


ex +d
(x - a)(x - b)

= +_!}__
x - a
x - b

ex +d = A(x - b) +B(x - a)

<=>

<=>

A+B = e

and

Ab+Ba= -d.
We solve for A and B by any method, getting the solution
A=

ae+d

be+d.
B=
b - a

a - b'

These values satisfy both equations.


Note that since a - b appears in the denominators, we really needed the hy
pothesis a -- b.

And for a= b, the theorem is false.

That is, you cannot express

1/(x - a)2 in the form


A

A+B

x - a

x - a

x - a

-- +-- =--.

It might seem that we should have stated a stronger theorem, as follows:


Theorem

1. If a b, then
ex +d
(x - a)(x - b)

ae+d
a -b

be+d . _1_
+
.
x - a
b - a
x -b
1

_
_

But nob'ody could remember this formula.

The efficient way to handle such

problems is the following. Given

fcx 2x
-

- 5)

=
?

we know by Theorem 1 that there are numbers A and B such that

(x

------ = -- +--

- 2)(x -

5)

x - 2

300

6.8

The Technique of Integration

The only problem is to find out what they are, numerically. We first write

1 = A(x - 5)+ B(x - 2).


Since this equation holds for every x, it must hold for
1 =A

x =2 and for x = 5.

Therefore

1 =B 3;

(-3),

B =t.

A= -t,

This is another example of the efficiency of existence theorems: often, if you


know in advance that a problem has a solution, you can use a simple procedure to
find out what the answer is. Without Theorem
1

(x - 2)(x - 5)

=-

1,

the shortcut calculation of

l
_!_ .__
l __
1
+ .
3 x-2
3 x-5

would not have been valid. To see this, consider the following analogous procedure:
"Problem."

Find the numbers

and
sin

b such that

x =ax+ b.

"Solution." Letting x =0, we get


0 =a 0+ b.

Therefore

b =0, and
sin

Letting

x =7r/2,

a =2/7r.

and

x =ax.

we get

l=a2:
2'

Therefore
.
(?)

Sill
X
.

2x
=-

for every

(?)

This is wrong: in fact, our formula is correct for only three values of

0 and

x = 7r/2.

x,

namely,

The fallacy was in assuming at the outset that the problem

had a solution, when in fact it has none.

What the above line of reasoning really

proves is the following:

=>

The statement

(1)

=>

( 2)

sin

is a linear function

sin

is the linear function

(1)

2x/7r.

is true, but it is not useful, because

(2)
(1)

is false.

The method that we used for quadratic denominators also works whenever the
denominator can be factored into linear factors all of which are different.

2x2+ 1
A
B
C
=
+
+
x+ 1
x+ 2
(x+ l)(x+ 2)(x+ 3)
x+ 3
--

2x2+ 1 =A(x+ 2)(x+ 3) + B(x+ l)(x + 3) + C(x+ l)(x+ 2),


Ax = -1,
3 =2A,
-
2,
x =-2,
x = -3,

9 = -B,
19 =2C,

B= -9;

C=:!ll-.

This solution depends on the following existence theorem.

301

Algebraic Devices: Completing the Square and Partial Fractions

6.8

If p(x) is of degree 2, and a, b, and

Theorem 2.

are all different, then there are

numbers A, B, and C such that

p(x)
(x - a)(x - b)(x - c)

_
_

x - a

_
_

_
_

(3)

With our present equipment, we could give only a brute-force proof. But we know
how to handle simple cases, and in the following problem set you will see how various
more difficult problems of the same type can be solved.
PROBLEM SET 6.8

Find:
1.

4.

7.

8.

9.

12.

15.

J
J
J
J
J
J
J

dx

2.

x2+2x+5
dx

5.

x2+x- 4

J
J

dx
v'x2+2x+5
dx
Yx2+x -

J
J

3.

6.

dx
x2 - 4
dx

v2 - x2

dx
>12 - 2x - x2
dx

(Is this an impossible problem at the outset?)

v -2x- x2
dx

10.

x2+ 6x + IO
dx
(x - l)(x -

13.

2)

J
J

dx

v-x2-

6x +10

x dx
(x - l)(x

11.

14

- 2)

J
J

d
x2

6 +10
dx

x(x - l)(x - 2)

x dx
x(x- l)(x-

2)

Find the unknown coefficients A, B, C, . . . which satisfy the following equations:


16.

18.

20.

1
(x - I)2 (x - 2)
1
x2 (x

23.

Find

25.

Find

(x - 1)2

B
C
+ -+
x- 1
x - 2
--

A
x+ I

--

dx
(x+ l)(xdx
x(xz + I)
dx
x(xz+1)2

2)3

B
(x-

22.
24.

2)3

17.

Find

19. Find

--

(x+ l)(x - 2)3

J
J
J

A
B
C
D
2 +x
- + (x - J)2 +x- I
l )2 = x

21. Find

c
(x -

2)2

J
J

(x

1x

x2(x

1)2

D
(x - 2)

---

A
Bx+ C
=-+
2
x2+ I
x
x(x + I)
I

x(x2+1)2

26. Find

Dx+ E
Bx+ C
A
=-+
+
x
x2+1
(x2+ 1)2

sine
d8
1 + cos 8

2)

302

The Technique of Integration

27.

Given 8

28.

Find J d8/(1 +cos 8).

6.8

2 Tan-1 x, calculate sin 8 and cos 8, in terms of x.

->-

(One way to do this is to use the substitution

u(x)

2 Tan-1 x,

sin 8

_,.

d8
?

_,.

u'(x) dx

cos 8

->-

2 dx
1 + x 2'

But when you see the answer, you may be able to think of a simpler method of solution.)
Find:
29.

32.

35.

f
f
f

d8
1 +sin 8
d8

sin 8 +cos 8
d8
1

sin 8

30.

33.

36.

f
f
J

d8

31.

cos 8
d8

sec 8 +tan 8
d8
2 +cos 8

34.

37.

f
J
J

dx
x2 +6x +9

sec3 8 cot 8 d8
d8

cos 8

sin 8

The
7

7.1

Definite Integral

THE PROBLEM OF ARC LENGTH

-C.o_!lider the parabolic arc which is the graph of y = x2, 0 x 1. We shall


calculate its length. The ideas that we use to do this will apply to other curves, but the
general problem is no harder than the special one; if we were interested in arc length
only for parabolas, the ideas in this section would all be needed.
The length of an arc of a circle is defined (as in Section 4.3) as the limit of the
lengths of the inscribed broken lines. More generally, suppose that f is a continuous
function on a closed interval [a, b].
y

By a net over [a, b] we mean an ascending sequence N of numbers


a
For each

i,

= x0 < x1 <

< xi <

X;+i

<

<

Xn

b.

let

We join successive points P;_1, Pi with segments, getting a broken line as in the figure.
Such a broken line is said to be inscribed in the graph off Its length is
PoP1 + P1P2 +

+ Pn-1Pn.

We denote this by p(N). That is,


n

p(N)

I P;-1P;.

i=l

We use the functional notation p(N), because when the net N is named, the broken
line is determined, and so also is its length.
The graph of a continuous function on a closed interval may have infinite length.
But if the length is finite, we ought to be able to approximate it by using a net N
which cuts up [a, b] into very small pieces. This idea is the basis of the following
definitions.
303

7.1

The Definite Integral

304

Definition. Let

be a net over

[a, b]. The mesh of N is the largest of the numbers

The mesh of N is denoted by INI .


Definition.

If p(N) approaches a limit L, as INI approaches

0, then fis said to be

rectifiable, and the number Lis called its length.


We need, of course , to explain what is meant by the statement
lim

p(N)

INl->O

L.

Intuitively, this means that p(N) R:! L when IN I R:! 0. We define this idea by the same
method that we used to defn
i e the limit of a function at a point . To make the analogy
clearer, we write the old and new definitions in parallel .
Definition.

interval

Let f be a function, on an

Definition. Let f be a

function, on an

[a, b]. Let x0be a point of [a, b]. interval [a, b].

Suppose that for every E > 0 there is a Suppose that for every E > 0 there is a
a > 0 such that if x is a point of [a, b], a > 0 such that if N is a net over [a, b],
then
then
INI < 0
0 <Ix - x01 <a
Then

=>

lf(x) - LI < E .
limf(x)

=>

lp(N) - LI <

E.

Then
Jim

L.

p(N)

INl->O

x--+xo

L.

To calculate the arc length, we fri st express the length p(N) (of the inscribed
broken line) in terms of things that we know how to handle. By definition,
n

p(N)
(See the figure on p .

I P;-1P;.

i=l

303.) The segment from

P;_1 to Pi looks

Yi=f(xi)
Yi-I= f(Xi-1)

-+---'---.L__-'---x
Xi-I Xi
Xi

like this:

The Problem of Arc Length

7.1

305

Thus

Here the fraction

[i (:YJ Lix;.
+

Liyi/Lixi is the slope of the chord from Pi-l to Pi; and the mean-value

theorem says that this is the slope of the tangent line at some intermediate point.
Thus we have

Liyi
Lixi

= f'(xi)

(xi-1 < X; < X;).

Making this substitution, and extracting the square root, we get

Taking the sum from i

=1

p(N)

ton, we get
n

=l

=l

=I
;-1 =I J1
i P P;
i

+ [f'(xi)]2 Lixi.

The problem is to find out what happens to the sum on the right as INI

-+

0. We can

find this by giving a geometric interpretation to the sum.


y

-t--X i--I XiX.i---x

For each

x on [a, b], let

g(x)

J1

In the figure,

g(x),

X;-1 <Xi < X;

for

n.

[xi_1, xi], of length Lixi


X; - x,_1, we have set up a rectangle
[xi_1, xi] as base, and altitude g(xi). The area of this rectangle is then

On each little interval


with

+ [f'(x)]2.

g(xi) Lixi.
The sum of these areas is

i g(x;) Lix; i_i=l J1

i=l

If f' is continuous, then so is

g;

+ [f'(i;)]2 Lixi

and I g(xi)

p(N).

Lixi ought to be close to the area

306

The Definite Integral

7.1

under the graph of g, when the mesh of the net N is small. That is, we ought to have

INI

i g(xi) !ixi fg(x) dx,

=>

which means that


n

lim L g(xi) !ixi


INl->Oi=l

This gives us a formula for arc length:

= .J1

lbg(x) dx.
a

[f'(x)]2 dx.

This holds whenever f' is continuous; and we will complete the proof later in this
chapter. Meanwhile, consider some examples.
Example

1.

Thenj'(x)

Let

f(x) =

= 0, and

Oxl.

= Jf1o .J1

02 dx

which is the right answer.


Example 2. Let

1,

f(x) = kx,

= 1,

Oxl.

Thenf'(x)

= k, and
L

= .J1

k2 dx

= .J1

k2,

which is the right answer, by the Pythagorean theorem.


Example 3. Let
f(x)

= .J1

-x

-x2,

Ox.J2.
-

Then
f'(x)
1 +

.J1

[f'(x)]2

[f'(x)]2

=1

- x2 + x2
1 -x2

x2

x2

'

-x2
1

_'
_

x2

The Problem of Arc Length

7.1

307

and
L

i,1;10

"-J1 + [f'(x)]2 dx

i\!212
o

Sin-1

-Ji

dx

-J2_

x2

,;[Sin-1 x]0 212

!!. .

This is the right answer, because Lis one-eighth of the circumference of a circle of
radius 1.
Example 4. We return to f(x)

/'(x)

2x

x2, 0 x

1 + [j'(x)]2

1.

f-J1

Now

Here
=

+ 4x2 dx.

1 + 4x2,

This is

2x.

H-x-JI + 4x2 + t In 12x + -J1 + 4x21 + C}.

Therefore, by the fundamental theorem of integral calculus, we have


L

ls+
2

t In c2 +

-Js).

The answer suggests that no method would have made the problem look easy.

PROBLEM SET 7.1


Find the lengths of the graphs of the following functions, between the indicated limits.

I.

f(x) = x312,

x, 0 x 7T/4
x, 7r/4 x 7r/2
*4. f (x) = Jn x, 1 ;;; x ;;; 3
5. f (x)
1 + tx3i2, 0 ;;; x 4
2.

3.

f(x) =
f(x)

x ;;; 2

In cos

In sin

*6. f (x)
7. a)

x
= e ,

/(x)

t(ex

e-

"'

) , 0 x 1 (You can solve this one, by an algebraic trick,

without using any of the standard formulas for hyperbolic functions.


problem is a little easier if you remember that sinh

}Ce"

e-x .

He"' -

e-" ,

But the

) cosh x

For the definitions of the hyperbolic functions sinh and cosh, and the

formulas governing them, see the end of Section

4.11.)

308

8.

7.2

The Definite Integral

Let f be a function with a continuous derivative, on an interval containing x0


y

Let

r(x)

"'

P0P,,/P0Px, that is, the ratio of the arc length to the length of the chord.
Show that lim,,_x0 r(x)
I.
"'
To prove this, you will need to use the formula which expresses P0P0. as an integral.
=

In many books you will see a "proof" of the integral formula for arc length, based on
the assumption that

r(x)

--+

I.

This is an example of a proof of infinite thinness: the

hole in it is as big as the proof, because to fill the hole you must first prove the theorem
itself by another method.
9.

Consider the sequence of broken lines suggested by the figure below. Each broken line
forms a stairway from P to Q. The nth stairway has n vertical segments and n horizontal
segments . For each

n,

let B,. be the length of the nth broken line. Find limn-oo Bn.

p
10.

Let/be any function on

[a, b], and let x be any point of(a, b). Let mv m2, and mbe the
[a, x], [x, b], and [a, b]. Show that mis between

slopes of the chords over the intervals


m1 and m2 . (Unless, of course, m1
More precisely,

m2

f(x) - f(a)
x-a

m.)

f(b) - f(x)
b-x

and
m

f(b)-/(a)
b - a

The theorem says that either (a) m1 m m2 or (b) m2 m m1.

7.2

THE DEFINITE INTEGRAL, DEFINED AS A LIMIT OF SAMPLE SUMS

In Section 3.7, we defined the definite integral in terms of area, with areas above the
x-axis counted positively and those below counted negatively.

In the preceding

The Definite Integral, Defined as a Limit of Sample Sums

7.2

bg(x) dx

309

section, however, we regarded the integral as the limit of a sum:

Lim

2 g(.Xi) xi.

JNJ...,Oi=l

Most of the time hereafter, the definite integral will be used in this way, and so we
shall redefine the integral, using the above formula as a definition. For this purpose,
we need to investigate nets, and sums of the type
n

2 g(.X;) xi.
i=l
Consider first ail increasing continuous function/, on an interval
y

f(b)

[a, b].

y
------

f(b)

---- --- r -'


I
I
I

--- --- -----,- -- 1


I
I

x0=a x1
I

iI -II
-

f(a)
--+-x
Xn=b
x3
a=xo xi xz
(a)

f(a)

On the interval

[a, b] we

N: a

I
I
I
---

_____

IX3
_J

Xn=b X

_J

I
I
I
_

.J

(b)

form a net

x0 < x1 <

< xi-l < Xi <

< Xn

b.

N cut up the interval [a, b] into little intervals [xi_1, X;]. For
m; be the minimum value off on the ith interval [x;_1, x;],
and let M; be the maximum value. Since f is increasing, we have m; = / (x;_1),
M; = f (x;) As usual, x; = X; - X;-i. so that x; is the length of the ith interval
[x;_1, x;]. Iff is positive, as in part (a) of the figure above, then the sum

The points of the net


each

from 1 to

n,

let

s(N)

2 m; xi
i=l

is the sum of the areas of the inscribed rectangles, and the sum

S(N)

L M; X;
i=l

is the sum of the areas of the circumscribed rectangles. For functions which may be

s(N) and S(N) are sums of signed areas. In either


s(N) is called the lower sum off over the net N, and S(N) is called the upper sum
off over N.

negative, as in part (b) of the figure,


case,

310

7.2

The Definite Integral

On each interval [xi1, x;] we choose a sample point X;. Thus

(1 i 11) .
The sequence
is called a sample of the net

The sample gives a sum

N.

L (X)

L f(xi) 6.xi.

i=l

This is called the sample sum off over the sample X.


As in the preceding section, the mesh of N is the largest of the numbers 6.x;.
The mesh is denoted by INI. Thus

INI
Let

max {6.x;}.

be the region between the graph off and the x-axis, from a to b.
y

Theorem 1.

If/is continuous, and

N1 and N2 are any nets over [a, b], then

s(N1) S(N2).
That is, every lower um off is less than or equal to every upper sum off For
positive functions this is obvious, because in this case s(N1) is the area of an inscribed
polygonal region (lying under the curve) and S(N2) is the area of a circumscribed
polygonal region. Jn general,

where A and D are areas of inscribed regions, and Band C are areas of circumscribed
regions. (To see how this works, see the figure (b) above.) Therefore,

A C,
and

B ?;;_ D,

-B -D,

- B C - D,

s(N1) S(N2), as before.


Note that in Theorem 1 it is not required thatf be an increasing function.

Theorem 2.

Iff is continuous and increasing, then


lim [S(N) - s(N)]

1s1-o

0.

That is, the upper sums are close to the corresponding lower sums, when the mesh
is small. To prove this, we observe that the difference S(N) - s(N) has a geometric
interpretation.

The Definite Integral, Defined as a Limit of Sample Sums

7.2
y

311

--

f(b)

f(b) --------------

I
I
I
I
I
I
I
I

I
I
I
I
I
I
I
I
I

.-+---]----
I
I
I
I
I

j(a) --,

I
I
I
I
I

This difference is
n

S(N) - s(N) = .L Mi xi - L mi xi= L (Mi - mi) xi;


i=l
i=l
i=l
and this is the sum of the areas of the rectangles drawn solid in the figure.

These

rectangles can all be moved to the left and stacked up inside a rectangle of altitude

f(b) - f(a) and base INI. (Remember that INI is the largest of the numbers xi.)
Therefore

S(N) - s(N) [ f(b) - J(a)] INI,

and so

S(N) - s(N)

Theorem 3.

-+

0 as

IN/

-+

0.

If f is continuous, and

Jim

IXl->O

then the sample sums

.L (X)

[S(N) - s(N)] =

approach a limit, as

0,

INI

-+

0.

That is, there is a number k such that


Jim

l1Yl->O

Proof

L (X) =

k.

We have to start the proof by naming the number k.

bounded above.

The numbers

s(N) are

(By Theorem 1, every upper sum is an upper bound of the lower

sums.) Let
k

Consider an interval

sup

[x;_1, x;]. For each

{s(N)}.

i,

m; f(i;) M;.

312

The Definite Integral

Since

7.2

!J.xi > 0, this gives


m; !J.xif (xi) !J.xiM; !J.x i.

Therefore the sums from 1 to

n rank in the same order:

I m; !J.xiI f(x;) !J.x;I M; !J.xi,


i=l
i=l
i=l
so that

s(N) I (X) S(N).

That is, every sample sum lies between the lower sum and the upper sum.
We are now almost done. Given

INI < 0
By Theorem 3 there is a

> 0, we want a c5 > 0 such that


IL (X) - kl < .

=>

o > 0 such that


INI < o =>

Thus when

S(N) - s(N) <

E.

INI < o, the interval from s(N) to S(N) has length less than
s(N) 2.::(X)

E.

S(N)

I (X). ( See the inequalities above.) And it also contains k:


s(N) k, because k is an upper bound for the lower sums; and k S(N), because
k is the least upper bound of the lower sums. Therefore IL (X) - kl < E, because
I (X) and k are squeezed together: they both lie on the same short interval.

This interval contains

We can now give the new definition of the integral.


Definition.

fa f(x) dx

lim1N1-?0

I (X), if the indicated limit on the right exists.


f is integrable on [a, b ].

If the limit exists, then we say that

Theorems 2 and 3 fit together to give:


Theorem 4.

If f is continuous and increasing on

[a, b], then f is integrable on [a, b ].

Later we shall see that all continuous functions are integrable, whether or not
they are increasing.
Our calculations of definite integrals have been based on the differentiation
formula

f'f(t) dt

f(x),

where/is continuous. We need to know that this differentiation formula still holds,
under the new definition of the integral. This is the purpose of the following theorem.
Theorem 5

(The betweenness theorem for integrals). If f is integrable on [a, b ], and


mf(x)M

(a x b),

7.2

313

The Definite Integral, Defined as a Limit of Sample Sums

then

m(b
Proof

a) !(x) dx M(b - a).

Let N be any net over

[a, b],

and let

X be any sample of N.

m f (xi) M

Then

for every i.

Therefore

m D.xi f(xi) D.xi MD.xi.


Forming the sample sum

'L (X) by addition,

we get

'L m D.xi 'L (X) 'L M D.xi.

i=l

But

i=l

'L m D.xi

i=l

m 'L D.xi;
i=l

'L M D.xi

i=l

M 'L D.xi,
i=l

and
n

'L D.xi

i=l

(Why?) Therefore

m(b

a.

a) 'L (X) M(b

a);

and this holds for every sample sum, over every net N. Therefore the same inequalities
hold for lim

'L (X),

and the integral lies between

m(b

a)

and

M(b

a),

which

was to be proved.
If you review the proof of the formula
D

f'f(t) dt

f(x),

in Section 3.10, you will find that in this proof, all that we needed to know about the
integral was the information conveyed by the betweenness theorem for integrals.
Therefore the differentiation formula continues to hold, under our new definition,
wherever the integrand is continuous.

It follows that the fundamental theorem of

integral calculus still holds true.


At the end of this chapter we shall prove that every continuous function is inte
g rable.
*PROBLEM SET 7.2
1.

In Theorem 2 it was assumed that the function f is increasing. Does the same scheme
of proof work, for a decreasing function? If so, draw a figure illustrating the proof for
decreasing functions. If not, explain how the scheme breaks down, for the case of a
decreasing function.

2.

In Theorem 2 it was assumed that f is both continuous and increasing. Suppose we


assume that f is increasing, but not that f is continuous. What changes (if any) do we
then need to make in
a) the definitions of mi and Mi,
c) the proof of Theorem 2?

b) the definitions of s(N) and S(N), and

314
3.

The Definite Integral

7.2

Prove the following:


(The mean-value theorem for integrals). If f is continuous on [a, b], then
there is a point x, between a and b, such that

Theorem A

4.

ff

j(x)(b - a).

Consider the function f defined by the following conditions: /(!)


!, /Ci)
!,
0 for every other x on [O, I]. Is this function integrable? Why
/Ci)
!; /(x)
or why not?
=

*5.

(x) dx

Consider the following function, on [O, I]. If x is irrational, then/(x)


0. If x
p/q,
in lowest terms, then/(x)
I/q. At what points (if any) is this function continuous?
Is the function integrable? Why or why not?
=

6.

Given a continuous functiong, on [a, b], and a net N over [a, b]. Show that there is a
sample x of N such that L (X)
s g (x) dx.

7.

In Section 7.1 we showed that for every net N we could choose a sample X in such a
way that the length of the inscribed broken line is equal to the sample sum L (X),
not just approximately but exactly. ls it always possible to choose a sample X' such
that L (X') is exactly equal to the arc length? (Here we are assuming, as usual, that
/' is continuous.)

*8.

The following remarks are a very sketchy indication of an amusing proof of an important
theorem which is known to you in a slightly weaker form. Fill in the gaps, and state the
theorem which is proved.
F' f. f is known to be integrable on [a, b], but is not necessarily continuous.

L [F(x;) - F(xi_1)]
i=l
As INI
along.
*9.

___,.

0,

L'f=if (x;) D.xi

___,.

L j(x;) D.xi.
i=l

?; but Lf=1 [F(xi) - F(xi_1)] was something simple, all

Let/be differentiable on [a, b]. Show that if


j'(a) < k <

f (b

(a)

then k
j'(x), for some x between a and b.
[Hint: Remember the definition of j'(a). Do a sketch illustrating the definition.]
=

*IO. Theorem (171e no-jump theorem for derivatives). If f is differentiable on [a, b], and k
is between j'(a) and j'(b), then k
j'(x) for some x between a and b.
Thus, for example, the function
=

for x ,e. 0,
for x
cannot be the derivative of any other function f

The Calculation of Volumes, by the Method of Disks

7.3

*11. Theorem.

315

Iff is differentiable at a, then

xl-

(f(x; =Xi)) =['(a).

x:-a+
More precisely, for every

" > 0

a *

12.

a)

A function

c5 <

there is a
x1

<

c5 > 0
<

x2

such that

<

I [Cx2) - f(x1) _ ' a


[( )
X2 - X1
I

+ c5
<

"

of/is Lipschitzian on [a, b] if there is a number k


x1 and x2 of [a, b],

> 0

such that for every

[f(x1) - [Cx2)i k [x1 - X2i

Show that f(x)

sin x is Lipschitzian on the interval ( - ct:>,

ct:> .

b) Show that every Lipschitzian function is continuous.

c) Give an example to show that a continuous function is not necessarily Lipschitzian.


d) Show that if[' is continuous on [a, b], then/is Lipschitzian on [a, b].

e) Show that if[ is Lipschitzian on [a, b], then/is integrable.


Since Theorem
then

is known, it will be sufficient to show that if/is Lipschitzian,


lim [S(N) - s(N)]

0.

IN/o
13.

a)

function f is uniformly continuous on [ a, b] if for every


such that

[x - x'[

< c5

I/(x) - f(x')[

"

> 0

there is a

c5 > O

< "

(Here x and x' are any points of [a, b].) Show that if[' is continuous, then f is
uniformly continuous on [a, b].
b) Show that every uniformly continuous function is integrable.
7.3

THE CALCULATION OF VOLUMES, BY THE METHOD OF DISKS

The volumes of various solids can be expressed as definite integrals. In this process,
we shall assume that the following volume formulas are known.

I
I
I
I
I
I
I

) ---- ---/
/

,,....-- ------ ........,


r

V =abc.

V = 7rh(r2 - s2).

The Definite Integral

316

7.3

The first of these solids is a rectangular parallelepiped; the second is a right circular
cylinder; and the third is a cylindrical shell, that is, the portion of the larger cylinder
that lies outside the smaller cylinder.
We get a coordinate system in space by setting up a z-axis, perpendicular to the
xy-plane at the origin. Here, and throughout this chapter, we shall indicate only the
positive half of each axis, thus getting a picture of only the "first octant," in which
the points have nonnegative coordinates.
y

Consider now the function


f(x)

1/x

on the interval [1, 2]. Let R be the region under the graph, in the xy-plane. We
rotate the region R about the x-axis. This gives a solid S.
y

Let
net

vS

be the volume of S. We shall express

vS

as a definite integral. First we form a

over the interval [1, 2]. For convenience, we use equally spaced points, so that
xi

xi-l

1/n

for each i. Over the intervals [xi_1 , xi ] we set up the circumscribed rectangles. These
form a region Rn which is an approximation of the region R. Then we rotate Rn
about the x-axis. Each of our rectangles then gives a cylinder (lying on its side), and
the cylinders form a solid Sn which is an approximation of S. In the figure on the
right below we show only the ith cylinder. Its altitude is x
xi
xi_1, and the
radius of its base is
=

The Calculation of Volumes, by the Method of Disks

7.3
y

317

Therefore its volume is

7T (-1-)2 x
xi-1
and so the total volume of the circumscribed solid Sn is
n
n
1 2
vSn .2 vi .2 7T
x.
i=l
i=l X;-1
V

7Tr2 x

'

(-)

This is a sample sum of the function

g(x)

1
7T
x2'

over the net

(In fact, it is the upper sum of g over the net N, because g(xi_1) is the maximum
value of g on the interval [x;_1, X;].) The mesh of N is

and so IN!

as

INI
co.

{ 7T dx
J 1 x2

.! ,
n

Therefore

lim vSn

n-+ 00

- Jx 21

.'.'.:: .

(1)

If we use inscribed rectangles, and rotate them about the x-axis, then we get an
inscribed solid S, with volume
n
2
vS .2 7T ( 1 ) x.
=

i=l

X;

This is also a sample sum, of the same function g


lim vS =

n- cc

2 7T
2 dx.
1 X

7T/x2 Therefore

(2)

Therefore the volume vS of Sis squeezed between the volumes of the inscribed and
circumscribed solids:
for every n,

318

7.3

The Definite Integral

and so

vs =

f
Ji

'!!__ dx = '!!. .
2
x2

(3)

We shall now review this process and state the assumptions on which it is based.
Not all solids are

measurable, in the sense that they have volumes; but the solids that

you are likely to encounter soon are measurable, and their volumes are governed by
the following laws.
By an

elementary solid we mean a right parallelpiped, cylinder, or cylindrical

shell, as at the beginning of this section. We have been assuming that:

V.1. Elementary solids are measurable, and their volumes are given by the formulas
v = abc, v = 7Tr2h, v = 7Th(r2 - s2).
Two solids are nonoverlapping if they have no solid in common. (They may have
surfaces in common.)

V.2. If s1, s2,

sn are nonoverlapping elementary solids, and Sn is their union,

then Sn is measurable, and

V.3. If S and S' are measurable, and S' lies in S, then vS' vS.
V.4 (The squeeze principle). If (a) Si. S2,

are measurable solids containing S,

(b) S{, S, ... are measurable solids lying in S, and (c) lim,H00

vSn

L = limn-ro

vS,

then S is measurable, and

vs=

L.

Using V.1 through V.4, we can show that the method of disks, which we have
'

used for the function

f (x) = I/x,
works for every function

f which is

0 and continuous.

f
M;
'mi

Given such a function f, on a closed interval

----

I
I
I
I
I

I
I
I
I
I

.x
--lf---x, -1 -i
-

---x

[a, b], let S be the solid of revolution


[a, b], with equal spacing. As usual, let
and M; be the minimum and maximum values of f on the ith interval [x;_1, x;] .

of R, about the x-axis. Take a net N over


mi

---

The Calculation of Volumes, by the Method of Disks

7.3

319

If we rotate the inscribed rectangles about the x-axis, we get an inscribed solid S,
of volume

n
vS = I 7Tmi!lx.
i=l

If we rotate the circumscribed rectangles about the x-axis, we get a circumscribed


solid Sn, of volume

n
vSn = I 7TMi!lx.
i=l

V.3 says that

vS vSn

for every

n.

But vSn is an upper sum of the function

g(x) = 7Tj(x)2,
and vS is a lower sum of the same g. As

vSn _.

f7rf(x)2 dx,

n-. oo,

vs

JNI-. O;

--f7Tj(x)2 dx.

By the squeeze principle it follows that S is measurable, and

vS

f7Tj(x)2 dx.

We also use this formula sidewise.


y

y=vx
R

Suppose that the region R on the left is rotated about the y-axis. Sidewise, R can be
regarded as the region under the graph of a function
x =f(y) =y2,
Therefore the volume is

1 7T

[f(y)]2 dy

Oyl.

1 7TY4 dy
1
0

7T

=5

320

7.3

The Definite Integral

PROBLEM SET 7.3

1. Obviously a right circular cone can be regarded as the solid of revolution of a right
triangle about one of its legs. If we place the triangle in the xy-plane as shown in the
figure, then the hypotenuse becomes the graph of a function f Calculate/, and find the
volume of the cone by the methods of this section.
y

2. Similarly, a round ball of radius

r can be regarded as the solid of revolution of a semi


circular region about its diameter. Find the volume, by the methods of this section.

3.

The region under the graph of /(x)


Find the volume of the resulting solid.

4.

Same, forf(x)

5.

Same, for f(x)

6.

Same, forf(x)

7.

Let R
volume.

8.

Same, for R

9.

Same, for R

v':X

(0 x 1) is rotated about the x-axis.

sinx, 0 x .;,,
x312, 0 x 1.
cosx, -1Tj2 x 1T/2.

{(x,y) I 0 x 1,. x2 y 1} be rotated :>.!)out the y-axis.

10. Same, for

11. Same, for

{(x, y) I 0 x 1, Sin-1x y 1T/2}.


{(x,y) I 0 x 1T/2, sinx y l}.
{(x,y) I 0 x 1, x3 y 1}.
{(x,y) I 0 x v'2/2, x y v'l - x2}.

Find the

321

The General Method of Cross Sections, and the Method oJ.' Shells

7.4

12. Find out whether the following is true:

Theorem (?). Let T and T' be triangles each of which has a side on the x-axis.

If T

and T' have the same area, then when they are rotated about the x-axis, they give solids
with the same volume.
13. a) For each.x from 0 to 1, let T,,, be the triangle whose vertices are (0, O), (1, 0), and
(x, 1). What value or values of x give maximum volume, when Tx is rotated about
the x-axis?
b) Suppose that the triangles T,,, are rotated about the y-axis (instead of the x-axis).
Which value or values of x give maximum volume?
14. For each k from 0 to 1, let Tk be the triangle whose vertices are (0, 0), (k, 0), and
(0, Vl - k2). (Thus the hypotenuse of Tk has length 1.) Tk is rotated about the x-axis.
What value of k gives maximum volume? What is the maximum volume?
15. a) Given/(x)

1/x. Let R be the region under the graph of/, from 1 to

ro.

Give a

reasonable definition of the area of R. Is this area finite?


b) The region R is rotated about the x-axis, giving a solid S. Give a reasonable definition
of the volume of S. Is this volume finite?
16. a) The region R under the graph of f(x)

1/x2 from l to

ro

is rotated about the x-axis

giving a solid S. Does S have finite volume?


b) If the region R is rotated about the y-axis, do we obtain a solid with finite volume?

7.4 THE GENERAL METHOD OF CROSS


SECTIONS, AND THE METHOD OF SHELLS

The method of disks can be generalized in the following way.

Given a solid S in

space. Suppose that we can calculate the areas of the cross sections perpendicular to
the x-axis.
y

'\
'I
, ,
11
,,
I;

\
I
I
I
I

For each

from

to b we let A(x) be the area of the cross section.

function A which expresses the cross-sectional area in terms of


examples, the cross sections were all circular.)
[a, b] into

I
I
I
I
I
I

x.

This gives a

(In our previous

As before, we divide the interval

equal parts, and we approximate the volume by cylinders. In the figure

at the right, we show only the ith cylinder.

322

The Definite Integral

7.4

We then have
vn

L A(xi) D..x,

i=l

and the sum on the right-hand side is a sample sum of the function A.

Therefore,

as the mesh goes to 0,

It is plausible to suppose that as the mesh approaches 0, vn approaches the


volume of S; and in fact this is true, for measurable solids, although we are not in a
position to prove it. Thus
vS

fA(x) dx.

By this method we can calculate volumes.


y

x, for 0 x 1.

For example, take the parabola

For each y from 0 to 1, we take the horizontal segment

from (O,y) to-the point (x,y) of the parabola; and using this segment as an edge,
we construct a horizontal square. Thus we get a solid, as shown in the figure.
y

/
/

The cross-sectional areas perpendicular to the y-axis are given by the formula
A(y)
Therefore the volume is
V

x2

y.

11A(y) dy 11y dy [y2] 1


=

1
=

The general method of cross sections applies, in a sense, to every volume problem.
That is, it is always true that
vS

fA(x) dx.

But often this formula leads to difficult calculations.

The General Method of Cross Sections, and the Method of Shells

7.4

323

,,.

,,.

I
I
I
I
I

-2

Consider, for example, the region R under the graph of


f(x)

cos x,

We rotate R about the y-axis, getting a solid of revolution, of which only the front
half is shown in the figure. We can find the volume by the cross-section method. We
have
A(y)
Therefore
V

J:

?TX2

A(y) dy

?T(Cos-1 y)2

J:

?T(Cos-1 y)2 dy.

We can calculate this by integrating by parts twice, but there is a better way.
Instead of approximating the solid by thin cylinders, we approximate it by thin
cylindrical shells.
y

First we approximate the region R by rectangles with equal bases x

?T/2n,

as

shown on the left. Then we rotate each of these rectangles about the y-axis, getting
cylindrical shells, as shown on the right. The altitude of the ith shell is
f(xi)

cos xi;

the outer radius is X;; and the inner radius is X;_1

X; - x . Therefore the volume

of the ith shell is

?TX cos xi - ?T(X; - x)2 cos X;


?T(x; - x +

2x; x

- x2) cos X;

27TX cos xi x - 7T(COS X;)(x)2.

324

7.4

The Definite Integral

Therefore the total volume of the inscribed solid Sn is

L 27TXi

cos xi

n
Llx - 7T Llx L cos xi Llx.

We need to find out what happens as the mesh goes to 0. The first sum is a sample
sum of the function 27TX cos x. Therefore

L 27TX; cos X;
For the same reason,

Llx

L cos X;

-+

7T Llx

L cos xi

rr/2
27Tx cos x dx.

Llx-+

Therefore

rr/2
cos x dx.

Llx

-+

7T

[ r.12
J cos x dx.
o

Thus the entire second sum, in the expression for vSn, drops out when we pass to the
limit. Therefore

vSn
and
V

-+

[ "12
J 27TX cos x dx,
o

[ "12
J 27TX cos x dx
o

27T

[i J
+ 0

27T[X sin x + cos x]12

- 277[1]

7T2 - 27T.

The same method applies if we rotate a region lying to the right of the y-axis.
y

If the width of the region is given by a function h(x), then the volume of the ith
cylindrical shell is
vi =
=
=

7TX7h(x;) - 7T(Xi - Llx)2h(xi)


7Th(xi)(x7 - x7 + 2x; Llx - Llx2 )
27TX;h(x;) Llx - 7Th(xi) Llx2

The General Method of Cross Sections, and the Method of Shells

7.4

Therefore

n
vn

L vi

325

27T L X;h(x;) Llx - 7T Llx L h(x;) Llx

Therefore
V

27T

fxh(x) dx.

Here again the second part of the sum drops out when we pass to the limit.
Thus the sums behave as if

V;

There is a simple reason why

were given by the formula

V;

is well approximated by

27TX;h(x;) Llx.

2...x;

h(u){ ------------------]
\..,

D.x

If we make a vertical cut in the ith cylindrical shell, and :flatten it out, we get a rec
tangular prism. The length of the prism is the circumference of the outer circle in the
base of the shell.

This is

27TX;.

The altitude and the thickness of the prism are the

same as the altitude and the thickness of the shell; these are
the volume of the prism is exactly
mation to

V;

when

Llx

27Txih(x;) Llx;

h(xJ

and

Llx.

Therefore

and this ought to be a good approxi

is small, because when the shell is thin, we can flatten it out

without distorting it very much. As we have seen, the error goes to zero as the mesh
goes to zero.
The method of shells applies to the problem that we were discussing above.
We know that the volume is
V

irr/227TX
0

COS

X dx.

Integrating by parts, we get

Jx

Therefore
V

cos

x dx

{x sin x +

[27T(X sin x +

cos

x)]12

cos

x +

C}.

7T2 - 27T.

The same method applies more generally. Consider the region


R

{(x,y) I 2 x 3, 0 y -x2 + 5x - 6}.

326

The Definite Integral

7.4
y

-_Ll
I
I

+-

If R is rotated about the line

-1, then by the shell method the volume of the

resulting solid is

f27T(x

-'--'-.__--X

----'-

+ 1)(-x + 5x - 6) dx

f27T(-x3
[ x4
27T - 4

If the same region is rotated about the line

f277(6 - x)(-x2

which is also equal to

77T/6.

+ 4x2 - x
+

4xa

6,

x2

6) dx
]a
6x. 2

t7T.

the volume is

+ 5x - 6) dx,

(Why?)

In some cases there is little to choose between the cross-section method and the
shell method. For example, suppose we take the region below the graph of y
0

;;; x ;;;

x2,

1, and rotate it about the y-axis.


y

By the shell method,


V

f127TX x2dx

Jo

'!!..

The horizontal cross section at height y is the region between a circle of radius 1
and a circle of radius

x =

JY.

as before.

Therefore

fA(y) dy f [7T
=

7T - 7T

fly dy
0

7T
2

12 - 7TY] dy

= - '

7.5

The Area of a Surface of Revolution

327

PROBLEM SET 7.4

1. Let R be the circular region with center at (5, 0) and radius 2. R is rotated abo1t the
y-axis. Find the volume of the resulting solid.
2.

A solid of the sort described in Problem 1 is called a solid torus. More generally, suppose

we have given a circular region of radius

a,

and a line L in the same plane, such that

the perpendicular distance from L to the center of R is b, with b

a.

When R is rotated

about the line L, the result is a solid torus. Find its volume, in terms of

3.

and b.

Let R be the square region with center at (4, 0) and sides of length 2, parallel to the

coordinate axes. R is rotated about the y-axis. Find the volume of the resulting solid.

4. Let T be the square region with center at (4, 0) and sides of length 2, with diagonals
parallel to the coordinate axes.
rotated about the y-axis.

Find the volume of the solid which results when T is

5. a) The region under the graph of


y

1 x

lnx,

e,

is rotated about the x-axis. Find the volume, by the method of disks.
b) Now solve this problem by the method of shells.
6.

For eachx from 0 to 1, let Rx be the circular region perpendicular to thexy-plane, with

center at the point (x, x2) and radius 1.

Find the volume of S.

Let S be the solid formed by the regions Rx.

7. a) The region described in Problem 5a is rotated about the y-axis. Find the volume, by
the shell method.

b) Now solve Problem 7a by the method of cross sections.


8. a) The region under the graph of y

e", 0 x 1, is rotated about the y-axis. Find

the volume by the method of shells.


b) Now solve Problem Sa by the cross-section method.
9.

Let C be the cylinder with the y-axis as its axis of symmetry, and radius 1. Let S be the

sphere with center at the origin and radius 2.

Find the volume of the solid which lies

inside the sphere and outside the cylinder.

10. Let C,, be the cylinder of radius 1, with the x-axis as its axis of symmetry; and let Cy
be the cylinder of radius 1 with the y-axis as its axis of symmetry. Find the volume of the
solid which lies in both Cx and Cy.
11. Let S be the sphere of radius v2 with center at the origin. Let C be the cone with vertex
at the origin, axis along the y-axis, and passing throug the point (1, I). Find the
volume of the solid which lies inside the sphere and inside the cone.

7.5

THE AREA OF A SURFACE OF REVOLUTION

Given a line and a curve, lying in the same plane and lying on one side of the given
line. If the curve is rotated about the line, the resulting surface is called a

revolution.

surface of

The area of such a surface can be expressed as an integral. We begin with

the simplest case, in which a function-graph is rotated about the y-axis.


functionfis defined on a closed interval
assume that

[a, b]

has a continuous derivative.

Here the

on the positive half of the x-axis. We

328

7.5

The Definite Integral

.. x

To calculate the area of the surface of revolution, we need the formula for the
lateral surface of a right circular cone. Let s be the slant height of the cone, and let
r be the radius of the base, so that the circumference of the base is 2Trr.

We assert

that the lateral surface is the same as the area of a circular sector of radius s, with
boundary arc of length 2Trr. The reason is that we can make a straight cut in the cone,
starting at the vertex, and then flatten out the surface, without changing its area, so
that the resulting surface lies in a plane. The plane surface thus obtained is the sector
shown below.

But the area of a circular sector is half the product of its radius and the length of its
boundary arc. Therefore, for cones, we have
A

= 'TrYS.

(Note that for a "cone of altitude O," that is, a disk, this formula gives the right
= Trr2. ) From this we can get a formula for the lateral area of a frustum

answer Trrs

of a cone. If the larger cone (with slant height s2) has area A2, and the smaller cone

The Area of a Surface of Revolution

7.5

329

has area Ai, then the area of the frustum is


AA= A2 - Ai=

7Tr2s2 - 7Tris1

Evidently

If Si

kri. s2= kr2,

then

AA= 1Tkr - 1Tkri= 1Tk(r2 - ri)(r2


=

7T(S2

s1)(r2 + r1) =

and so

21T

+ ri)
ri + r2 A
us,
--2

AA= 2Ttf As,

where

i(r1 + r2).
That is, the area of the frustum is equal to its "average circumference"
slant height D..s.
f=

2Ttf

times its

Consider now the surface of revolution obtained by rotating the graph off
about the y-axis. We take a net
N: x0, Xi,

over the interval

[a, b],

xi-l xi,

. . . Xn,

with equal subdivisions, so that

b - a
Ax= --= INI
n
for each i. For each i, let Pi be the point (xi, f (xi)) . These points determine a broken
xi - xi-l

line Bn which is an approximation of the graph off When Bn is rotated about the

y-axis, we get a surface Sn, with area An.

By definition, the area of the surface of

revolution off is

A= lim An,
INl->O
if the limit exists.

(This is like the definition of arc length.)

We shall now calculate An, and find its limit as

INI -+

0. Consider the ith segment,

from Pi-l to Pi. When this segment is rotated, it gives a frustum whose area is

ai=

21Txi

Pi-lpi;

.. x

Xi-1

Xi

Xi

7.5

The Definite Integral

330

As in the calculation of arc length,

Pi-1Pi

where

X;_1

.JLix2 + [f(xi) - f(x;_1) ]2

xi )
f xi)
1 + ( c :c -l Y Llx

.J1 + f'(i;)2Lix,

< i; < xi, as shown in the last figure.

We now have a formula for the area A n of the approximating surface:


An

Here

i i is

I ai iI=l 27TiiPi-1Pi

i=l

the midpoint of

If it were true that

i;

[xi_1, x i],

and

i i for each i,

g(x)

I 27Tii.J1

i=l

+ f'(i;)2Lix.

i; is somewhere

on the same interval.

then A would be a sample sum of the function


n

27Tx.JI + f'(x)2,

and we would have no problem, because for


n

I a;

i=l

we know that
n

I a;

lim

noo i=l

I 27Ti;.J1 + f'(i;)2Lix,

i=l

ibg(x) dx ib27Tx.J1
=

+ f'(x)2 dx.

What we need to show, therefore, is that

It will then follow that


n

lim

I NI ->O

Now

because

i; lies

I ai

i=l

lim

INI ->O

I a;

i=l

xi __
x i,1
I-

on the interval

lai - a;J

[xi-1

ib27Tx.J1
a

+ f'(x)2 dx.

<Lix
'
2

] whose midpoint is i i. Therefore

X; ,

l27Tii.J1 + f'(i;)2Lix - 27Ti;.J1 + f'(i;)2Lixl

27T.J1 + f'(i;)2Lix Ii; - i;I

7T.J1 + f'(i;)2Lix2

The Area of a Surface of Revolution

7.5

331

Therefore

a; - ;a;
;

I ;Ia;
n

- a;I

+ f'(x;)2 .6.x2

= .6.x I '"-J1

f'(x;)2 .6.x.

i=l

Now
lim

.6.x i Tr-J1

INJ-+O

Therefore

i '"-J1

i=l

+ f'(x;)2 .6.x

b
= 0 r Tr-J1
Ja
.

+ f'(x)2

dx.

II a; - I a;I is squeezed to 0, which was to be proved.

Let us try this formula on some problems to which we already know the answers.
y
y
b

A cone is the surface of revolution of a segment. Here


and

f(x) = b - (b/a)x,
A

f'(x) = -b/a,

a
b2
r
= o 2Trx 1 + dx
J
a2

= 2Tr -Ja2 + b2 rax dx


a
Jo
= Tra-Ja2 + b2,

which is the right answer.

Consider next a quadrant of a circle of radius

about the y-axis. In this case,

f(x) = -Ja2 - x2
(0 x a),
-x
f'(x) = I
'
'\/ a2 - x2
a2
x2
1 + f'(x)2 = 1 +
=
a 2 - x2 a2 - x2
---

a,

rotated

332

The Definite Integral

7.5

Therefore the area is

It follows that the total area of a sphere of radius

is

47Ta2

This is the standard

formula.
It is harder to find the area when we rotate a function-graph about the x-axis
instead of the y-axis.
y

Given a nonnegative function f, on an interval


over

[a, b],

[a, b].

As before, take a net

with equal subdivisions, so that for each i,


xi - xi-1

D..x

b
=

--

INI.

As before, we approximate the graph by a broken line Bn- Then we rotate Bn about
the x-axis, getting a surface Sn, with area Aw

We define the area of the surface of

revolution to be limlNl--?O An, if such a limit exists. We proceed to calculate:

where

a;

is the area of the ith frustum, shown in the figure. As before,


P;_1P;

where

-J 1

X;_1 <

f'(x)2 D..x,
< Xi.

But when we rotate the chord from P ;_1 to P; about the x-axis, the "average circum
ference" is

Obviously i'; is between/(xi_1) andf(x;), because i'; is their average. By the no-jump
theorem of Section 5.7,
Therefore
An

I a; I 27Tj(.X;)-J1
=

i=l

i=l

f'(.X)2 D..x.

The Area of a Surface of Revolution

7.5

333

If it were true that xi= x for each i, then the sum on the right-hand side would be a
sample sum of the function

g(x)= 2rrf(x)J1 + j'(x)2.


As it stands, it is very close to being a sample sum. The idea is that
INI=

for each i
for each i

i 2rrl(xi)-./1
i=l

f'(x;)2 x

i 2rrl(x;)J1

f'(x;)2 x

i=l

f2rrl(x)-./1

f'(x)2 dx.

At the end of the chapter, these ideas will be turned into a proof. Meanwhile let us
look at some applications of the formula
A=

1)

If f(x)

f2rrl(x)J1

f'(x)2 dx.

k, on [a, b], then the surface of revolution is a cylinder.

By the

integral formula,
A=

f2rrkJT+02 dx= 2rrk(b

- a ,

which is the right answer.

2)

A sphere of radius

a is the surface of revolution of a semicircle of radius a.

Here

l(x)= Ja2 - x2,


Hence

2x + 2lf'= 0,
Therefore
1

and
A=

!'2=

and

2
12

1
+ =

x
l

l'=

2
2 + x2
=
12
1 2'

a
f-aa 2rrl(x)J7
-- dx= f 2rra dx= 4rra2
-a
l(x)2

PROBLEM SET 7.5


1.

Let

Ca

be the circle with center at the origin and radius

lying above the interval

a,

and let A be the arc of Ca

[ -a/2, a/2]. A is rotated about the x-axis. Find the area of the

resulting surface. What proportion is this, of the total area of the sphere?

334

The Definite Integral

7.5

2. The entire circle

Ca is rotated about the x-axis, giving a sphere of radius a. Eb and Ee


b and x = c; and S is the part of
the sphere that lies between them. Find the area of S, in terms of a, b, and c. The

are two planes, perpendicular to the x-axis, at x =

form of your answer ought to suggest a somewhat surprising theorem which can be
stated without the use of formulas. What is the theorem?
3. The circle with center at

(b, 0) and radius a, a


torus. Find its area.

<

b, is rotated about the y-axis. The

resulting surface is called a

4. The square with corners at the points

(a, O), (a+k, k), (a+k, -k), and (a + 2k, 0)


a.) Find the area of the resulting surface.

is rotated about the y-axis. (Here 0 < k <

5. Find the volume of the solid obtained when the corresponding square

region is rotated

about the y-axis.


6.

The same square is rotated about the line x

a+2k. Find the surface area.

7. The square region is rotated about the line x =


8.

The square with center at


rotated about the line x =

9.

a + 2k. Find the volume.

(a, 0) and sides of length 2k parallel to the coordinate axes is


2a. Find the area of the resulting surface. (Here 0 < k < a.)

Find the volume when the corresponding square region is rotated about the Jine y = k/2.

10. Consider the curve consisting of (a) the segment from (0, 0) to

(a, 0), (b) the segment


(a, 1), and (c) the semicircle, pointing outward, with endpoints at (a, 0)
and (a, 1). This curve is to be rotated about the y-axis. For what value of a is it true
from (0, 1) to

that the total area of the resulting surface is equal to 15?


11.

For each

a, let Sa be the area of the surface described in Problem 10, and let Va be the
a maximizes the ratio VafSa?

volume of the solid that it encloses. What value of


12. The circle with center at

(b, b) and radius a is rotated about the line


x+y=l .

Here

a and b are both positive, and the Circle does not intersect the line. Find (a) the

area of the resulting surface, and (b) the volume of the solid that it encloses.
13. Same question, for the circle with center at (2, 10) and radius I and the line x+y = 2.
(The only natural solutions of this, on the basis of the theory that we have so far, are
rather clumsy. This suggests that some new ideas are needed.)
14. The graph of
y = 2x2,
from x = -1 to x = 1, is rotated about the line x = 5. Find the area of the resulting
surface.
15. If the same surface is rotated about the line x = 4, would the area of the resulting
surface be greater, or would it. be less, than the answer to Problem 14? Get a plausible
answer to this, and justify it as well as you can.
16.

The graph of y =

t(e"'

e-"'), 0

1, is rotated about the x-axis. Find the area of

the resulting surface.


17. The same graph is rotated about the y-axis. Find the area of the resulting surface.
18.

Let G be the graph off (x)=sin x, from


line x+y = 4, and then about the line
has the larger area? Why?

x
x

0 to x = n/2. G is first rotated about the

+y

5. Which of the resulting surfaces

(A right answer, with a plausibility argument, is acceptable

Moments and Centroids. The Theorems of Pappus

7.6
as an answer to this one.

It is possible, however, to give a proof of the right answer,

without calculating the area of either of the surfaces.

That is, you can prove an in

equality of the form A < B, without calculating either A or

7.6

335

B.)

MOMENTS AND CENTROIDS. THE THEOREMS OF PAPPUS

The ideas in this section are mathematical descriptions of physical ideas.


finite set of "point masses" m;, at the points P;

Given a

(x;, y;) in a coordinate plane, the

moment (of the system) about the y-axis is


n

My= .L X;m;.
i=l

The left-hand figure below shows the general case.


y

y
P2

-2
-1
--r---'---+-----_...x
I
I
-1
I
I
I
m2=1------2

----
--+-----____, :

P3

Pn

In the example on the rig)1t, we have:


My

x1m1 + x2m2

2 +

(-2)

0.

Physically speaking, this means that if the plane is horizontal, resting on a knife-edge
along the y-axis, it will balance. The formula .L mix; for M v makes it plain that the
effect of each point mass depends only on the product m;X;; if we divide m; by 2,
and double xi, then the moment Mv is unchanged.
Similarly, the moment about the x-axis, of our finite system of point masses, is
defined to be
M

L Yimi.

i=l

The total mass of alt the particles in the system is denoted by


n

.L m;.
i=l

The centroid of the system is defined to be the point

(x, .Y)

such that
and

Mx

ym.

m.

That is,

336

The Definite Integral

7.6

Thus if we concentrate the entire mass of the system at P, the moments about the
x-axis and y-axis are unchanged.
For example, if we have m1 = 2 at P1 = (1, 2) and m2 = 3 at P2 = (2, 5), then
My= 2 + 6 = 8,

MllJ = 4 + 15 = 19,

m = Lmi

= 5,

19 = ji . 5,

8 = x. 5,

x = t.

ji=1_/.

The above discussion does not prove that My, Mx, and P = (x, ji) have any
physical significance; only experiments can prove this. The fact, however, is that the
physical conditions for equilibrium are described by moments and centroids.
Let us now consider how these ideas can be applied to a region Rin the xy-plane.
We shall think of Ras a very thin sheet of homogeneous material, so that the mass per
unit area is constant, say, = I.
y

Suppose that we take a net over the interval [a, b], as in the figure; for each x ,
we let h(x) be the height of the cross section of Rat x, and we let
We use equal subdivisions, so that
Then

xi -

for each i.

xi-l = b.xi = !:ix,

is the area of the rectangle in the figure. The rectangle is narrow, and so its moment
about the y-axis should be approximately

If we approximate the region R by a finite set of such narrow rectangles, then the
moment of R about the y-axis ought to be approximately
n

L xA !:ix = L xi h(x ) !:ix,


i=l
i=l
i

7.6

Moments and Centroids. The Theorems of Pappus

and the approximation ought to get better as the mesh

,6.x decreases.

337

This is the idea

of the following definition.


Definition.

Let R be the region lying between the graphs of two continuous functions

/1 and/2, on an interval

[a, b], with/1

/2, and let

h(x)=fix) - /1(x).
Then the

moment of

about the y-axis is

fxh(x) dx.

M,"=

The definition of M., is similar. Here (see the right-hand figure)


R= {(x,y)

and by definition,

Jc y

g1( y) x g2(y)},
w (y)=g2(y) - gi(y),
M.,

and

fyw(y) dy.

Since the total mass of R is its area

A= h(x) dx= w(y) dy,


it is natural to define the centroid of R as the point P= (.X,ji) such that
Mv=.XA,

Mx=jiA.

For example, consider a quadrant of a circle of radius


y

y=Va2-x2,

Oxa.

a.

338

The Definite Integral

7.6

Here

Obviously
and so
Therefore

x =-a.
37T

By symmetry, interchanging

and y, we get:

y =
=

The moment about the line

= -a.
37T

x0 is

defined to be

f(x - x0)h(x)dx,

Mx=xo

and the moment about the line y =

y0

Mv=Yo =

is

f(y - Yo)w(y)dy.

It is now easy to see that


Mx=:r =

Theorem 1.

Proof

For any

Mx=x =
o

This is 0 for

x0

0 = My=g

x0,

fcx - x0)h(x)dx fxh(x)dx - x0fh(x)dx =My - x0A.


=

x. The proof of the other half of the theorem is the same. In fact,

the equation

M"'="o = My

- x0A

shows that the converse of Theorem I is also true.


Theorem

2. If MX=Xo

0, then Xo

x; and if MY=Yo =

0, then Yo

ji.

symmetric, in the sense now to


symmetric across the line L if L is the perpen

Centroids are easy to find for regions which are


be defined.

Two points

and P' are

dicular bisector of the segment between them.


that

P'

In the left-hand figure below, we say

is the point symmetrically across L from

L if for each point

region or a curve.

P of the figure, P' also lies in

P. A figure is symmetric about a line

the figure. The figure may be either a

For example, a circle is symmetric about any line through its

center, and the interior of a circle has the same property.

Moments and Centroids. The Theorems of Pappus

7.6
p

pt

P'

It is easy to see that if R is symmetric about the y-axis, then x

h(x) is an even
with (-x)h(-x)

on the left,
function,

function, with

- [xh(x)].
M11

and x

h(-x)

h(x).

Therefore

0. In the figure

xh(x)

is an

Therefore

faxh(x)

0,

0.
y

--+--'-x
"'----'--'--..L.._
x0-k x0-t x0 x0+t x0+k

More generally, as in the right-hand figure, we have:


Theorem 3.

If R is symmetric about the line

Proof
Mxxo

By symmetry,

rxo+k
(x
J xo-k
h(x0

- X0)h(x) dx

t)

h(x0

x0,

then x

x0.

r xo+k
(x) dx.
Jco-k efi

+ t)

for every t; and so

efi(x0

- t)

Therefore the graph of

339

- t)

[(x0

-efi(xo

efi must

- x0]h(x0 -

t)

-th(x0

+ t).

be like the graph shown below .

xo+k

t)

odd

7.6

The Definite Integral

340

Therefore

r xo cp(x) dx '"o+k cp(x) dx,


Jxo-k
Jxo
xo+k
Mx=xo ixo-k cp(x) dx
=

and

It follows that

x0

x.

0.

In this proof, all that we have used is the assumption that

h(x0

t)

h(x0 + t).

This condition may hold for regions which are not symmetric, as below.
y

And interchanging

and y, we get the following theorem.

Theorem 4. If R is symmetric about the line y

y0, then ji

y0

These ideas have the following geometric consequence:

Theorem 5 (Pappus' theorem, for volumes). If a region is rotated about a line not
intersecting it, then the volume of the resulting solid is equal to the area of the region
times the circumference of the circle described by the centroid.
That is, if the region below is rotated about the y-axis, then
V=

27T.XA.

Proof

By the method of shells,


V

f27Txh(x) dx.

Therefore
V=

27TMv

27T.XA,

Moments and Centroids. The Theorems of Pappus

7.6

341

because x was defined by the equation


Mv

xA.

Pappus' theorem can be applied in two ways.

If we know x and A, we can

compute V = 27TxA; and if we know V and A, we can solve for x = V/2.;.A.

For

example, consider a circular region R, of radius a, with center at the point (b,

0),

b > a. When R is rotated about the y-axis, we get a solid which is called a solid torus.
(The surface of the solid is called a

torus.)

By Pappus' theorem, we get

27Tb 7TG2 = 2TT2a2b.

y
a

I
I
I
I
I
I
-
-+-+----+-- x
a

pt=?

I
I
I
I
-a

We can use the theorem in reverse to find the centroid of a semicircular region.
If the region is rotated about the y-axis, we get a sphere of radius a, with volume

V = t7Ta3.
Obviously
Therefore
and

4
x =-a.
37T
These ideas apply also to arcs.

We shall think of an arc as a thin homogeneous

wire whose mass per unit length is constant, say, = 1.

Suppose that the arc is the

graph of a function f, on an interval [a, b ]. As usual, we take a net over [a, b], with
equal subdivisions.

The arc length over the interval [x;_1, X;] is


si

f/1

+ f'(x)2 dx.

The Definite Integral

342

7.6

Now

si

-./1 +f'(x;)2 f.,,x;

R::!

the moment of this little arc about the y-axis ought to be approximately x,si; and so
the moment of the whole graph ought to be
Mv
Definitions.

R::!

I xi-./1

il

+f'(x;)2 t>,.x.

Given the function f, with a continuous derivative f', on [a, b], the

moment of the graph about the y-axis is


M,11

and the moment about the line x


M,x=xo

fx-./1 +f'(x)2dx;

x0 is

fcx - x0)-./l +f'(x)2dx.

Similarly, we state the following:


Definitions.

The moment of the graph off about the x-axis is


Mx

Mv=110

ff(x)-./1 +f'(x)2dx,
f(f(x) - Yo)-./1 +f'(x)2dx;

and the centroid of the graph is the point P


M
v

(x, ji) for which

:XL ,

where L is the total arc length.


Our previous theorems for regions now have analogous forms for arcs, as follows.
Theorem 6.

If the graph off is symmetric about a Ii ne

x0 (or y =

y0)

then this

line contains the centroid.


Theorem

7. If the graph off is rotated about a horizontal or vertical line not inter

secting the graph, then the area of the resulting surface of revolution is equal to the
length of the arc times the circumference of the circle described by the centroid.
y

r-1
I
I

I
I

Moments and Centroids. The Theorems of Pappus

7.6

343

For example, if we rotate about the y-axis, then the area of the resulting surface is
S

27Txl + f'(x)2 dx

27TM"

27TxL,

by definition of x. The proof of the theorem in the other cases is similar.


Throughout this section, we have used a fixed coordinate system to define and
investigate moments and centroids of regions and arcs.

It is a fact, however, that

moments and centroids do not depend on the choice of a coordinate system; they
depend only on the regions and the arcs. ln particular, any line of symmetry (horizon
tal, vertical, or sloping) must contain the centroid.

You may use this fact in the

problem set below.

PROBLEM SET 7.6


I.
2.

Let A, B, and C be the points

(0, 0), (a, 0), and (b, c), a, c > 0. At each of these points

there is a particle of mass 1.

Find the centroid of the resulting system.

Suppose at the points A, B, and C of Problem I there are particles of


respectively.

3.

ma

ss 1, 2, and 3

Find the centroid of the resulting system.

median of a triangle is a segment between a vertex and the midpoint of the opposite

side. Show that every median of the triangle described in P roblem 1 passes through the
centroid.

4.

Now consider the triangular region R determined by the same points A, B, and C.
Find the centroid of R.

5. The region R is rotated about the x-axis. Find the volume.


6.

R is rotated about the y-axis. Find the volume.

7. The figure formed by the sloping sides of the triangle is rotated about the x-axis. Find
the area of the resulting surface.

8.

Same question, for rotation about the y-axis, assuming

9.

A trapezoid has vertices

(0, 0), (a, c), (b

b 0.

) (b, 0) (with b > a > 0 and c > 0).

- a, c ,

Find the centroid of the region T bounded by this trapezoid.

10. The region Tis rotated about the x-axis. F!nd the volume.
11. The region Tis rotated about the y-axis. Find the volume.
12.

The figure formed by the four sides of the trapezoid is rotated about the y-axis. Find the
surface area.

13.

The circle with center at


y-axis.

14.

a,

with

0 < a < b, is rotated about the

Let the arc A be the portion of the circle with center at the origin and radius
lies in the first quadrant.

15.

(b, 0) and radius

Find the area of the resulting torus.

a which

Find the centroid of A.

The square with corners at


about the y-axis.

(Here

(a, 0), (a + k, k), (a + k, -k), (a + 2k, 0) is rotated


0 < k < a.) Find the area of the resulting surface.

16.

Find the volume of the solid obtained if the corresponding square

17.

The same square is rotated about the line

= a

region is rotated.

+ 1k. Find the surface area.

344

The Definite Integral

7.7

18. The square region is rotated about the line


19.

x =

a + 2k. Find the volume.

(a, O); (b) the segment


I) to (a, I); and (c) a semicircle, pointing outward, with endpoints at (a, 0)
and (a, I). This curve is to be rotated about the y-axis. For what value of a is it true
Consider the curve consisting of: (a) the segment from (0, 0) to

from (0,

that the total area of the resulting surface is equal to 15?


20.

For each

let Sa be the area of the surface described in Problem 19, and let Va be the

volume of the solid that it encloses. What value of

7.7

a maximizes the ratio VafSa?

IMPROPER INTEGRALS

The definite integral is defined as a limit of sample sums as the mesh of the net
approaches

0.

This limit exists if the integrand f is continuous.

But this definition

of the integral does not apply to the function


1

f(x)
on the half-open interval from

Jx

to 1.

It really is the half-open interval

(0, l]

that we are dealing with, because at

{x j 0
x

<

I}

the function is not defined.

half-open interval the function is unbounded.

On this

Therefore, for every net over

(0, l]

we can form a sample sum as large as we please, by taking the first sample point
close to

0.

Thus,

the sample sum is large when

x is small, and so the sample sums do not approach a

limit as the mesh approaches

0.

Nevertheless, we can extend the definition of the integral in such a way that our
problem has an answer.
every closed interval

The function f

[a, l],

where

a > 0.

(x)

1/J

Therefore

is defined and continuous on

H (1/J) dx is

well defined.

Improper Integrals

7.7

345

f(x)

1
-

Vx

We define a new kind of integral by saying that

f 7x

if the indicated limit on the right exists.

!+

; '

(We write a-+ O+, because a takes on only

positive values.) In the present case, the limit exists and is finite:

f ;
Therefore

\-112 dx

[2x112J!

f1 d
lim [2
a-+O J a '\/ a-+O+
=

lim
+

2.fa]

2 - 2.Ja.
=

2.

There are similar-looking problems for which the limit is infinite. For example,

11 dx

.
1dx
-= hm
0 2 a->0+ 1a 2 '
X

if the limit exists. In this case,

1
-

-1 +

r 1 dx
a->0+ Ja x2

and so

lim

(a > 0),

oo
,

in the sense defined in Section 5.3.


We abbreviate this by writing

f1 dx

Jo x2

oo

The same test can be applied at any point where the function "blows up," as long
as there is only one such point, at an endpoint of the interval. For example, consider

/2sec x dx =

lim
sec x dx,
a.. rr 2 la0
/
where the minus sign means that a-+ 7Tj2 through values less than 7T/2.

l"0

346

7.7

The Definite Integral


y

--+---rr,,.__---- X
2

Now

lasec xdx = [In !sec x


= In I sec a
=

As a-+ "TT/2-,
sec a=

1
--

cos

-+

Therefore In [sec a + tan a] -+

+ tan
+ tan

In I sec a + tan

oo

oo

xJ]g

aI

0I

a j.

tan a

and

In 11 +

sm
=

cos

a
a

-+

oo.

and
a

-"12- xdx =
a
We use the same method to define and evaluate such integrals as
ico dx2
x
Here the integration is supposed to be carried out all the way to the right, starting
at x =
Again our definition of the definite integral (as the limit of the sample sums)
does not apply, and so we define the improper integral as a limit:
lco dx = .m ladx
a-co
if the limit on the right exists. For the function
ladx
g(a) = 2'
x
the limit exists and is finite:
j'adx2 = [-.!]a=_ l + 1;
x
a
x
and so
ad
.!] =
lim r : = lim [ 1
J1
-+oo
a x a-+oo a
Jim

r sec
Jo

00.

1.

2
X

2
X

l.

Improper Integrals

7.7

347

Here we have a second example of an unbounded figure with a finite area.


y

f(x)

1
2

For very similar-looking functions, the integral from 1 to


For example,

00

dx

Jim

a--+oo

fa dx
1

Jim [In

a--+oo

x] = lim ln

a =

a--+oo

may be infinite.

oo

oo

Integrals of the type that we have been discussing here are called improper

integrals, of the first and second kind respectively.


combination.

For example,

The two kinds can occur in

r oo
dx
Jo ,jx(l + x)

is improper in two ways: the function blows up at the lower limit, and also the upper
limit is

oo.

Thus, for the integral to be finite, the two limits

1 dx
.
a-+o+ a,Jx(l + x)
hm

1.
1m

b-+oo

fb
i

dx
.jx(l + x)

must both be finite; if they are, then the integral from 0 to

oo

positive number k would have done equally well in place of 1.)

is their sum.

(Any

In this case, both

limits are finite.

Improper integrals may appear in a disguised form, and so we need to be careful.


For example, a careless calculation gives

l d: [-1]1
J-1 x x -1
=

-1

This is impossible, because the integrand is positive.


function blows up at
integrals

x =

o dx
x

-1

'

We find that both these limits are equal to


oo.

The trouble here is that the

0, and so we need to make separate investigations of the

to

-2.

oo.

Therefore the original integral is equal

348

The Definite Integral

7.7

Note that when we got the "answer" -2 we had no right to complain that the
theory was wrong, because the Fundamental Theorem of Integral Calculus does not
apply to functions whose domains have holes in them; the theorem applies only
to functions which are defined and continuous on the interval over which we are
integrating.
An even worse example of the same kind is

1 dx

= [In lxl]:.1 = ln 1

-1 x

ln 1 = 0

(?).

Splitting this into two parts, we get

J-1
(
J

= li[ln lxl]'.:1 = li ln lal = -oo,

a-+O

1 dx

a-+O

=lim[ln lxlJ! =lim[-lna] = oo.

a-+O

a-+o

The limits - oo and oo do not combine to give a well-defined limit, either finite or
infinite, and therefore the original integral is not defined at all. We also get no
answer for
"'
sin x dx.
Here

sin x dx =[-cos x]g = -cos

+ 1,

which oscillates forever between 0 and 2, and therefore does not approach a limit.
Thus, when we investigate an improper integral, there are three situations that
we may encounter.

1) The integral may exist, as a finite limit. For example,

J1

"'

=Jim

a-+00

r a:
J
1

= lim

a-+oo

2) The integral may exist, as an 'infinite limit

i"'
1

dx
- =Jim
X

a-+ oo

3) Finally, the limit may not exist as


find for

(?)

oo

[- l]

or

= l.

- oo.

dx
- = lim lna =
X

For example,
oo.

a-+ oo

any

limit, finite or infinite. This is what we

1 dx
.
-1 x

In the following problem set, when you are asked to "investigate" an improper
integral, you should find out which of the above three cases it represents. If it is
Case 1, you should find the limit, unless the contrary is stated.

Improper Integrals

7.7

349

PROBLEM SET 7.7

loo dx
x2
oo
dx
J x3
loo dx
1 x1.0001
f dx
1x x
loo
2 (x - l)s dx
J000 xe-xdx
loo xdx
1 x4
f;' xn dx

loo dx
1 x2 x
f dx
loo dx
1 x0.9999
f dx1 0001
0 x.
f
1 (x - l)s dx
Loox2e-xdx

Investigate:
1.

4.

2.

1 +

5.

7.

10.

13.

16.

19.
20.

8.

11.

In

3.

6.

-v:x

9.

12.

14.

17.

15.

18.

-v:x

v'x(l

x)

1 +

Show that

is never finite, for any value of n, positive, negative, or zero. (The

l/x,

point is that something always goes wrong, either at 0 or at


21.

loo dx
1 xs
loo dx
loo dx
xlnx
f2 dx
0 xo.9999
fo00e-xdx
Joo dx

Consider the graph of/(x)

< oo. Let

oo.)

R be the region under the graph;

let S be the solid of revolution (about the x-axis); and let Tbe the

surface of revolution.

Investigate the improper integrals which represent (a) the area of R, (b) the volume of S,
and (c) the area of T.
Investigate the following for existence. (That is, find out

whether the integral

represents

a finite limit; but in the cases where it does, you need not calculate the liP)-it.)
22.

25.

27.

30.

33.

*36.

loo-1--.,
dx dx
1 e
loo--dx
x2e-x
x
+

1 1 +

J00 e-xdx
-00
J00 e-x dx
-00/2
f0 xdx
x
Joo --dx
x2
csc
sin

..

23.

26.

28.

31.

34.

loo dx
2 x x
f-00oo e-x dx
1 +

loo

24.

In

1 +

v'

dx

1 +

Tan-1

< oo. It will

then follow by symmetry that

e-x

< oo. And obviously

because

lsin

dx
x(l x)
f;" e-x dx f= e-"'2
f:_1 e-x-dx1,
loo dx
2 v'xinx
/2tanxdx
f0
xi
Joo --dx
x2

(To show that this is finite, it will be sufficient to show that

x4 dx
i00 G - x) dx
i00 x2 xi dx
1

Joo

is continuous on
29.

32.

!sin

35.

1T

< oo,
1].)

7.8

The Definite Integral

350

**37. a) Show that for each

n,

IJv';;;(n+iJ.-.
f 00 v

I 'Jv;;;

sm x2 dx <

b) Investigate
38.

v'..

__

sin x2 dx .

v' (n-1>.-

sin x2 dx.

Let f be a decreasing function, with a continuous derivative, on the interval [a,


The graph (and the region under it) are rotated about the x-axis.

oo ) .

Show that if the

surface area is finite, then so also is the volume.


*7.8

THE INTEGRABILITY OF CONTINUOUS FUNCTIONS

Let/be a function which is continuous on an interval l. The function/is continuous


if for each x, and each E > 0, the graph has an Ec5-box at the point
total height of an Ec5-box is h

(x,f(x)).

The

2E. The box is then called an h-box of every point of

the graph that lies in it.


y

The definition of continuity applies to the points x of the interval, one at a time.
It may appear, therefore, that iff is continuous at each point
have to use infinitely many boxes (one for each

x)

x of the interval, then we

in order to exhibit the fact.

But

if I is a closed interval, this is not so:


Theorem 1

(The finite covering theorem). Let f be a continuous function on the


[a, b], and let h be a positive number. Then there is a finite collection

closed interval

of h-boxes, covering the entire graph off


y
y

I
I
I
I
I

I
I
I
I
I

I
I
I
I
I

I
I
I
I
I

Proof Let [c, d] be a subinterval of [a, b]. If there is a finite collection of h-boxes,
covering the part of the graph determined by [c, d], then we shall say that [c, d] is good.

The Integrability of Continuous Functions

7.8

351

[c, d] is bad. We allow the case in


[c, d] is all of [a, b]. Thus what we need to show is that [a, b] is good. Suppose,
then, that [a, b] is bad. We shall show that this leads to a contradiction.
Let [au b1]
[a, b]. If the left-hand half of [a1, b1] and the right-hand half of
[a1, b1] are both good, then it follows that [a1, b1] is good; we can fit together two
If no such finite collection exists, then we say that

which

finite collections of boxes, getting another finite collection of boxes that covers the
whole graph. Therefore one or both of the halves of
be a bad half of

[a3, b3]

be a bad

[a1, bi]. Similarly,


half of [a2, b2].

[a1, b1] must be bad. Let [a2, b2]


[a2, b2] must be bad. Let

one of the halves of

Proceeding to infinity in this way, we get a nested sequence

[a1, b1J. [a2, b2], ..., [an, bn],


of closed intervals, each of which is bad.

..

By the nested interval postulate, there is a

number x which lies on all of these intervals.


Now f is continuous at .X. Therefore /has an h-box at
box is

{(x, y) I x0 < x < x1,

Yo <

<

y1}.

bn - an
the length of the nth interval

b
=

[an, bnl

- a

2n-

[a,,, b,,] approaches

is good after all:

Suppose that the

--1- '

0.

Therefore we must have

for some
This means that

(x,j(x)).

Since

one

n.

h-box covers the part of the graph that

lies above it; and 1 is finite.


We continue now at the point where we left off in Section 7.2. There we defined

net, mesh (of a net), upper sum S(N), lower sum s(N), and sample sum I (X).

Theorem

3 of Section 7.2 was as follows:


Theorem.

If f is continuous, and
lim [S(N)
[N[-+O

- s(N)]

then the sample sums approach a limit, as


Now

f f (x) dx

INI

is defined to be limfNf-+O

->-

0,

0.

I (X).

Therefore what we need, to

complete the proof that continuous functions are integrable, is the following theorem:

352

7.8

The Definite Integral

If f is continuous on [a, b], then

Theorem 2.

lim [S(N) - s(N)]

INJ-+O

Proof Let

>

0.

be given. We need to show that there is a o > 0 such that

INI <o

=>

S(N) - s(N)<E.

We know by the finite covering theorem that for every h > 0 there is a finite
collection of h-boxes, covering the graph. (See the left-hand figure below. We have
not yet decided what h we want to use.) The x-coordinates of the vertical sides of the
boxes, together with a and b, form a net N0 over [a, b]. Let o be the length of the
shortest interval in N0 We assert that if N is any other net over [a, b], with IN! <o ,
then every little interval [x;_1, X;] in N lies under some one of our boxes. We illustrate
y
y

with the simpler figure on the right. If [x;_1, X;] contains no point of N0 (as on the
right) this is evident. If [xi_1, X;] contains a point Y; of N0 (as on the left), then Y;
lies on the open interval under one of our boxes, and so [x;_1, x;] lies under the same
box.
Now take a net N, with INI <o . The difference S(N) - s(N) is the sum of the
areas of a collection of rectangles, like this:
y

I
I
I

I
I
I

Each of these rectangles lies in one of our h-boxes. (Why?) Therefore each of them
has height h. Hence
S (N) - s(N) h(b - a).
Thus we want
h(b - a)<E,
and this will hold if

h<--.
b - a
This is the way we should choose h at the beginning of the proof. The resulting o
is the o that we need.

7.8

The Integrability of Continuous Functions

Theorem 3.

353

Every continuous function on a closed interval is integrable.

Proof By the preceding theorem, lim [S(N) - s(N)]


7.2, f is integrable.

0. By Theorem 3 of Section

Some of the ideas in this proof are worth examining further. In Problem Set 7.2,
we gave the following definition.
Let f be a function on an interval /. Suppose that for every E > 0 there
is a o > 0 such that

Definition.

Ix - x'I

<

where x and x' are any two points of

=>

I.

If(x) - f(x')I

<

E,

Then f is uniformly continuous on

I.

Note that while continuity is defined for one point x at a time, uniform continuity
is defined for the graph as a whole. The difference between these ideas may be clarified
by an analogy:
1)
2)
3)
of

A
A
A
the

man is literate if there is a language that he can read and write.


group of men is literate if each of its members is literate.
group of men is uniformly literate if there is one language that every member
group can read and write.

Thus, uniform literacy is a property not of the individuals in a group but of the
group as a whole; if each of the members of the group is literate, it follows that the
group is literate [see (2)], but it does not follow that the group is uniformly literate.
The difference between continuity of a function f on an interval I and uniform
continuity of f on / is analogous. For example,f(x) = l/x is continuous on the open
interval I= (0, 1), because/is continuous at every point x of I. But/is not uniformly
continuous on I. (For every E > 0, we can find two points x, x', as close together as
we please, such that lf(x) - f(x')I > E.)
But continuity implies uniform continuity when the domain of the function is a
closed interval.
Theorem 4 (the uniform continuity theorem). If f is continuous on [a, b], then f is
uniformly continuous on [a, b].

Proof Let E > 0 be given. By the finite covering theorem there is a finite collection
of boxes of height E, covering the graph. (We are using E as the h of Theorem 1.)
Let N0 be the corresponding net over [a, b], as in the proof of Theorem 2. As before,
let o be the length of the shortest interval in N0 It follows that if Ix - x'I < o, then
x. and x' lie under some one of our boxes. Therefore lf(x) - f(x')I < E, which was
to be proved.
This is the idea that we need, to complete the proof of the formula
A

f27Tf(x).J1

f'(x)2 dx

for the area of a surface of revolution about the x-axis. In Section 7.5, we knew that

354

as

The Definite Integral

INI

--+

7.8

0, the sample sum


n

I' = I 27Tf(x;)J1

+ f'(x;)2 ilx

i=l

approaches the integral.

But the area of the approximating surface was

I= I 2TTj(i;)Jl + f'(.X';)2 Llx,


i=l

with two different sample points i;,

x;

used on each interval

to show that

Iim II' -

JNJ->O

II

2TTJ I + f'(x)2 Therefore

Let M be such that

the latter function is bounded.

2TTJI + f'(x)2
E

Thus we need

o.

We are assuming that / ' is continuous. Therefore so also is

Let

rx;_1, x;].

< M,

for every

x.

be any positive number. Then

M(b

a)

>

0.

By the uniform continuity theorem, there is a o > 0 such that

Ix - x'I

<O

=>

lf x) - f(x')I

<M(b

- a)

This is the o that we need.

Proof

INI

<o

<o

=>

Ii; - .x;1

=>

lf(x;) - f(x;)I

for each i

<M(b

Now

I - I'

for each i.

a)

I 2TT[f(i;) - f(x;)]J1 + f'(x;)2 ilx,

i=l

and so

II

I'I I lf(.x;) - !c.x;) 1 27TJ1 + f'CxD2 ilx


<IM lf(i;) - f(x;)I ilx.

i=l
n

i=l

Therefore

INI

<o

=>

/I - I'/ <iM. M(b - a) Llx = M. M(b - a); Llx


'(b - a) =
b -a
E

which was to be proved.

--

E,

The Integrability of Continuous Functions

7.8

355

'PROBLEM SET 7.8

Most of the questions below can be answered on the basis of a careful reexamination of
the theorems and proofs in Sections 7.2 and 7.8. Some of them, however, require independent
investigation. Naturally, all answers should be explained.
I.

Suppose that/is known to be increasing on

[a, b],

but is not known to be continuous.

Does it follow that f is integrable?


2.

Show that Tan-1 is uniformly continuous on ( -

oo

).

3.

Same, for f(x)

4.

Is it possible for a function to be uniformly continuous on an open interval

oo,

on

( -co,

oo

).

(a,

b)?

Why or why not?


5.

If a function is uniformly continuous on an interval I, does it follow thatf is continuous


on I? Why or why not?

6.

Suppose that/is (a) continuous at


on

*7.

(a, b) .

a,

(b) continuous at b, and (c) uniformly continuous

Does it follow that f is uniformly continuous on

[a, b]?

Suppose that f is bounded and integrable, but not necessarily continuous, on


For each

of

[a, b],

let

F(x)

[a, b].

f' f

(t) dt.

Show that Fis continuous. (The betweenness theorem for integrals, which is Theorem 5
of Section 7.2, may be useful here.)
*8.

For the definition of

Lipschitzian,

see Problem 12a of Problem Set 7.2.

is Lipschitzian on I, then/ is uniformly continuous on I.


open or closed, finite or infinite.)
9.

10.

Let

a1, a2,

an

be any finite sequence of numbers. Show that

Let f be continuous on

[a, b].

Show that

If

f(x) dx

I f

If(x)I dx.

Show that if f

(Here I may be any interval,

8.1

The Conic Sections

TRANSLATION OF AXES

In Section 2.2 we stated the definition of a coordinate system on a line L.


p
x

X2

A coordinate system on L is a one-to-one correspondence


L

<--+

p <--+ x,

R,

between the points P of L and the real numbers x, such that the distance between
any two points is the absolute value of the difference of the corresponding numbers.
That is,

If we subtract the same number from the coordinate of every point, we obtain
another coordinate system on the line.

If we subtract h from every x, then

and so

'

Therefore the distance formula works, for the new coordinates x

h.

This process is called translation. The origin is moved to the point h, and all the
other number labels are moved with it.
0

x=
1
x =

-h 1-h 2-h

h+l

h+3
3

Thus the old and new coordinates are related by the formulas
'

h,

Consider now a plane with a coordinate system.


356

x' + h.

Translation of Axes

8.1

357

y'

y
y=k

'

x=h
Suppose that we translate the coordinate system on the x-axis, subtracting

from every x-coordinate, and then translate the coordinate system on the y-axis,
subtracting

k from every y-coordinate. The effect is to move the origin to the point
(h, k). Every point p now has a new pair of coordinates x', y', and these are related

to the old ones by the formulas

= x' + h,

x' = x - h,

y = y' + k,

y' = y - k.

These formulas are easy to remember; the only way you are likely to go wrong

( by writing x'

is to get them backwards

= x + k ( ?) or y' = y + k ( ?)). It is easy


h, k and new coordi

to see, however, that the new origin must have old coordinates
nates

0, O; and from this we can tell which way the formulas ought to go.
y) denotes the point whose old coordinates are x and y. Thus the
old origin is (0, 0), and the new origin is (h, k). When we write (a, b)' (with a prime
outside the parentheses) we mean the point whose new coordinates are a and b.
Thus the new origin is (0, O)', and the old origin is ( -h, -k)'. More examples are
As usual, (x,

given below:

y'

P=(5,2)=(2, 1)1

--

2 -------

+--=-t.._--,+-'-.._--- x
- 1,
0
2 3
4
5

/L--1
Q=(-1, -l)=(-4, -2)1
In the figure,

h =

3 and

k = I. Two points have been labeled both ways. At the

point P, we have
x

= 5,

y = 2,

x'

= 2,

y'

= 1,

so that the label


p

is correct.

= (5, 2) = (2, l)'

Similarly, at Q we have
x

= -1,

= -1,

x'

= -4,

y' = -2,

358

The Conic Sections

8.1

so that

= (-1, -1) = (- 4

-2)'.

When we write an equation to describe a figure in the plane, the equation depends
on the choice of axes; and often one choice of axes gives a simpler equation than any
other. If we didn't start with the axes in the best position, then we can simplify the
equation by translation of axes.

For example, consider the parabola with directrix

y = -1 and focus F = (3, 3).


y
F=(3, 3)

x, y)
---- P=(
--\

"'-.__

-+---- x
0
1
2
4
5
16
3
D---1

________________

n_
M

__

The parabola is the graph of the condition

FP =MP.
Algebraically, this says that

.Jex - 3)2 + (y - 3)2 = )(y + 1)2


x2 - 6x + 9 + y2 - 6y + 9 = y2 + 2 y + 1
<=>- 8y = x2 - 6x + 17
<=>- y = tx2 - !x + -1i-.
-<::::>

We know, however, that a parabola with a horizontal directrix and vertex at the
origin always has an equation of the form

y = ax2

The vertex of our parabola is

halfway between the focus and the directrix, at the point V

= (3, 1).

This means

that we should translate the axes so that the new origin becomes the point
O'

= (h, k) = (3, 1).

Relative to the new axes, the equation becomes

8(y' + 1) = (x' + 3)2 - 6(x' + 3) + 17;


here we have replaced x by x' + h = x' + 3 and y by y' + k = y' + I.
8y' + 8 = x'2 + 6x' + 9 - 6x' - 18 + 17,

This gives

or

8y' = x'2, or y' = tx'2.


This is in the standard form y' = ax'2, where a = 1/2pand pis the distance between
the focus and the directrix.
Thus, by a translation of axes, we have eliminated the linear term in

x and the

constant term. Here we knew in advance where the origin ought to be for the equation

Translation of Axes

8.1

359

to appear in a simple form. If we hadn't known this, we could still have investigated
algebraically, to find out what sort of simplifications a translation could accomplish.
To do this, we would regard hand

k as unknown quantities, and make the substitution


y = y' + k

x= x' + h,
in general form. This gives:

S(y' + k)= (x' + h)2 - 6(x' + h) + 17,


or

x'2 + (2h - 6)x' - Sy' + h2 - 6h + 17 - Sk=

0.

Certain facts are now obvious: (1) We can't get rid of the term x'2, by any choice
h and k, because h and k do not appear in the coefficient of x'2. (2) For the same
reason, we can't get rid of the linear term in y'. (3) The total coefficient of x' is 2h - 6,

of

and the total constant term is

h2 - 6h + 17 - Sk.
We can therefore get rid of the

x' term, by using h = 3. The constant term then

becomes
9

which is 0 when

- 1 S + 17 - Sk'

k= 1.

or

- Sk,

Thus, translating the origin to the point

(h, k)= (3, 1),

we get the equation in the form

Sy'= x'2

or

y'= tx'z,

as before. This is the process that you follow if you don't know the answer in advance
PROBLEM SET 8.1
1. Find a translation which eliminates both of the linear terms from the equation
xy - 5y - 6x - 30

0.

Then sketch the graph, showing both sets of axes.


2. Is there a translation which eliminates the xy-term from the above equation?
or why not? How about the possibility of removing the constant term?
3.

Find a translation which removes both linear terms from the equation
x2 + y2 + x + y - 2

0.

Then sketch the graph, showing both sets of axes.


4.

Find a translation which removes both linear terms from the equation
2xy - x + 3y - 2

5.

0.

Find a translation which removes both linear terms from the equation
x2 + y2 + 4x + 2y + 1

0.

6. Find a translation which removes both linear terms from the equation
x2 + xy - 3x + 2

0.

Why

360
7.

The Conic Sections

8.2

Find a translation that eliminates both linear terms in the equation


x2 +xy +y2 +x +y +5

8.

0.

Find a translation which removes both linear terms from the equation
x2 +xy +y2 +x +y +1

9.

0.

Show that there is no translation which removes both linear terms from the equation
x2 +2xy +y2 +x - y +1

0.

10. Show that there is no translation which removes both linear terms from the equation
4x2 +4xy +y2 +2x +y +8

l 1.

Consider the equation x2 +y2 +x +y - 2

0.

0.

Under what conditions for fl

and k does this equation take the form


x'2 +y'2 +Ax' +By'

0,

with possible linear terms but no constant term? (You may be able to think of a way to
answer this question without doing any calculations at all.)
12.

Show that if ad - be ;;f. 0, then the linear system


ah +bk

ch +dk

e,

always has a solution. (Simply start solving it; at some point, you will need to assume
that ad - be ;;f. 0.)
13.

Consider an equation of the form


Ax2 +Bxy +Cy2 +Dx +Ey +F

0.

Show that if B2 - 4AC ;;f. 0, then there is always a translation that eliminates both of
the linear terms.
where B2 - 4AC

(The converse is not true; there are simple examples of equations


=

0, but where the linear terms are absent to begin with. Examples?)

14.

Sketch the graph of the equation y2

15.

Let C be the graph of an equation of the form

b2 y2 +h1Y +ho

x(x +l)(x - 1).

3
G3X

+a x2 +G1X +ao.
2

Show that if the axes are translated, then C is the graph of an equation of the same form,
in the new coordinates x' and y'.

8.2

THE ELLIPSE

Let F and F' be two points, let c be half the distance between them, so that
FF'= 2c,
and let a be a number greater than c. Let C be the graph of the equation
FP + F'P = 2a.
The curve C is called the ellipse with foci F, F' and focal sum 2a.

The Ellipse

8.2

361

To draw an ellipse, we put two thumbtacks in a drawing board, at the foci


and

We tie the ends of a string to the thumbtacks, in such a way that the length

F'.

of the string left free between the thumbtacks is 2a. Then we put a pencil in the loop
of string, placing the point so that the string is taut, and move the pencil around,
keeping the string taut all the way. (We need to do this in two steps, on the two sides
of the line through

F and F'.)

In the definition of an ellipse, we really mean that

;>6

F'.

the graph of the condition


a <

c,

F and F' are two

(Thus a circle is not an ellipse.) Also, we really mean a >

FP

F'P

2a is the segment from

points; that is,


c.

(For a

to

F';

= c,

and for

the graph is empty.)

Some things about ellipses are easily seen from the definition. For the definition
of symmetry of a figure, about a line or a point, see Section 7.6.
Theorem 1.

Proof

An ellipse is symmetric about the line through its foci.

In the left-hand figure below,

FP

P is
+

on the ellipse, so that

F'P

2a.

And L is the perpendicular bisector of the segment from


geometry,

FP

FP',

and

F'P

F'P'.

Therefore

P to P'.
FP' + F'P' = 2a,

By elementary
and

P' is on

the

ellipse.

Theorem 2.

An ellipse is symmetric about the perpendicular bisector of the segment

between its foci.


Proof? (This is not quite as simple as the preceding theorem. See the right-hand
figure above.)
Theorem 3.

If a curve is symmetric about each of two perpendicular lines, then it is

symmetric about their point of intersection.

362

8.2

The Conic Sections

Proof?

(We need to show that OP = OP".)


L'

P'

--,
I

I
I
I
I
I

For ellipses, this gives us:


Theorem 4.

Every ellipse is symmetric about the point midway between its foci.

P0 is called the center of the ellipse. (See the right-hand figure above.)
These symmetry theorems convey nearly all that is easy to see about ellipses
merely from the definition. Our next step is to set up a coordinate system, and describe
our ellipses by equations. We take the origin at the center of the ellipse, and the foci

on the x-axis. The ellipse is then said to be in standard position, relative to the axes.

y
P(x, y)

As indicated in the figure above, let F and F' be the points

FP + F'P

J(x + c)2 + y2 + J(x - c)2 + y2 = 2a

J(x + c)2 + y2 = 2 a - J(x - c)2 + y2

=>

x2 + 2cx + c2 + y2

=>

and

(c, 0).

Then

2a

( -c, 0)

-----

4a2 - 4aJ(x

aJ(x - c) 2 + y 2
a2 '-- ex
a2x2 - 2 a2cx + a2c2 + a2y 2

c) 2 +

y2

+ x2 - 2cx + c2 + y2

(a2 - c2 )x2 + a2y2


x2
Y2 + -=l.
a2
az - c 2

a4

a2(a2 - c2)

- 2 a2cx + c2x2

The Ellipse

8.2

It is possible to show,

Thus every point on the ellipse satisfies the final equation.


conversely, that every point
(See Problem
Theorem 5.

22

363

that satisfies the final equation lies on the ellipse.

(x, y)

below.) Thus we have:

The ellipse with foci at

the equation

x2

a2
For example, for

3 and c

y2

--

+
=

(c, 0)

a 2 - c2
2

and focal sum

2a

is the graph of

1.

we get

x2

y2

-+-=l.
9
5

To sketch, we observe that for y = 0, x = 3, and for


then sketch an oval with these as its extreme points.

0,

y=

)5.

We

y
2

-2

Given an equation
x2

y2

a2

b2

-+- = 1 '

the graph is always an ellipse.


for some

>

0.

Since

a2 - b2

> 0, it follows that

a2 - b2

c2

The graph is therefore the ellipse described in Theorem 5. Thus we

have proved half of the following theorem.


Theorem 6.

Given the equation


x2

a2
For

<

a2,

y2

/;2

1.

the graph is the ellipse with focal sum

)a2 - b2. For a2 < b2, the


(0, c), where c = )b2 - a2.

c =

at

b2

2a

and foci at

(c, 0),

graph is the ellipse with focal sum

Proof of the second half of the theorem?

2b

where

and foci

8.2

The Conic Sections

364

-a

-b

b<a,

c=Va2-b2,

a<b,

FP+F'P=2a.

c=vb2-a2,

FP+F'P=2b.

If the foci are not in either of the two positions shown above, then the equation
of the ellipse is more complicated. In some cases, when the equation is given, we can
simplify the equation by a translation of axes. Consider

4x2 + 9y2 - 8x + 18y - 23 = 0.


Making the substitutions

x = x' + h, y =y' + k, we get

4x'2 + 9y'2 + (8h - 8)x' + (18k + 18)y' + 4h2 + 9k2 - Sh + 18k Evidently we want

23

0.

h = 1, k = -1; and this gives the equation in the form


2

4x ' + 9y' - 36 = 0,

or

x2
y2
'
'
-+-=l.
9
4

()5, O)' and focal sum 6; it intersects the x'


(3, O)', (0, 2)'. We can now sketch, showing both sets of

The graph is the ellipse with foci at


and y'-axes at the points
axes

'
y

-3-+-----+-_
l-+--_._.,__+3---. x'
-_
-1

-2

In doing such sketches, we start by drawing the new axes and the curve, in a con
venient position on the paper, and then draw the old axes, in the position where
they must have been.

8.2

The Ellipse

365

PROBLEM SET 8.2


Write equations for the ellipses described by the following conditions and sketch.
1. Foci at (1,

O);

2. Foci at (0, 1); focal sum 4.

focal sum 4.

4. Foci at (-1, -1), (1, 1); focal sum 4.

3. Foci at (1, 2), (1, 4); focal sum 4.


5.

Foci at (-1, 1), (1, -1); focal sum 4.

7. Foci at (0, 2); focal sum 6.

6. Foci at (2, O); focal sum 6.


8. Foci at (-1, 1) and (I, -1); focal sum 6.

Find the foci and the focal sum, and sketch, showing both sets of axes, in cases where
more than one set is used.

9. x2/4 + y2 =1

10. x2 + 9y2 - 2x + 36y + 28 =0

11. 9x2 + y2 + 36x - 2y + 28 =0


12. x2 + 2y2 + 3x + 4y - 6 =0

(This one does not "come out even.")

13. 4x2 + y2 =1
15.

14. x2 + x +

y2
-

2y
-

+ 1 =0

Given an equation of the form

Ax2 + By2 + Cx + Ey + F = 0,
where A and Bare both positive, show that the graph is (a) an ellipse, (b) a point, or (c)
the empty set. (The same conclusion follows if A and B are both negative.)

16. A function f is odd if/(-x) =

f (x)

for every x.

Show that the graph of an odd

function is symmetric about the origin.

17. a) Let C be the graph of the sine function. Show that C is symmetric about the origin.
b) Now show that C is also symmetric about infinitely many other points.
may happen that an unbounded figure has more than one "center."

(Thus it

In fact, there

is a simpler example: a line is symmetric about each of its points, and so every point
of a line is "a center" of the line.

center only for bounded figures.)

For this reason, we ordinarily use the word

18. a) Show that the graph of the cosine is symmetric about infinitely many points.

b)

Show that the graph of the sine is symmetric about infinitely many lines.

19. Consider the infinite strip R between the lines y =1 and y = -1. That is,
R = {(x, y)

-oo < x <

w,

-1

2 y 2 l}.

Show that R is symmetric about infinitely many points, and find a simple description of

the set C consisting of all points which are "centers" of R.

20. Show that every cubic curve is symmetric about its point of inflection. Here by a cubic
. curve we mean the graph of an equation y = ax3 + bx2 + ex + d, with a yf 0.
21. Suppose that in Theorem 3 of this section we drop the hypothesis that the two lines of
symmetry are perpendicular. Would the resulting theorem be true? Why or why not?

*22. Given 0 < c < a, as in the definition of an ellipse.


Let P = (x, y) be a point satisfying the equation
x2
y2
c
a2 + a-2--2
-

1.

Let F = ( -c,

O), F' = (c, 0).

The Conic Sections

366

8.3

a) Show that
y2

a2 - e2
-- - (a2 - x2).
a2

b) Show that
FP + F P
I

c) Show that a2 + ex > 0.


Remember that 0 < e <

a,

d) Show that a2 - ex > 0.

e) Show that FP + MP

I
'\/ (a2 + ex)2 + 1 '\/I (a2 - ex)2
a
-

(There are two cases to consider, x 0 and x 0.


and use the fact that /x/ a.)

2a.

(This completes the proof of Theorem 5.)

8.3

THE HYPERBOLA

Given

0 < a < c, and the points

condition

F and F', with FF'

FP - F'P =

The curve C is called the

2a

hyperbola with foci

2c. Let C be the graph of the

(a < c).

F, F'

and focal difef rence 2a. The figure

shows what a hyperbola looks like, but the reasons for this appearance of the graph
are not obvious; the only thing that is easy to see, on the basis of the definition, is
that the hyperbola is symmetric about each of the two perpendicular lines. The first
step in our investigation of hyperbolas is to take the axes in a convenient position, as

shown above, with F


FP - F'P
<=> FP
<=>
=>

<=>

F'P

(-c, 0) and

(c, 0), and get an equation for the curve.

2a
=

J(x - c)2 + y2 2a

x2 + 2ex + c2 + y2
- a2

2a

2
J(x + c)2 + y

ex

F'

x2 - 2cx + c2 + y2 4aJ(x - c)2 + y2 + 4a2

aJ(x - c)2 + y2

8.3

The Hyperbola

c2x2 - 2a2cx

-:?

(c2

a2)x2

a4 = a2x2 - 2a2cx
a2y2 = a2(c2
a2)

a2c2

367

a 2y2

y2
= 1.
c 2 - a2
---

Thus every point P = (x, y) of the hyperbola satisfies the final equation. As in the
case of the ellipse, it can be shown conversely that every point on the graph of the
final equation is on the hyperbola. (See Problem 32 below.) Since c2 > a2, we may let
b2 = c2 - a2.

This substitution gives the standard form of the equation:


x2
Y2 =l.
--a2
b2

And we can sum up as follows:


Theorem 1.

The graph of the equation


Y2

x2
-

a2

bz

is the hyperbola with foci at (c, 0) (where c =.Ja2

b2) and focal difference 2a.

We shall use our equation to justify the sketch which we gave at the outset.

1) No point of the hyperbola lies between the lines x = -a and x = a. The


reason is as follows. Solving for y, we get

Therefore the hyperbola is the union of the graphs of two functions


f(x) = "l}_.Jx2 - a2
a

and

g(x) =

and neither of these functions is defined for -a

<

<

"!}_ .Jx 2 - a2,

a
a.

2) The curve is symmetric about each of the coordinate axes. This is easy to see
algebraically. For each point (x, y), the symmetric point across the x-axis is (x, -y);
and if (x, y) is on the hyperbola, then so also is (x, -y). Similarly, if (x, y) is on the
curve, then so also is (-x, y); and so the hyperbola is symmetric about the y-axis.

3) The hyperbola is unbounded in both the x- and y-directions. Obviously f(x)


and g(x) are defined whenever !xi a. As x-+ oo, f(x) -+ oo and g(x)-+ - oo.
And as x-+ oo,f(x) -+ oo and g(x) -+ - oo.
-

It remains to discuss the two lines which the curve seems to be getting close to
when both x and y become numerically large. The behavior of the hyperbola relative

The Conic Sections

368

8.3

to these lines seems to be similar to that of the curve


Y

relative to the coordinate axes.


curve.

= f(x) = l,
x

The coordinate axes are called

asymptotes

of this

By this we mean, roughly speaking, that points of the curve, far from the

origin, in the appropriate directions, are close to the axes.

We want to extend this

idea to cases in which the asymptote is neither horizontal nor vertical.

As x-+

oo,

(This distance is

the distance from the point

MP= IYI = 1 1/x j . )


P=

P=

x;-<0.

(x, y) to the x-axis approaches 0.

We shall take this property as our definition

of an asymptote. That is, a line Lis an


from the line to the point

1
y=f(x) = x'

asymptote

of a function-graph if the distance

(x,j(x)) approaches 0 as

x-+

oo,

It is evident that the x-axis is an asymptote of the graph off (x)

or as x-+

oo.

I/x under this

definition.

In fact, the x-axis is an asymptote in both the positive and negative

directions.

We also say that a line L is an asymptote of a curve C if C

function-graph which has L as an asymptote.


x

g(y) = 1/y;

In the case of

y=

contains

l/x, we also have

thus the curve, looked at sidewise, is still a function-graph, and has

the y-axis as an asymptote, in both the positive and negative directions. This is shown
in the left-hand figure below.

y
x=g(y) = ly (y;o<O)
M
IimMP=O

y=f(x) = l
x (x;o<O)

y-.ro

lim MP=O

Jim MP=O

y-.-ro

The Hyperbola

8.3

We return to our hyperbola.

In the last figure on the preceding page, the slope

of the segment from the origin to the point P = (x,

m(x) = ,!'.
x

369

y) is

..! .Jx2 - a2 = 1
x a
a

a2

x2

Obviously
Jim m(x)
and this suggests that the line

y=

bfa,

bx/a, or x/a

part of the curve that lies in the first quadrant.


by symmetry that the lines x/a

y/b =

quadrants.
Thus we need to show that Jim.,

... 00

- y/b =

0, is an asymptote of the

If we show this, then it will follow

0 are asymptotes of the curve in all four

MP = 0. Since MP < NP, it will be sufficient

to show that lim NP = 0. This can be done by an algebraic trick.

NP=

Obviously NP

--+

0 as x

x
a

E..Jx2 - a2 = E.(x - .Jx2 - a2)


a
a

x
. (x - .Jx2 - a2).
a
x
b
a2
a

--+ oo.

+
+

.Jx2 - a2
.Jx2 - a2

.Jx2 - a2 .

Therefore MP

--+

0, which was to be proved.

This

gives the following theorem.


Theorem 2.

The lines

x
y
--=0
a
b
are asymptotes of the hyperbola

x2
a2

y2
b2

l.

You can sketch a hyperbola by drawing in the asymptotes and x-intercepts


exactly, and then filling in the curve freehand.
y

370

The Conic Sections

8.3

For example, consider

2
x

y2

= 1.

The x-intercepts are at x = 3, and the asymptotes are the Jines


y
x
-- = 0
'
3
2

or

Jx.

Y =

A hyperbola whose asymptotes are perpendicular is called


y =-x
'
'
'
'
'
'

y=x

'

/
/

'

'
'
'

'

/
/
/
/

rectangular.

'

'

'

'

'

'

'

'
'

'

If such a hyperbola is in standard position, then the asymptotes must be the lines
xy

0, and the equation must have the form


2
y2
x
- - - =
1,
2
2
a
a

or

2
2
2
x -y =a .

If the foci are on the y-axis, at the points (0,c ) , then the equation of the hyper

bola takes the form

2
x
--2
c - a2

1.

It follows that the graph of the condition


2
x
2
a

2
y
= 1
2
b

is the union of two hyperbolas with the same asymptotes. These are called
hyperbolas.

conjugate

8.3

The Hyperbola

371

PROBLEM SET 8.3


Sketch the graphs of the following equations.
l.

x2 - 4y2

4. y2 - 4x2

7. -x2 + 9y2

2. y2 - 4x2

-4

10. 25 y2 - 4x2

5. 9x2 - 4y2

8.

6. 9x2 - 4y2

36

-9x2 + y2

3. x2 - 4y2

9. 25x2 - 4y2

-4

-36

100

100

Derive equations for the hyperbolas determined by the following conditions, and sketch.
11.

Foci at (2,

13. Foci at

12. Foci at (2, 2); focal difference 3.

0); focal difference 3.

14. Foci at

(0, 0) and (0, 4); focal difference 3.

(0, 2); focal difference 3.

15. Foci at (1, I); focal difference 2.


16. Foci at (2,

O); passing through the point (3, 4).

17. Foci at (2,

0); focal difference 2.

18. Foci at (3,

0); focal difference 4.

19. Foci at

(0, 3); focal difference 4.

20. Foci at (3,

O); passing through the point (5, 5).

21. Given F, F', and a, as for a hyperbola in standard position. What is the graph of the
condition FP - F'P

la? How about the graph of FP - F'P

-la?

22. Find a rectangular hyperbola in standard position (with asymptotes x + y


x - y

0 and

0) passing through the point (5, 3).

Investigate the graphs of the following equations. In each case, find all asymptotes.
23. (x2 - y2 - 1)2
25.

x2y2 - xy +

=0
=

24. (x2 - y2)2


26. y

27. Let D be the line x

I
------

(x - l)(x - 2)

-1, let F be the origin, and for each point P let DP be the

perpendicular distance between D and P. Let C be the graph of the condition


FP

DP=
What sort of curve is this?

Sketch.

28. Let F and D be as in the preceding problem, and let C' be the graph of the condition
FP
DP
What sort of curve is this?

Sketch.

29. _Let G be the set of points P such that CP


D is the line x

=
1.
=

2DP, where C is the circle x2 + y2

I and

4. What sort of figme is G? Discuss and sketch.

30. The following passage occurs in the U.S.Internal Revenue Act of 1964.
" ...There shall be allowed as a deduction moving expenses paid ... in connection
with the commencement of work by the taxpayer ... at a new principal place of work ...
[However,] no deduction shall be allowed ... unless ... the taxpayer's new principal
place of work ... is at least 20 miles farther from his former residence than was his
former principal place of work ...

"

The Conic Sections

372

8.4

Give a sketch, showing what this means. Your sketch should show (a) the former
residence, (b) the former place of work, and (c) the region in which the new place of
work must lie, for the moving expenses to be deductible. (The author is indebted,
for this problem, to Dr. Henry Pollak, of the Bell Telephone Laboratories.)
31.

The region between two conjugate hyperbolas stretches out infinitely far, in each of four
directions. Find out whether the area of such a region is finite.

*32.

Given 0 < a < e, as in the definition of a hyperbola.


Let P
(x, y) be a point satisfying the equation

Let F

( -e, 0), F'

(e, 0).

x2
-

a2

v2

__
,_ =

e2 - a2

a) Show that

y2

e2 - a2
-- - (x2 - a2).
a2

b) Show that
FP - F P
I

c) Show that, if
ex

a,

then

+ a2 > 0,

d) Show that if x ;;;i

1
I .I
- v (ex + a2)2 - - '\I (ex - a2)2.
a
a

-a,

ex

- a2 > 0,

FP - F'P

and

2a.

then

ex + a2 < 0,

ex -

a2 < 0,

and

FP - F'P = -2a.

(This completes the proof of Theorem 1.)


8.4 THE GENERAL EQUATION OF
THE SECOND DEGREE. ROTATION OF AXES

An equation of the second degree in x and y is an equation of the form

Ax2 + Bxy + Cy2 + Dx + Ey + F

0,

where at least one of the coefficients A, B, and C is different from zero.

The latter

condition is to guarantee that the degree of the equation really is 2, rather than 1 or 0.
We have found that all conic sections are graphs of equations of this type; and we shall
now investigate the converse. That is, we propose to find out what sort of figure can
be the graph of a second-degree equation.

The possibilities that we have already

found are
a)

a circle,

b)

a p arabola,

c)

an ellipse,

d)

a hyperbola.

There are other possibilities, which we noted as exceptional cases when we were
studying the equation

x2 + y2 + Dx + Ey + F = 0,

The General Equation of the Second Degree.

8.4

Rotation of Axes

373

in connection with the circle. The graph of


x2 +

y2 = 0

is a point; and the graph of

x2
is empty.

y2

1 = 0

(See Theorem 2 of Section 2.3.)

Our list of possible graphs of second

degree equations must therefore include


e)
f)

point, and
the empty set.

And this is not all. The graph of

y2 = 0
is a line, namely, the x-axis. And the graph of

xy = 0
is the union of two lines, namely, the two axes. Similarly, the graph of

x2-y2=0
x2-y2 = (x
y)(x + y). This is = 0
y = 0. Therefore a point P = (x, y) is on
the graph of x2 -y2 = 0 if and only if (i) P is on the line y = x or (ii) P is on the
line y = -x.

is the union of two lines. The reason is that


if and only if either x -y

= 0 or x

In this example, the lines intersect, but we may get the union of two parallel lines.
The equation

x2- x = 0
is equivalent to
x(x

1) = O;

and the graph is therefore the union of the two parallel lines x

= 0 and x = 1. Thus

the graphs, for the general equation of the second degree, include
g)
h)

line, and
the union of two lines, either parallel or intersecting.

We shall show that the eight possibilities that we have just listed are the only possi
bilities. The method will be to reduce the equation to a recognizable form by moving
th axes. In some cases, this cannot be done by translation; we may also have to use
rotation of the axes.

374

The Conic Sections

8.4

Suppose that we rotate the axes through an angle of measure 8, getting a new
pair of axes.
y'

In the figure,

is the distance OP; P has coordinates x, y in the old coordinate system,

and coordinates x', y' in the new coordinate system.


x

= r

cos cp,

x'

= r

cos (<P - 8)

y'

= r

sin (cp - 8)

sin cp,

cos <P cos 8 +

= r
= r

= r

Evidently

sin <P cos 8 -

r
r

sin <P sin 8'

cos <P sin"8.

Therefore the new coordinates are given in terms of the old ones l:>y the formulas
x'

x cos 8 + y sin 8,

y'

-x sin 8 + y cos 8.

If we rotate the new axes through an angle of measure -8 we are back where we
started. Therefore the old coordinates are expressed in terms of the new ones by the
formulas
x

x' cos (-8) + y' sin (-8),

-x' sin (-8) + y' cos (-8).

These give
x
Theorem 1.

x' cos 8 - y' sin 8'

x' sin 8 + y' cos 8.

In any second-degree equation, the xy-term can be eliminated by a

rotation of axes.
Before going into the proof, let us try a simple example:
xy

l.

To rotate the axes through an angle 8, we should substitute


x

x' cos 8 - y' sin 8'

x' sin 8 + y' cos 8.

(1)

The equation then becomes


(x' cos 8 - y' sin 8)(x' sin 8 + y' cos 8)

1,

or
x'2 sine cos 8 + x'y'(cos2 8 - sin2 8) - y'2 sine cos 8
We want the x'y'-term to vanish. Thus we want
cos2 8 - sin2 8

0,

or

cos 26

O;

l.

The General Equation of the Second Degree.

8.4

Rotation of Axes

375

and this will happen when


W

'TT
=

nTr

Tr

'

One value of 8 is all we need, and so we take 8


sin 8

cos 8

Tr/4, which gives

)2

and
sin 8 cos 8

nTr

+ -.

t.

Thus our new equation is

x' 2

y' 2

1.

This is the equation of a rectangular hyperbola.

Let us now return to our general equation

Ax2 + Bxy + Cy2 + Dx + Ey +

0.

Making the usual substitution, to rotate the axes through 8, we get

A(x' cos 8 - y' sin 8)2 + B(x' cos 8 - y' sin 8)(x' sin 8 + y' cos 8)
+ C(x' sin 8 + y' cos 8)2 + D(x' cos 8 - y' sin 8)
+ E(x' sin 8 + y' cos 8) +

F = 0.

When we collect coefficients for the terms of various types, we get a new equation of
the same form, like this:

A'x'2 + B'x'y' + C'y'2 + D'x' + E'y' + F'


Algebraically,

A'
B'

C'
D'
E'

A cos2

8 + B sin 8 cos 8 + C sin2 8,


-2A sin 8 cos 8 + B(cos2 8
sin2 8) +
A sin2 8 - B sin 8 cos 8 + C cos2 8,
D cos 8 + E sin 8,
- D sin 8 + E cos 8,

F'= F.

0.

2 C sin 8 cos 8,

The Conic Sections

376

8.4

For future reference, we have written down all of these, but for the moment, all we
are interested in is B': we want to find a e that makes B' =0. Simplifying trigono
metrically, we get
B' = (C- A) sin2e + Bcos2e.
There are now two cases:

1)

If A =C, then B' =Bcos 28.

We must have B - 0, or there wouldn't be any

xy-term in the original equation. Therefore


when

B' =0

cos2e =o,

and cos28 =0 when


e = 27:.
4
Thus a rotation through TT/4 eliminates the xy-term whenever A =C.
2)

If A - C, then we can divide by A -C. Therefore B' = 0 when

--B

or

B cos 28 = (A-C) sin 2e,

A-C

=tan 28.

Thus, to get B' =0, we take


e
This proves the theorem.

Tan
I
-

---

A-C

(The theorem did not say that the coefficients in the new

equation were easy to compute.)


Theorem 2.

The graph of a second-degree equation is (a) a circle, (b) a parabola,

(c) an ellipse, (d) a hyperbola, (e) a point, (f) the empty set, (g) a line, or (h) the
union of two lines (either parallel or intersecting).

Proof

By the preceding theorem, we can assume that there is no xy-term.

equation then has the form


Ax2 + Cy2 + Dx + Ey + F = 0.
We now need to discuss various cases.

1)

Suppose that neither A nor C is

( )

A x2 +

0.

We can then write

( )

2
+ C y +

= -F,

and complete the square twice to get

( r c(y r

A x +

-F +

which has the form


Ax '2 + Cy'2 = F'.

:;2,
2

The

8.4

The General Equation of the Second Degree.

Rotation of Axes

377

Here we have translated the axes letting


D
x =x +2A'
I

E
,
y =y+-.
2C

Since A =;r6 0, we can divide by A, getting


x'2 + C'y'2 = F"

(C' = C/A =;r6 0).

There are six possibilities to be considered. For each of these cases, we have indicated
on the right what sort of figure the graph is.
(C' > 0, F" > 0

a circle or ellipse

C' > 0, F" = 0

a point

C' > 0, F"< 0

the empty set

(C'< 0, F" > 0

a hyperbola

C'< 0, F" = 0

two intersecting lines

C'< 0, F"< 0

2)

a hyperbola (with foci on the new y-axis)

Suppose that C = 0. The equation then has the form


Ax2 + Dx + Ey + F = 0,

where A =;r6 0, because the degree is 2. We divide by A, getting


x2 + D'x + E'y + F' = 0;
and then we complete the square in x, so as to eliminate the linear term in x.
gives an equation of the form

This

x'2 + F" = E'y.

For E' =;r6 0, this is a parabola. For E' = 0, the equation x'2 =

F" gives one line,

two lines, or the empty set.

3)

Suppose that A = 0.

This is exactly like Case 2; we interchange x and y, and

proceed as before. This completes the proof of the theorem.


It is easy to compute the new coefficients produced by a translation of axes.
For a rotation, the new coefficients are expressed in terms of sine and cose, and e
is defined by the equation
__
e = 1.
Tan-1 _B
2

A-C

Thus we want to express sine and cose in terms of tan 28 ( = B/(A

C)) for the

case where
_!!<28<.
2
2
When 28 is in the first or fourth quadrant, cos 28 > 0, and sin 28 has the same sign
as tan 28.

8.4

The Conic Sections

378

v1+k2
k?
x

k?
v1+k2

In the figure,
B

k = tan 2() = ___ .


A-C

Therefore

1
cos 2() =
I
v1 +k2
The half-angle formulas are
x

cos-=
2

i+cos x
2

'

sin-=
2

l - os

x .

For the present case, these give

cos e =
where

i+cos 2e
2

.
sin e =

'

i- cos 2()
2

'

1
cos 2() =
I +k2
v1
and where the sign in the formula for sine is the same as the sign of k =tan 2(),
For example, consider

Here
and

3x2 +2xy +y2 =1.


A=3,

B =2 ,

c =1,

B
2
=--=1.
k=
3-1
A-C
--

Therefore

1
1
=
cos 2e =
J1 + k2
J2

----=

Hence

cos e =

i+1;/2
2

2 + /2
4

'

The General Equation of the Second Degree.

8.4

Rotation of Axes

379

and

(In the second formula, sine > 0 because k > 0.) Therefore
cos2e

2 + /i
'

sin2 e

A'x'2
A'

A cos2e
3

2 + .J2
4

and

C'

C'y'2

+2

. .J2

/i.
4

1,

B sine cose

sine cose

'

The new equation is


where

2 - .J2

C sin2e

2 - .J2

2
.J + 2

- B s in e cose + C cos2e
2 - .J2 - . .J2 2 + .J2 -.J2
+ 2.
2
+

A sin2e
3.

PROBLEM SET 8.4


In these problems, when you are asked to

inrestigate

an equation, you should find out

what sort of figure the graph is, and sketch. If the graph is a conic section, you should also

find the coefficients in the standard form.

1. Investigate

x - xy

(Here it is easier to translate first and rotate second.

Sketch, showing all three sets

of axes.)
2.

Investigate

3.

Investigate

x2 - xy

4.

xy -

1.

- 2y

0.

Investigate

5 . . Investigate
6. Investigate
7.

2xy - y2 + 2

0.

x2 + 2xy + y2 + 2x + 2y + l = 0.

x2 + 4xy + 4y2 + 4x + 8y + 3

0.

Show that, under a rotation of axes,

A' + C'

A + C

and

F' = F.

We express this by saying that A + C and Fare invariant under rotation.

380

8.

The Conic Sections

8.4

a) Given the general equation of the second degree.

Let

A0, B0, C0,

be the new

coefficients, when the axes are rotated through an angle of measure 8. Thus

C8,

are the

A', B', C', . .


A8
B8
C0

A cos2 8 + B sin 8 cos 8 + C sin2 8,


(C - A) sin 28 + B cos 28,
A sin2 8 - B sin 8 cos 8 + C cos2 8.

Show that the derivatives

A, B, C satisfy the differential equations

b) Show that the function


f(8)
is a constant.

A8, B8,

. of the text; and so

Thus we say that

B - 4A8C8

B 2 - 4AC is invariant

under rotation of axes.

It

may be of some interest to check this, in the cases where we have computed the new
coefficients.
9.

Given x2- + 2xy + 3y2 +


the xy-term.

4x

+ 5y + 6

0, the axes are rotated so as to eliminate

What are the possibilities for the coefficients of x2 and y2, in the new

equation?
10.

Same question, for the equation


x2 + 2xy + 5y2

11.

10

0.

Same question for


4x2 +

v3 xy

+ y2 + 2x + 3

12. a) Let D be a line, let F be a point not on D, and let

0.

be a positive number. Let G

be the set of all points P such that


FP
=
DP

e
,

where DP is the perpendicular distance from D to P.

with directrix

D, focus F,

(b) a parabola if

and eccentricity e.

G is called

the conic section


e < 1,

Show that G is (a) an ellipse if

I, and (c) a hyperbola if

> 1.

b) Is a circle a conic section, in the sense defined in Problem 12(a)? Why or why not?

9.1

Paths and
Vectors in a Plane

MOTION OF A PARTICLE IN A PLANE

To describe the motion of a particle in a plane


particle is, at each time t on a certain interval /.
interval

I there

corresponds a point

P(t);

E,

we need to explain where the

Thus to each time t on the given

and the motion is described by a function

P: I-E.
For the motion shown in the figure,
point

is the point

P(O)

(1, 1 )

I is

the infinite interval

[O,

oo

),

and the initial

In general:

Definition.

plane path is a function

P: I-E,
where

I is

an interval and

E is

a plane.

The same idea applies more generally: a path in space is a function


where I is an interval and S is space.

P: I - S,

In this chapter we shall be dealing only with

plane paths, and so we shall refer to them for short simply as paths.
The locus of a path is the curve which is traced out by the moving point.
precisely:

Definition. Given a path

P: I-E,
the locus of P is the set of all points

Q which are
381

P(t)

for some

in I.

More

382

Paths and Vectors in a Plane

Briefly, the locus of

P is

the

9.1

image

of I under the function

P.

The locus is deter

mined when the path is named, but given a locus, the path is not determined: the
same curve can be traced out by a moving point in infinitely many ways.
We describe a path in a coordinate plane by defining two functions which give
the coordinates of the moving point for each time t.

x = f(t)
4t
y = g(t) = 8t2
=

Here

At

/=(-00,00),

t = 0, P(t) = (0, 0).

As

t increases,

P(t)

and

For example, we might take

starting from

0,

(4t, 8t2).
both

and

increase, but y

increases faster. In fact, the locus of this path is a parabola. To see this, we observe
that from the first equation,

t = x/4.
y=

Substituting in the second equation, we get

sGY

tx2.

Thus every point of the path lies on the graph of the equation

And it is easy to check, conversely, that every point


path.

(x, y)

of the parabola is on the

x =f(t), y = g(t), the two


parameter. Sometimes
we can get a simple description of the locus of a path by writing an equation in x and y.
We then say that we have eliminated the parameter, getting a rectangular equation of
When a path is described by a pair of functions

functions are called the

coordinate functions,

and tis called a

the locus. Often this process is useful: a path may trace out a simple figure, such as a
segment or a circle, in a complicated way; and when this happens we want to know it.
The process of getting rectangular equations for loci is often tricky.
for example, the path

P described by

Consider,

the equations

x =f(t) = t2,

y = g(t)

t4

Every point of this path lies on the parabola

y = x2.
But the converse is not true.
Therefore the locus of

P is

On the path, we always have

only half of the parabola.

?; 0, because t2 ?; 0.

Motion of a Particle in a Plane

9.1

383

y
\
\
\
\
\
\
'
'

', P(O)
,

'
,....-::;
. -'---- x

We are always free to regard a parameter as representing time, and in many


physical problems, this is what the parameter means.

But it often makes equally

good sense to regard the parameter as the measure of an angle. Consider, for example,

x =cost,

y =sint.

These functions describe uniform motion around a circle.

Here we may regard the

parameter as the measure of an angle, and write

x =cose,

y =sin()

to describe the same path.

Somewhat similar looking paths have ellipses as their loci. For example, consider

x =a cose,

y =b sine.

Here the locus of Pis an ellipse:


::
a

=cos() '

2' = sin() '


b

x2
y2
2
.
-2 + -2 =COS () + Sill2 () = 1.
a
b
Investigating further, we see what values of () correspond to what points of the
elliptical locus.

We draw circles of radii a and b, with centers at the origin, and

construct L() in standard position.

Paths and Vectors in a Plane

384

9.1
y

In the figure,
Q
R

(a cos e, a sin8),
(bcos8,bsin8).

Therefore
P

P(8)

(a cos8, bsin8).

Following the scheme of the above figure, using drawing instruments, you can plot
as many points of the ellipse as you want to, without making any numerical calcula
tions. The same idea is used in the construction of a drawing instrument called the
ellipsograph, which can be adjusted so as to draw the ellipse with any pair of semiaxes
a, b.
PROBLEM SET 9.1
Investigate the paths described by the following pairs of coordinate functions, sketch
the loci, and label a

few points as P(O), P( 7T/4),

and so on, so as to indicate the way in which

the moving point traverses the locus.


l.

4.

sec e, y = tan e

2. x = cos e, y = cos2 0

cos2 e, y = sin2 e

5.

x = t3, y

3.

x = 2 cos e, y = sin 0

lt31

(Check that not only /(t) = t3 but also g(t) = lt31 have continuous derivatives. Thus a
moving point can go smoothly around a sharp corner, if only it does so slowly enough.)
6. x = sec2 e, y
9. x

tan2 0

sin e, y = Jsin 01
y

7. x = sec 0, y = cos 0
10. x = t6, y

8. x

t4
y

csc 0, y = cot ()

The Parametric Mean-Value Theorem;

9.2

L'Hopital's Rule

385

11. In the left-hand figure above, IJ ranges over the open interval (0, 7T) , OR
b, and QP
is a constant a. Find a parametric description of the path, and sketch the locus.
=

12. In the right-hand figure above, OR

b as before, and QP is a constant a.

parametric description of the path, and sketch the loci, showing the three cases

Find a
a

<

b,

b, a> b.

13. A circle of radius a rolls without slipping inside a circle of radius 2a. The initial position
is shown on the left below; a later position is shown on the right. Observe that RQ

2a0, PQ

2a0, PQ

aij>. Therefore 1>


S

(h, k)

20. Let

(a cos IJ, a sin 0).


y

2a

y'

-2a

Then
x'

a cos (0

x = x' +

y'
y

if>),

h,

a sin (0
'

if>),

k.

Complete the discussion to get a parametric description of the path, and find out what
the locus is.

It will turn out that the figure on the right above is slightly misleading.

**14. lf you solved the preceding problem correctly, you found that some of the machinery
that you used was not necessary after all. But consider the case where the outer circle
has radius a and the inner circle has radius b

a/4. Find parametric equations for the

path, and eliminate the parameter to get the rectangular equation

iY
This curve is called a four-cusped

principles.

k.

iY.f

f/.

a, and limx-a ['(x)

[Hint: The theorem is conceptual,

k, then f is differentiable

and the proof goes back to first

Start by writing out the hypothesis and conclusion in terms of the basic

definitions of the statements (a) lim;;-a/' (x)

9.2

hypocycloid.

*15. Show that if f is differentiable for x

at a, and[' (a)

k and (b) f' (a) = k.]

THE PARAMETRIC MEAN-VALUE THEOREM; L'HOPITAL'S RULE

Given a path described parametrically by a pair of coordinate functions


x =

f(t),

y =

g(t),

386

9.2

Paths and Vectors in a Plane

we may want to find the slope of the tangent line at the point corresponding to a
particular

t.
y

In the figure, we see the path; we want to find the slope of the tangent at P, if such a
tangent exists.

Suppose that P is the point corresponding to a certain

be the neighboring point corresponding to

Lly= g(t + Llt) - g(t)

t + !it.

Llx=f (t + Llt)

and

t;

and let Q

Let
-

f (t),

as indicated in the figure. Then the slope of the tangent at P is

Ll y
m= I.im-,
Ll.t-+O LlX
if such a limit exists. Suppose now that f and g are differentiable, and that f' (t) 0.
Then we can write

[g(t + Llt) - g(t)]/Llt


Ll /Llt
Ll
m= lim y= lim y = lim
Ll.t-+OLlX Ll.t-+ollx/Llt Ll.t-+O [f(t + Llt) - f(t)]/Llt
limLl.t-+O {[g(t + Llt) - g(t)]/Llt}
g'(t)
j'(t)
limLl.t-+O {[j(t + Llt) - f(t)]/Llt}
_

Thus we get the formula

This will be called the


Theorem 1.

g'(t)
m=--.
f'(t)
parametric slope formula.

We have shown:

Given a path defined by functions


x=j(t),

If f and g are differentiable at

t,

y= g(t).

and f' (t) 0, then the path has a tangent at the

corresponding point P, and the slope of the tangent is given by the formula

m= m(t)

g'(t)
.
f'(t)

An important case is the one in which f' (t) 0 for

< t < b. Here x=f (t)

can never take on the same value twice, and so the locus of the path is the graph of a
function

ef;.

The Parametric Mean-Value Theorem;

9.2

L'Hopital's Rule

387

t=a

Qt=b

M=<t>'(x)
I
I

If P and Q are the endpoints of the graph, as in the figure, then the slope of the
secant line through P and Q is

g(b) - g(a)
f(b) - f(a)
By the mean-value theorem,there is a point i where the derivative cf' (i) is the slope of
the secant line. Thus

g(b) - g(a)
f(b) - f(a)

rf'(x).

This number i must have come from somewhere. That is, there must be a i between
a and b such that

It follows that

f'(i)
Therefore

g(b) - g(a)
f(b) - f(a)

f(i).
g'(i)
.
f'(f)

g'(i)
f'(i)

(a <

< b).

What we have just proved is a parametric form of the mean-value theorem. The idea
is that, if a function-graph is presented parametrically, then we can rewrite the
mean-value theorem parametrically, expressing both the slope of the secant and the
slope of the tangent in terms of the parameter.
Theorem 2 (The parametric mean-value theorem). Given two continuous functions
Jandg,fora t b. If both functions are differentiable fora < t < b,andf'(t) =
0 for a < t < b, then

for some

between a and b.

g(b) - g(a)

g'(i)

f(b) - f(a)

f'(i)

9.2

Paths and Vectors in a Plane

388

This theorem takes a simple form when

f (a)

g(a)

0.

In this case, the theorem says that

g'(i)
(b)
g
-- f(b)
f'(i)'
for some f between a and b.
approaches a limit

L,

as

This has the following consequence: if

_,..a, then

g(t)/f(t)

approaches the same limit

L.

g'(t)/f'(t)
That is,

if

f(a)

a
g( )

ta

then

(t)
f(t)

lim g
t-a

Since f is between t and a, we know that


fR:> a
_,..

L.

'

L.

Roughly, the reason why it holds true is as follows.

This is called l'Hopital's rule.

because g'(t) lf' (t)

'(t)
J'(t)

lim g

and

0,

t R:> a=> f R:> a.

=>

But

'(f)
g
- R:> L,
f'(i)

Therefore

fR:>a=:>fR:>a

=>

g(t)
f(t)

g'(i) R::3 L.
f'(f)

Therefore

t
t R::3 a => g( )
t
f( )

and so

(t)
J(t)

lim g
t-a

R::3

L'

L.

It is very easy to express this idea in the form of an E o proof; all we do is to for
-

malize our statements involving

I)

Hypothesis.

For every

"R::3"

E > 0

there is a o > 0 such that

O<lt-al<o
2)

Conclusion.

For every

E > 0

We need to show that ( 1) => (2).

(I).

For each

t,

=>

t
\g'( )_L\<E.
t
f'( )

there is a o > 0 such that

0 < It-a I <o

furnished by

in the following way:

=>

t
\ g( )-L / < E.
f(t)

Given

> 0, as in (2), we take the o > 0

let f be the f furnished by the parametric mean-value

The Parametric Mean-Value Theorem;

9.2

theorem.

This is the o that we need:


0

<It

al < o

==>

(f)
g
f''(f)

I
I

==>

==>

< If

(t
g
f(t))

These fit together to give the desired conclusion.

f(a)

In the above discussion, we assumed that


were

0.

lim

t--+a

If these relations hold, and


to be

and

<

<
.

g(a)

were both defined, and

It would have been sufficient, however, to suppose that

1 im f(t)
g(a)

389

al < o

L'Hopital's Rule

O;f and g

f and g

t--+a

(t).

are not defined at

a,

then we

define f (a) and


g/f goes

are then continuous, and the discussion of the limit of

through exactly as before.


Using

Theorem 3

instead of t, we get our theorem in the following form:


If

(l'Hopital's rule,first form).


I im f( x)
x--+a

Jim
x--+a

( )
gx

and

Jim
a:--+a

( )
g' x
f ' (x)

L'

then
lim
x--+a

g(x)
f(x)

L
Consider

Let us now look at some applications.

1.
sin
1m

--

x--+O

This satisfies the conditions of l'Hopital's rule, with

g(x)

f(x)

sin x,

x.

We investigate the quotient of the derivatives:


Jim
x--+O

cos
1

sin

1.

1.

Therefore
Jim
x--+O

This discussion does not supersede the geometric proof of the same statement,
given in Section 4.2. The reason is that to apply l'Hopital's rule, we had to know the

derivative of the sine, and to find the derivative of the sine we needed to know that

lim.,o [(sin x)/x]

1.

Moreover, if you know the derivative of the sine, you can

remind yourself of what lim [(sin

x/x]

is, without using l'Hopital's rule.

The point

Paths and Vectors in a Plane

390

9.2

is that
. sin x
1. sinx - sin 0
. , 0,
= Sln
l Im -- = Im
x-+O
X
x-+O
X- 0
by definition of sin' 0.

Since sin' = cos, and cos 0

1, we get the answer imme

diately.

It is not an accident that in applying the first form of l'Hopital's rule, we some
times find that we are merely solving a differentiation problem. The reason is that the
formula used in the definition of the derivative is always an instance of the rule,
whenever the function is differentiable.

F'(x0)

By definition,

1
- .Im

F(x) - F(x0)

x-+x0

X -

X0

The indicated limit on the right satisfies the conditions of Theorem 1, with
g(x) = Fx
( ) - F(x0)

--+

0,

f (x) =x - x0 --+ 0,

asx --+ x0. Thus every differentiation problem is a problem of the sort that l'Hopital's
rule deals with. The rule, of course, applies in many other cases; and it is the other
cases that make it significant.

1.

Im

x-+O

For example,

2
sin x+x
1. 2sinxcosx+l
=l
= 1m
. ,,
"'
"'
e
- 1
x-+O
e

by the rule; and here the rule is needed.


Often, the application of l'Hopital's rule requires the use of some preliminary
device.

For example, consider the possible limit


lim xcot x.
x-+O

Here we should start by writing

. xcosx
1im
--x-+o

sinx

and then use Theorem 3 ( unless we can think of something simpler ).


PROBLEM SET 9.2

Investigate the following indicated limits. (That is, calculate the ones which exi st.)

1. limx cotx
x-o

cos2x - 1
4. lim
x2
x -o

7.
10.

e"'

- 1

Jim
.,_0 In (x + 1)
Jim x2 sec2 x
00-+rr/2

2.

sin2 e
lim
82

3.

o-o

o-o

5.

Jim
x-1

lnx

6.

-X -

. y2 - 2y + 4
8. ltm
y-2
y- 3
1

11. limx2 sin 2


x-o

sin3 e
lim
82

9.

x- I
lim -.,-e
- 1
x-1
Jim
.,-,,/2

12. Jim
x-rr

cos3x

x3 - 1

x4 - 1
. 2X - 1

Sill

The Parametric Mean-Value Theorem;

9.2

1 3.

1 6.

1 9.

21.

23.

*25.

ln2 x - 1

Jim

x2

x-e

Jim
e-rr/4
Jim

x-l

Jim
t-e

Jim

x-1

1 4.

sine - cose

1 7.

e - Tr/4

Jim

Jim x2 csc2 x
x-o

(t)
t

22.

e13

v( dt

24.

Jim x In x
CC-+O+

26.

*27. Jim x In (sin x cos x)

*28.

a;_.o+

29.

lim

x2 - 4x + 3

x-+1 X

2 - 3x + 2

- 1

[ J,"'
X

3x
.,. 20. Jim -1
x-oe

---

In

--

x-o

18.

391

tan x

Jim

1 5.

v:X

x-o

x2 - 1

In

sin x cos x - tan x

L'Hopital's Rule

[ f
[ l"'
sect

Jim
t-rr/2
Jim

csc x

x-rr

rr/2

]
]

Vl + sin3 t dt

rr/2

Vl + sin3 t dt

Jim x In sin x
X-+0+
Jim x In (cos2 x sin2 x)

x-o+

A circle starts off tangent to the x-axis at the origin. The circle then rolls, without slip
ping, along the x-axis. The point P which started at the origin then traces out a path;
this path is called a

cycloid. (The same term cycloid is applied also to the locus of the

path.) The parameter in the coordinate functions is thee indicated in the figure. Sketch
the locus, and calculate the coordinate functions of the path.

y'

I
I

P=(x, y) =(x', y')'

As the figure suggests, the easiest method is to use a "moving coordinate system,"

(h, k) of the
. (What is ?)

as in Problem 13 of Problem Set 9. 1 ; we need to calculate the coordinates


"moving origin" O', and calculate x' and
30.

y' as

cos

</> and

a) When a circle rolls on the inside of another circle, we get a


the fixed circle has radius

sin

hypocycloid. In the figure,

and the rolling circle has radius b.

392

9.2

Paths and Vectors in a Plane

Calculate the coordinate functions, using e as the parameter. The answer is


x

f(O)

b-a
(a- b) cos e + b cos - - e,
b
g(O)
b - a
(a - b) sin e + b sin - - e.
b

b) Get a rectangular equation for the locus, for the case b

a/4. Sketch.

31. When one circle rolls around the outside of another, the figure traced out is called an

epicycloid. Derive parametric equations for the epicycloid, using radius a for the fixed
circle and radius b for the moving circle. Use the same parameter e as in Problem 30(a).
32. Suppose that a railroad wheel rolls (without slipping) along a flat track. Find coordinate
functions for the path traced out by a point at the outer edge of the flange on the wheel.
In the figure below, the outer radius is b and the inner radius is a. Sketch the locus,
bearing in mind that it is not a function-graph; it has loops in it.
y

33.

Make the same modification in the definition of the epicycloid, as suggested by the
figure below, and sketch the curve. The fixed circle has radius a; the rolling wheel
has inner radius b and outer radius c.
y

*34. A path is regular if the coordinate functions/, g are differentiable, and we never have
'
f' (t) = g (t) = 0 for the same t. Show that every chord of a regular path is parallel to

Other Forms of L'Hopital's Rule

9.3

393

the tangent at some intermediate point. Examples of this are as follows:


y

(Evidently the locus need not be a function-graph, and the chord may be vertical.)
*35.

Given a path, with differentiable coordinate functions/, g. Show that, if the axes are
rotated, then the coordinate functions F, G that work for the new set of axes are also
'
differentiable. Show that if f' and g never vanish simultaneously, then F' and G'
never vanish simultaneously.
A

9.3

OTHER FORMS OF L'HOPITAL'S RULE

The first f orm of l'Hopital's rule says that if


lim/(x)

lim

x-+a

x-+a

(x) =

0,

and

(x)
g'
x->a j'(x)

lim

L'

then

(x)
g
x->a f(x)

Jim

L.

This can be generalized in three ways.

x--+ oo or x--+ - oo, instead of x--+ a, it doesn't matter;


2 ) If g'(x)lf'(x)--+ oo or_,. - oo, as x _,.a, the rule still holds.
3) If f(x) _,. oo and g(x)--+ oo, instead of f(x)--+ 0, g(x)--+ 0,
Similarly iff(x) --+ - oo and g(x) --+ - oo.
1)

If

the rule still holds.

the rule still holds.

Thus, in the most general f orm ofl'Hopital's rule, we have: (1) x _,.a, x _,. oo, or
x--+ - oo; (2) g'(x)/f'(x)--+ L, g'(x)/f'(x) _,. oo, or g'(x)/j'(x)--+ - oo; and (3)
f(x), g(x) --+ 0, or--+ oo, or_,. - oo. Thus we have a grand total of 27 theorems, all of
which are true. One of these has already been proved, and the only hard one among
the others is the following.
Theorem I

(The Northeast Theorem).


lim f(x) =

x-oo

lim
x-toc,

then

(x) =

oo,

and

(x)
=
g
x->ro J(x)
Jim

This is proved in Appendix H.

If

(x)
g'
'
x->oo j (x)
Jim

L.

Meanwhile we shall use it.

L'

Paths and Vectors in a Plane

394

Example I .

9.3

To find
1. In x
im-,
X-+O'J X

we take derivatives and find

1/x
lim- = 0.
X-+00 1

By the Northeast theorem,

ln x
= 0.
lim
X-+ 00 X
The case in which g' (x)Jf'(x)
Example 2.

oo

causes no trouble.

To find
e

"'

Jim - ,
X-t-00 X

we investigate

"'

lim - = oo.
X-+ 00 1
It follows, by one form of l'Hopital's rule, that
"'
e

Jim - = oo.
X-+00 X
The theorem being applied here is the following.
Theorem 2.

then

If
limf(x) =Jim g(x) = oo,
X-+00
X-+00

and

.
g'(x)
hm -- =co,
x-+oo f'(x)

g(x)
lim
= oo.
x-+oof(x)
The proof is easy, on the basis of the Northeast theorem; we merely investigate
reciprocals. Since g'(x)Jf'(x)

By the Northeast theorem,

co, we have
f'(x)
Jim
= 0.
X-+00 g'(x)
f(x)
lim
= 0.
X-+00 g(x)

And f(x)/g(x) > 0 when x is large. Therefore

Example 3.

Consider

g(x)
lim
=co.
x-+oof(x)
.
In x
1.
- In x
I1m-=-1m -- .
x-+O+ X
., .... o+
x

Other Forms of L'Hopital's Rule

9.3

Here the limit on the right takes the form


lim
., ... o+

Taking derivatives, we find

oo oo.

-1/x
--

395

=0 .

By one form of l'Hopital's rule, it follows that


-In x = ,
I"Im+ -O

X->0

and so the answer to the original problem is -0 = 0. The theorem being used here
is the following.
Theorem 3.

then

If
lim f(x) = lim g(x) =
+
x-+a+
x-+a

(x)

lim g'
= L'
x->a+ f'(x)

and

oo,

(x)

lim g
= L.
x..,a+ f(x)
Let

This is not quite as easy as Theorem 2.

1
x=a+-,
y

y= -- ,
x - a
so that y

---+ oo

as

x ---+ a+

and

x .......

a+

as

Then

---+ oo.

g(x) =lim g(a + l/y).


x-.a+ f(x)
y-.oof(a+ 1/y)
lim

Taking derivatives, we find


1.

Im

v->oo

g'(a+ l /y)(-l/y22)
f'(a+ 1/y)(-1/y )

1.

= Im
y-.oo

=
The Northeast theorem now applies to

g'(a+ 1/y)
f'(a+ 1/y)
g'(x)

hm -x-.a+ f'(x)

g(a+ 1/y)
Y->00 f(a
+ 1/y)
].

1m

L.

'

and tells us that this limit is L. Therefore

(x)

lim g
= L.
x-.a+ f(x)
We have now discussed all the troublesome cases of l'Hopital's rule; once we
have gotten this far, the rest of the derivations are routine. Hereafter, we shall use
all forms of the rule without comment.
Sometimes we can apply the rule by taking logarithms.
lim
.,_, 0

x"'

= ?

Consider

9.3

Paths and Vectors in a Plane

396

Let cf>(x)

xx. Then

=x

In cf>(x )
Now
.
g'(x)
Ilffi --

x-+o+

f'(x)

g(x)
= 1/xx = f(x)
In

In x

lffi

x-+O

1/x
--1/x

==

Jim In cf>(x)

0,

1lffi ( -x )

limxx
x o+

and

x-+o+

..-. =

Therefore
x-o+

e0

1.

PROBLEM SET 9.3

2. Jim y (Tan-1 y - 7T/2)

3. Jim (In Jn x)/ v:X

5. Jim e-1/x'

6. Jim [(l/x)e-1/x2]

8. Jim (1 + Tan-1 o:)Tau-ia

9. Jim ( l + csc x)in' x

x-o+

x-o+

7.

Jim [(sin x)(In x)/x]

a--+ co

10. Jim (I + ax)1fx


x-o

11. Jim x3e-x

13. Jim (l - 2x)1fx

14. Jim
x-'"12

16. Jim (tan fJ)lano

17.

19. Jim (e-lfx'/x2)

20. Jim (e-lfx' /x3)


x---+O
23. Jim (l/x + n Jn x)

X-+00

o-o+

Jim

t-oo

))Tan-'
(
()1

x--o+

x)sin

x-o

30. Jim (I - sin

kx)csc

Tf'-o+

18.

Jim x2 ln x

x-o+

21. Jim x Jn x
x-o+
24. Jim ( l/x2 +
x-o+

/1

ln x)

0.

27.

Jim (1 - COS y)l-COS y

v-o+

29. Jim (1 + sin kx)csc x


x-o

31. Jim (v2)"


v-o+

x-o

32. Jim

15. Jim x2e-11x


a:- o+

28. Jim (1 + COS y)l-COS Y


v-o+

X---+00

1 + Tan 1 (x

22. Jim (1/x + In x)


x-o+
x-o+
25. Show that for every n, Jim (e-1/x' /xn)
26. Jim (sin

x-o+

12. lim xe-1/x

'
ww

33. Jim (I + tan 3 fJ)-CSC 9


o-o

*34. The nth derivative of a function .f is denoted by ['"1 Let

f( x )

e-1/x2

for x ;e 0,

for x

0.

Show that for each n > O,fhas an nth derivative, for every x; and show that J<">(O)
0
for every n. [Hint: You are not likely to find a manageable general formula for J<n>(x).
But you ought to be able to show that for x - O,J<">(x) is always given by a formula of
a certain form, involving certain constant coefficients; and you may be able to use thi5
form, to show that ['"1(0)
0, without needing to determine the coefficients.]
=

Polar Coordinates

9.4

9.4

397

POLAR COORDINATES

When we set up a rectangular coordinate system in a plane E, to every ordered pair

(x, y) of numbers there corresponds a point P of


{(x, y)}

(x, y)

E,

Thus we have a function

E.

f--+ P.

And the correspondence also works in reverse: when P is named, x and y are deter
mined.

{(x, y)}

Thus a rectangular coordinate system gives us a one-to-one correspondence


+--+

E between the ordered pairs of real numbers and the points of E.

We now consider another way of labeling points with pairs of numbers.


Given two numbers rand 8, we first draw the ray which starts at the origin and
has direction 8.

On the line containing this ray we set up a coordinate system, with

the direction 8 as the positive direction; and we let P be the point with coordinate
(This is equivalent to saying that the directed distance OP is

r.

r.) We then say that

P has polar coordinates (r, 8).


For example, in the left-hand figure it looks as if P1 has polar coordinates (2, 7r/3),
and P2 has polar coordinates (-2, 7r/3).
1r

,f---__,___,_-'---- x
2

I
I
I
I
I
I
I
I

P2(r,8) (r< 0)
I

Thus to every pair (r, 8) of numbers there corresponds a point P. But the corre
spondence does not work uniquely the other way: every point P corresponds to
infinitely many number pairs (r, 8). Thus, in the right-hand figure, the point P with
rectangular coordinates (1, 1) has polar coordinates

(._}2, 7r/4).

But Palso has polar

coordinates ( -/2, 57T/4). And this is not all; the possible polar coordinates for Pare

cI2, 7r/4

c-.J2, s7T/4

+ 2n7T),

+ 2n7T),

where n is any integer (positive, negative, or zero).


Thus, when we set up a polar coordinate system we have a function

{(r, 8)}

E,

but we do not have a one-to-one correspondence; the polar coordinates of a point


are not determined when the point is named.
ordinates can naturally be thought of as paths.

1)

For this reason, graphs in polar co


Let us look at some examples.

Consider the graph of

cos 8

co 8 27T).

398

Paths and Vectors in a Plane

9.4

Since the cosine is periodic, with period 27T, we can get all of the locus by restricting()
to the interval [O,

27T].
r

As an aid in sketching the polar graph, we first sketch the rectangular graph of the
equation r

cos e. We cut the curve into four parts, as indicated, and then sketch the

portion of the polar graph corresponding to each of them. As () increases from 0 to

7T/2, r decreases

from I to 0. As () increases from

7T/2 to 7T, r decreases

from 0 to -1.

Therefore the second part of the curve, in the fourth quadrant, comes from values of()
in the second quadrant.

(See the figure on the left.)


,..

,..

'

3,..

'

3,..

2
As() continues to increase, from

7T

to

27T,

we trace out a curve shown on the right.

This looks like the curve that we had already. And in fact it is exactly the same curve
as before, because
cos ce +

7T) =

-cos

e.

Further investigation shows that the graph is a circle.


y
p

If Phas polar coordinates

(r, ()),
x

then the rectanguar coordinates of Pare

= r

cos e,

y = r sine.

Polar Coordinates

9.4

399

(There are two cases to check. If r > 0, then these formulas follow from the defini
tions of the sine and cosine.

Verification for r

x2 + y2

0 and

r2 cos2 e + r2 sin2e

< O?) Therefore

r2 .

This gives the three conversion formulas


x

r cose

x2 + y2

r sine,

,2,

We note that the equation


r

cose

does not involve any of the three expressions r cose, r sine, r2 which we know how
to convert into rectangular coordinates.

But we can multiply by r, on both sides,

getting
r2

r cose;

and this means that


x 2 + y2
This is the circle with center at the point
2)

x.

(t, 0)

and radius t.

Consider
r

sece.

Here we might sketch the graph without using rectangular coordinates.


y

It is easier, however, to multiply both sides of the equation by cose. This gives
r cose
Ase increases from 0 to

27T

I,

or

I.

(skipping 7T/2 and 37T/2), this line is traversed twice.

(It

is worthwhile to figure out how.)


3)

In these two examples, it was easy to work back to a rectangular equation.

Consider, however,
r

sine

co e 27T).

9.4

Paths and Vectors in a Plane

400

(As for r

cos(), the interval [O, 277] gives us the entire locus of the path.)

we do a rectangular sketch as on the left.

First

We then sketch the polar graph in four

parts:
11'

2
2
r

3,,.

2
This curve is called, for obvious reasons, a cardioid.

It is possible to write a rectangular equation for the cardioid.

First we observe

that since

1 -

sin () ;;; 0,

(1)

we always have (on this particular curve)

r
We can therefore write

r2

J--;2
=

x2 + y2
4)

Jx2 + y2.

r - rsin(),

Jx2 + y2

(2)
_

y.

(3)

To get the equation of a line L, in polar coordinates, we proceed as follows.

Let N be the perpendicular to L through the origin. Rotate the axes through an angle
of measure cp, choosing cp so that N becomes the x'-axis.
!J

Then L is the graph of an equation

x' - p = 0,

Polar Coordinates

9.4

where p is a constant. Since

'

= x

cos <P +

cos <P +

401

sin cp, this gives

sin cf> - p

0.

Converting to polar form, we get


r cos e cos cf> + r sin e sin cf> - p
r cos ee - cf>) - p

0,

0.

This is the standard form of the equation of a line in polar coordinates.


Some geometric problems are most conveniently attacked by introducing polar
coordinates at the outset. To do this, we need a distance formula.
Theorem I. Let d be the distance between the points with polar coordinates (r1, e1)
and (r2, e2). Then
d2
ri + r - 2r1r2 cos eel - e2).
=

Proof The rectangular coordinates of the two points are


for

1, 2.

Therefore
d

2
2
(x1 - X2) + eY1 - Y2)
2
2
erl cos el - '2 cos e2) + er1 sin el - '2 sin ez)
2
2
2
2
riecos el + sin el) + recos e2 + sin ez)
- 2r1r2ecos el cos e2 + sin el sin e2)

ri

+ ,. -

2r1r2 cos eel - e2),

which was to be proved.


For r1 > 0, r2 > 0, 0 < e1 - e2 <

TT,

this is simply the law of cosines.

But the polar distance formula applies for any values of r1, r2, e1, and e2

9.5

Paths and Vectors in a Plane

402

PROBLEM SET 9.4

Sketch the following, and convert to rectangular coordinates if possible.


I.

5.

7. r2 =sin 8 sec3 8

8. r =

13. r =

=1 - cos 8

11.

sin 30

36

1 - sin0

19.

r=
sin 8 - cos 8

20. r2 =sin2 8

22.

23.

2 +cos 8

15.

4 cos2 8 +9 sin2 8

17.

r
1 +r cos 0

r=sin 28

12. r =1 + r cos 8

=sin 48

16. r sin 0

9.

sin 8 +cos 8

14. r2=

1 +cos0

6. r=sin 8 sec2 8

4. r =1 +sin 8

10. r

3. r cos (fJ - 7T/4)

2. r = -2 sec 8

r=2 csc0

= 1 +csc 8

18. r=cos 38

= 2.

21.

r2

24.

=sin0

=eBI

25. The figure given in the text suggests that at the origin, the two sides of the cardioid
have the same tangent, namely, the line 0 = 7Tj2. Show that this is correct.
Discuss, as in Problems 1 through 24.
26. r2

27. r2

cos2 0

cos 8

28. r2 = cos 28

(This curve is called a lemniscate.)

29. r2 = a2 cos 28

r2

30.

=a2 sin 20

Find polar equations for the curves defined by the following conditions, and sketch.
Identify the curve if possible.
31. The set of all points which are equidistant from the origin and the line

= csc0.

32. The set of all points which are equidistant from the origin and the point (2 v2, 7T/4).
33. The set of all points P such that PA = 2PB, where A is the origin and B

(2v2, 7T/4).

34. The circle with center at (2, 7T/4) and radius 2.


Sketch:
35. r=2 - sin 8
38.

9.5

r
1 +r cos 0

36.

= 3 - 2 sin 0

37.

=3 + 2 cos 8

1
=- (What sort of curve is this, and why?)

AREAS IN POLAR COORDINATES

Given
r

f(8) 0,

where f is continuous, and the length of the interval [O'., /9]


region between the polar graph and the origin.

is

;;;

2rr.

Let

be the

Areas in Polar Coordinates

9.5

403

7r

That is,
R

{ (r, &) I o: ;;; e ;;; f3

and

;;;

;;; f (&)}.

Consider a subinterval [&;_1, &i] of the interval [o:, {3]. Let mi be the minimum
value off on [&;_1, &;],let M; be the maximum value, and let flA; be the area of the
region between the origin and the part of the curve from e
ei-1 to e
e i.
=

The area of the inner circular sector, with radius

m;,

and the area of the outer sector, with radius M;, is


Therefore

-M; fl&;.

Now take a net


over the interval [o:, {3]. The area of R is
n

A= .2flA;.
i=l

is

404

Paths and Vectors in a Plane

9.5

The above inequalities hold for every

i;

and so by addition we get

I -m; 6.8;

i=l

I iM; 6.0;.
i=l

But the sum on the left is the lower sum s(N) of the function

F(8)
over the net

N,

F, over the net

}/(B)2,

and the sum on the right is the upper sum S(N), of the same function
N. Thus

s(N) A S(N);
and so
I

Since

Jim s(N)
s1-o

lim s(N)
JXJ-o
is squeezed, and

it follows that

Jim S(N).
J.YJ-o

(P U(8)2 dO
J,

Thus we have:

fHC0)2

lim S(N),
J.\J-o

dO.

Let f be continuous and 0 on [?:, (3], with (3


?'. 21T, and Jet
the region between the origin and the polar graph off Then the area of R is

Theorem 1.

f f(fJ)2

which is the right answer. For

i2"ta2

dO

-a2 21T = 7Ta2,

r = -----

'
cos 0 + sine

we get
A=

1"12
o 2
l;;/2
2 o
["12
1

(cos 0 + sin 8)2

2.o

8 7T/2,

+ sin 20
- sin 20

cos220

dO

c1e
d6

Mt tan 2e - - sec 28]12

}(O + t)

be

dO.

Let us try this in some simple cases. For the circular region with radius
center at the origin, the formula gives
A

(O - i)

and

The Length of a Path

9.6

405

This is correct, because the region is a right triangle with legs of length 1.

rcosB+rsinB=l; x+y=l; A=t.

PROBLEM SET 9.5


Find the areas of the regions enclosed by the following curves, and sketch.
1.

r =

4.

r2 =

7.

r =

- sin 0

r = 1 - cos()
r2 = cos 20

2.

cos2()

5.

sec fJVtan fJ, 0

()

j4

1T

8. r

Find the area of the inside loop of the graph of

11.

r =

1
lcos

fJI +

lsin

12.

01

= e0, 0

r =

6.

r =sec() tan fJ, 0

9.

r2

= sin() cos()

10.

r =

0 21T

() 1T/4

4 cos 2()

1 - 2 sin fJ, and sketch.


13.

14. Given a polar graph defined by a differentiable function


a formula for the slope of the tangent, at a point

cos()

3.

(r0, fJ 0)

= e28, 0

() 21T

r = f(O) ( <X () {3), derive


(r0 =f(fJ0)). Here we really

mean the slope, relative to a rectangular coordinate system superimposed on the polar
coordinate system.

9.6

THE LENGTH OF A PATH

Roughly speaking, the length of a path is the total distance traversed by the moving
point.

For example, consider the path defined by the coordinate functions


j(t)

a cos

t,

g(t)

The locus of this path is a circle with radius


from

0 to 47T,

sin t

(0

47T).

a and circumference 27Ta.

But as t increases

this locus is traversed twice. Therefore the length of the p ath is

2 27Ta

41ra.
Lengths of paths are undirected; they are always positive (or zero, in trivial cas es) .
Thus the length of the path
x =

f (t)

cos t'

g(t)

is four, not zero; the two halves of the path do not cancel each other out.

9.6

Paths and Vectors in a Plane

406

To be exact, path length is defined as follows. Given a path

x = f(t),

g(t)

y =

(a t b),

let

be a net over

[a, b];

for each i from 0 ton, let

Yi

g(ti),

Then

is the length of an inscribed broken line.


y

Po

The path length, by definition, is


n

s = Jim

IN10

2; P;_1P;.
i=O

We can express the path length as an integral.


X; = f(t;),

For each i, let


Y.; =

.6.x; = X; - X;_1,
as indicated in the figure.
y

:I

_____ _

Xi-1

Then

Xi

g(t;),

9.6

The Length of a Path

407

Since
and
we know by the mean-value theorem that

6.xi

f' (ii) 6.ti

for some f; between ti-i and ti; and

for some r; between ti-l and ti. (We do not know that f; = i;, and this leads to trouble,
as we shall see.) Therefore

and so

i pi-lpi i=l
i -J f'(ii)2 +

i=l

g'(f/)2 6.ti.

This is almost, but not quite, a sample sum of the function

ix(t)

-Jf' (t)2 + g' (t)2;

it differs from a sample sum in that we have used two different sample points f;, i;
on each interval [ti_1, t;] of our net N. Since INI --+ 0, and g' is continuous, we ought
to have

i -Jf'(f;)2 +

i=l

g'(f;)2 6.ti

i -Jf'(i;)2 +

i=l

f-J f'(t)2

g ( f;)2 6.t;
'

i t1.(i;) 6.ti

i=l

+ g'(t)2 dt

when INJ 0. For a proof of this, see Appendix I. Meanwhile we shall state the
following theorem and use it.
Theorem 1.

'

If

the coordinate functions f, g of a path have continuous derivatives

f', g , then the length of the path is


s

f-Jf'(t)2

+ g'(t)2 dt.

This formula can be converted to polar coordinates in the following way. Suppose
that a polar path is described by a function

cp(6)

(a e b).

The rectangular coordinate functions of the path are then


c/J(6)cos e,

y =

cp'(6)cos6 - cp(6)sin6,

'
g (6)

x
This gives
j'(6)

/(6)

g(6)
=

cp(6)sin6.

f (6)sin6 + cp(6)cos6.

Paths and Vectors in a Plane

408

9.6

For short, let us write </>' for </>' (e),

for cos e, and

for sin e. This gives

and

Thus we have the following theorem.


Theorem 2.

Given a path defined in polar coordinates by a function

(a e b),
where </>' is continuous, the length of the path is

PROBLEM SET 9.6

It is hard to propose reasonable problems in the calculation of path length; sometimes


the integral takes a troublesome but manageable form such as S V 1 + x2 dx, but most of
the time, path length problems are either easy or impossible. Therefore, if some of the
problems below look impossible, you should try to think of an approach that might make
them easy.
Find the lengths of the following paths.
I.

2. r = eO, 0 () 2n
1
, 0 0 n/2
4. r =
.
cos 0 + sm 0

r = cos(), 0 () n

3. x = 0, y = cost, 0 t h/2
5.

9.

r = 2 sin 8, 0 () r.

r =

8.

,.

2
cos() - sine '

1
= 0

n/2 0 n

0 () 2r.

r = sec fJ tan 0, 0 0 7T/4 (Remember the above remarks.)

10. x = 1 + sint, y
1 1.

6.

x =

cost, 0 t 7T/4

cos3 t, y = sin3t, 0 t 1Tj4 (What sort of curve is the locus of this path?)

12. x = t3, y = t
l 3I, -1 t 1 (Do these coordinate functions satisfy the conditions of
Theorem 1? That is, does g (t )

lt31 have a continuous derivative?)

'-'13. The proof of Theorem 1 would have been much easier if we had been able to use the
following:

Theorem (?). For each i, there is a single point f;, betweent;_1 andt;, such that
pi_lpi

..; f'(i;)2 +

g'(i;)2

D.t;.

We could then have expressed P;_1P; as a sample sum, and passed to the limit, as
in Section 7.1. But the above theorem is false. Give an example of a path ( withf' andg'
continuous) for which the theorem fails.

There is a very simple example of this kind.

Vectors in a Plane

9.7
9.7

409

VECTORS IN A PLANE

In Section 3.8, we found that the motion of a particle on a line could be described by
a single functionf, with real numbers as values, and that the velocity and acceleration
functions were the first and second derivatives

v =f'

and

a=

v
I

= f" .

As we remarked at the time, these ideas are not adequate to describe the motion of a
particle in a plane (or in space).

The motion of a particle in a plane Eis described

by a path, which is a function

P:

I---+E

: t H P(t),
where I is an interval, and P(t) is the location of the moving particle at time

t.

Velocity

in this case is a "vector quantity," with both a magnitude and a direction, conven

iently pictured by an arrow. At each point P(t), the direction of the velocity vector is
the direction of the motion, so that the arrow always lies on the tangent line, pointing
in the appropriate direction on the tangent line; and the length of the velocity vector
is the speed.
y

This is the idea. We need to express it in a mathematical form in which it can be


used. The idea of a vector appears in a variety of forms. The simplest of these is as
follows.
With each point

--+

of the plane we associate the directed segment

at the origin and ending at

P.

--+

Such a directed segment


y

OP

OP,

starting

will be called a

vector.

9.7

Paths and Vectors in a Plane

410

--+

We allow the "degenerate segment" 00; this is called the zero vector1 and may be
-+

denoted simply by 0.

Moreover, since all our directed segments, in this section, are

going to start at the origin, we can denote the directed segment


symbol

-+

P.

Addition.

--+

OP

by the shorter

Three operations can be performed, in this system:


-+

Given P1,

-+

P2,

with P1 =

(x1, Yi) and P2

(x2, y2),

the sum is defined to be

where

Vector addition is governed by the same formal laws that govern addition of real
numbers, as follows.
-+

A.1

Associativity.

A.2

Existence of 0.

every vector

-+

-+

-+

-+

-+

-+

There is a vector 0 such that 0

-+

-+

For each vector P there is a vector


-+

-+

--+

P + ( -P) = ( -P) +P =
Commutativity.

A.4

--+

-+

-+

+P = P + 0 = P

for

P.

Existence of negatives.

A.3

-+

(P1 +P2) +Pa = P1 + (P2 +Pa).

-+

-+

-+

such that

-+

0.

-+

P1 +P2 = P2 +P1.

These follow from the corresponding laws for real numbers.


-+

-P

(P1 +P2) +P3

-+
=

Q,

and

-+

-+

-+

For example, if
-+

P1 + (P2 +Pa) = Q',

then

Q = ((x1 + Xz) + X3, (Y1 +Y2) +Ya)


'
= (x1 + (x2 + Xa) , Y1 + (Y2 +Ya))= Q ,
and so

-+

-+

Q = Q'.

Q, where Q

-+

-+

The existence of 0 is obvious: 0 is 00. If P

(-x, -y).

Similarly for

A.4.

(x, y),

then

-P =

9.7

Vectors in a Plane

411

Scalar multiplication. When we are discussing vectors, we refer to real numbers as

scalars. To multiply a vector P by a scalar


That is,

a, we multiply the coordinates of P by a.

- aP = Q,

where

(clX , ay).

We then have a kind of associative law.


-

M.1. (a{J)P

a({J]>).

Because

(a{J)x

==;

a({Jx), and (a{J)y

a({Jy).

Multiplication is connected with

vector addition by two distributive laws.


-

M.2. (a+ {J)P

-+

aP+ {JP.

-+

M.3. a(P1 + P2)

-+

aP1 + r1.P2.

Zero and 1 work in the usual way:


-

M.4. 0
M.5.

M.6. a

P
P
-

-+

0, for every

P.
P, for every P.
= 0, for every a.

-+

Let "f/' be the set of all vectors

P. In "f/' we have defined two operations (addition

and scalar multiplication), and shown that they satisfy the laws A. l through A.4 and
M. l through M.6; "f/' is called a

vector space (relative to these two operations). More

generally, any collection "f/' of objects is called a vector space if it is provided with two
operations satisfying the above formal laws. There are many important vector spaces
other than the one which we are now discussing. For example, we may consider the
---+

directed segments

OP, starting from the origin in three-dimensional space, with

the two operations defined in an analogous way.


Finally, we introduce another kind of multiplication for vectors, called the

dot product or inner product. If P1

(xi. y1) and P2

(x2 , y2), as before, then the

inner product is a scalar, namely,


+

P1 . P2

X1X2 + Y1Y2

The following properties of this operation are easy to check:


-+

-+

-+

S.1. P1 P2

S.2. (aP1) P2
-

->-

P2 P1.

-+

a(P1 P2).

--

-+

-+

P1 P2 + P1
(P2 + Pa)
- S.4. P P 0, for every P.

S.3. P1

-+

Pa.

412

Paths and Vectors in a Plane


-+

If P

S.5.

-+

0, then

-+

-+

9.7

0.

(The last condition rules out trivial "dot products" for which
for every

-+

-+

-+

P1 P2

is always 0,

P1, P2.)

Thus, "f/ is called an inner product space (relative to the three operations which
have now been defined). More generally, any collection 1/ is called an inner product
space if it is provided with three operations (addition, scalar multiplication, and
inner product) satisfying all the above laws.
As a matter of convenience, we have defined our three operations algebraically,
using the coordinates (x, y) of the terminal points P of the vectors. But it is important
to understand that all three of them have geometric meanings.

We can add two

vectors, geometrically, by completing a parallelogram, as shown on the left.

To do this, we don't need to know the directions of the


axes can be, and hve been, omitted from the figure.

x-+

and y-axes. Therefore the


-+

lf P1 and

P2

are collinear, then

the parallelogram collapses, but the idea is the same.


Geometrically,

-+

-+

is the vector Q which has the same length as

opposite direction.

-+

P,

but has the

-+

To multiply a vector
-+

by a positive scalar

direction as P, but multiply the length by

x.

IX,

we draw a vector with the same

If(/. < 0, we go in the opposite direction,

and multiply the length by j(/.j.

-?

aP

The geometric meaning of the inner product is less obvious.


-+

-+

P1 P2

X1X2 + Y1Y2

Algebraically,

Vectors in a Plane

9.7

413

Under the conditions given in the figure,

Substituting cos

el = X1/0P1, sin el = Y1/0P1, cos 82 = X2/0P2, sin 82 = Y1/0P2,

we get

so that
-+

-+

P1
Obviously cos

P2 = OP1

OP2 cos 8.

8 is independent of the directions of the axes, because () measures the

angle between the two vectors. Note that the length of the vector

-+

P can be expressed

in terms of the dot product:


-+

p . p

= x2 + y 2 = OP 2.

-+

The length of a vector P may also be denoted by


IPI

IP!. Thus

j+-+

= PP.

By a linear combination of two vectors

-+-+

-+

P1, P2 we mean a vector Q which can be

expressed in the form

where

rx

and f3 are scalars. In a coordinate plane, it is easy to find two vectors i and j

such that every vector is a linear combination of them. If the vectors i and j are as in
the left-hand figure below, and P
-+

= (x, y), then

P =xi + yj

(i = (1, 0), j

This is an equation between vectors, not numbers.


the vectors i and j by the scalars

(0,

1)).

On the right, we have multiplied

x and y, and added the resulting vectors.

Paths and Vectors in a Plane

414

9.7
y

This section contains no new information, but quite a lot of new language.
Learning a language takes practice.

Therefore, while some of the problems below

are genuine problems, many of them are merely exercises in the process of translation
from the language of coordinate s y ste m s to the language of vectors and back again.

PROBLEM SET 9.7

Sketch the set of all points P satisfying the following conditions.


I. P

3. P

od ( -

<

oo

r.ti + r.tj (-w

5. p . i
0
.
7. p (i + j)
9. p . p
1

2. P

< w)

C'I.

<

4. P

< w)

C'I.

10. p. p

r.ti + 2et.j

13. p. j
15. P

19. P

co

C'I.

14. P

2r.ti - C'l.j (-w <

C'I.

<

w)

16. p
18. p

r.tj + r.t2i

21. p . (2i + j)
22. Let c

12. p. i

v3)

17. P(i + 2j)

(-w

<

r.t

20. p .

< co)

(i

C'I.

<

)
)

8. p . (i + j)

<

r.ti- r.tj ( - w <et. <

6. p . j

11. J3

r.tj ( -

cxi + r.t2j

( - O'J <

C'I.

< oo)

et.i + r.t3j

( - O'J <

C'I.

< O'J)

C'l.2i + r.t3j
+ 2j)

( - Ct:) <

C'I.

< (/J)

i + j, d

i - j. Express i as a linear combination of c and d.

(To do this

you will need to calculate with vectors, by the same processes that you use with real

numbers. This can be done; and this is why we stated and verified the laws A. I through

A.4 and M.1 through M.6.)

23. Express j as a linear combination of c and d.


24. Now show how any vector

25. Let e

i + 2j, f

P can be expressed as a linear combination of c and d.

2i - j.

a) Express i as a linear combination of e and f.


b) Express j as such a linear combination.
c) Show how every vector P can be so expressed.

26. Same problem, for e

i - 2j, f

3i + 2j.

27. The vectors g and h span the vector space Y if every vector in Y is a linear combination
of g and h. (Thus in Problem 25 (c) you showed that e and f span Y.) Is it true that
every pair of vectors in Y span Y? Why or why not?

Free Vectors

9.8
28.

Let P1

(2,

1), P2

(1, 2).

Sketch the set of all points P such that

(0 ;;;i
29.

-+

415

et:

-+

;;;i I).

different vectors).

Let P1 and P2 be any two vectors (by which we mean two

Sketch the

set of all points P such that


-+

et:P1 + (1

-+

rx)P2

(0 ;;;i

;;;i I).

et:

Sketch the set of all points P satisfying the following conditions:

30. p

. i

32. p .
34. p
36. p
*33.

(i +
=

31. p . j 0

33. p .

j) 0

et:i +

{i'j ( et: 0, fi' 0)

. (i + 2j)

35. p
37. p

j) 0

(i -

et:i + {i'(i

. (i -

2j)

<

(o: 0, /! 0)

+ j)
0

Let "ff/' be the set of all continuous functions on the interval


functions in

1/1,

[ -1, I].

State, for the

definitions of (a) addition, (b) scalar multiplication, and (c) inner

product, in such a way that 1f/' forms an inner product space. Verify that under your
definitions, the inner product space laws are all satisfied. (There is only one reasonable
definition for (a), and similarly for (b); but the "right" definition of the inner product is
less obvious. Hint: The""' operation is supposed to assign a numberf g to each pair
of functions j; g. Under what significant operation does a number correspond to one

function? As a check on your definition, it should turn out that if/ (x)

h(x)

x,

then.fg

O,g

0,

and/

x3, g(x)

1,

%.)

This inner product space has important uses, later in the theory of functions.

9.8

FREE VECTORS

In the last section, we defined a vector to be a directed segment OP, starting at the
origin. We shall now introduce a different form of the vector concept, which for
some purposes is better.
By a

translation of a coordinate plane we mean a correspondence of the form


x x

where h and k are constants.

)'

k,

This is different from the idea of translation of axes,

which we used in Chapter 8.


moving the

+ h,

Then, we were moving the

points (x, y), with (x, y)

(x + h, y + k).

axes, while now we are

----+

Suppose that we have given two directed segments PQ, P'Q', in a coordinate
plane.

"---+

If there is a translation under which P


----+

P' and

PQ and P'Q' are equivalent.


y

Q'
Q

p'

---+

Q', then we say that

9.8

Paths and Vectors in a Plane

416

This idea is easy to describe in terms of coordinates.

Q'

Let

(x, y).

We can always move P onto P' by a translation


y

+ h,

x I-+ x

I-+

where
h

x{ -

X1,

y + k,

y{ - Yi-

If it is true that

y{

Y1

-+

Y - Y2,
-----*

then this translation also moves Q onto Q', and PQ and P'Q' are equivalent.
-+

For each pair P, Q, the symbol PQ denotes the set of all directed segments

-----*

-+

P'Q' that are equivalent to PQ. Such a set of equivalent directed segments is called
a free

vector

intended).
vector.

(or simply a

vector,

if the context makes it obvious what meaning is

Thus the figure on the left below is a partial picture of exactly one free

A free vector is called an

equivalence class

of directed segments; and any

directed segment which belongs to such an equivalence class is called a


of the class.
-+

representative

Thus each of the arrows in the figure is a representative of the free

vector PQ.
y
y

-+

-----*

If two directed segments PQ, P'Q' are equivalent, then they determine the same
-+

free vector, and PQ

P' Q'.

-+

And if PQ

-+

-----*

P'Q', then the segments PQ and P'Q'

are equivalent. Therefore, when we write an equation of the form

-+

PQ

P'Q',

we are saying that the segments PQ, P'Q' are equivalent under a translation.
It is now easy to define, for free vectors, the operations of addition, scalar
multiplication, and dot product.

If
-+

---+

OP+ OQ

-+
=

OR

Free Vectors

9.8

417

in the sense defined in the preceding section, then


--+

--+

---+

OP+ OQ =OR,
by definition.

This definition is complete, because every free vector ST has exactly

one representative segment which starts at the origin. Similarly, if

by definition; and

by definition.

---+

--+

OP OQ

--+

OQ, then

--+

rxOP

--+

rxOP

OQ,

OP OQ,

The form of these definitions makes it clear that all the vector laws

and inner product laws of the preceding section also hold true for free vectors. Since
all we would need to do is rewrite them in the new notation (using
not worth while to do so.
representatives. That is,

--+

--+

IOPI =OP.

The free vector of length 0 is denoted by 0.


It is convenient, in figures, to use the label
vector

The length of a free vector is the length of each of its

IOPI

OP for P), it is

--+

PQ for any representative of the free

Thus different segments may have the same label, as in the left-hand

PQ.

figure below; and when they do, this means that the segments are equivalent.
y

y
u

It is easy to see that the right-hand figure above is correctly labeled.


--+

Theorem 1.

--+

PQ + QP

0, for every

P, Q.

Similarly, the labels are correct in the parallelogram below.


--+

Therefore:

Since

OP+ OR=OQ,
we have

--+

--+

OP+ PQ

OQ.

This has a geometric meaning: we can add free vectors by laying representative
segments end to end.

Solving for

--+

--

--+

PQ, we get PQ

--+

OQ - OP. And this gives:

9.8

Paths and Vectors in a Plane

418

Theorem 2.

Proof

PQ + QR + RP
-

PQ + QR + RP

0, for every P, Q, R.

OQ - OP + OR - OQ + OP - OR

0.

R
I
I
I
I
I
I
I
I
I
I
I
I
"

I/" --.-:.,. ..........

As a matter of convenience, we have defined equivalence of directed segments in


terms of a coordinate system. But in fact this relation of equivalence is independent
of the choice of the coordinate system.

The directed segments

equivalent under translation if (a) their lengths

PQ

and

P'Q'

PQ

and

=-+

P'Q'

are

are the same and (b)

the directions e and e' are the same.

Note that while the directions e, e' depend on the directions of the axes, the equation

e' does not; if the equation holds, and the axes are rotated, then the equation

continues to hold.
Thus we say that the relation of equivalence between directed segments, used in
defining free vectors, is

invariant

under changes in the coordinate system.

It very often happens that we use coordinate systems in the study of things which
are invariant under changes of coordinates. Thus the distance between two points is
invariant, and so also is the question whether a given curve is a parabola. But we use
coordinate systems in the study of parabolas, and similarly we use coordinate systems
in the study of vectors. If P

(x, y),

then x and

y are called

the

x- and y-components

Free Vectors

9.8

of

--+

OP.

where

In this case

--+

P,

--+

--+

P= OP =

i, and

xi +

yj,

are as in the preceding section.

we have free vectors

i, j;

and

OP is

419

Corresponding to the vectors i, j

a linear combination of these free vectors:

OP= xi+ yj,

as shown on the left below. And of course pictures of the new

starting at any point that we want.

PQ, i,

In the right-hand figure,

and

--+

j can

be drawn

and

are all

free vectors. In general, ifV and Tare any vectors, with T :;tf. 0, then the T-component
of

is the number

vT

1v1

cos e,

where 8 measures the angle between the direction of T and the direction of
y

V.
Q

PQ=i+2j.
Thus, in the figure below

VT is

the directed distance PQ, relative to the given positive

direction on the line that contains

PR.
R

Since

V T = IVI

ITI

cos e,

it is easy to express the T-component in terms of the dot product:

v T

V T
--

IT!

420

9.8

Paths and Vectors in a Plane

PROBLEM SET 9.8

In the figures below, we use tick marks to indicate that segments have the same length.
Thus the tick marks in the figure below say that AB

AC.

1.

-+

--+

--+

a) Calculate OS as a linear combination of OR and OP. (The figure is a parallelogram.)

0
--+

--+

--+

b) Calculate OT as a linear combination of OR and OP, shown on the right above.


(These two answers, in combination, give a vector proof that the diagonals of a
parallelogram bisect each other.)
2.

--+

-+

--+

--+

a) Calculate SR and OT as linear combinations of OR and OS.


rhombus, so \OS\

(The figure is a

--+

-+

\OR\.)
T

0
--+

--+

b) Show that, in a rhombus, SR

OT

0.

(These two answers, in combination, give

a vector proof that the diagonals of a rhombus are perpendicular.)

Free Vectors

9.8

421

-+

3. a) Calculate OS as a linear combination of OP and OR in the left-hand figure below.


R

--+

b) Calculate OT as a linear combination of OP and OR, in the right-hand figure.


4. Do Problems 3a and 3b give a vector proof that the three medians of a triangle are

concurrent? Or do you need to carry out a third calculation of the same kind, to
complete the proof?
5.

a) Show that

IV Tl IVI

ITI,

for every two free vectors V and T.


b) Show that for any real numbers a, b, x, y, we have

lax+ byl Va2+ b2Vx2 + y2.


6.

Show that if P, Q, R, and Sare any four points of the plane, then
---+

---+

---+

PQ + QR + RS + SP
Let V0 be a fixed (free) vector.

8.

Suppose that V i
0, and [V[
1. Is this information enough to determine V?
If so, what is V? If not, give a figure, showing the possibilities for V.

9.

Given that V i

0.

7.

Show that if V0

0, for every V, then V0

0 and V

0, discuss as in Problem 8.

11.

a) A set of vectors Vv V2,


, Vn are linearly dependent if there are scalars !Xv IX2,
1Xm not all
0, such that

Given V

1 and V

10.

0.

1, discuss as in Problem 8.

Show that for any V, the vectors i, j, and V are linearly dependent.
b) Show that if one of the vectors Vi is
0, then the vectors V1, V2, ... , Vn are
linearly dependent.
c) Find a number a such that 2i + j and 7i + aj are linearly dependent.
=

12.

a) A set of vectors V1, V2, ... , Vn are linearly independent if they are not linearly
dependent. Thus the V/s are linearly dependent if
n

iI

CliVi

=>

IX 1

IX2

Cln

0.

Show that i and j are linearly independent.


b) Are i and i + j linearly independent? Why or why not?
--+

c) Given that i and OP are linearly dependent, what are the possibilities for P?

Paths and Vectors in a Plane

422
13.

9.9

Show that if
and
then
IV1 - V2I

IW1 - W2I

(Remember that IVl2


V V, for every V.) Then draw a figure, and restate the
theorem in the language of elementary geometry.
=

14. Explain how Problems 5a and 5b can be regarded as the same problem.

15. a) Consider the vector space which you were asked to define in the last problem of the
preceding problem set. Let 1 be the constant function which is = 1 for each x on
[ -1, 1 ]. Find ten nonconstant functions.f1,f2,
,/10 such that 1 f; = 0 for each i.
b) Show that in the same vector space, f f0
0 for every f => f0
0.
.

9.9

VELOCITY, ACCELERATION, AND CURVATURE

We return to the discussion of moving particles in a plane. Suppose that the motion is
described by a path

P: IE
t
where

I is a

P(t),

time interval. Let the coordinate functions of the path be f and

P(t) = (f (t), g(t))

(ton

g,

so that

I).

We now regard the path as a function whose values are the vectors
-

----+

pt= OPt,
-

where Pt is a vector in the sense of Section 9.7, and

P(t) is

denoted by

Pt,

to fit it into

the vector notation.


y

We then have
-+

pt= f (t)i + g(t)j.


We can now define the velocity and acceleration. These are the free vectors

Vt= f'(t)i + g'(t)j,


At= j"(t)i + g"(t)j,

423

Velocity, Acceleration, and Curvature

9.9

where i and

are the free vectors corresponding to i and j.

Since

Vt and At are free

vectors, we can draw pictures of them in any position we want; and so we picture
them by drawing arrows starting at the point

P1

The picture then says that at time

t,

the moving particle is at the point Pt and has the

indicated velocity and acceleration vectors

Vt and At Note that Vt lies along the

tangent line; and this is right. (This should be checked, for the various possible cases.
(a) If f'(t) and g'(t) are both 0, then V1
0, and there is nothing to prove. (b) If
f'(t) -:;!= 0, then V1 and the tangent line both have slope g'(t)/f'(t). (c) If f'(t) = 0
and g'(t) -:;!= 0, then V1 and the tangent line are both vertical.)
When we write Vt = f'(t)i + g'(t)j, A1
j"(t)i + g"(t)j, we are describing each
of the vectors Vt and At by a pair of numbers. Unfortunately, the numbersf'(t),
g'(t), f"(t), g"(t) have no physical meaning, because they depend on the coordinate
=

system.

It is possible, however, to describe the acceleration by a pair of numbers

which do have physical meanings. This is done in the following way. First we take a

V1, but with length I. T is called the unit


Pt. (Here, and throughout the following discussion, we are
assuming that the speed IV11 is not zero. If the length of Vi is 0, then its direction is

free vector T, with the same direction as


tangent vector at the point

not determined, and so T is not determined either.)

Next we take a free vector N, with length 1, perpendicular to T, and lying on the
same side of T as

At

Then

Ai

must be expressible as a linear combination

Ai=
of T and N.

o:T + (3N

Here o: is the T-component of

Au

and (3 is the N-component.

These

numbers are called the tangential and normal components of the acceleration.
shall now compute them.

We

424

9.9

Paths and Vectors in a Plane


Y

At

N---In the right-hand figure above,

is the direction of

we have

Vt. Since Vt = f'(t)i +

g'(t)j,

where

IVtl = -Jf'(t)2 + g'(t)2

Similarly, </> is the direction of the acceleration, so that


cos

<P

By definition of the T-component


IX=

A1I cos(</>

.
(t)
Sill</>= g"
-.

f"(t)

IAtl '

IX of

IA1I

At, we have

8) = IA1I cos

4> cos

f"
f"(t) cos 8 + g"(t) sin 8 =

8 + IAtl sin

4> sin

(t)f'(t) g"(t)g'(t)
I
I

f' (t)f"(t) + g'(t)g"(t)

-Jf'(t)2

Theorem 1.

+ g'(t)2

The tangential component of acceleration is the derivative of the speed.

That is,
IX=

Ar

.E:_ IVtl = IVtl'.


dt

Once this has been observed, it is easy to check it, by differentiating the function

IV tl =

)j'(t)2

+ g'(t)2.

fJ is computed as follows.
423, then

The normal component

T, as in the figure on p.

fJ

r/>]
I At I COS [(8 + rr/2)
ef>) + rr/2] = -IA tl
IAtl cos [(8
-

If N is counterclockwise from

If the direction of N is reversed, the sign of sin

(8

sin

(8

- ef>).

</>) is also reversed. In any case,

we want fJ 0, because N is taken on the same side of the tangent as

A1. Therefore

425

Velocity, Acceleration, and Curvature

9.9

we must have

fJ

IAtl

(8

!sin

</>)I,

in all cases. Therefore

fJ

IAtl

I sine cos</> - cose sin</>I

I IAtl sine cos</>

IA1I cose sin </> I

lf"(t)g'(t) - g"(t) f'(t) I

lf"(t) sine

g"(t) cose1

lf" (t)g'(t) - g"(t)f'(t) I

,Jf'(t)2 +

IVtl

g'(t)2

This formula for fJ also has an interpretation, but its interpretation is harder to
see, and requires the idea of the curvature of a path at a point.
For the sake of simplicity, we start with the idea of the curvature of the graph of a
twice differentiable function at a point.
y

For each

x,

let

s(x) be the length of the graph from t


s(x)

For each
Since

x,

let

8(x)

to t

x.

Then

f'J1 +f'(t)2dt,

and

s'(x)

,J1

+f'(x)2

be the direction of the tangent line, with

-TT/2

<

s is an increasing function, 8(x) is determined when s(x) is known.


h such that
8(x) h(s(x)),

is a function

ahd (in the language of Section

5.8)

ae

ds
The curvature

h'(s(x))

d8/dx
.
ds/d."

is defined to be
K

=I :I

8(x)

<

TT/2.

Thus there

426

Paths and Vectors in a Plane

9.9

This is easy to calculate. Since

(}(x)

we have

{}'(x)
Therefore
K

For future reference:


Theorem 2.

d(}
dx

Tan-1 f'(x),

1
f (x)
1 + f'(x)2 "

f"(x)

I
I d(}ds I I d(}/dx
I
ds/dx
1 f'(x)2 -J1
=

f"(x)

[1 + f'(x)2]3/2

1
+ f'(x)2

The curvature of the graph of a twice differentiable function is given

by the formula
K

f"(x)

(1 + j'(x)2]3/2

For paths, the idea is similar. Take a fixed


of the path, from

t0 to t.

t0, and for each t, let s(t) be the length

Then

s(t)

and

{t-Jf'(u)2
Jto

g'(u)2 du,

s'(t) -Jj'(t)2 + g'(t)2


{}(t) be the direction of the velocity vector at time t. We are working
on a portion of the path where !Vtl
-JJ'(t)2 + g'(t)2 - 0. On such a portion of the
path, sis an increasing function, and so {}(t) is determined when s(t) is known. Thus
=

For each t, let

there is a function h such that


Therefore

{}(t)
d(}
ds

h(s(t)).

h'(s(t))

d(}/dt
ds/dt.

But according to the definition of the curvature


K

In order to calculate

of the path,

lI

we take first the case in which Vt is not vertical, so that


tan

{}(t)

Taking the derivative, we get


2

[sec

(}(t)](}'(t)

g'(t) .
f'(t)

f'(t)g"(t) - g'(t)f"(t)
.
j'(t)2

Velocity, Acceleration, and Curvature

9.9

427

Now
2
sec
Therefore

g'(t)2 = f'(t)2 + g'(t)2


e(t) = 1 + tan2 e(t) = 1 +
f'(t)2
f'(t)2
010 =

f'(t)g"(t) - g'(t)f"(t)
.
f'(t)2 + g'(t)2

This derivation works whenever the velocity vector is nonvertical.

(Query: How

would you derive the same formula, in the case where the velocity vector is vertical?)
This gives

K=

de
ds

I I I

d e/dt
ds/dt

[ [

e'(t)
s'(t)

g'(t)f"(t)
= f'(t)g"(t)
f'(t)2 + g'(t)2
_

.Jf'(t)2

+ g'(t) 2

lf'(t)g"(t) - g'(t)f"(t)I
]3/
[f'(t)2 + g'(t)2 2

Thus we have:

Theorem 3.

The curvature of a twice differentiable path, at any point where the

speed is not

0, is given by the formula

lf'(t)g"(t) - g'(t)f"(t)I
]3/
[f'(t)2 + g'(t)2 2

K=

Comparing this with the formula

/3 =

lf"(t)g'(t) - g"(t)f'(t) I

.Jf'(t)2

we get:
Theorem 4.

+ g'(t)2

'

At any point where the speed is not zero, the normal component of

acceleration is given by the formula

/3 = AN = K IV1l2
In our discussion, we used the notation f',

g',

... for derivatives, most of the

time, in order to connect our work with the preceding theory. We used the notation

de/dt, de/ds,

. only when we really needed to talk about the derivative of one

function with respect to another, in defining and calculating curvature. In the litera
ture of physics, however, the notation f',

'
g,

. is hardly used at all. The following

notations are far more common:

dg.
df.
v = -1 +
1
dt
dt
-

In the last expression the dots

dx
dy
v = -i +--j,
dt
dt
x
over
and y indicate

time. Similarly,
A=

d2g .
d2x
d2j
J=
l +
d t2
dt2
dt2
.

.
l

v =xi+ yj.
differentiation with respect to

d2y.
J = Xl + YJ
dt2
.. .

...

Paths and Vectors in a Plane

428

9.9

In these notations,

and
K

Ji.XI
[x2 + _y2]312
li.Y

l(dx/dt)(d2y/dt2) - (dy/dt)(d2x/dt2)1
[(dx/dt)2 + (dy/dt)2]31 2

There is a good reason, in physics, for the use of the "fractional" notation

dy/dt,

df/dx,

... for derivatives. Most of the time, physical problems involve a large number

of interrelated functions, and physicists need to talk about the derivative of one of
these with respect to another.
"fractional" notation

df/dx

Therefore, the rest of the time, they use the same

for ordinary derivatives

f'.

PROBLEM SET 9.9


1.

Find the point of maximum curvature of the parabola y =


value of

2.

and find the maximum

Find the point of maximum curvature of the parabola y =2 +


maximum value of

3.

x2,

K.

+ x2, and find the

K.

Find the points of maximum and minimum curvature of the graph of


calculate the values of

x3,

and

at these points.

4.

Calculate the curvature of a circle of radius

5.

Calculate the curvature at the points

a.

(a, 0) and (0, b) for the ellipse

x2

y2

+
=1.
/;2

6.

Sketch the path

Pt = i cost + j sint,
showing the velocity and acceleration vectors at several points.
7.

Discuss (as in Problem

6)
t

Pt = i cos 2 + j sin 2 .
8. Discuss
9. Discuss
10. Discuss

Pt =2i cost + j sin t.

Pt = it + jt2 .
Pt

ti +

(t - t2)j (0 t

11. Discuss

Pt =i cos t2 + j sin t2.

12.

Pt

Discuss

it + jt3.

13. Discuss

Pt = it3 + jt2

14. Discuss

Pt =i(l - t)2 + j(t3 - t).

15.

Pt =(t cos ()()i + ( -!gt2 + t cos ()()j.


0, and find the direction of this vector.

Discuss

1).

In the sketch , show the velocity at

Velocity, Acceleration, and Curvature

9.9

429

16. For a certain path, the velocity at time 0 has direction I)( and length 1. The initial
point P0 is the origin. For each t, At
gj Express the path in the form Pt
f(t )i + g(t)j.
=

17.

Discuss as in Problems 6 through 15, and express the tangential and normal com
ponents of acceleration as functions of the time:

Pt

18. Discuss and sketch

P0

3i cos 2t + 3j sin 2t.

i cos2 8 + j sin 8 cos 8.

Describe this as a path in polar coordinates; find a rectangular equation for its locus,
and identify the locus.
19. Discuss and sketch

Po

(cos 8 - cos2 8)i + (sin 8 - sin 8 cos 8)j.

20. Discuss and sketch

21. Discuss and sketch

Po

(3 cos 8 + cos 38)i + (3 sin 8 - sin 38)j.

P0

22. Discuss and sketch


po

i(cos2 8 - sin2 8)i + sin 8 cos 8j.

(cos 8 - cos 8 sin 8)i + (sin 8 - sin2 8)j.

23. Is the following statement true? (Why or why not?)


Theorem (?).

Given a path with coordinate functions f and g, on an interval [a, b], such

that f and g are differentiable, and the velocity is nowhere

such that V1 has the same direction as PaPb.


The figure indicates that in

some

0, then there is a time t

cases, at least, there is such a time t.

24. Given a path which has curvature


the curvature change?

at time t0, suppose that the axes are rotated. Does

Why or why not?

[Hint: This problem does not require a

calculation.]

25. Let a

i + j, b

j. Suppose we define an "inner product" V1

that for
V1
V2

x1a

+ y1b,

x2a

+ y2b,

V2, by agreeing

430

Paths and Vectors in a Plane

9.10

the * product is
V 1 * V2

X1X2 + Y1Y 2

a) Does *obey the same formal laws as the old inner product?
b) Is it true that V1 *V2 = V1 V2 for every V1 and V2?
case, express the new operation * in terms of the old.

Why or why not?

In any

9.10 CONCLUDING REMARKS ON


VECTOR SPACES AND INNER PRODUCT SPACES

The treatment of vectors in this chapter has been brief, because so far we are working
in a plane, and the main advantages of a vector approach appear in three-dimensional
space, and in spaces of higher dimensions.

Meanwhile we must bear in mind that

vector ideas appear in many different forms.

1)

Free vectors.

Velocity and acceleration are vectors in this sense, as in Sections

9.8 and 9.9.

2)

Bound vectors.

These have not only length and direction, but also position.

For example, if two forces act in opposite directions on the ends of a spring, then
they may be regarded as bound vectors.
F_1_.

---F2-

In the figure, the two forces have the same length and opposite directions, but they
do not cancel each other out, as free vectors would; on the contrary, they compress
the spring.
3)

Sequences of numbers, regarded as vectors.

(w, x, y, z) can be regarded as vectors.


(w1, X1, Y1, z1) + (w2, Xz, J2, z2)
<X(w, x, y, z)
(w1, X1, Y1, z1)

(wz, X2, Yz, Z2)

For example, ordered quadruplets

We make the natural definitions


(w1 + W2, X1 + X2, Y1 + Y2, z1 + Zz) ,
(<Xw, <XX, <XJ, <Xz),
(w1W2 + X1X2 + Y1Y2 + Z1Z2.

In fact, this is the usual way of describing a space of four dimensions.

4)

Systems of other kinds, regarded as vector spaces and inner product spaces.

of these are unexpected, but turn out to be useful.

Some

See, for example, Problem 33 of

Problem Set 9.7, in which it appeared that a set of functions can be regarded as an
inner product space, although functions may not seem like vectors when we look at
them one at a time.
For this reason, when people speak of "vectors," we need to find out what kind
of vectors they are talking about.

Infinite Series

10

10.1

LIMITS OF SEQUENCES

Most of the time, so far, we have dealt with limits of functions, as


x

oo.

But often we have dealt with limits of sequences, as n

oo.

a or as

For example, in

Section 2.10, we wanted to find the area A, under the graph of y = x2, from
to

= h. We expressed A as the limit of a sequence A1, A2,

=0

where An is the area

of a circumscribed polygonal region Rn. We calculated


An =

h3

1 +

r) (

and we found that

1
+ 2n '

h3
limAn = n-+oo
3

We are now going to use limits of sequences more extensively, as a way of dealing
with infinite series.

Given an infinite sum


. 00

L ai

i=l

= a 1 + a 2 + ...
.
'

we define

and we call the An's the partial

sums

of

_2;:1 ai.

Thus the An's form a sequence

Ai, A2, ...

If

limAn =A,
n-+oo
then we say that the infinite sum is convergent, and we write
00

2; ai

=A.

i=l

We shall now examine limits of sequences more carefully, starting with the
definition of the limit, and building up the theory that is needed.
Definition.

Given a sequence A1, A2,

for every

> 0 there is an integer N such that

n > N

=>

of numbers, and a number

IAn
431

LI

<

E.

L.

Suppose that

432

Infinite Series

10.1

Then
Jim An= L.
n-+ oo

Note that this is like the definition of lim.,00/(x). A sequence which has a limit
is called

convergent.

Here, as always, when we speak of a limit we mean a finite limit

(unless the contrary is stated.)


Theorem 1.

Proof

limnoo

1
=
n

0.

Here

for every

L = 0, and IAn - LI = 11/n - OI = l/n.


E > 0 there is an N such that
n> N

Thus we need to show that

- <E.
1

=>

Now
1

-<E
n
If

l/E

is an integer, let N =

l/E.

n> N

<:::>

n>-.
E

In any case, there is an integer N>

n > 1/E

=>

=>

l/E.

Then

I/n < E,

which is what we wanted.


On the basis of the definition of limn_,00

A,,,

we can prove the expected theorems

on sums, products, and quotients. These are much like the corresponding theorems
for limits of functions. In Appendix C they are listed in such an order that they became
easy to prove.
Theorem 2.

Meanwhile we shall state the main results and use them.

If lim,,00

A,, = A

and lim,,00

B,, = B,

then

and

If

':;!: 0, and

B,,

':;!: 0 for each

n,

then

These theorems justify the procedures that we have been using informally.
example, they give a proof that

lim
n-+oo

h3 (i n.!) (i
3

) h3

12n

For

Limits of Sequences

10.1

433

The steps are as follows:


=0

lim

.!

n -+oo

lim

(i .!)

lim

_!_

lim

n-+ 00 n

n-+oo 2n

n-+oo

= 1

= .! lim .! = 0
2 n-+oo n

i +

1-

lim ha 1 +

n-+oo 3

=1

)
( .!) (
2n

i +

1-

=ha.

2n

(Justification for each of these steps?)


If we start with convergent sequences AI> A2,

, B1, B2,

, and so on, then

Theorem 2 tells us that certain other sequences are convergent. But often we deal with
sequences which are not built up out of convergent sequences as in Theorem 2. We
then need the following ideas.
Definition. A sequence A1, A2,

sequence is decreasing if An

is increasing if AnAn+i for every n.

The

An+l for every n. (If An < An+i for every n, then the
.

sequence is strictly increasing; and if An+i < An for every n, then the sequence is
strictly decreasing.)
Definition. If there is a number M such that An M for every n, then M is called

an upper bound of the sequence A1, A2, .


above.

, and we say that the sequence is bounded

If there is a number m such that m An for every n, then m is called a

lower bound of the sequence, and we say that the sequence is bounded below. If there

is a K > 0 such that /An/ K for every n, then the sequence is bounded.
Example:

(1)

If An =/;:,,for every n, then the sequence is increasing, and is

bounded.

(3)

If An = sin n, then the sequence is neither increasing nor decreasing,

bounded below but not above. (2) If An = e- , then the sequence is decreasing, and is
n

but is bounded, with /sin n/ 1 for each n.


It is easy to see that if a sequence is bounded both above and below, then it is
bounded.

Given m An M for every n, let K be the larger of the numbers /m/

and IM/.
Theorem 3. If a sequence is increasing, and is bounded above, then it is convergent.

That is, if
AiA2 ...AnAn+l ... M,

then the sequence has a limit.


have seen is in geometry.
polygon of 2n sides.

The first application of this principle that you may

Given a circle of diameter

1,

we inscribe in it a regular

For each n, let An be the perimeter of our 2n-gon.

(Note that

we had better start with n = 2.) It is a matter of elementary geometry to show that the

Infinite Series

434

sequence A2, A3,

10.1

is increasing. Also, An < 4 for every n, because the perimeter of

every inscribed polygon is less than the perimeter of the circumscribed square.
(Draw a figure.) Therefore the sequence is convergent. Its limit, of course, is

'TT.

We proceed to the proof. Let S be the set of all numbers An. That is,

Then S has an upper bound. By the Least Upper Bound Postulate (LUBP), S has a
least upper bound. (See Section 5.6.) This is called the supremum of S, and is denoted
by sup S. Let
A= sups.
We shall show that
limAn
n co

A.

....

Let

be any positive number.

Then A

upper bound of S. Therefore AN > A

< A. Therefore A -

is not an

for some N. Since the sequence is increas

ing, this means that

n > N

An > A

=>

Since A is an upper bound of S, and A +

E.

> A, it follows that A +

is an upper

bound of S.

Therefore
for every n.
Therefore

n > N => IAn - Al <


and limn-co An

E,

A, which was to be proved.

We have a similar theorem for decreasing sequences:

Theorem 4. If a sequence is decreasing, and is bounded below, then it is convergent.


That is, if A1, A2,

is decreasing, and An K for every n, then the sequence

has a limit.

Proof

For each n, let Bn

-An.

Then B1, B2,

above. Therefore it is convergent. Let limn-co Bn

is increasing, and is bounded


B. Then limn-oo An

B.

Some simple sequences converge for reas0ns which are not covered by the
preceding theorems. For example, given that
lim

n-.0 __

_!

0'

it is obvious that
lim
1.V"n-co
n + 2 n

0,

because the second sequence is smaller, term by term. This is the idea of the following
theorem.

10.l

Limits of Sequences

(The squeeze principle). If lim n- co


n, then limn-oo Bn = L.

Theorem 5

435

An= L, Iimn-co en= L, and A n

B11 en for every

In many cases, it is easier to use this theorem than to do awkward calculations.


Similarly, it ought to be true that
lim

cos

11-00

= 0'

But we can't get this result from Theorem 2,

because lcos
because cos

nl I, and I/n -+ 0.
n does not approach a

Theorem 6

(The annihilation theorem).

limit as

n-+ oo.

Hence we need the following:

If limn- co An= 0, and BI> B2,

is bounded,

then limn_,00 A nB n = 0.
Theorem 7.

For

Every convergent sequence is bounded.

increasing

sequences,

this is trivial:

limn-oo An= A, then Ai An A for every

n.

For a proof in the general case, see Appendix C.


tions: if we show that a sequence is

not

if Ai, A2,

is increasing,

and

Similarly for decreasing sequences.


This theorem has simple applica

bounded, then it follows that the sequence

is not convergent.
The statements
lim An =
mean what you would expect.

IimAn = -00
n...,co

oo,

You should be able to state your own definitions of

them, following, if you need to, the models of Section 5.3.

Sequences like this are

not called convergent. If lim"_,00 An = oo, then we say that the sequence diverges
to infinity. And if limn-oo An = - oo, we say that the sequence diverges to minus
infinity. We have to be careful about this: if convergence allowed the limits oo and
- oo,

then Theorem 7 would become false, and Theorem 2 would be meaningless

in many cases.
-oo.

(You can't perform algebraic operations on the "numbers"

oo

and

PROBLEM SET 10.1


Investigate the following indicated limits. That is, find out whether they exist, and find
out, if possible, what they are.

1.

lim 2
noo

2.

Jim

2 + 311

3.
4.

6.

Jim

3n

n-oo n- +
Jim

(Try dividing the numerator and denominator by

---

n-Cf)

n-oo 2 ll

ll

(Try using one of the last theorems in this section.)

5.

+ n

.
sin (2n +
hm

n.)

1)

7.

lim

n->00

3
ll

ll

+ n2 +

.
cos (n
Jim

n->ro

7T

- 1)

n + 1

436

8.

9.

Infinite Series

Jim

n->OO

lim

n_,.oo

(
(

n
)

1 +

10.1

-l/n
)

[Hint:

(1 +

(1

Surely you know limx-o

1/y)v,

x)1fx.

Now find liffiy_oo

and apply the result to the problem in hand.]

10. lim,H"' Bn. where Bn is the perimeter of a regular 2n-gon circumscribed about a circle

1.

of radius

1.

13.

15.
17.

Jim Inn

n-oo

1/n

Jim In

n-oo

"dx

n-oo

i
f

Jim

.L

Jim

n-+CO

Jim

n---+CO z=l

19.

Jim

21.

14.

n---+CO

(n2)

Jim In (

Jim

n-c.o

)
ll

"dx
3
1 x

(Investigate existence only. A geometric interpretation is useful.)

"dx

n--oo J1

Jim

n-oo

dx
312

"

Jim In

12.

16.

X2

18.

20.

(You need not prove that your answer to this one is right.)

n--i-oo i=1

X
1

(Investigate existence only.)

:a
l

n
1
L .31
n-+OO 1.=l l 2

lim

Jim

Jim

(Investigate existence only.)

22.

n->OO t=l l

23.

n--co i=l

sin

24.

Jim

cos

-:

(Investigate existence only.)

( hr)n n
I ( 2n) n
:

hr

noo i=l

(Geometric interpretation?)
n

!!__

25.

I)

_L --- (
n-oo i=l I + (i/n) ;
Jim

10.2

26.

Infinite Series.

lim
n-co

"

i=l

1 + (i/211)

28.

l n
lim - L e-i/n
n-co n i=l

30.

Jim n2 sin 2
n
n--+CO

32.

Jim
n-.oo

34.

Jim
n--+CO

(1)-

27.

31.

437

L eifn

n-co n i=l

-n1
( 1n

Jim

sin

Jim 11 1
n-+CO
Jim

33.

tan

Comparison Tests

1 n

Jim -

29.

Convergence.

sec

cos

[I - ]
In

1.=l l

(In fact, this limit exists; if you can find a geometric interpretation of the problem, you
The limit is known as Euler's constant. Nobody knows whether it is
rational.)

can prove it.


n

35.

Jim
n->OO

10.2

i=l

36.

(Investigate existence only.)

(21 + 1)

n
1
um I
.2
)2
i=l
(31
+ l
n-.oo

INFINITE SERIES. CONVERGENCE. COMPARISON TESTS

By an

infinite series

we mean an indicated sum of the form


00

I ai

i=l

= a1 + a2 +

+ an +

We say "an indicated sum" because in many cases there is no such thing as the sum
of infinitely many terms.

For example, the series


(to infinity)

l+l+l+ "
has no sum; and neither does the series
1

+1

+1

(to infinity).

In many cases, however, the "sum of infinitely many terms" can be defined, by a
passage to a limit, in the following way.
Given the series

for each

n,

let

An =

Then An is called the

I ai

i=I

nth partial sum

a1

+ a2 +

of the series.

+ an

If

limAn =A,
where A is a (finite) number, then we say that the series is
is its

sum.

converges to A.
that the series is divergent. If

We also say that the series

has no limit, then we say

limAn =
n-+ oo

oo,

convergent

and that A

If the sequence A1, A2,

438

Infinite Series

10.2

then the series diverges to infinity; and if


limAn
=
n-+ oo

We may write these statements briefly as

then the series diverges to minus infinity.


"'

"'

L Qi =

i=l

00

.z a;=

A,

-oo,

co,

i=l

=-co .

.2;a;

i=l

Probably the first example that you have seen of a convergent series is the geo
metric series

1+

Here
=
An

1 +

,2

r + r2 +

1- r

Jim

-oo

,.n

and this means that

00

.2; ri

i=O

0 <

< I,

we have

Therefore the sequence r, r2, r3,

(0 <

(0 <

I'

(1 ).

<

(0 <

< 1).

rn+i

1
1-r

- , n_r_
1- r

(0 < ,. < 1),

--1- r

= -

There are many ways of proving


r

1
- 1- r

-+oo

limA =
n

+ r

n
+I

If we know that
then it follows that

1- r

+ ... + r =

rn

< 1);

< 1).

The following proof is the easiest.

(1)
(2)

(3)
Since

for every n.

is decreasing. And it has a lower bound, namely 0.

Therefore the sequence is convergent, to some limit L. Thus


lim rn = L,
n-+oo

and

Jim
n-+

rn+i = L.

oo

(Why? What happens to the limit of a sequence, if you omit the first term?) Therefore
L = Jim

11-00

and so
Since

rn+

L = rL,
- r

0,

n-oo

observation:

If limn-oo lan
l=

0,

rn

= rL,

(1 - r)L = 0.

and

In fact, the same conclusion holds for

Proof?

= r Jim

it follows that L = 0.
Jim rn =
n-+oo

Theorem 1.

Therefore

(0 <
-1 <

< 1).
0.

then limn-oo an
=

0,

We get this from the following


and conversely.

(If you rewrite these two statements, using the definitions of the statements

limn-oo lan
l=

and limn-oo an
=

the following theorem.

0,

they hardly even look different.) Thus we get

Infinite Series.

10.2
Theorem 2.

If -1 <

Convergence.

Comparison Tests

439

< 1, then

= 0.

limrn
n-+oo
Algebraically, the formula
1 + r + y2 + .
holds for every
Theorem 3.

n
+ r = (1

00

i=O

This holds because lim n-oo

Theorem 4.

r)

< 1, then

Lri

a,

:- I. We therefore have a more general result for geometric series:

If -1 <

is any number

n
- y +l)/(1

-- .

[-r +l/(l - r)]

= 0/(1

r) = 0.

If -1 <

< 1, then

oo

Lar'
i=O

= -- .
1 -r

The following theorem often makes it easy to see that a series


Theorem 5.

Proof

If

L:,1 ai

For each

If the first term

rather than 1, then we have:

n,

is convergent, then lim n -oo an = 0.

Jet

Let lim,._,00 An =A. Then limn-oo An-I

= A,

lim(An - An_i) =A -

an-

Therefore limn-oo

an

where

n->oo

But An - An-l =

diverges.

> I. Therefore

A= 0.

= 0, which was to be proved.

L:o ari is divergent for a :lrln lal, and so an does not approach 0.

For example, the geometric series


In this case,

lanl = lal

Warning.

The converse of Theorem 5 is false.

0 and

lrl

I.

That is, the nth term of a series

may approach 0, and the series may still diverge. The simplest example of this is the
series
1

+ t + t + + i + i + t + t + t + t +

The next five terms are each equal to t; and so on.

Here

an-+

0, but the series

diverges to infinity.
A more natural example of the same phenomenon is the
00

i=l

I=1+-+-+ .

harmonic

+ -+
n

series

10.2

Infinite Series

440

In fact, this diverges.

The easiest way to see this is to draw a picture:

For each

n,

the area under the graph from

area of the circumscribed rectangles. Therefore


A

But this integral is In

Briefly:

2;;:1 (l/i) =

=n +

1 is less than the total

ln+l dx

-.

and

(n + I);

Therefore the partial sums

Theorem 6.

/1

limln(n

diverge to infinity.

1 to

1
1
1
=1+-+-+ .. +- >

"

n+I

An

+ 1)

oo.

form an unbounded sequence, and the series must

oo.

The same sort of comparison scheme can be used for other series, to show that
they converge. Consider, for example,

Here

ai

l/i2,

and so lim;oo

ai

0. This does not, in itself, show that the series

converges. But the algebraic pattern suggests that the series is related to the improper
integral

100 dx la dx
[-l]a (-1 l)
x a-+ x a-+ x a-+
/x2
x, "
ld
.

. Since 1

2 = 1Im

> 0 for every

00

1.
Im

co

1.Im

00

l.

the integral approaches its limit from below, and

< 1

for every n.

Infinite Series.

10.2

And since the function

Convergence.

Comparison Tests

441

1/x2 is decreasing,
(n

>

1).

1/n2.)
{ 2dx
J1 x2 '

(Here the area of the rectangle is

l_

22

and

<

An

Obviously the sequence

A A 2,
1,

2
i=l

Therefore

l_

32

2 < 1 +
n

<

r s dx
J 2 x2 '

n dx
J1 x2

< 2.

of partial sums is increasing; and we have just

seen that it is bounded above. Therefore:


Theorem 7.

2:1 (1/i2) is convergent.

Some of the ideas that we have been using to get these results are useful in so
many connections that they are worth recording as theorems.
Theorem 8

(The comparison theorem).


0

Then

(1)

if

Then

ai bi

2:1 ai and 2:1 bi be series,

with

for each i.

2:1 bi is convergent, then so also is ,Li':1 ai; and (2) if ,L:1 ai is divergent,
.L:1 bi.

then so also is

Proof

Let

For each

n,

let

442

10.2

Infinite Series

(Why?)

And each of the sequences A1, A2,

and B1, B2,

is increasing.

An

increasing sequence is convergent if it is bounded, and conversely. We can therefore


prove

( l)

in the following steps:


00

i=l bi
2

is convergent
is convergent

=> B1, B2,

=> B1, B2,

=> Ai, A2,

=> A1, A2,

is bounded
is bounded
is convergent

00

=>

2a;

is convergent.

i=i

(Reason for each of these implication signs?)


following steps:

Similarly, we can prove

(2)

in the

00

i=l ai
2

is divergent

=> Ai, A2,

is divergent

=> A1, A2, is unbounded

is unbounded

=> B1, B2,

=> B1, B2,

is divergent

00

=>

i2=l bi

is divergent.

The comparison theorem gives us easy tests for some series.


example,

1 =1+-+-+-+
1 1 1
i2=O i !
1 2! 3!
n (n
n! =
00

Here

and

O!

Then

1, by definition. For each

i, let

1)

ai= '1 bi= (2_ 1)i-l.


a i bi
1 erl (i = 0),
O! 2
1
1! Gr (i = 1),
.

l.

for each i;

- <

Consider for

10.2

Infinite Series.

Convergence.

Comparison Tests

and thereafter the strict inequality holds, with l/n ! < (l/2r-1 for

443

2. Therefore

our series is term by term less than the geometric series

which is known to converge.


Theorem

9.

Therefore:

2;:0 (l/i!) is convergent.

In fact,

I !

i=O

ln-

lim

(1

x)1/"'.

x-+0

But we won't be able to prove this until we have developed the theory much further.
The situation here is peculiar: the easiest way to get this special result is first to show
that

and then to set

1. (You have seen a situation like this before. The easiest way to

H x4 dx is first to calculate the function frr t4 dt, and then to set x

find

I.)

Consider

next
00

/n

Since
1
n

>

for every

n,

and

it follows that the given series diverges:


ro

1
n

)n

Cf).

While the comparison theorem tells us, under some conditions, that a series
converges, it never tells us what the sum is.

But such partial information may be

useful. In fact, some of the most important uses of series are in cases where a number
(or a function) can best be described by a series; in such cases, we use
some large

n)

to get an approximation of

,L;:1 a;.

Lf=i a;

(for

For example, the approximation

is excellent, even for fairly small values of n; it gives by far the best way of computing
e; and in fact, the series approaches its infinite sum so fast that e is much easier to

compute than

)2.

Infinite Series

444

10.2

Therefore, when you are asked to show that a series has a sum, without finding
out what the sum is, you should not consider that the problem is artificial.
PROBLEM SET 10.2

Find out which of the following series are convergent. If the series is geometric, calculate
the sum.
I.

oo

2 Vi:
i=l ----:

oo

3.
6.

1
4.
i j3/2 + 2
oo

00

.2 (-l)i7T-i

i=l

10.

00

cos3

(2i)
j2

oo

13.
16.
19

14.

il j l-1

2 l 1n I .
i=2

17.

.,

i=O
00

25.
2 8.

(i!)3

---

2
i=l i
.,

i=l

(i + l)(i + 2)
+ 1

20

i l Vi + 1
2i
i=l

sin2

0 9
i j.
1

1
i=2 l:a-n

18.

_2

z
i=2 1-:--1n

co

1
21.
iO (i !)2
co
1
24. 2
i=2 i(i - 1)

2 i(i+1)
i=l -
00

.2 -.-
i=2 1 n
00

2
.
i=2 z:---1n2 z
.2

ii'
00

15.

(2i - 1)
j2

i =l
00

12.

oo

23.

9.

i
i=l j

00

i2 ln2 i
(i ! - 1)

22.

L ( -2)ie2i

i=l

00

00

11.

.2
= --

00

00

oo

8.

)
i(
00

i j 3 /2

26. 2
(i - l)(i)(i + 2)
i=2 ---

27.

21_

30.

i_
i -._
+ 1

i=l

i:

i=l j2 - 1

31. If you think of Theorem 3 backwards, it says that

--

1
=1 +r+r 2+
1 -r
That is,

1/(1 - r) can be expressed as the sum for an infinite series. Express 1/(1+x)

as the sum of an infinite series. For what numbers

does your series converge?

32. Express 1/(1 +x2) as an infinite series. For what numbers x does the series converge?

33. Same question, for 1/(1 + x4).

*34. Suppose that :L:o aixi converges for every x. The series then defines a function
co

/(x)

2 aixi.
i=O

It will turn out that functions which can be defined in this way are always differentiable,
and that their derivatives can be calculated by differentiating the series a term at a time.

That is,
co

f' (x)

2 iaixi-I.

i=l

10.3

Absolute Convergence.

Alternating Series

445

(Don't try to prove this; you haven't got a chance.) Granted that all this is true, what
must the a;'s be, if /(0)
1 and f'(x)
/(x) for every x? Comment on your result.
=

00

.3
:Li=l I + 1

35.
*37.

00

For which numbers

ct.

is the series

2i2

1 I1
i=l
+

36.

:L:1 (l/n") convergent?

*38. Prove the following.


Theorem A (The Integral Test). Let f be a positive decreasing continuous function,

on the interval [1,

oo

) . If

ioo

f(x) dx < oo,

(1)

then
00

:L /Ci)
i=l

< w,

(2)

and conversely.
10.3

ABSOLUTE CONVERGENCE. ALTERNATING SERIES

Given a series

_L!o ai

(in which the terms may be positive, negative, or zero), we can

form a new series by taking the absolute value


00

00

Lai
i=O

then

L( - l)iri
i=O
=

L lrli
i=O

of each term

- r + r2 -

00

00

_L la;I
i=O

la;J

1 +

lrl + lrl2

Given that La; converges, it does not follow that _L


the series

1 a;

a;.

For example, if

lai l converges.

1 - 1 + t- t + i - i +

For example,

is convergent, but the series

:L la;I

1 + 1 + t+ t + i + t

is not, because the harmonic series is not. The same sort of thing happens if we take
absolute values in the series
00

00

Here it is plain
convergent.

:L ai :L c-1)i+ 1--: 1
i=l
i=l
l
that L la;I diverges, but it
=

Proof

Let

1
-

- - + ...
4

is not quite so easy to see that

La;

is

This is worth proving, however, because the idea used in the proof is

useful in other connections.


Theorem 1.

_L!1 ( - l)i+1(1/i) is convergent.

446

10.3

Infinite Series

If n is even, with n =

2k,

then

2
)+(

An = Ao.
"k= 1 - I_ + I_ =

(1

)4 .+
1

Therefore the sequence A2, A4, A6, , A2k,

2k

- l_

...

(2k 1-

)
2k .
1

is increasing. And it has an upper

bound, because

G ) ( - )

A2k = 1 -

Therefore the sequence A2, A4,

has a limit.

ck 2 - 2k 1 ) 2k1
-

<

1.

Let

A= lim A2k.
k-+ 00

(1)

We shall show that A is the sum of the series. First we p.bserve that
lim A2k+1= lim [A2k

so that

le-> oo

k-> oo

+ a2k+il

lim A2k+i

k--+ 00
Thus we see that

lim A2k

k->

oo

lim
le-> oo

2k

1
+

(2)

A + 0= A.

(I) as n--+ oo through even values, An --+A and


(3) limnoo An= A.

odd values, An --+A. It follows that

(2)

as n--+

oo

through

Proof? (You need to show that for every

E > 0 there is an N such that [An


- A[ <
n > N. Given such an i:, you know from (1) that there is an N1 such that
[A2k - A [ < i: for every > NI; and you know from
that there is an N2 such that
IA2k+i - Al < i: for every > N2 How can N be definecl in terms of NI and N2?)

i: for every

(2)

The scheme that we used to prove Theorem 1 applies more generally.

If you

reexamine the proof, you will see that the only facts ,about the series

that were used were the following:

1)

The series is alternating. That is, successive terms a;, ai+I have opposite signs.

3)

The sequence laII, la2I, ... is decreasing.

2)

Limnoo an

0.

We have therefore proved the following theorem.

Absolute Convergence.

10.3

447

(The alternating series test). Given an alternating series ,L:1 a;. If


0, and the sequence la1i, la2i, . . . is decreasing, then the series converges.

Theorem 2

l i mn_,00 an

Alternating Series

(Strictly speaking, some of our formulas in the proof of Theorem 1 used the fact
that the first term was positive instead of negative.

If you know that the theorem

holds in this case, how would you show that it also holds when a1 < 0 ?)
We have seen that if

La;

converges, it does not follow that

L la;I

converges.

But the reverse implication does hold:


Theorem 3.

If

If

.L la;I

_L;:1 la;I

is convergent, then so also is

is convergent, then

L
. a;

is said to be

,L;:1 a;.
absolutely convergent.

In this

language, we can restate Theorem 3 as follows:

Every absolutely convergent series is convergent.


To prove this, we break up each partial sum

into a sum of positive terms and a sum of negative terms. To do this, we let

if a; 0,

+_ a;
a;
-

and let

if

if a; 0,

a;
a; =
0
_

a; < 0,

if a;> 0.

Let
n

A+
n

A-;;-= _La;.
i=l

"" ai'
+
L-

i=l

Then
for each

n,

for each

i.

because

Obviously

A, A;, ... is

an increasing sequence, and

A-;, A;, ... is a

decreasing sequence. Let


ro

.L la;!.

i=l

Then
n

L ia; I

:::::; k

for every

n.

i=l

Also
n

A! _L la;I,
i=l

because

is the sum of some (perhaps all) of the terms on the right-hand side.

Infinite Series

448

Therefore

10.4

At, A, . . .

is convergent.

Let

A+=

lim

A-;;-

I ( - lail),

Similarly,

because

A:

A;;.

i=l

is the sum of some (perhaps all) of the terms on the right-hand side;

and if you omit negative terms, the sum becomes larger.


sequence

Therefore the decreasing

A, A-;, ...is bounded below. Therefore it has a limit.

Then
lim An = Jim (A;;
n-+co
n-+co
and

I1 a; is convergent,

Theorem 4.

If I1

/a;/

Let A-

limnco A:.

A-;;) = A+ + A-,

which was to be proved. In fact, we can say a little more:

is convergent, then

Ii a;I ; /a;J.
Proof

We know that

By induction it follows that

Ii a;I ; /a;/

for every

n.

Passing to the limit, we get the inequality that we wanted.


PROBLEM SET 10.3

Find out which of the following series are alternating, which are convergent, and which
are absolutely convergent.
"'

i=l

4.
1.

10.
10.4

"'

1
<-1)i -:{I

i=l"'
i1
i=lI c-2))2i-:-i
(i=lf
l

l.

s.

i=l"' 2
1
I 2
i=lI c ni1
(
i=lI ::.!_2
-

i=l

11.

'
..

i=l

( -1)' -. + I

+ i

"'

s.

3.

9.

sin

cos

rri)
2
l

12.

ESTIMATES OF REMAINDERS

Given that a series converges, we often want to use a partial sum

00

:L

i=2 c

t)i ' (i

00

i=l

c-i)-i

1)

Estimates of Remainders

10.4

449

as an approximation of the limit


ct)

A= lim An = L ai.
n-+co
i=l
The approximation An

A is used in some of the most important applications, and

in all applications that use computers.

As in all approximation processes, we are

better off if we can set a limit on the error. We shall now find ways to do this.
Given that

1:1 ai converges

to a sum A, let"Rn= A - An. Then


ct)

Rn=
and obviously

L ai,
i=n+l

limRn=
n-+oo

0.

For alternating series, of the type treated in Theorem 2 of the preceding section,
it is easy to get an estimate of R,,,.
ct)

Let the series be

ct)

i+l =
"" ai = ""
b;
b1 - b2 + ba .., ( -1)
i=l
i=l
..,

where

b; = Jail Then
ct)

ct)

R n= L a i= L (-l)i+l bi.
i=n+l
i=n+l

If n is even, then

(bn+l - hn+2) + (bn+3 - bn+4) +

'

' ' 0.

But we can also write

Therefore

1)

0 Rn hn+l

when

is even.

If n is odd, then
Rn =
=
Rn =

-bn+l + hn+2 - hn+3 +


-(bn+l - h,,,+z) - (bn+3 - b,,,+4) - O;
-bn+l + (bn+2 - hn+a) + (bn+4 - bn+5) +

Thus

2)

-bn+l Rn 0,

when

is odd.

Therefore

Since

bn+l = Jan+il, we have proved the following theorem.

450

10.4

Infinite Series

Given Ii:1 ai. If (1) the series is alternating, (2) limnco an


is decreasing, then
(3) the sequence la11, la2I, .

Theorem 1.

0, and

for every

IRnl lan+il

n.

That is, when you stop after a finite number of terms, the error is numerically no
larger than the first term that you omit. For example, take
i(-l)i+l_!2
i

1 - _!_2 + _!_2 - . .
2
3
.

i=l

By the alternating series test, this series converges. Let

be its sum. Then

1
1
1
+-
Al--+-2 '
2
2
2

and the error in the approximation is 1/102


very fast. Next consider
-1

C()

I c lY-:i=O
l!
-

This series converges to

sum

1 - 1 +

A.

1
-

2!

0.01. This series does not converge


1

- - + ...

3!

(It will turn out that

2!

3!

10 !

1/e.) We have

ARJ--- + +-
'

and the error is less than 1/11 !. This series converges very rapidly:
11 !

39,916,800,

and

__!__ 2.5052 10-s 0.000000025052.


11 !

If you reexamine the proof of Theorem 1, you will see that the method that we
used to get an estimate of the error was very much like the method that we used to
establish convergence in the first place, in the proof of the alternating series test.
This happens most of the time: that is, a proof of convergence usually gives an
estimate of Rn- Consider, for example,
1

C()

I-:;.
i=l l

We let

and we observe that the sequence A1, A2,


is increasing. To show that it is bounded
above, we draw a picture and observe that

An

n 1

-:2 <

i=l l

ln

1+

dx
2
x

Estimates of Remainders

10.4

451

The same sort of reasoning tells us that

= l_
R
.,,,
i=n+l l2

<

oo

dx.
9

Since

we conclude that

Rn

< 1
-

for every

n.

This is nowhere nearly so small as the estimate of error for the corresponding alter
nating series.

1n fact, the positive series I

(1/i2)

converges very slowly.

Similarly, Theorem 4 of Section 10.3 gives an estimate of the error for series

which are absolutely convergent.


Theorem 2.

Suppose that

L lail

is convergent.
00

Rn= L ai.
i=n+l

Let

10.4

Infinite Series

452

Then
00

IRnl
Iai

That is, the error in

lail

i=n+l

is numerically no greater than the error in

I la;I-

To

prove this, we apply Theorem 4 to the series


00

I ai,
i=n+l
If we use the comparison theorem of Section

10.2,

to establish the convergence

of a positive series, then any estimate of the remainder of the larger series auto
matically is an estimate of the remainder of the smaller one.

For example, we have

found that

00 1

I-:; < co,


i=l i
l/(i2 + 1)

with Rn < l/n for every n. Since 0 <


comparison theorem that
00

2
< l/i for every i, it follows by the

I-. -- <co.
i=l i2 + 1
It also follows, for the remainder

in the new series, that

and so

R < l.

This scheme always works, whenever we establish convergence by means of the


comparison theorem.

PROBLEM SET 10.4

Each of the following series is convergent. In each case, get an estimate of the remainder
Rn, in the form \Rn\
I.

i; ( ym
i=l 3}

2.

00

3. I ( -l)i7T-i
i=l
00
sin2 (2i - 1)
5. L
i2
i=l
00 ( - l)i+l
7. I
i=l i-4
w

9.

L
i=l

( - l)i

i0.9

4.
6.

i: (- 4)
i=l
i; co 7Ti
l

i=l
00

il f3

8.
w

10.

L -:--12.
1 n 1
i=2

Termwise Integration of Series.

10.5

1 1.

co

12.

i O (i!)2

13.
15.

co

1 i(i

14.

+ 1)

co

16.

[a, b].

Let

( --r

co
2
i=l
co

i 2 i2(1 + i)
co

fv fz,

453

i l i(i + l)(i + 2)

*17. Let

Power Series for Tan-1 and In

I c -o-i
i=

..
f

. be a sequence of continuous functions defined on the same interval


be a function such that
Jim

n-co

fn(x)

f(x),

for each x on [a, b]. Questions: (1) Does it follow that/ is continuous? (2) If f is known
to be continuous, does it follow that
(?)Jim

n--+co

* 18. Consider

order

19.

10.5

i
I!1 ( -l) +l (l/i).

Jb fn(x) dx Jb f(x) dx?


=

Show that by writing the terms of this series in a different


(using each term once and only once) you can get a series La; whose sum is 10.

Now reexamine your solutions of Problems 1 through 16. If you used any method other
than Theorem 1, in estimating the remainder in an alternating series, try using Theorem
1, and compare the new estimate with the old one. (The alternating series test usually
gives a good estimate, in the cases where it applies at all.)
TERMWISE INTEGRATION OF SERIES.

A power

series is a

POWER SERIES FOR Tan-1 AND In

series of the form

(Here, as a matter of convenience, we are defining

x, including x = 0.)

1, so that

a0x0

a0

Thus every geometric series is a power series; writing

the old formula, we get

. -

co
1
x'=
I
i=O
1-

for every

x for r in

(-1 < x< 1).

If a given series is convergent, for every x on an open interval

(-r, r),

then the series

defines a function/, on the same interval, and we write

co
f(x) = L a;xi
i=O

(-r<x<r).

The following theorem is fundamental:


Theorem A.

Given

co
f(x) = L G;Xi
i=O

Then/ is continuous and differentiable on

(-r<x<r).
(-r, r),

and the derivative of the sum is

10.5

Infinite Series

454

the sum of the derivatives. That is,


00

(-r<x<r).

f'(x) =Ii aixi-l


i=l

The same idea applies to the integral.


Theorem B.

Given
00

f(x) =I aixi
i=O

(-r<x<r).

Then the integral of the sum is the sum of the integrals. That is,

x
["'
l f(t) dt =I ai dt =I -._ai_1 xi+1.
oo

oo

i=O. 0

1=0 l

As you might expect, the proofs are hard; they will be postponed until the end of
this chapter. But the theorems are easy to apply, and Theorem B gives the best method
of finding series for many functions. The method is as follows.
We know that

1
1 + x + x2 + ... + xn + ... = -1 -x

(-1 <x<1).

Writing this backwards, we can express the function 1/(1


1/(1
Replacing

- x) = 1 + x + x2 +

- x) as a power series:

(-1 <x < J).

x by -x, we get

l
__ = 1 - x + x 2 - . .. + (-lrx" + ..
1 + x
and then, replacing

(-l<x<l );

x by x2, we get

_1_ = 1 - x2 + x4 - xs + ... + ( -1r x2n + ...


1 + x2

(-1 <x<1).

Theorem B says that the series on the right can be integrated a term at a time. Thus

lx

dt

ol+t2
--

lo"'dt -lo"'t2 dt

and this gives

lx
0

dt
1 + t2

=x -

+ (-1)"

3
x

l"'t2n dt
o

x2n+1

+ ...(-l)" 211 + 1

00
i 2i +l
=I C-l)
i=o
2i + 1

The integral on the left is equal to Tan-1

(-l<x<l ),

+ ...

(-1<x<1).

x. Thus we have:

Termwise Integration of Series.

10.5

Theorem 1.

Tan-1

. x2i+1
oo
x =I (-1)' --.
21 + 1
i=o

Power Series for Tan-1 and In

455

(-l< x<l).

Granted that Theorem B is true, there is no need to test the convergence of the
series on the right; Theorem B tells us not only that the series has a sum, but also that
its sum is Tan-1

x.

Note that the series includes only terms of odd degree.

could have been predicted, because Tan-1 is an odd function, with Tan-1

x.

-Tan-1 x for every

This

( -x) =

The same method can be used to get a series for the natural logarithm.
Theorem 2.

In

(1 + x)

x2
x3
x- - + 2

(-1 < x< 1).


Proof

We know that
In

and we know that

1
1 +t
--

1 - t +t2 - t3 + ... + (-l)iti +

By Theorem B,

lx

dt
(1 + x) = -- ,
0 1 +t

dt

--

o l +t

f"'at - xtat +f"'t2 dt o


o
Jo

x2
x3
=x--+2

00

=IC-l)i+l
l
i=l

l)i +
.

(-l<t<l).

"'
f
+ c-1Y tiat +
o

xi
-

(-1 < x< 1).

Note that this method cannot be used to calculate the integral from 0 to 2, because the
series for

1/(1 + t)

converges only for

!ti < 1.

The method that we have just been using can be applied so as to give answers,
in the form of series, for problems which up to now we could not have solved.
Consider

r -1/2

Jo

1 + x4

In Chapter 6, this would have been an impossible problem. But now we can solve it,
by expressing the integrand as a power series, and integrating a term at a time.
the series for

1/(1 + x),

we replace

1 + x4
---

1 00

x by x4 This gives

4 + x8 -

=I c-1)ix4i.
i=O

.
+ (-l)'x4'

In

456

10.5

Infinite Series

Therefore

-112

dx

4
1 + X

--

oo
=

I
i=O

-112

(-l)'x4'dx

I< -l)i

< ) i+l
_ -t

i=O

+ 1

4z

-t - t(-t)5 + -H-W +

1
-! +_
.

25

+ ...

9 . 29

This is an alternating series; the terms diminish numerically, and approach 0 as


n--+ oo. Therefore, if we use the first three terms as an approximation of the integral,
the error is less than the fourth term. This is

which is quite small: 210

1024, 213

8192, and so

E < I0-5

PROBLEM SET 10.5


1.

2.

Calculate Tan-1

l/10

0.02 to six decimal places, and explain how you know that the error in

your approximation is less than 5

Calculate

7
10-

1
---4 dx to five decimal places, and explain how you know that the
1 + x

error in your approximation is less than 5


3.

10--G.

Using the first term only, in the series for Tan-1, we get the approximation formula
Tan-1

x ""' x

for

x ""'

0.

How might you explain and justify this approximation formula if you knew nothing
about infinite series?

4. Given

f(x)

1 +

a) Express f(x) as an infinite series.


b) Express

0.6
J
o

as an infinite series.

1 +

6
x

c) Calculate numerically the sum of the first three terms of your series.
d) Get (by any method) an estimate of the error in the resulting approximation of the
integral.
5.

Do the same four things, starting with f(x )


(Your infinite series will use powers of

same reasons.)

v,

1/(1 +

vi),

on the interval

[O, 0.49}

but the same methods will apply, for the

10.6

The Ratio Test for Absolute Convergence

6. Do the same four things, starting with/(x)


7. Do the same, starting with f (x)

8. Do the same, starting with/(x)

457

1/(1 + x), on the interval [O, 0.2].

1
5 2 , on the interval [O, 0.25].
1 + x 1
1

3, on [O, 1/2].
1 + x

9. Express in the form of a series:

rk [ f--=:_J

Jo i=oi +

dx

(0 < k < 1).

10. Using the first term only, in the series for In, we get the approximation formula
In (1 + x)

R:J

for x

R:J

0.

How might you explain and justify this formula, if you knew nothing about infinite
series?
11. Consider the function f(x) defined by the series
1 + x +

x2

x3

+
3!

n
x
n!

a) Express/'(x) as a series.
b) Express H f(t) dt as a series.
The results that you get ought to enable you to guess what the function is.
*12. For each n, let

(0 x 1).
a) Find limn-->oo f fn(X) dx.
b) For each x on [O, 1], let f(x)
c) Find

rl

)o

limn-->oo fn(x). Get a formula for the function f(x).

f(x) dx

J1
0

[limfn(x)] dx.
n-co

*13. Your answers in Problem 12 suggest that the functions fn behave rather peculiarly.
Investigate as follows:
a) For each n, let .Xn be the point at whichfn takes on its maximum value. Get a formula
for Xn , and find limn_,00 Xn .
fn(.Xn ). Get a formula for Ym and find liffin_,00 Yn
b) For each n, let Yn

c) Draw a sketch showing what the graph of fn looks like for n


1, n
2, and
n R:i oo. Your sketch will throw some light on the results that you got in Prob
lem 12.
=

10.6 THE RATIO TEST FOR ABSOLUTE


CONVERGENCE. APPLICATIONS TO POWER SERIES

Consider a series .Lai, in which the terms may be positive or negative, but not equal
to 0. For each i, let

458

Infinite Series

10.6

so that

lai+II
An examination of the sequence

r1, r2,

lail ri.

gives us a convergence test which works

very quickly, in the cases where it applies.


Theorem 1

(The ratio test). If


lim

i_..oo

then

2:0 ai is

ri

r <

1,

absolutely convergent.

Proof Let s be any number such that r < s < 1. Then there is an N such that
i N

ri <

=>

s.

r;

(iN)

ri, take

(In the defi11ition of limico

laN+II

laN+2I

s - r, so that r

s -

) It follows that

laNI rN < laNI s,


[aN+1I rN+I < laNI ss

laNI s2;

and in general, given


it follows that

By induction,
Therefore

"'

laN+;I < laNI s1

for every j.

"'

"'

2 laN+1I 2 laN[ s'

i=O

i=O

laNl2s'

i=O

It follows that

and so

co

2 fail

i=O

N-1
=

2 [a;[

i=O

[aNI -- <
1 - S

oo.

co

2 lail

<

oo,

i=]\T

which was to be proved.


What we are really using here is a comparison test between the series

2 la;I

and

a geometric series; the comparison does not necessarily work for the first few terms,
but it does start working after a certain point; and this is good enough to tell us what
we want to know.

10.6

The Ratio Test for Absolute Convergence

In Section 10.2, Theorem 9, we showed by a comparison test that

459

Io ( l/i ! ) is

convergent. The ratio test gives this result very quickly. We have
1

a;=
l.

1
i!
Qi+l
- -- '
r-=-=
'
(i + 1)!
i + 1
a;
and so lim;co

r; = 0. It follows that the series converges.

There are simple cases in which a series converges, but in which convergence
cannot be established by the ratio test.

Consider
co

!-:;.
i=l I
which is known to converge.

Here

i2
a;+1
r; = - =
.
(i + 1)2
a;
Therefore, while

r; <

I for each i, we have

I.1m r;

i-co

]"1m

i- co

1
(1

+ (l/i)]2

= 1,

and so the ratio test does not apply.

And Theorem 1 cannot be generalized to take

r; ---+

I it may easily happen that the series diverges.

care of these cases, because if


This happens for
co
1
I
-:
i=l l

i
r;= -----+ 1.
i +l

oo,

An even simpler case is


co

I c-1)i= 1 i=O
Here

+ 1 - 1 +

r; = 1(- l)i+l/(- l)il = I

so that

ri

---+

for every i,

I automatically, but the series diverges.

On the basis of the ratio test, we can derive a more general result for power series:
Theorem

2. Given the series

where

0 for every

a;

i.

Suppose that
Jim

t-+ro

a;+
i = L.
ai

I I

If L = 0, then the series is absolutely convergent for every


series is absolutely convergent for

lxl < 1/L.

x.

If L > 0, then the

Infinite Series

460

Proof

10.6

For x = 0, there is nothing to prove.

For each

- 0, we have

Therefore
lim ri = lxl
If L = 0, then

ri

--+

L.

0, no matter what x may be.

If L > 0, then limioo

whenever lxl < I/L. In either case, the series converges absolutely.

ri <

By the first half of Theorem 2, we conclude that

converges absdlutely for every x. By the second half of Theorem 2, we see that

converges absolutely for lxl < 1.

In each of these cases, the sum of the coefficients

forms a convergent series. But the theorem also applies in cases where the sum of the
coefficients diverges.

Consider
CXl

I i11'iXi.

i=l

Here
1.Im

.
1.-+00

ai

I I
+l
Qi

ci + 1)11'i+l
= 11'.
. i
n-+OO
l'TT
1.

= Im

Therefore the series converges absolutely whenever lxl < 1/71'.


If the ratio

n approaches a limit which is greater than 1, then the series

_L ai

always diverges. The reason is that in this case we have an N such that

i N

=>

r;

> 1.

Therefore lai+11 > lail for i N, and so after a certain point the sequence la1J,

Ja2J, . . . becomes an increasing sequence.

Therefore ai cannot approach 0.

This

observation enables us to add something to the conclusion of Theorem 2.


Theorem 3.

Given the series

_Lo a;xi,

with

a;
+1
i-+oo ai
Jim

If L

I I

L.

0, then the series converges absolutely for every x. If L > 0, then the series

converges absolutely for lxl < l/L and diverges for Jxl > 1/L.
This theorem can be adapted to take care of cases in which some terms of the
series are equal to 0.

For example,
CXl

I c-1)ix2i.

i=O

The Ratio Test for Absolute Convergence

10.6

x2

Setting

461

y, we get
00

.L (-1)y\
i=O
which converges absolutely for
series converges absolutely for

IYI <
/xi <

1 and diverges for

IYI

> 1. Therefore the given

1 and diverges for

/xi

> 1.

x2i+1

oo

.L i
i=O 2

Here

.
lIm
i-+ oo

Similarly,

x2i
x_L--:-.
i=O 2'
oo

I I
ai+l
a;

Therefore the series converges absolutely for

1
=-

x2 < 2

and diverges for

x2

2.

>

Some more observations about Theorem 3 are in order.


1)

The theorem applies only to the case in which

approaches a limit. This

/ai+1/a;/

usually happens for series which are describable by simple formulas.

But for series

in general it should be regarded as a remarkable accident. Suppose, for example, that


we start with
00

..

i=O
Here

a;

even i.

1 for

i,

every

x'

and so

r;

1 + x + x2 +
1 for every

i.

We now divide

xi by i!

for every

This gives
oo

.L b xi
i=O i
The series still converges, for

2)

x+

/xi < 1,

x2
2!

x3 +

4!

but the ratio approaches no limit at all.

The theorem tells us that the series converges everywhere on the open interval

(-1/L, l/L),
interval.

but it tells us nothing about what happens at the endpoints of the

In fact, at the endpoints anything can happen.

converges on

( -1, 1),

For example, _L;:1

and converges at both the endpoints.

The series

(xi/i2)
.L:1 ixi

neither of the endpoints. The series


x
-1, but diverges at x
1.
(-1, 1), and converges at x
1, but

converges on the same interval, but converges at

_L;:1 (xi/i)

converges on ( -1, 1), and converges at

The series

.L:1 (- l)i(xi/i) converges on


x
-1. For this reason, to tell where the series converges,

diverges at

we have to

make separate tests at the endpoints.

3)

Obviously every power series

sometimes 0 is the only value of


every

L a;xi

converges for

;tf 0, we have
r;

O; the sum is

that gives convergence. Consider

l(i + l)!xi+i/i!xil
(i + 1) /xi --+ oo.

Therefore the series converges only for

0.

a0

.L:o i!xi.

But
For

462

4)

Infinite Series

10.6

Finally, the results that we have been getting for power series suggest a conjecture.

In every case that we have investigated, the domain of convergence of


turned out to be of one of the following types:

.L aixi

has

( - oo, oo).
(-a, a), plus, perhaps, one or both of the endpoints.

i) The entire interval


ii) An open interval

iii) The point 0 alone.


The question arises whether these are the only possibilities. For example, is there
a series

.L a;xi

origin?

.L a;xi

whose domain of convergence is an interval whose midpoint is not the

We shall see, as the theory develops, that the domain of convergence of

is always a set of one of the forms

( - oo, oo),

(-a, a),

[-a, a) ,

(-a, a],

[-a, a],

{O}.

PROBLEM SET 10.6


For each of the following series, find the domain of convergence, remembering, of course,
to test the endpoints.
ro

ro

1.

i=1

ro

4.

_L ix2i
i=l
oo

7.
10.

13.

oo

5.

xi

3. _Li2x2i-1
i=l
x2i

xi

ro

6. .L
i=l VI

.L

i=l v'l
(-x)i

co

ro

; v2i + 1

8.

00

I -=
i= Vi - I

9.

11. L (3i)2x2i-1
i=l
x2 i
14. IC-l)i
(
2'I )'
t=O

.L (3i)x2i

i=l

x2i-1

ro

.L (-l)i+l ( . 21
i=l

ro

15.

17. _Li-ixi

i=l

x 4i

L ( - l)i (2 I')I

t=O

ro

00

16. _Lhi

i=l

12. L (3i)4x3i
i=l

oo

1)!

.L(3i)3xi
co

ro

00

19.

ro

2. _Li3xi

_Li2xi
i=l

18.

i=l

,L (Tan-1 i)xi
i=l

I (-l)i
i2 + 1

i=l
00

20. .L i(2x - l)i


i=l

(Does the answer to this one contradict Theorem

ei

oo

21. .L-:--- (x - 4)i

(Same query as for Problem

i=ll3
oo

(x - 2 )i

22. .L

i=l

(Same query as for Problem

Show that

24.

Prove the following theorem:

,L:1

Theorem. If

,L;':1 a;

then ,L:1 a;b;

20.)

20.)

(sin i)xi is absolutely convergent when

23.

3 ?)

Jxl

< 1.

is absolutely convergent, and b1b2, ... is a bounded sequence,

is absolutely convergent.

10.7

Power Series for exp, sin, and cos

463

*25. Show that there are infinitely many integers i for which sin i > t.
*26. Show that ,L;:1 (sin i)xi is divergent when !xi > I.
(The ri;;:sults of Problems 23 and 26 show that for this very irregular series, the domain
of convergence is still of one of the types described by Theorem 3.)
*27. You may have noticed that the number 1 has come up very often as an endpoint of our
domains of convergence. The following theorem helps to account for this:
Theorem. Let p(i) and q(i) be polynomials in i, of any degree, with q(i) never equal to 0.

If

a;

p(i)/q(i), then

converges absolutely for !xi < 1, and diverges for !xi > 1.
Prove this theorem.
10.7

POWER SERIES FOR exp, sin, AND cos

Theorem A of Section 10.5 asserts that power series can be differentiated a term at a
time.

That is, if
co

f(x) = ,L a;xi

( -r <

i=O

then

x <

r ,

co

f'(x) = ,L ia;xi-l
i=l

( -r < x

r).

<

We shall use this to find a series for the exponential function. We start by assum

ing that ex can be expressed in some way as a power series, so that


co

f(x) = ex = ,L a;xi = a0 + a1x +

i=O
for some sequence of coefficients a0, a1, . . . . On any open interval whf"re this works,
we have
co

f'(x) = ,L ia;xi-l = a1 + 2a 2x +
i=l

ia;xi-l

;
(i + l)a;+1x +

It must be true thatj'(x) =/(x), and/(O) = 1; and so we want to find a sequence


of coefficients a0, a1, a2,

which gives these results for the series. This is easy: we

want

which gives f' (x) = f (x); and we want a0 = 1, which gives f (0) = 1. Thus

a0 =

1,

a1 = a0/ 1 =

I,

a2 = a1/2 = t,

and, in general,

a; = 1/i!.

a3

= a2/3 = 1/ (2

3) ;

10.7

Infinite Series

464

This can be checked by induction. For i = 0, I,


And
a

1
-i!

=>

2,

the formula

a;

= I/i! holds true.

1
1
G;
.
i+l - i + 1 - (i + l)i! - ( + 1)!
i
--

This proceeding does not prove that


(1)

because we started off with an unproved assumption that ex had some power series
expansion. But now that we know what series to examine, it is very easy to show that
Eq.

(1) holds.

By the ratio test, the series on the right-hand side converges for every x.

It therefore defines a function g. Thus


oo

i
x

g(x) =I-:-

i=O l !

We chose the coefficients

(-oo <

< oo).

in such a way that g' = g and g(O) = l.

show that g(x) = e"' for every x. For each x, let ef;(x) = g(x)/e"'. Then
ef;'(x) =

e"'g'(x) - g(x)e"'
e

2x

1
= - [g'(x) - g(x)]
e"'

Therefore is a constant, and ef;(x) = ef;(O) for every x.


g(x)/e"' = I,

We need to

0.

But ef;(O) = I.

Therefore

g(x) = e"',

and

which was to be proved.

What makes this scheme work is the fact that the functionf(x) = e'" is completely

described by the conditions f' =f, f(0) = 1; no other function satisfies these con
ditions. Thus we have
Theorem 1.

e"' =

Lo (xi/i!).

Setting x = I we get
Theorem 2.

e =

Lo (1/i!).

This series converges so fast that some people enjoy using it to calculate e

2.7182818,

correct in the seventh decimal place.

We now want to get a series for the sine. As before, we start by assuming that our
problem has a solution, and then we try to find out what form the solution must take.
For f(x) = sin x we have

f'(x) = cos x,

Therefore
and

f(O) = 0,
f"(x) =

f(x)
-

f"(x) = -sin x.
f'(O) = I,

for every x.

Power Series for exp, sin, and cos

10.7

465

Thus if
.

sm

we must have

a0 =

0,

x=

co

..

i=O
and a1 = 1.

a;xi

a0 + a1x + .

Now

f'(x) = L ia;xi-1,
i=l
co

f"(x) = L i(i - l)a;xi-2


O')

i 2
= 2a2 + 32a3x + + i(i - l)aix + (i + l)ia;+lxi-l + (i + 2)(i + l)a;+2xi +
To getf"

(i
Since

a0

-f, we want

2)(i + l)ai+2

-a;,

a;+2 = -

or

(i + l)(i + 2)

0, it folJows that every even-numbered coefficient a2i_is also equal to 0.

The odd-numbered coefficients are

a1 = 1,
as=

(1

a1

1)(1

2)

23

2)

as
(3 + 1)(3

a5 =

a1

---=

and in general

3!'

1
3! .

4. 5

1
5! '

i
a2i+l - (-l) .

(2i + 1)!

To check this by induction, we note that

i
a2i+l - (-l)
=>-

ac2i+i>+2

(2i + 1)!
a2i+l

-[
(2i + 1)

a2<i+i>+i = (-l)'( -l)


l)i+l

- c

1] [(2i

1) + 2]
1

1
.

(2i
1

(2i

3)!

1)! . (2i
-

c -1Y+i

2)(2i

3)

[2(i + 1) + 1]!

Therefore, if there is a series for the sine, the series must have the form

(x) =

co

x2i+1

(-l)' (2i + l)!


.

466

10.7

Infinite Series

g(x) =sin x for every x.


h(x) =g'(x). Then we know that

We need to show that


Let

I) g'

h,

2) h'

-g,

3) g(O)

= 0,

4) h(O) = I.
It ought to be true that

h(x) =cos x.

g(x) =sin x,
If so, the function

cp(x) = [g(x) - sin x]2 + [h(x) - cos x]2


x. And conversely, if cp(x) =0 for every x, it follows
g(x) =sin x and h(x) =cos x. Now

must be equal to 0 for every


that

cp'(x)

2[g(x) - sin x][g'(x) - cos x] + 2[h(x) - cos x][h'(x) + sin x]

=2[g(x)
=0
Therefore

<P

sin

for every

x][h(x) - cos x] + 2[h(x) - cos x][-g(x) + sin x]


x.

is a constant. But

cp(O) = [O - 0]2 + [l - 1]2


Therefore

0.

cp(x) =0 for every x, which was to be proved. Thus we have:

Theorem 3.
oo

sin

I (-1)'

i=O

By differentiation,

x2i+1
(2i + 1) !

oo

COS X =

.? (-1)

i-0

=x

x3

(2i + l)x2i

(2.l

Thus:

xs

x7

-- + - - - +
3!
5!
7!

l) .I

oo

x2i

t-0

.? (-1)i (2 . ) I
.

Theorem 4.
cos

oo
. x2;
x2
xs
x4
x =I c - 1) - = 1 - - + - - - +
i=O
(2i)!
2!
4!
6!
'

Obviously, the series that we have been developing in this section can be used
for calculating the values of the corresponding functions.

In fact, this is the way

people arrived at the values .that you find in the tables of exp, sin, and cos. And the
series can be adapted, in simple ways, to handle a variety of related problems.
example, consider

o
r .5
Jo

'"2
e

dx.

If we could get a simple formula for a function F such that

F'(x)

= e

"'

For

Power Series for exp, sin, and cos

10.7

467

then the integral could be expressed as F(0.5) - F(O). There is no such simple formula.
But we can express such an Fas an infinite series, in the following way. We know that
x
e =

ex'

Therefore

ro

ro
=

L
i=D l !
2i
x

L--;-i=O l !

a
x

"
x

2!

..

ro

F(O)

and

0,

f'

t'

F'(x)

dt =

2i+1

I --- .
-i=oi!(2i
+ 1)

Evidently
Therefore

2!

- + -- +
3

1 + x + - + ...

Integrating a term at a time, we get a function

F(x)

+ -+
2!

1 + X

ex'.

F(x),

and so, using the series for F, we can calculate

F(t)

approximately, with an error as

small as we please.

PROBLEM SET 10.7


Find a series for each of the following functions.

1. f(x)

x In (x

In each case, name the interval on

2. f(x) x2 In (x2 + I).


4. rxf(t) dt, where f is as in

which you know that your series converges to the given function.

3. f(x)

5. f(x)
7. f(x)
11. f(x)
9. /(x)

13. f(x)

15. F(x)
17.

F(x)

2
x ln (x

I).

2x.
sin (x/2).

sin

cos

Jo

10. f(x)

G).

3 "'3
x e

l"'

6. f(x)
8. f(x)

+ I).

3 t3
t e

12. f(x)

dt.

lxf(t)dt,

wheref is as in Problem

14.

i"'f(t)dt, where/is as in Problem 16.

14. f(x)

16. f(x)

18. f(x)

x sin x.
cos 2x.

2.

sin x cos x.

'n-'x

"'
xe - x.

(e"'

Problem

for x

for x

for x

;e 0
0

;e 0
0
=

x ;e 0
for x
0
for x

for

468

Infinite Series

19.

F(x)

21.

f(x)

23. f(x)

10.8

f'f(t)dt,

where/ is as in Problem 18.

x3 cos2 x.

20.

f(x)

22. F(x)

cos2 x - sin2 x

24. F(x)

cos2 x.

{ f(t ) dt where f i s
J as in Problem21.
o

x cos2 x + x sin2 x.

tf and (2) f(0)


I. Either before or
25. Find a series for a function f such that (1) f'
after finding the series, find an elementary formula for such a function f
=

26. Find a series for a function f such that (1) /' (x)
/(0)
0.

2/ (x)/x for every x 0 and (2)

27. Is there only one function satisfying the conditions of Problem26? Why or why not?
28. Get a formula for Dixi, where D1 denotes the ith derivative of the function f
29. Get a formula for Dix1, valid for i < j.

30. Do the same, for the case i


*31. Given f(x)
of f )

>

j.

I:o a xi. Get a formula forJ(i>(O). (Here /( i ) denotes the ith derivative
i

*32. Is it possible that there are two different power series for the same function, valid on
the same open interval I? That is, given
<XJ

f(x)

L aixi

io

<XJ

L b;xi

on I,

io

does it follow that ai = b; for each i? Why or why not?


function f is called real-analytic on an interval I if f can be expressed as a power
series I:o a xi. Does there exist a real-analytic function /, on an interval (-a, a),
i
such that /(i) (0)
(i !)2 for each i? Why or why not?

*33. A

10.8

THE BINOMIAL SERIES

It is possible to show, by induction, that if n is a positive integer, then

(a + bt

an +

nan-lb +

n(n

n(n - 1) an-2 2 +
b
2

1) (n
l1.

- i

+ 1)

an-ibi + . . + bn.

Here the coefficient of an-ibi can be written more briefly as

(n)
i

n(n - 1) (n

n!
=

!(n

i)!

1
1.

- i

+ 1)

The induction proof of the binomial theorem depends on the identity

You may have seen this proved.

In any case, we shall not stop to prove it now,

The Binomial Series

10.8

469

because the elementary form of the binomial theorem is a corollary of a more general
result which we shall prove presently.
We would like to generalize the familiar binomial formula

(a + br
to take care of the case in which

i G) an-ibt

is not an integer. That is, we want a formula for

(a+ b)k, where k is any real number. The following observations are obvious:

1)

Fork

0, we have

(a+ b)k

1, and our problem is solved. We may therefore

assume that

k
2)

For the case of interest, in which

c>

only for

0.

(See Section

4.9.)

k is

0.

not an integer, the exponential

a+ b>

3)

For

ck is defined

Therefore we must assume that


0.

b, the problem has an immediate solution: (a+ b)7'

(2a)k

2kak.

We therefore may assume hereafter that

a-:;!:. b.
And we want to assume this, because the case

b does not fit the pattern that is

going to emerge.

4)

It is now a matter of notation to suppose that

We let

a+ b
If we had

a> b.

bfa, so that

\x\

a(l + x),

and

\bfa\ 1, then either b a or b -a; and these possibilities are


(2) and (4). Therefore \x\ < I, and our problem takes the

ruled out by conditions


following form:
Problem.

Given

k -:;!:.

0, and

f(x)

(I + x)k

(\xi < I).

Find a formula for f (x), analogous to the binomial formula.


Our past experience with sin, cos, and exp suggests that we should investigate
the relation between

f(x)

(1 + x)k

and its derivatives, and use the results in the

investigation of the series. Now

j'(x)
Therefore

k(l + x)7'-1.

(1 + x)f'(x)
f'(x)

CQ

L ia;xi-l
i=O

kf(x).
CQ

L ia;xi-1.
i=l

470

10.8

Infinite Series

Therefore
xf'(x)

C()

.L ia ixi.

We want to express (1 + x)J'(x) as a series, and so we need to express/'(x) in the


form .L bixi. For this purpose we use a trick. Let j = i - 1, so that i = j + 1.
This gives
co

00

.L (j + l)a;+1x1 = L (i + l)a;+ixi.

f'(x) =

1=0

The equation (1 + x)j'(x) = kf(x) now takes the form


00

00

L [(i + l)ai+i + ia;]xi = L ka;xi.

i=O

Comparing coefficients of xi, we get


(i + l)ai+i + iai = kai

Obviously
a0

Therefore

/(0)

<=>

(i + l)ai+l

<=>

a;+i =

(1 + 0)1'

(k -i)ai

k-i
ai.
i
+1

1.

a0 = 1;
a1 =

O a = k'
0+1 0

k(k - 1)
k -1

a.= -- a =
1 + 1 1
2
'
k -2
k(k-l)(k - 2) - k(k- l)(k -2)
a ---a
32
3!
'
3 2+1 2

and in general, for

>

1,

ai =

k(k - 1) . .. (k
.,
I.

i + 1)

We denote the fraction on the right by the symbol (),just as in the case where k is a
positive integer. The above formula then takes the form
ai.=

(7)

for each

0,

and the net result of the above discussion is that


00

'
.4..

i=O

k
aixi = (1 + x)

=>

a; =

(7)

for each

i.

That is, the series that we have found is the only series that might work. To know
that our series does work, we need the following two theorems.

10.8

The Binomial Series

Theorem 1.

Proof

10.6,

As in Section

Then
r

'

= I k(k
=

Evidently

2::o CDxi is

The series

convergent for

lxl < 1.

let

1)... (k
i + l)(k
(i+l)!
_

i).

i!
k(k-l)(k-i+l)

r
; =

Ix!.

Therefore, by the ratio test, the series converges absolutely for


> 1.

Theorem 2.

For every

0,

lxl <

Let

1, and diverges

x
i ()xi= (1 + x?.

and every

between -1 and 1,

g
l)g(O)=l;
2) (I +x)g (x) = kg(x).

Proof

xI

Jim

lx l

I ; x I
i-too

for

471

be the function which is the sum of the series.

We determined the

coefficients in such a way that

'

We need to prove that

g(x)=f(x)= (1+x)k

(-1 <

x < I).

For this purpose, we use the same device that worked for the exponential.

f(x)
Then

= (1 g(x)
+ x)k

x) - (1 +x)kg'(x)(1-+ gx()xzk)k(l+x)k-l
kg(x)
= (1 + (1x)g+'(x))
g(O) = f (0) =
x,
e .

A. '
't' C

x k+1

so that cf> is a constant.

g(x)=j(x)

for every

PROBLEM SET 10.8


1.

Let

And the constant is 1, because


which was to be p ro v d

'

1.

Therefore

Write a series for v', and find out how many terms of the series you would need
to use, to calculate v'u, correct to three decimal places.

10.8

Infinite Series

472

2.

.a;
x +
Do the same, for v

3.

Do the same, for V1 x

4.

Do the same, for

5. Let

1.

+ 1.

V2x + 1.

be a positive integer. Using the definition

n!
(n - i)!i!'
write formulas for

cn;1) and (i.:'.1).

6.

Using the apparatus of Problem 5, show that

7.

Using the result of Problem

<1 + xr =

6,

show that

i ( ) xi

i=O

"""

(The first and last terms on the right-hand side require a separate discussion.
note that ()
<nt1), because both are equal to 1 ; and similarly that ()
(!})
=

Since obviously (1

+ x)1

() + ()x1,

But

= 1.)

this gives an induction proof of the elementary

form of the binomial theorem.


Find a series for each of the following functions, and discuss for convergence. You need
not test for convergence at the endpoints.

f(x)

8.

v'I + x

14. f(x)

Sa"' Vl + t10 dt
f'2 VI + tlo dt

f(x) = (2 + x2)"

20. f(x)
22.

x
=

15.

V 1 + x2

--

19.

x2
.I
v

1 +x

.a;
v

x
1 + x2

(1

+ x)312

f(x) = {/2 + x

f' {/2 + t2 dt
f(x) = r (2 t2)k dt
=

21. f(x) = ( "'

J v'1 + t2
0

dt

Find a function f such that (1) (1 + x2)[' (x)


f(x) and (2) f(0) = 1. Then show that
the function tht you found is the only function satisfying conditions (1) and (2).
=

23.

(k ;6-0)

13. f(x)

17. f(x)

16. f(x) = v'3 + x


18.

j(X) =

11. f(x)

10. f(x) = xVI + x


12. f(x) =

Same question, for the conditions

(1) f' (x) sec x

f(x) and (2) f(0)

1.

Taylor Series

10.9
10.9

473

TAYLOR SERIES

! a;xi, converging to f(x) = l /x on an


0, because any such series is continuous at x = 0, while

Obviously we cannot get a series of the type


open interval containing
lim:t-.o+

1/x

On the other hand, there is a series

= oo.

f C-l)ixi

1
__
1+X

i=O

If we let

x'

1 + x, x

(lxl <1).

x' - 1, then the above equation takes the form

(Ix' - 11< l);


and dropping the prime we get

f(x)

1
=

00

!(-l)'(x - 1)'

(Ix - 11<1).

i=O

Taylor series, oi a Taylor expansion of


1. A power series ! a;xi, of the type that we have
been discussing so far, is called a Maclaurin series. Thus every Maclaurin series is a
Taylor serie:

The series on the right-hand side is called a


the function f about the point

00

! a;x;

00

i=O

i=O

0. In this language, we may say that f(x)

1/x
1.
In x cannot have a Maclaurin series, because at x = 0 the

which is a Taylor series, with

! a;(x - O)i,

has no Maclaurin series, but it does have a Taylor expansion about the point
Similarly,

f(x)

function approaches

- oo.

g(x)
Setting

x'

1 + x, x

But we do have a series for


ln (l

x)

oo
xi
!c-1r1--:-

i=l

(l xl<1).

x' - 1, we get

Inx' =

)i
(x
i(-l)i+l ' l

i=l

(Ix' - 11<1);

and dropping the prime we get


lnx

(x l)i
ic-l)i+i -:l
i=l

(Ix - 11<1).

This is a Taylor expansion of In, about the point 1.


With the obvious modifications, all our theorems for Maclaurin series hold also
for Taylor series; and to prove them in the general case, we merely translate the axes

x = x' + a, x' = x - a.
10.5 take the following forms:

by the substitution
of Section

Theorem A'.

For example, Theorems A and B

Given

f(x)

00

! a;(x - aY

i=O

(a - r<x< a + r),

10.9

Infinite Series

474

thenf is continuous and differentiable on the interval

(a - r, a + r),

and the deriva

tive of the sum is the sum of the derivatives. That is


ro

f'(x)= L iab - aY-1


i= l
Theorem B'.

(a - r

<

<

a + r).

(a - r

<

<

a + r).

Given
ro

f(x)= L a;(x - aY
i=O

Then the integral of the sum is the sum of the integrals. That is,

("'f(t) dt
Jo

("'a;(t - 4 dt
i =oJo

(x - a)i+l.
I
+ 1

i=Ol

Our other theorems can be generalized in the same style; we treat


exactly the same way that we used to treat

x.

found, in Problem 31 of Problem Set 10.7, that if f(x)

interval containing 0, then


so that

a =
n

x - a

in

Another example is as follows. You

j<n>(o)
--

n!

I:o a;xi,

on an open

An analogous formula holds for Taylor series:


Theorem 1.

If f(x)=

Lo a;(x - a)',
a =
n

Proof
by

on an interval

J<n>(a)

(a - r, a

r),

then

for every n.

--

n!

The nth derivative of a function described by a formula rp(x) will be denoted

Dncp(x).

We observe that

nn(x - a)i=

nn(x - ar=

n!,

for

nn(x - a)i = i(i - 1)

Therefore,

<

forf(x)= L a; (x - a)i,

fCn>(x)

a
n

n,
(i - n + l)(x - a)i-n

for

>

J'l.

we have
ro

n!

+ L b;(x - aY-n.
i=n+l

We don't care what form the b/s have, because every term of the sum on the right
hand side has (x

- a)

raised to a positive power, and we are about to set x

This gives
so that

a =
n
which was to be proved.

fCn>(a)
--

n.1

'

a.

10.9

Taylor Series

475

We have found that for some functions the use of Taylor series in place of
Maclaurin series is a necessity.

For example,

l/x and Jn xdon't

have any Maclaurin

In other cases a Taylor series may be preferable, even though the

expansions.

Maclaurin expansion exists. The point is that

.Z

i
a ix

usually converges rapidly when

0, and more slowly when xis larger. To take an extreme case, we know
10,0007T = 0, because 10,000 is even. Therefore it must be true that

is close to
that sin

.I=O (-1)i (2i +1 1)! (10,0007T)2i+l

o.

But in waiting for the partial sums to get close to 0, we had better not be impatient.
In general, if we want to use a series to calculate a function numerically, we should
choose the "base point"
substitute.

as close as possible to the value of x that we want to

Suppose, for example, that we have calculated In

way to calculate In

1.6

would be to take
ln

But the convergence of the series

1.6

x =

Jn

x,

we have

f'(x)
f"(x)

fil(x)

ai

Jn
=

1.6,

fi)(l.5)
.,
L

0.4055.

One

.I c-1)i+l co"6)i
l

00

.2 a;(x - 1.5t

i=O

1/x

1.6

x-1,

( -l)x-2,

(-l)i+l(i - l)!x-\

( -l)i+l
.

(1.5)-i

(i > 0),

0.4055.

0.4055 +

this gives
In

j(O)(l.5)

Therefore

For x

We therefore use the base point 1.5. Thus


In

For f (x)

1.5

x
1
l(-l)i+1C y
l
=
i l
i=l

is slow.

in the series

(- ) +l ( x - 1.5);

I
.
1.5

0.4055 +

t=l

( - i.y+i (1-)i,
i=l

15

which converges much more rapidly.


Note that the above derivation tacitly assumes that In
about the point

1.5;

has a Taylor expansion

if it has, then the coefficients must be given by the formula

ai

f(a) .
.

= -,L

476

Infinite Series

10.9

It is a fact that if a function J has a Taylor expansion


<Xl

f(x)
converging on an interval

Ix - al

<

! ai(x - a)\
i=O
then any other point b of the interval can also

r,

be used as a base point, giving an expansion


<Xl

f(x)

! b;(x - b)i,
i=O

which converges on some interval containing b. But the proof would be hard, in the
present context, and should be postponed until we can use the theory of functions
of a complex variable.
PROBLEM SET 10.9

For some of the functions in the first twelve problems below, it is a practical proceeding
to derive a general formula for pn>(a), and use the formula to calculate the coefficients ai
in the series

! a;(x - a)i.

In each such case, calculate the coefficients by this method. In

cases where the derivation of the general formula seems unreasonably difficult, merely cal
culate the first three terms of the series.

a=0.
a = 1T.
3. f(x) =tan x,
a = 0.
5. f(x) =Tan-1 x,
a=0.
7. f(x) =ex,
a=0.
9. f(x) =In (2 + x),
a=0.
11. f(x) =Jn (1 + x2),

2. f(x) =tan x,

a=0.
a=21T.
4. f(x) =cos x,
6. f(x) =Tan-1 x,
a=1.
a=1.
8. f(x) =e"',
a=any number.
10. f(x) =e"',

1. f(x) =sinx,

12. f(x) = sin x,

a = 1.

13. This is a separate problem, and it requires you to think of a trick. Given that In 1.4
0.3365, find a way, using series, to calculate In 2, correct to three decimal places. (To
four decimal places, In 2 = 0.6931.)

!%1 a; be any series.

14. Let

For each i, let

= i(a; + la;I),
ci=i(a; - la;I).

b;

! b; and ! c; both converge, then 2: a; converges absolutely.


2: c; be as in Problem 14. Show that if ! a; converges and 2: la;I
Let 2: a;, 2:
Show that if

15.

b;,

oo,

then
and

LC;
*16. Let

n1, n2,

exactly once.

- oo.

be a sequence of positive integers in which each positive integer appears

That is, the numbers

some order. For each series

n1, n2,

2:%1 a;,

are the integers

1, 2,

3,

. . . arranged in

we can then form a "rearranged series"

in which the same terms appear in some order.


"commutative law of addition" for positive series.

!%1 an,,

The following theorem is a sort of

Taylor's Theorem.

10.10
Theorem. If

ai

> 0 for each i,

2 ai

and

converges to the same sum.

Estimates of Remainders

A, then every rearrangement of

477

2 a;

Prove this.
*17. Show that

has

a rearrangement

which converges to 0.

(Thus the "commutative Jaw for infinite

sums" does not hold in general.)

a rearrangement

*18. Show that for every number k there is

of the above series which

converges to k.

10.10

TAYLOR'S THEOREM. ESTIMATES OF REMAINDERS

In the preceding section, we showed that, if a function is expressible by a Taylor


series, with

f(x)

co

(a - r < x < a + r),

.2 a;(x - aY

i=O

then the coefficients ai are given by the formula

ai

i
J< >(a)

.-1
.
i.

Using the formula, we can write down a series. But there are three questions which
it is natural to ask:
1)

For what values of x does the series converge? (We recall that Tan-1 x is defined
for every x, but its series converges only for -1 < x 1.)

2)

Does the series converge to the function f that we started with?

3) If we use a partial sum


Sn(x)

(i)(a)

J
.2.

i=O

(x - a)i

as an approximation ofj(x), what is the error? For this, we need an estimate of the
"remainder function"
Rn(x)

/(x) - Sn(x)
n /(i)(a)
f(x) - 2-. (x - aY,
i=l l !
-

Partial answers to these questions are given by the following theorem.


Theorem 1

(Taylor's theorem). If f has

the interval [x, a], then

Rn(x)
for some x between a and x.

+ 1 derivatives, on the interval [a, x] or

f(n+l)(.X)
=

(n + 1) !

(x - ar+1'

10.10

Infinite Series

478

x as a constant; and for

The proof is artificial, and hard to remember. We regard


each

t we let

i
f (t)
i
f(x). (x - t) .
i=O l !
Here we have simply replaced a by t in the formula for R (x).
n
F(t)

F(x)
For

a we have F(a)

F(t)

f(x)-

r<ol(x)

--

O!

f(x) - f(x)

R (x). Since
n
f"(t)
f(t)
f'(t)
(x - t)2 f(x) - -- - (x - t)O!
1!
2!
F '(t)

x we have

0.

we have

For

J(n)(t)
- -- (x - tf,
n!

-f'(t) - [f" (t)(x- t)- f'(t)]


f" (t)
f"'(t)
2(x- t)
(x - t)22!
2!

J(n)(t)
f(n+l)(t)
(x- tr- -- . n(x - tr-1 .
n!
n!

Here all terms cancel out, telescopically, except the first term in the last bracket;
and so

F'(t)

Now let

f(n+l)(t)
(x - ir.
n!

(x - tr+i
G(t)- -(n + 1)! '

so that

G'(t)

-(x - tr
=

n!
F and G, on the interval between a and x, we apply the parametric
mean-value theorem. (This is Theorem 2 of Section 9.2.) It gives

To the functions

F(x)- F(a)
F'(x)
G(x)- G(a)
G'(x)'
_

for some

between

a and x. Since F(x)

-F(a)
And

-G(a)

G(x)

F '(x)
G'(x)

0, this means that

10.11

The Complex Number System

479

for every t. Therefore

F(a)
G(a)

+1
f<n >(x).

By definition of G(a) and F(a), we have


R n(x)

F(a) = J<n+i>(x)G(a)

x)
f(n+l)((n + 1)!

(x

ar+l'

which was to be proved.


In some cases we can use this theorem to prove that a formal power series con
For example, we may be able to find a number M

verges to the expected function.


such that

IJ<n+i>(x) I
for every

and every x between

and

x.

In such a case it follows that

Rn(x)

--+

and f (x) is the sum of its formal Taylor series. Most of the time, however, estimates
of

(n

+ l)st derivatives are hard to come by. For example, the calculation ofj <n> (x)

is unmanageable for the function

f(x)

1 -,
+
x3
1

even though we can easily see what the Maclaurin series is:

_1_ =I c-i)ix3i
X3 i=O

1 +

(lxl

<

1).

It follows, of course that

JC3i>(O) = ( -1)i(3i) ! ,
and thatJ<nl(O)

0 if n is not divisible by

about J<n>(x) for other values of

x.

3.

But this does not give us any information

PROBLEM SET 10.10


1 through 6. In at least six of the first twelve problems in Problem Set 10.9, it is easy to
get an estimate of J<n>(x), and then show by Taylor's Theorem that the series converges to
the given function. Identify these six cases, and carry out the process.

10.11

THE COMPLEX NUMBER SYSTEM

Formally speaking, complex numbers are numbers of the type


z

=a+ bi,

where a and bare real numbers, and where i is some sort of number such that i2

-1.

Granted that there is such a number system, and that it obeys the same manipulative
rules as the real number system, the equation i2 = -1 gives all that we need to

480

Infinite Series

10.11

perform calculations.

For example,

(a + bi)2 = a2 + 2abi + b2i2 =(a2 - b2) + 2abi;

(a + bi)(c+ di) =(ac - bd) + (ad + bc)i;


1
1.)2.
( +
i
= z;

/2
(_;2
,1

Jl

I;

i4 =(i2)2 =1;
l0,001
i
= i;

G 3
+

)2

iy = i(l +
= i(l +

= i(l = -1.

3J3 i + 3 . 3i2 + 3J3 i3)


9 + 3J3 i
3J3 i)
-

i has no reciprocal. But if a + bi - 0, then a+ bi has


a reciprocal in the complex number system. To see this, note that if a + bi - 0,
then a and b are not both = 0. Therefore a
bi - 0, and
Obviously 0

= 0+ 0

_13 i)3

1
1
a - bi
a - bi
--- = --- . --- =
a + bi
a + bi a - bi
a2 - (bi)2
=

a - bi

a2 + b2

=A+ Bi.

a2 + b2

This calculation begins with the assumption that

but once we know the answer, it is easy to check:

-b

+ bi and

a
(a+ bi)(A + Bi) =(a + bi) (
+
a2 + b2
=

Therefore A +

(a + bi)(a - bi)
a2 + b2

a2 + b2

Bi is the reciprocal of a+ bi.

-b

2+ b2

a2 + b2

a2 + b2

bi have reciprocals,
i

=1.

There are several ways to define the set of complex numbers, as a mathematical

system, and check their properties.

One such method is explained in Appendix J.

Meanwhile we shall regard the complex numbers as known, and calculate with them,
using the familiar laws of algebra and the fact that
The

conjugate of the complex number


z

=a+ bi

=a - bi.

is the number

i2 =

10.11

The

The Complex Number System

ab solute value

of

481

is

lzl = .Ja2 + b2
By straightforward calculations, we get the following.

Theorem 1. For all complex numbers z,

Zi.

z2,

we have:

z = z,
z + z</